Data Visualization, Excel, Power Map, Power View, SQL Saturday, SSRS

I’m Speaking About Geospatial Data Viz and Data Viz Best Practices in April

I will be speaking at two SQL Saturdays in April.  First, I will be at SQLSaturday #297 in Colorado Springs on April 12.  I’ll be presenting my session on The Accidental Report Designer: Data Visualization Best Practices in SSRS.  See my previous post on this subject to understand why I think it is an important message for all data professionals.  In this session, I share guidance on evaluating the effectiveness of data visualizations and a few good tips and demos to take your data visualizations from chaotic and ineffective to professional and powerful.

Next, I will be at SQL Saturday #291 in Chicago on April 26.  I will be presenting a new session to help you Choose Your Geospatial Mapping Adventure. Here’s the abstract:

You have geospatial data and you know that plotting it on a map will deliver insights. Or maybe you keep hearing “geospatial” as a current BI buzzword and you want to be informed about your options when you get to it. The Microsoft BI stack provides several options for visualizing location data. But how do you know which tool will best serve your purpose, or what factors are important in making this decision? We’ll identify factors to consider when choosing a tool to visualize your data in maps such as system requirements, availability and type of location data, required level of technical understanding, and more. You’ll leave the session confident in your ability to choose the best tool within Excel and SQL Server.

The Microsoft BI stack offers several great options for presenting data on a map, but your decision of which tool (Power View Map, Power Map, SSRS map, free mapping apps in Excel) can limit you in features, accessibility, and time to delivery. Some of the tools require more technical skill than others, but you lose some of the ability to customize when you go with the highest ease of use.  The session will be filled with tips, explanations of features and requirements, and demos.

If you will be in the area on the dates of these events, please stop by to participate in my sessions or just say hello.

Excel, Microsoft Technologies, Power BI, Power Map

Power Map for Excel is Now Generally Available for Office 365 With a Few New Features and Bug Fixes

Today, Microsoft announced that Power Map for Excel is now in GA. As Chris Webb noted, it will only be available for those that have Office 365 ProPlus. Those with standalone Excel or Office Professional Plus will not get the GA version of Power Map. I just finished applying the update to Office to get the GA version and I have a few observations.

Once you have the Office SP1 update, you will need to enable to the GA add-in and disable the preview add-in, as it installs separately from (rather than over the top of) the Power Map Preview add-in.

Enable Power Map Add-in

If you had the Power Map Preview installed, when you go to insert or open a Power Map, you will receive a dialog box that asks you to allow it to uninstall an older version.  This only takes a minute. Then you can open Power Map once again and work with the new GA version.

There don’t seem to be any significant differences between the preview and GA version, but they did fix a few bugs, add a couple of new features,  and make a few UI updates.

Some of the icons in the ribbon have been updated, but the options are still the same.
PM Updated Ribbon icons

If you click on Themes, you will notice there are now 12 themes available where before there were 8.

PM Themes

When you go to add a new layer, you will notice that the Chart Type drop-down was changed to an array of icons.  The available selections remain the same: stacked column, clustered column, bubble, heat map, region.

PM - chart type icons

They have added a new icon to the Layer Manager panel that allows you to add a new layer to the selected scene.

PM added New Layer IconThey also added a new option in the Layer Options to change the opacity of the layer.

PM Layer Option - Opacity

And they fixed the color picker in the Layer Options!  This was awful in the preview, but it looks much better now.

PM - Color Picker worksThey also updated the UI for annotations.

PM - Updated Annotation UI

And, I’m not 100% sure on this one, but I think they added the ability to add line breaks in the text of the annotations (using Shift + Enter).  I seem to remember struggling with the lack of this ability in the preview version.

PM - annotation line breaks

Those are all of the changes I noticed.  Leave me a comment if you see something I missed.  I was able to open and update several maps I made with the Preview version and I haven’t noticed any incompatibilities between the preview and GA versions.

There are still some features that I feel should be included in this product:

  • filter data in the map without having to make lots of calculated measures in Power Pivot to accomplish this
  • show/embed a Power Map on a web page without having to open it in Excel or download/play a video of the tour
  • include images and text in a single annotation
  • choose which fields appear in tool tips/hover text
  • reposition annotations so they don’t overlap with another annotation or a chart item
  • two-color scales for region shading (ex: blue -> red for showing temperature, red – > green for under/over goal)
  • stacked column charts with categories, let the user choose how the items are stacked ex: chronologically asc/desc, alphabetically by category, etc.)
  • create a map from a copy of an existing map

Hopefully some of them will come with the next release.

Happy mapping!

Excel, Microsoft Technologies, Power BI, Power Pivot, Power Query

Moving Calculated Measures in Power Pivot for Excel 2013

I learned a lesson the hard way: I shouldn’t change field names and data types in Power Pivot on tables that were imported using Power Query. My changes broke the connection between the two tools, so when I refreshed a query in Power Query that was set to load the results to my data model it caused a new/duplicate table to be created in my data model instead of updating the original table.

PPBlogTables

I already had Power View reports built off of my model that I didn’t want to rebuild, and I didn’t want to leave a duplicate table in my model, but I needed to make the data model refreshable. The issue can be fixed, but it’s a bit tedious:

  1. Create relationships for the new table to match those of the old table
  2. Rename the old table to something else
  3. Rename the new table to the original table name (usually it has a 1 after the table name)
  4. Rebuild or move any calculated measures to the new table
  5. Delete the old table

I was fine until I got to #4. You can’t just copy a calculated measure to a new table by cutting and pasting. You can copy the formula from the formula bar and paste it into a cell on another table, but you will get a warning that you can’t have two measures with the same name.

PPBLogDupName

Going this route means pasting your measure, renaming the old measure, and then renaming the new copy of the measure to the original name.  Due to the lack of the ability to filter data in Power Map, I had to create several calculated measures in my data model to limit the data shown in my map.  The idea of copying, pasting, and renaming several dozen measures was less than appealing.

It should be noted that in the current version of Power Pivot, you can put calculated measures on any table in the model without affecting the result of a query using the measure. The table in which it is located is more of an organization issue rather than a technical calculation issue.

I thought there must be a better way to move a measure. I looked through all of the buttons on the ribbon in the Power Pivot window and saw nothing helpful so I turned to Google (yes, I still choose Google over Bing). A TechNet article provided just the guidance I needed:

“To move a measure, you can cut and paste the formula in the Calculation Area. You can only move it to a different location in the Calculation Area of the table in which it was created. You cannot move a measure to different table in this view; instead use the Measure Settings dialog in the PowerPivot ribbon in Excel to change the association.”

The Measure Settings dialog box held the answer, although I found these settings under the Calculated Fields -> Managed Calculated Fields.  PPBlogManageFields

It contains a drop-down box that allows me to choose the table where the definition of the measure should be stored. This was a much quicker way to quickly move several calculated measures to another table in Power Pivot.

PPBlogcalcFieldSet

I have built several Power Pivot models and never run into this particular situation. I can’t say I’m glad it happened, but I can appreciate that it reminded me that there are useful things on the Power Pivot ribbon in the Excel window that aren’t accessible through the Power Pivot window.

Data Visualization, Excel, Microsoft Technologies, Power Query, Power View

Most Popular Names Visualized in Excel

A couple of months ago, I came across an article in the Atlantic that showed an animated gif of the most popular baby names by state by year. I decided to visualize this data using various add-ins and chart types in Excel.

Retrieving the Data

I found the data on the Social Security Administration website. This data showed the top 5 girl names in a table for each state for a given year. I needed to pull in multiple years, so I used Power Query and wrote a bit of M. I followed the guidance from a great blog post by Devin Knight:

  1. Choose From Web and enter the url for a page containing any year of data.
  2. Modify the M to create a function that accepts a year parameter and uses that to formulate the correct url for each year.
  3. Create a second query from scratch and provide an array of years.
  4. Convert this data to a table and insert a calculated column that pulls in the data from the first query.
  5. Expand the columns, rename columns, and change to appropriate data types where necessary.

For more detail, see Devin’s post. He does a great job of explaining and providing screenshots.

Once I had my data it was time to show it off.

Power Map

My first instinct was to use Power Map since its specialty is showing changes in geospatial data over time. After trying each type of chart in Power Map, I found that the region shading was the closest to what I wanted.

Power Map Baby NameAlthough I wasn’t just trying to duplicate the animated gif, I found it difficult to display this data in an equally effective way in Power Map due to the lack of ability to add data labels or customize annotations. There was no good way to show the most popular name on the state. I chose to use annotations to display the name. You can use data fields to populate the title and description of an annotation. The biggest drawback of using annotations is that there is no way to resize or relocate them. If I were to add annotations for each state, they would overlap and you wouldn’t be able to read many of them nor would you be able to see most of the map. I was able to add enough annotations that there was usually one annotation for each of the names shown on the map at any time. One other issue I noticed is that occasionally an annotation will fail to update on time, so the data in the annotation will be old and correspond to a previous year.  If you watch the video for my Power Map, you will see that the annotation for Virginia gets stuck on 1983 for a few years, but it eventually catches up. Update: Microsoft responded to my question on this.  We figured out that the annotations will update as expected if you set the Time to Date(Year) and then change the time settings to “Data Shows for an Instant”.  I have updated my workbook with these settings. You can watch the video of my Power Map tour here. Outside of the inability to add good labels, watching the states change color over time works well. You can have a legend that shows the names with the colors, but it will show every name from the data set and will not be an effective use of space.

Power View

Power View provides the ability to create bubble maps. I made a bubble map tiled by year.

Power View Baby NameIf you click on a year tile, you can then use the arrow keys to flip through the years; you can also use the scroll bar below the tiles. I think this provides a better visual than the Power Map for this situation since it is easier to see which names go with which states. Because of the bubble colors, I can see how the trend went from Jennifer to Jessica to Ashley in the 1980s. We also notice that there are 5+ top names from the 1990s forward, whereas the 1980s had 2 – 4.  The only real problem I encountered with the map in Power View was the limitation of the colors in the themes. When there are several names on the map, some of the colors can be very similar.  I tried several themes and couldn’t find one where this wasn’t the case.

Pivot Table

Next I decided to stray from maps and see if there were other chart choices that suited this data.  I created a pivot table with data bars and a timeline slicer.

Excel pivot baby nameI can still tell which baby names were most popular across the US and see how the trends change over time. The piece of information that I lose in moving away from maps is seeing how trends affect states that are geographically near each other. For instance, I can tell that Jessica was popular in 1985 in states such as California, Florida, New York, and Michigan. But I don’t immediately recognize that Jessica was popular only in New England, Florida, and the Mountain West and West Coast because Ashley had taken over the    Mid-Atlantic and Midwest. One advantage of the pivot table is that I can select multiple years. So I can learn things like Ashley was the most popular name overall in the 1990s.

Adventures with Apps for Office

Microsoft Research published several free apps for data visualization in Excel.  I decided to use a couple of them to visualize the baby name data. You can access these apps on the Insert tab.  Click Apps for Office -> See All.  Then click Find More Apps at the Office Store. One thing to note about these apps is that they require the data to be formatted in a certain way.  See the instructions in the links below for details on this.

apps for office

Streamgraph

A streamgraph shows how a set of numbers has changed over time using their relative area. It’s kind of like an area chart that doesn’t use an absolute Y axis.

steam graph baby names

I do not think this is an effective way to display the data.  Area graphs and streamgraphs both fail when trying to display a large number of values.  In this specific case, you can only pick a color and the shades are determined for you. It’s difficult to distinguish some of the shades from others. The fact that the values for the names can overlap and aren’t a specific shape (like a rectangle where you only have to judge length) make it difficult to interpret quickly.  You lose most of the advantages of pre-attentive processing because you really have to study the graph to get information from it.

Treemap

A treemap is a hierarchical view of data that breaks it into rectangles and uses size and color to communicate information.

tree map

For some reason there is a size limitation on this treemap so I couldn’t make it big enough to where all the names would fit in the boxes. Since the treemap plots the boxes so they fit into the overall rectangle, years are not placed in sequential order.  In my treemap, both the color and size represent the number of count of names for the year. Using a treemap removes the ability for the viewer to see trends over time.  I have to look in the bottom row for 1980 – 1982 and then jump to the top left to find 1985. I also had to summarize to top names by year, losing any geographical/state information.  One thing this does do well is compare 2 – 3 values for a name across years.  For instance, I can compare the shade of green for Jennifer in 1981, 1983, and 1985 to see the decline in popularity of Jennifer over time.

And the winner is…

I think the Power View does the best job of communicating all of the available information from the data set while being visually appealing. We can see geographic relationships and temporal relationships.  It’s easy to read and quickly see which names were top in which states at any year.

You can download my Excel file here if you would like to check it out or try to improve upon it.

Data Visualization, Excel, Microsoft Technologies, Power BI, Power Pivot, Power View

Infographic vs Power View

Someone on Twitter posted a link to an infographic on the 10 most visited cities in the world, which you can see below.  I’m interested in travel, and I like data, so I looked through it.

Most Visited Cities Infographic
Most Visited Cities Infographic from DailyInfographic.com

After a second or two of looking at it, my BI and dataviz nerdiness kicked in. Here were my thoughts:

  1. I know infographics are cool, and I love that they communicate an interesting data point to the general public in an easily consumable manner.  But I hate when they make them really large/long with a lot of text where you have to read through every bit of text to really get the info.  You lose a lot of the opportunities for preattentive processing with a format like this.
  2. I know I could make a great Power View from this data that could deliver information more efficiently, and I think it would look pretty cool.

Making the Power View

First, I had to get the data and put it in a Power Pivot model. I just copied the info from the infographic and put it in a single table in Excel.  You could split it out and put it into several tables (especially if you split the tourist attractions into 3 individual fields per city), but I didn’t need it for what I was making and the data was all at the same level of granularity in my single table.

Most Visited Excel table
Click to enlarge

I imported that into my Power Pivot model by selecting the table and  clicking the Add to Data Model button on the PowerPivot tab.  I had a vision of where I was going with my Power View, so I also imported a table of countries and flag images into Power Pivot. See yesterday’s post for info on how to import images into Power Pivot. I had originally split out the hotel price and the hotel name as two separate fields, but I found that I liked them together when I was building my Power View so I made a calculated field in Power Pivot to take care of that.  Here are my tables in Power Pivot:

Most visited power pivot 1
Click to enlarge
Most visited Power Pivot 2
Click to enlarge

You’ll notice on the Country Flags table, I had to change the table behavior properties to get the images to render properly in Power View. You can find this on the Advanced tab in Power Pivot. The important part is to set the default image to the flag image. Next I had to relate my most visited cities data to the appropriate flag.  Since there is one row per country in my Country Flags table, this was as easy as relating the country field in the Most Visited Cities table to the country field in the Country Flags table.

Most Visited Power Pivot 3
Click to enlarge

And finally, I built my report. You can download my workbook here.

Most Visited Power View
Click to enlarge

I used a map and cards to display the data. In the map, the bubble size represents number of visits. If you click on a bubble you filter the table below to see just the information for that location. If you hover over a bubble you can see the city name and visits per year. The table is sorted by visitors per year descending.

Why I Like This Power View

Let’s try an experiment: open the file in Excel and then open the infographic in your browser. See how long it takes you to answer the following questions with the infographic and then with the Power View.

  1. How many of the most visited cities are in the United States?
  2. If I were to plan a trip to visit 3 of the cities on this list, which ones would be most convenient to visit in one trip?
  3. What tourist attractions are in Hong Kong?
  4. How many visitors does Dubai get per year?
  5. In what country is Kuala Lumpur located? (I’m hoping you know this off the top of your head, but let’s pretend you need to look it up.)
  6. Which country has the lowest average hotel rate?

I think the Power View would answer all of these questions quicker or just as quick as the infographic.  If you have been looking at the infographic for a while, you might be pretty fast at #3, #4, and #5.  But you will be much slower at #1, #2, and #6 using the infographic.

There are two main reasons for this:

  1. The map provides visual information that you can take in quickly.  You can quickly see locations of these cities on the globe as well as their locations relative to each other. Because of preattentive processing (PDF) we can determine which places have the most visitors (size), which places are close to one another (spatial position), and where cities are located (color – the mauve bubbles stand out) without having to really think about it. I will note that both the Power View and the infographic do a good job of using enclosure to indicate which data points are related.  The Power View has the mauve bars to the left and the line above to separate each city.  I easily know that Malaysia is the country related to Kuala Lumpur and don’t confuse it with the information for Hong Kong above it.
  2. Interactivity allows us to quickly find information through sorting and filtering. For question #3, I just clicked on the bubble for Hong Kong to filter the table below and find that the tourist attractions in Hong Kong include Hong Kong Disneyland, Kowloon Walled City, and Ocean park Hong Kong. To find the lowest hotel rate, I just sorted my table by the Avg Cheap Hotel field ascending and looked in the first card to find Kuala Lumpur has hotels for $16. If you look for this information in the infographic you have to scan through every box until you find Hong Kong or look through all the boxes and remember the lowest hotel rate you have read.

Perhaps you think that it is unfair to compare an interactive Power View with a static infographic. I would argue that it makes a point.  When you build BI solutions including reports and ad hoc data discovery tools you should consider the needs of the end consumer to decide whether you need interactivity. They do make interactive infographics.  This author chose not to include that feature in her design. There are some advantages to the infographic around accessibility. The infographic can render in basically any browser on any device and can be downloaded as an image to where it doesn’t require internet connectivity. Power View maps currently use Silverlight and the maps require an internet connection. The preview of Power BI in Office 365 has an HTML view but it doesn’t yet work with maps, so you aren’t going to look at this Power View on an Android or iOS device.

I don’t think all infographics are inherently poorly designed; I don’t think this one is particularly bad.  I do think that the information could be better arranged to more effectively communicate to the audience. I recommend looking through Stephen Few’s presentation on Infographics and the Brain (PDF) for more info on this subject. I agree with Few that infographics are good when they simplify the data and tell a story to communicate a message with the data in addition to being entertaining.

Excel, Microsoft Technologies, Power BI, Power Pivot, Power View, SSAS, T-SQL

Fun With OPENROWSET

I’ve had several occasions to use OPENROWSET recently in T-SQL.  Although you can use it as an ad hoc alternative to a linked server to query a relational database, I’m finding it useful to get data from other sources. I’ll go into details of how to use it, but first I would like to acknowledge: OPENROWSET is not the answer to everything.  Sometimes you should use a linked server or SSIS or some other method.  I had some specific cases where I felt OPENROWSET made more sense for my task.

Importing Data from an Excel Spreadsheet

Technical need: I had some data that needed to be validated before importing into SQL Server from Excel as part of an upload process in an application.  If it didn’t pass validation, I did not want to import the data. Rather than use SSIS to import the data into a table and then check it, I used OPENROWSET to pull it directly from the Excel file into a temp table, ran all of my validation checks, and then imported the data using SSIS all within a stored procedure that was called by an application.  Field and table names have been changed to protect the innocent.

The stored procedure:

CREATE PROCEDURE dbo.MySproc
@datafile nvarchar(127) 
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;
    DECLARE
        @sql nvarchar(max)
    ,   @rowcount int
 SELECT
        @rowcount = 0

    DECLARE @FileExists int

    BEGIN TRY
        EXECUTE master.dbo.xp_fileExist @datafile, @FileExists out
        -- Value of 1 = file exists
        -- Value of 0 = file doesn't
        IF @FileExists = 0
        BEGIN
            SELECT
            CONCAT('The file ', @datafile, ' could not be found.') 
                                                   AS reasontext
        ,   NULL AS field2
        ,   NULL AS field3;            
            RETURN -1;
        END
    END TRY
    BEGIN CATCH
        SELECT
            ERROR_MESSAGE()  AS reasontext
        ,   NULL AS field2
        ,   NULL AS field3;
        RETURN -1;

    END CATCH
 TRUNCATE TABLE dbo.finaldest;
    IF OBJECT_ID('tempdb..##tempf') IS NOT NULL
    BEGIN
        DROP TABLE ##tempf;
    END;

--Create temp table to hold values from Excel file
Create table ##tempf (
Field1 [nvarchar](255),
Field2 [nvarchar](255),
Field3 [nvarchar](255),
Field4 [nvarchar](255),
Field5 [nvarchar](255),
Field6 [nvarchar](255)
);
--dynamic sql to populate temp table with values from Excel file
    SELECT
    @sql = N'INSERT INTO ##tempf
   SELECT
   [Field1],
   [Field2], 
   [Field3],
   [Field4],
   [Field5],
   [Field6],
    FROM OPENROWSET (
                ''Microsoft.ACE.OLEDB.12.0'',''Excel 12.0;Database='
                + @datafile
                + ';HDR=YES;IMEX=1'', 
                ''SELECT s.* FROM [Sheet1$] AS S 
                WHERE [Field2] IS NOT NULL AND [Field3] IS NOT NULL'')';

    BEGIN TRY 
        EXEC sp_executesql @sql;
    END TRY
    BEGIN CATCH
        SELECT
            'File not found or invalid file type' AS reasontext
            --ERROR_MESSAGE()  AS reasontext
        ,   NULL AS Field2
        ,   NULL AS Field3;
 IF OBJECT_ID('tempdb..##tempf', 'U') IS NOT NULL 
        BEGIN
            DROP TABLE ##tempf
        END
      RETURN -1
    END CATCH
/*Do lots of data validation here
...
*/
--if data validation tests are passed, write to final destination table
Insert INTO dbo.[finaldest]
   ([Field1],
      [Field2], 
      [Field3],
      [Field4],
      [Field5],
      [Field6])
   Select [Field1],
      [Field2], 
      [Field3],
      [Field4],
      [Field5],
      [Field6] 
      from ##tempf;

 Drop table ##tempf; 
END

To make this work you need to install the Microsoft ACE OLE DB 12.0 provider on your server. You cannot use the Jet OLEDB Provider with 64-bit SQL Server and 64-bit Excel. Once you have this installed you need to change some settings:

EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', 
N'AllowInProcess', 1
GO
EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', 
N'DynamicParameters', 1
GO
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'Ad Hoc Distributed Queries', 1;
GO
RECONFIGURE;
GO

Here is a blog post that explains the errors you might get if you don’t have these settings correct. It also notes that you can go check your settings in the registry keys. For SQL Server 2012 the path is HKEY_LOCAL_MACHINE -> SOFTWARE -> Microsoft -> Microsoft SQL Server -> MSSQL11.MSSQLSERVER -> Providers -> Microsoft.ACE.OLEDB.12.0.  You can then see that AllowInProcess and Dynamic parameters are set to 1.

RegKeyACEOLEDB

This provider seems to be a bit fussy and there are some things that took me a while to troubleshoot once I got started.  First, make sure that the location of the Excel file you are reading is accessible to the account that is running the SQL engine and doesn’t require admin access. The Linked Server errors that get returned with this really aren’t that helpful, so you have to use your Google-fu to work your way through it.  I used a global temporary table so it would be accessible across connections if it needed it to be (I was having trouble when I used a local temp table).  The application that calls this stored procedure queues up the calls and executes them serially so one can’t override the other.

Returning results from an Analysis Services cube in SQL

Technical need: I had some data (several facts with common dimensions) with calculations in a multidimensional cube and an application that needed that data but couldn’t/wouldn’t consume the MDX result set.

The query: For demonstration purposes I’m using a small query that returns the values for a drop-down box for selecting a member of the date dimension.

select a."[Measures].[DateKeys]" as DateKey, 
a."[Measures].[DateLevel]" as DateLevel, 
a."[Measures].[DateValues]" as DateValue 
from OPENROWSET 
('MSOLAP','Datasource=localhost; Initial Catalog=MyCubeDB;',
  
'with member [Measures].[DateValues] as
    Iif([Date].[Fiscal Calendar].currentmember.level_number = 2, "    " 
    + [Date].[Fiscal Calendar].currentmember.member_value, 
    [Date].[Fiscal Calendar].currentmember.member_value)
  member [Measures].[DateKeys] as 
    [Date].[Fiscal Calendar].currentmember.uniquename
  member [Measures].[DateLevel] as 
    [Date].[Fiscal Calendar].currentmember.level_number

select
 {[Measures].[DateValues], [Measures].[DateKeys], 
                           [Measures].[DateLevel]} on 0 ,
 {Exists(Descendants([Date].[Fiscal Calendar].[FY2014], 1, 
                         SELF_AND_BEFORE), , "MyMeasureGroup") } on 1
 from [MyCube]
') as a;

You can see that I explicitly named my fields, but you can also do a select * to get all the fields from the OPENROWSET. I had to put the fields from the cube in quotation marks to make it work because of the brackets in the field names. For Analysis Services, MSOLAP is the name of my provider.  You must have the correct version of this provider listed under the linked servers on the server where this is being executed.

MSOLAP ProviderFor SQL Server 2012 SP1, you can download the Microsoft Analysis Services OLE DB Provider for Microsoft SQL Server 2012 SP1 here. There is nothing special about the data source and initial catalog. They are exactly the same connection info you would use when connecting to SSAS through Management Studio.  The last part is simply the MDX query.  You should be able to copy that part out and run it in Management Studio connected to your SSAS database and get results.

Inserting Images Into a Database

Technical need: I was building a demo for Power BI and I needed to show some images in my Power View reports in Office 365.  In order for the images to show in Office 365, you must import the images into SQL Server and then import them into Power Pivot. First I downloaded images of flags for each country from the CIA world factbook.  I took the folder of flags and put it in my pictures folder. I already had a table in SQL Server that contained country names and codes.  So I just needed add a flag for each country.

The query:

  CREATE TABLE [dbo].[countryflagsdemo]
  (
  [id] int, 
  [country] varchar(100), 
   varchar(2),
  flagimage [varbinary](max)
  )
 --get ID, country, and code from existing table
  INSERT INTO dbo.countryflagsdemo (id, country, code)
  SELECT id, country, code FROM dbo.Countryflag1;

  DECLARE @i INT; 
  DECLARE @rows INT; 
  DECLARE @code VARCHAR(2);
  SET @i = 1;
--get max id value for loop
  SET @rows = (SELECT COUNT(country) FROM dbo.countryflagsdemo)
  WHILE (@i < @rows)
  BEGIN 

    DECLARE @sql VARCHAR(MAX); 
    SET @code = (SELECT code FROM dbo.countryflagsdemo WHERE [id] = @i);
    SELECT @sql = 'UPDATE dbo.countryflagsdemo 
      SET flagimage = 
      (SELECT BulkColumn FROM 
        OPENROWSET (BULK 
            ''C:\Users\mlongoria\Pictures\flags\large\' + @code 
               + '-lgflag.gif'', Single_Blob) as flagimage)
            WHERE  = ''' + @code + ''';'

    EXEC (@sql);
    SET @i = @i + 1; 
    --PRINT @code;

END 
GO

I figured out how to do this based upon this MSDN blog post. I inserted my other data first and then added the images.  I’m sure you could find a way to do this all in one select statement.  Since this was a one-time thing I didn’t see the need to find another way. It should be noted that the bulk insert must pull from a location on your computer; you can’t use links to the images on the internet. I downloaded the images in a folder. Each image name was the abbreviation for the country. If you are going to import this data into PowerPivot, be sure you change the table behavior properties so it will recognize the varbinary field as an image.

 

There may be other ways to accomplish my goals, but it’s nice to understand how to use OPENROWSET and have that in my toolbox.

Books, Data Visualization, Data Warehousing, Excel, PerformancePoint, SSAS, SSIS, Uncategorized

The One Book

I tend to get some variation of the following question as I present at SQL Saturdays and work with clients:

What is the one book I should read to gain a good understanding of this topic?

There are many great books out there on business intelligence, data warehousing, and the Microsoft BI stack.  Here is my list of those that had a significant impact on my learning and career or that I consider the current defining book on the subject.

Data Warehousing/Dimensional Modeling –  The Data Warehouse Toolkit by Ralph Kimball and Margy Ross (link is to the newest edition of the book) defines dimensional modeling, walks through many common data scenarios and explains recommended data modeling and ETL practices. If you are designing, building, or enhancing a dimensional data warehouse, please read this book.

Master Data ManagementMaster Data Management by David Loshin thoroughly explains the resources required, helps clarify goals, and discusses the challenges of implementing a master data management program in your organization. The book also discusses architecture options and provides a roadmap that you can adapt to your project. I actually did an independent study class based upon this book as part of my MBA. I would recommend it to anyone just getting into MDM as a great platform agnostic view of MDM.

Agile Data WarehousingAgile Data Warehousing Project Management: Business Intelligence Systems Using Scrum by Ralph Hughes is an enlightening read about applying the agile process to data warehousing by someone who understands both.  Use this book to help set expectations at the beginning of projects, estimate effort, decide what should be included in an iteration, and provide value early in your project.  It addresses DW enhancements as well as new DW projects.

SSISProfessional Microsoft SQL Server 2012 Integration Services contains great information whether you are just learning SSIS or you just need to understand the new features in 2012. It covers each task and data flow component, offers tips for performance tuning, and provides great examples.  If you haven’t moved up to SQL Server 2012 yet, you will want to learn about the project deployment model, environment variables, and project-level connection managers, which are all covered in this book.

SSAS TabularMicrosoft SQL Server 2012 Analysis Services: The BISM Tabular Model by Marco Russo, Alberto Ferrari, and Chris Webb
There aren’t that many books available on SSAS tabular models since it is fairly new. This one got me through a couple of projects as I learned tabular cubes and DAX.  You will read almost every page while building your first tabular cube.  Then you will return to find helpful tips on writing DAX calculations.

Data Visualization –  Information Dashboard Design by Stephen Few (link is to the newest edition of the book)
This book is what got me interested in data visualization.  The way I design reports and dashboards changed after I read it. I was torn whether to go with this book or Show Me the Numbers, but I think Information Dashboard Design covers the main points of Show Me the Numbers that I most want people to understand (it was also the first of Few’s books that I read). Today’s technology enables us to make reports that are high on shiny and low on actionable information. I see two troubling trends: reports that are just wide tables with several hundred lines that no one can read at one time, and really shiny dashboards with lots of bells and whistles that make them look really cool. This book will tell you there is an appropriate time and place for tables and dashboards.  And while visual appeal is good, it should not be at the expense of effectively communicating the intended message of the information.

Up Next

I just picked up some books that look very promising and may cause me to update my list above:

 

Happy reading!