Data Visualization, DAX, Excel, Microsoft Technologies, Power BI, Power Pivot, Power View

Color Coding Values in Power View Maps Based Upon Positive/Negative Sign

Power View can be a good tool for interactive data visualization and data discovery, but it has a few limitations with its mapping capabilities.

  • Power View only visualizes data on maps using bubbles/pies.
  • The size of the bubble on a Power View map can be misleading during analysis for data sets with small values and negative values.
  • By default, Power View can only display one measure on a map, which it uses to determine the size of the bubble.

Microsoft offers other mapping capabilities in Power Map and SSRS maps, so if all you need is a map, you can explore those other tools to determine if they better meet your needs. Check out my presentation on Choosing Your Geospatial Mapping Tool for more info on requirements, capabilities, and limitations of maps in Power View, Power Maps, and SSRS. If you want to see a map in the context of other charts, and you want the interactive highlighting and filtering of Power View, you may look for workarounds to the second and third issues listed above.

I recently had a client express disappointment about the display of negative values in Power View maps. A negative value simply gets a small bubble, and positive value gets a larger bubble. You have to hover over the bubbles to see each value and understand the bubble size that represents the change in sign from positive to negative. This gets a little tedious and minimizes the usefulness of the bubble map in the first place. The example Power View map below shows the year over year change in trespassing incidents in my city from 2013 to 2014. The small bubble has a value of -16, and the large bubble to the right has a value of 6.

PV Map Neg Values

One way to better present this type of data in a Power View map is to have positive values represented in one color and negative values represented in another with the bubble sizes showing the absolute value of the change. This proved to be less than straight forward as I worked my way through it. You can’t just make separate measures for the positive and negative values because Power View won’t let you put two measures on a map in that manner. You can’t simply create a calculated measure that determines the sign of your values and put it in the legend because Power View is expecting categorical data for the legend and it won’t let you put the numerical data field in the legend. Power View also doesn’t recognize calculated measures that return a text value (they don’t show up in the field list), so you can’t create a measure that turns the sign into a text value (“Positive” or “Negative”) in an effort to make it more categorical.

But you can use dynamic segmentation to show the second measure as a dimension. There is a great video on MSBI Academy that walks you through this. Although the video shows two different measures on a map, you can use this method to color code positive and negative values of a single measure. First, create a disconnected table for the number sign attributes. It should contain at least:

  • a text field with a descriptor
  • a field for the minimum value of the range
  • a field for the maximum value of the range.

The video shows storing that table in Excel (which is great if you plan to make changes), but you can now also paste the table directly into Power Pivot without any link to the table in Excel. My final table looks like the image below.

PV Map Sign Table

Next you need a calculation that uses the disconnected table and displays the absolute value of the measure you want to use for the bubble size.

Change In Incidents YOY Pos/Neg:=
     if(CALCULATE(Countrows('Sign'),
     FILTER('Sign',[Change in Incidents YOY] >= Sign[Min] 
     && [Change in Incidents YOY] < Sign[Max])
     ),
     abs([Change in Incidents YOY])
     )

Then you can build your map.  Use the Pos/Neg field for color, the new calculation for size, and your location fields of choice.

PV Map Pos Neg 1 Layer

Note that you will now see a warning message in the Power View fields list that relationships may be needed. You can disregard that. The use of a disconnected table is the cause of that message, and everything is working as intended.

PV Rel Msg

If your map has no drill down levels, you can stop here. Your map will now have the values for positive and negative in the legend with bubbles of two colors on the map. A bubble whose signed value is 5 will be the same size as a bubble whose signed value is -5 with the color indicating the sign of the number. In this analysis, a positive number indicates an increase in trespassing, so I chose a theme and ordered my values in the sign table such that the positive values were assigned the brighter color. These are the bubbles that need attention since trespassing is rising in the related zip codes.

If you do have drill down levels in your map, you have a couple more steps. You need another measure that deals with the drill down path. Without this extra step you end up with pies and unexpected values.

PV Map Pos Neg 2 Layer Bad

My map allows the user to drill down from zip code to beat. My new calculation references the top level of my drill down path ([Zip Code]), the calculation I just created ([Change In Incidents YOY Pos/Neg]), and the second level of the drill down path in my map ([Beat]). Note that this calculation is specific to the drill down path used in the map. If you have several maps with different drill down paths, you will need to create several calculations for this to work.

Change In Incidents YOY Pt 2:=
     IF(COUNTROWS(ALLSELECTED('KCPD Crime Data 2014'[Zip Code]))=1, 
     [Change In Incidents YOY Pos/Neg], 
     CALCULATE([Change In Incidents YOY Pos/Neg], 
     ALLSELECTED('KCPD Crime Data 2014'[Beat]))
     )

Now I can put this new measure in place of the original one, and the pies will go away, returning the expected color coded bubbles and values.

PV Map Pos Neg 2 Layers SBS

Excel, Microsoft Technologies, Power View, SSAS

Create a Power View Sheet Connected to an SSAS Tabular Model Without SharePoint

I have created Power View reports based upon SSAS Tabular models many times, but I typically go through SharePoint to get my data connection from a BISM connection file. I am now working on a project where I need to create Power Views connected to a tabular model without using SharePoint.  The way to do this was not immediately obvious to me, so I thought I would share this information. If you click Power View on the Insert Tab, you will get the following message: “Power View needs data to work with. Select a range of cells that contain data and then on the Insert tab click Power View.” Excel is assuming you want to you Power Pivot as your data source.

Instead, add a new data connection to your workbook that points to an SSAS Tabular model.

  1. On the Data tab, Choose From Other Sources -> From Analysis Services.
  2. When the dialog box appears, enter the server name and choose Windows Authentication. Then click Next.
  3. Select the appropriate database and perspective. Click Next.
  4. Update the file name and friendly name, if desired. Click Finish.
  5. Now that you have your data connection in your workbook, an Import Data dialog box will appear. Choose Power View Report and click OK.

A Power View sheet will be created in your workbook.  If you need another Power View sheet:

  1. Go to a regular Excel worksheet (not a Power View sheet).
  2. Click on the Existing Connections button.
  3. Double-click on the appropriate connection in the workbook.
  4. Choose Power View Report. Click OK.

The Office help documentation for this topic is here.

I also made a quick video that shows me creating two Power View tabs connected to an SSAS tabular model in an Excel workbook.

Data Visualization, Excel, Microsoft Technologies, Power Map, Power View, SQL Saturday, SSRS

Choosing A Mapping Tool in the SQL Server BI Stack

With the addition of the Power BI suite to the SQL Server BI stack, there are now 3 main options for creating data-driven maps. I have been working on a presentation to help you choose which mapping tool is appropriate for your needs on a given project. I gave the presentation for the first time in SQL Saturday Chicago, and I’ll be presenting it again at the first SQL Saturday – BA Edition in Dallas on May 3. I’ll share the highlights of the information I provide in my presentation in this blog post, and you’ll just have to attend a SQL Saturday where I’m speaking to get the rest of the details.

There are 3 main tools between Excel, SQL Server, and SharePoint that can visualize data on maps:

  • Power View
  • Power Maps
  • Reporting Services

There are also a few apps for Excel 2013 that create maps (Bing Maps, Geographic Heat Maps). I’ll explain the Excel apps in a later blog post, and focus on the more mainstream tools for today. Although you can make maps with each of these tools, there are some significant differences in features, time to delivery, and system requirements for each of these tools. The image below summarizes the pros and cons for each of these tools.

Maps OverviewPower View is the easiest to use, but it has the most limited features. The drag and drop interface allows you to build a map in a matter of minutes. But this ease of use comes at the price of limited customization. The only type of graph you get on a map is a bubble chart, which turns into pie charts if you add a category. You can only build one layer of data on your map. Also, fonts and colors are limited to the themes available within Power View. Power View offers great interactivity and cross-filtering capabilities. You can set up your map so it drills down from country to state to city. You can use slicers and other graphs to filter and highlight data on a map. An example of using a map to filter a related table in Power View can be found here. Power View reports are usually visually appealing and lend themselves well for presentations. And if you are using Power View inside of SharePoint, you can export your Power View to a PowerPoint presentation and retain interactivity.

Power Map’s strongest features are a time axis that enables you to watch trends over time, the ability to show multiple layers at once, and the storytelling capabilities that come from scenes and tours. Power Maps have more choices for graphs, including columns, bubbles, heat maps, and region shading. But there is no drill-down capability to see an increased level of detail. Power Map also lacks the ability to filter data.  For example, if you have sales data for three products and you only want to see sales for Product A, Power Map doesn’t offer a way to filter to just Product A. This means that you may have to add some DAX calculations to your Power Pivot model to facilitate your map creation. DAX has its own learning curve, so this increases the time required to build the map if you have to learn DAX and alter your Power Pivot model first. Update: As of Sept  2014 you now have the ability to filter your data in a Power Map. The great thing about Office365 is that you can get monthly updates.  The inconvenient thing about that is I have to go back and update blog posts when this happens. Power Map can facilitate great storytelling with data through the use of scenes and tours. You can look at a particular geography and time period in each scene and combine scenes into a tour, then you can share that tour to take users on your journey to gain insight.

Although Reporting Services isn’t getting much love from Microsoft’s marketing or engineering groups, it is still a very viable option for creating maps. While SSRS can require more technical expertise and time to deliver a finished product, it allows the most customization of the three tools and offers subscriptions (time-based and data driven) and multiple rendering and export formats. It also allows custom shapes, so you aren’t limited to whatever the Bing Maps API can do. This helps with custom geographical regions within a company or its customers. It also means you can map more than just points on a globe; you can map floor plans and other shapes you can capture with SQL Server spatial data types. Where Power Map and Power View don’t necessarily require a lot of infrastructure to share a report, SSRS reports require a Report Server or SharePoint Server to make a report available to users.

I’ve developed the following high-level categories to be used to evaluate the appropriateness of each tool for your map.

Maps Eval CriteriaI used these evaluation categories to summarize my findings for each tool in the table below. You’ll notice that I split Power View into Power View for Excel and Power View for SharePoint. This is because these two different development environments have some different system requirements, data sources, and methods of consumption for the resulting Power View report. Although there are two different modes of Reporting Services (Native and SharePoint Integrated), this difference only affects system requirements so SSRS was kept as a single category.

Maps SummaryFor system requirements, note that Power View works with Excel 2013 Professional Plus and O365 Pro Plus and a few other versions, but you must have an Office 365 Pro Plus subscription to get the GA version of Power Map.

Although the Power Map release notes state that you can use it with a 32-bit processor, large amounts of data or graphics intensive map layers can slow the computer to halt or cause Excel to crash. If you can help it, it’s best to have a 64-bit computer for Power Map development or consumption in Excel.

You’ll notice that the list of data sources for SSRS is fairly long compared to the other development environments, but don’t let that mislead you. I’m just listing direct data sources. Although Power Pivot only takes up one line, it can connect to a multitude of data sources, and Power Query extends those options even further. SSAS Tabular also has many options for data sources.

Also, be aware that bubble sizing in Power View for small values can be misleading. This issue is compounded by the fact that turning on data labels for a Power View map only labels the location and not the amount related to the size of the bubble.

In my presentation at SQL Saturday Chicago, one of the attendees expressed the opinion that Power Maps tend to be very busy and it’s difficult to tell where to look when you first encounter one.  I don’t disagree with that, so I’ll throw in the warning that it can be very easy to make a Power Map that is visually appealing but doesn’t provide any insight into your data. I think there are some good use cases where Power Map is the best tool to deliver insight, but not every map needs a time axis and multiple layers/multiple graph types. Another limitation of Power Map is that is can only be opened from the Excel file that contains it (there’s no way to view it in a web browser). While I expect that to change in the future, this is the current state today. Outside of exporting a tour to video, the development environment and the viewing environment are one and the same. To create a new Power Map or insert an existing Power Map, you must go to the Insert tab in Excel and choose Map. This will open the Power Map window where you can view and update your map.

I also want to point out that SSRS is the only one of the three tools that can make use of SQL Server spatial data types. If you would like to learn more about spatial data types in SQL Server, Hope Foley has a great presentation on the subject and has her slides and scripts posted on her blog. Jason Thomas has some good blog posts on SSRS maps.

Data Visualization, Excel, Power Map, Power View, SQL Saturday, SSRS

I’m Speaking About Geospatial Data Viz and Data Viz Best Practices in April

I will be speaking at two SQL Saturdays in April.  First, I will be at SQLSaturday #297 in Colorado Springs on April 12.  I’ll be presenting my session on The Accidental Report Designer: Data Visualization Best Practices in SSRS.  See my previous post on this subject to understand why I think it is an important message for all data professionals.  In this session, I share guidance on evaluating the effectiveness of data visualizations and a few good tips and demos to take your data visualizations from chaotic and ineffective to professional and powerful.

Next, I will be at SQL Saturday #291 in Chicago on April 26.  I will be presenting a new session to help you Choose Your Geospatial Mapping Adventure. Here’s the abstract:

You have geospatial data and you know that plotting it on a map will deliver insights. Or maybe you keep hearing “geospatial” as a current BI buzzword and you want to be informed about your options when you get to it. The Microsoft BI stack provides several options for visualizing location data. But how do you know which tool will best serve your purpose, or what factors are important in making this decision? We’ll identify factors to consider when choosing a tool to visualize your data in maps such as system requirements, availability and type of location data, required level of technical understanding, and more. You’ll leave the session confident in your ability to choose the best tool within Excel and SQL Server.

The Microsoft BI stack offers several great options for presenting data on a map, but your decision of which tool (Power View Map, Power Map, SSRS map, free mapping apps in Excel) can limit you in features, accessibility, and time to delivery. Some of the tools require more technical skill than others, but you lose some of the ability to customize when you go with the highest ease of use.  The session will be filled with tips, explanations of features and requirements, and demos.

If you will be in the area on the dates of these events, please stop by to participate in my sessions or just say hello.

Data Visualization, Excel, Microsoft Technologies, Power Query, Power View

Most Popular Names Visualized in Excel

A couple of months ago, I came across an article in the Atlantic that showed an animated gif of the most popular baby names by state by year. I decided to visualize this data using various add-ins and chart types in Excel.

Retrieving the Data

I found the data on the Social Security Administration website. This data showed the top 5 girl names in a table for each state for a given year. I needed to pull in multiple years, so I used Power Query and wrote a bit of M. I followed the guidance from a great blog post by Devin Knight:

  1. Choose From Web and enter the url for a page containing any year of data.
  2. Modify the M to create a function that accepts a year parameter and uses that to formulate the correct url for each year.
  3. Create a second query from scratch and provide an array of years.
  4. Convert this data to a table and insert a calculated column that pulls in the data from the first query.
  5. Expand the columns, rename columns, and change to appropriate data types where necessary.

For more detail, see Devin’s post. He does a great job of explaining and providing screenshots.

Once I had my data it was time to show it off.

Power Map

My first instinct was to use Power Map since its specialty is showing changes in geospatial data over time. After trying each type of chart in Power Map, I found that the region shading was the closest to what I wanted.

Power Map Baby NameAlthough I wasn’t just trying to duplicate the animated gif, I found it difficult to display this data in an equally effective way in Power Map due to the lack of ability to add data labels or customize annotations. There was no good way to show the most popular name on the state. I chose to use annotations to display the name. You can use data fields to populate the title and description of an annotation. The biggest drawback of using annotations is that there is no way to resize or relocate them. If I were to add annotations for each state, they would overlap and you wouldn’t be able to read many of them nor would you be able to see most of the map. I was able to add enough annotations that there was usually one annotation for each of the names shown on the map at any time. One other issue I noticed is that occasionally an annotation will fail to update on time, so the data in the annotation will be old and correspond to a previous year.  If you watch the video for my Power Map, you will see that the annotation for Virginia gets stuck on 1983 for a few years, but it eventually catches up. Update: Microsoft responded to my question on this.  We figured out that the annotations will update as expected if you set the Time to Date(Year) and then change the time settings to “Data Shows for an Instant”.  I have updated my workbook with these settings. You can watch the video of my Power Map tour here. Outside of the inability to add good labels, watching the states change color over time works well. You can have a legend that shows the names with the colors, but it will show every name from the data set and will not be an effective use of space.

Power View

Power View provides the ability to create bubble maps. I made a bubble map tiled by year.

Power View Baby NameIf you click on a year tile, you can then use the arrow keys to flip through the years; you can also use the scroll bar below the tiles. I think this provides a better visual than the Power Map for this situation since it is easier to see which names go with which states. Because of the bubble colors, I can see how the trend went from Jennifer to Jessica to Ashley in the 1980s. We also notice that there are 5+ top names from the 1990s forward, whereas the 1980s had 2 – 4.  The only real problem I encountered with the map in Power View was the limitation of the colors in the themes. When there are several names on the map, some of the colors can be very similar.  I tried several themes and couldn’t find one where this wasn’t the case.

Pivot Table

Next I decided to stray from maps and see if there were other chart choices that suited this data.  I created a pivot table with data bars and a timeline slicer.

Excel pivot baby nameI can still tell which baby names were most popular across the US and see how the trends change over time. The piece of information that I lose in moving away from maps is seeing how trends affect states that are geographically near each other. For instance, I can tell that Jessica was popular in 1985 in states such as California, Florida, New York, and Michigan. But I don’t immediately recognize that Jessica was popular only in New England, Florida, and the Mountain West and West Coast because Ashley had taken over the    Mid-Atlantic and Midwest. One advantage of the pivot table is that I can select multiple years. So I can learn things like Ashley was the most popular name overall in the 1990s.

Adventures with Apps for Office

Microsoft Research published several free apps for data visualization in Excel.  I decided to use a couple of them to visualize the baby name data. You can access these apps on the Insert tab.  Click Apps for Office -> See All.  Then click Find More Apps at the Office Store. One thing to note about these apps is that they require the data to be formatted in a certain way.  See the instructions in the links below for details on this.

apps for office

Streamgraph

A streamgraph shows how a set of numbers has changed over time using their relative area. It’s kind of like an area chart that doesn’t use an absolute Y axis.

steam graph baby names

I do not think this is an effective way to display the data.  Area graphs and streamgraphs both fail when trying to display a large number of values.  In this specific case, you can only pick a color and the shades are determined for you. It’s difficult to distinguish some of the shades from others. The fact that the values for the names can overlap and aren’t a specific shape (like a rectangle where you only have to judge length) make it difficult to interpret quickly.  You lose most of the advantages of pre-attentive processing because you really have to study the graph to get information from it.

Treemap

A treemap is a hierarchical view of data that breaks it into rectangles and uses size and color to communicate information.

tree map

For some reason there is a size limitation on this treemap so I couldn’t make it big enough to where all the names would fit in the boxes. Since the treemap plots the boxes so they fit into the overall rectangle, years are not placed in sequential order.  In my treemap, both the color and size represent the number of count of names for the year. Using a treemap removes the ability for the viewer to see trends over time.  I have to look in the bottom row for 1980 – 1982 and then jump to the top left to find 1985. I also had to summarize to top names by year, losing any geographical/state information.  One thing this does do well is compare 2 – 3 values for a name across years.  For instance, I can compare the shade of green for Jennifer in 1981, 1983, and 1985 to see the decline in popularity of Jennifer over time.

And the winner is…

I think the Power View does the best job of communicating all of the available information from the data set while being visually appealing. We can see geographic relationships and temporal relationships.  It’s easy to read and quickly see which names were top in which states at any year.

You can download my Excel file here if you would like to check it out or try to improve upon it.

Data Visualization, Excel, Microsoft Technologies, Power BI, Power Pivot, Power View

Infographic vs Power View

Someone on Twitter posted a link to an infographic on the 10 most visited cities in the world, which you can see below.  I’m interested in travel, and I like data, so I looked through it.

Most Visited Cities Infographic
Most Visited Cities Infographic from DailyInfographic.com

After a second or two of looking at it, my BI and dataviz nerdiness kicked in. Here were my thoughts:

  1. I know infographics are cool, and I love that they communicate an interesting data point to the general public in an easily consumable manner.  But I hate when they make them really large/long with a lot of text where you have to read through every bit of text to really get the info.  You lose a lot of the opportunities for preattentive processing with a format like this.
  2. I know I could make a great Power View from this data that could deliver information more efficiently, and I think it would look pretty cool.

Making the Power View

First, I had to get the data and put it in a Power Pivot model. I just copied the info from the infographic and put it in a single table in Excel.  You could split it out and put it into several tables (especially if you split the tourist attractions into 3 individual fields per city), but I didn’t need it for what I was making and the data was all at the same level of granularity in my single table.

Most Visited Excel table
Click to enlarge

I imported that into my Power Pivot model by selecting the table and  clicking the Add to Data Model button on the PowerPivot tab.  I had a vision of where I was going with my Power View, so I also imported a table of countries and flag images into Power Pivot. See yesterday’s post for info on how to import images into Power Pivot. I had originally split out the hotel price and the hotel name as two separate fields, but I found that I liked them together when I was building my Power View so I made a calculated field in Power Pivot to take care of that.  Here are my tables in Power Pivot:

Most visited power pivot 1
Click to enlarge
Most visited Power Pivot 2
Click to enlarge

You’ll notice on the Country Flags table, I had to change the table behavior properties to get the images to render properly in Power View. You can find this on the Advanced tab in Power Pivot. The important part is to set the default image to the flag image. Next I had to relate my most visited cities data to the appropriate flag.  Since there is one row per country in my Country Flags table, this was as easy as relating the country field in the Most Visited Cities table to the country field in the Country Flags table.

Most Visited Power Pivot 3
Click to enlarge

And finally, I built my report. You can download my workbook here.

Most Visited Power View
Click to enlarge

I used a map and cards to display the data. In the map, the bubble size represents number of visits. If you click on a bubble you filter the table below to see just the information for that location. If you hover over a bubble you can see the city name and visits per year. The table is sorted by visitors per year descending.

Why I Like This Power View

Let’s try an experiment: open the file in Excel and then open the infographic in your browser. See how long it takes you to answer the following questions with the infographic and then with the Power View.

  1. How many of the most visited cities are in the United States?
  2. If I were to plan a trip to visit 3 of the cities on this list, which ones would be most convenient to visit in one trip?
  3. What tourist attractions are in Hong Kong?
  4. How many visitors does Dubai get per year?
  5. In what country is Kuala Lumpur located? (I’m hoping you know this off the top of your head, but let’s pretend you need to look it up.)
  6. Which country has the lowest average hotel rate?

I think the Power View would answer all of these questions quicker or just as quick as the infographic.  If you have been looking at the infographic for a while, you might be pretty fast at #3, #4, and #5.  But you will be much slower at #1, #2, and #6 using the infographic.

There are two main reasons for this:

  1. The map provides visual information that you can take in quickly.  You can quickly see locations of these cities on the globe as well as their locations relative to each other. Because of preattentive processing (PDF) we can determine which places have the most visitors (size), which places are close to one another (spatial position), and where cities are located (color – the mauve bubbles stand out) without having to really think about it. I will note that both the Power View and the infographic do a good job of using enclosure to indicate which data points are related.  The Power View has the mauve bars to the left and the line above to separate each city.  I easily know that Malaysia is the country related to Kuala Lumpur and don’t confuse it with the information for Hong Kong above it.
  2. Interactivity allows us to quickly find information through sorting and filtering. For question #3, I just clicked on the bubble for Hong Kong to filter the table below and find that the tourist attractions in Hong Kong include Hong Kong Disneyland, Kowloon Walled City, and Ocean park Hong Kong. To find the lowest hotel rate, I just sorted my table by the Avg Cheap Hotel field ascending and looked in the first card to find Kuala Lumpur has hotels for $16. If you look for this information in the infographic you have to scan through every box until you find Hong Kong or look through all the boxes and remember the lowest hotel rate you have read.

Perhaps you think that it is unfair to compare an interactive Power View with a static infographic. I would argue that it makes a point.  When you build BI solutions including reports and ad hoc data discovery tools you should consider the needs of the end consumer to decide whether you need interactivity. They do make interactive infographics.  This author chose not to include that feature in her design. There are some advantages to the infographic around accessibility. The infographic can render in basically any browser on any device and can be downloaded as an image to where it doesn’t require internet connectivity. Power View maps currently use Silverlight and the maps require an internet connection. The preview of Power BI in Office 365 has an HTML view but it doesn’t yet work with maps, so you aren’t going to look at this Power View on an Android or iOS device.

I don’t think all infographics are inherently poorly designed; I don’t think this one is particularly bad.  I do think that the information could be better arranged to more effectively communicate to the audience. I recommend looking through Stephen Few’s presentation on Infographics and the Brain (PDF) for more info on this subject. I agree with Few that infographics are good when they simplify the data and tell a story to communicate a message with the data in addition to being entertaining.

Excel, Microsoft Technologies, Power BI, Power Pivot, Power View, SSAS, T-SQL

Fun With OPENROWSET

I’ve had several occasions to use OPENROWSET recently in T-SQL.  Although you can use it as an ad hoc alternative to a linked server to query a relational database, I’m finding it useful to get data from other sources. I’ll go into details of how to use it, but first I would like to acknowledge: OPENROWSET is not the answer to everything.  Sometimes you should use a linked server or SSIS or some other method.  I had some specific cases where I felt OPENROWSET made more sense for my task.

Importing Data from an Excel Spreadsheet

Technical need: I had some data that needed to be validated before importing into SQL Server from Excel as part of an upload process in an application.  If it didn’t pass validation, I did not want to import the data. Rather than use SSIS to import the data into a table and then check it, I used OPENROWSET to pull it directly from the Excel file into a temp table, ran all of my validation checks, and then imported the data using SSIS all within a stored procedure that was called by an application.  Field and table names have been changed to protect the innocent.

The stored procedure:

CREATE PROCEDURE dbo.MySproc
@datafile nvarchar(127) 
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;
    DECLARE
        @sql nvarchar(max)
    ,   @rowcount int
 SELECT
        @rowcount = 0

    DECLARE @FileExists int

    BEGIN TRY
        EXECUTE master.dbo.xp_fileExist @datafile, @FileExists out
        -- Value of 1 = file exists
        -- Value of 0 = file doesn't
        IF @FileExists = 0
        BEGIN
            SELECT
            CONCAT('The file ', @datafile, ' could not be found.') 
                                                   AS reasontext
        ,   NULL AS field2
        ,   NULL AS field3;            
            RETURN -1;
        END
    END TRY
    BEGIN CATCH
        SELECT
            ERROR_MESSAGE()  AS reasontext
        ,   NULL AS field2
        ,   NULL AS field3;
        RETURN -1;

    END CATCH
 TRUNCATE TABLE dbo.finaldest;
    IF OBJECT_ID('tempdb..##tempf') IS NOT NULL
    BEGIN
        DROP TABLE ##tempf;
    END;

--Create temp table to hold values from Excel file
Create table ##tempf (
Field1 [nvarchar](255),
Field2 [nvarchar](255),
Field3 [nvarchar](255),
Field4 [nvarchar](255),
Field5 [nvarchar](255),
Field6 [nvarchar](255)
);
--dynamic sql to populate temp table with values from Excel file
    SELECT
    @sql = N'INSERT INTO ##tempf
   SELECT
   [Field1],
   [Field2], 
   [Field3],
   [Field4],
   [Field5],
   [Field6],
    FROM OPENROWSET (
                ''Microsoft.ACE.OLEDB.12.0'',''Excel 12.0;Database='
                + @datafile
                + ';HDR=YES;IMEX=1'', 
                ''SELECT s.* FROM [Sheet1$] AS S 
                WHERE [Field2] IS NOT NULL AND [Field3] IS NOT NULL'')';

    BEGIN TRY 
        EXEC sp_executesql @sql;
    END TRY
    BEGIN CATCH
        SELECT
            'File not found or invalid file type' AS reasontext
            --ERROR_MESSAGE()  AS reasontext
        ,   NULL AS Field2
        ,   NULL AS Field3;
 IF OBJECT_ID('tempdb..##tempf', 'U') IS NOT NULL 
        BEGIN
            DROP TABLE ##tempf
        END
      RETURN -1
    END CATCH
/*Do lots of data validation here
...
*/
--if data validation tests are passed, write to final destination table
Insert INTO dbo.[finaldest]
   ([Field1],
      [Field2], 
      [Field3],
      [Field4],
      [Field5],
      [Field6])
   Select [Field1],
      [Field2], 
      [Field3],
      [Field4],
      [Field5],
      [Field6] 
      from ##tempf;

 Drop table ##tempf; 
END

To make this work you need to install the Microsoft ACE OLE DB 12.0 provider on your server. You cannot use the Jet OLEDB Provider with 64-bit SQL Server and 64-bit Excel. Once you have this installed you need to change some settings:

EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', 
N'AllowInProcess', 1
GO
EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', 
N'DynamicParameters', 1
GO
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'Ad Hoc Distributed Queries', 1;
GO
RECONFIGURE;
GO

Here is a blog post that explains the errors you might get if you don’t have these settings correct. It also notes that you can go check your settings in the registry keys. For SQL Server 2012 the path is HKEY_LOCAL_MACHINE -> SOFTWARE -> Microsoft -> Microsoft SQL Server -> MSSQL11.MSSQLSERVER -> Providers -> Microsoft.ACE.OLEDB.12.0.  You can then see that AllowInProcess and Dynamic parameters are set to 1.

RegKeyACEOLEDB

This provider seems to be a bit fussy and there are some things that took me a while to troubleshoot once I got started.  First, make sure that the location of the Excel file you are reading is accessible to the account that is running the SQL engine and doesn’t require admin access. The Linked Server errors that get returned with this really aren’t that helpful, so you have to use your Google-fu to work your way through it.  I used a global temporary table so it would be accessible across connections if it needed it to be (I was having trouble when I used a local temp table).  The application that calls this stored procedure queues up the calls and executes them serially so one can’t override the other.

Returning results from an Analysis Services cube in SQL

Technical need: I had some data (several facts with common dimensions) with calculations in a multidimensional cube and an application that needed that data but couldn’t/wouldn’t consume the MDX result set.

The query: For demonstration purposes I’m using a small query that returns the values for a drop-down box for selecting a member of the date dimension.

select a."[Measures].[DateKeys]" as DateKey, 
a."[Measures].[DateLevel]" as DateLevel, 
a."[Measures].[DateValues]" as DateValue 
from OPENROWSET 
('MSOLAP','Datasource=localhost; Initial Catalog=MyCubeDB;',
  
'with member [Measures].[DateValues] as
    Iif([Date].[Fiscal Calendar].currentmember.level_number = 2, "    " 
    + [Date].[Fiscal Calendar].currentmember.member_value, 
    [Date].[Fiscal Calendar].currentmember.member_value)
  member [Measures].[DateKeys] as 
    [Date].[Fiscal Calendar].currentmember.uniquename
  member [Measures].[DateLevel] as 
    [Date].[Fiscal Calendar].currentmember.level_number

select
 {[Measures].[DateValues], [Measures].[DateKeys], 
                           [Measures].[DateLevel]} on 0 ,
 {Exists(Descendants([Date].[Fiscal Calendar].[FY2014], 1, 
                         SELF_AND_BEFORE), , "MyMeasureGroup") } on 1
 from [MyCube]
') as a;

You can see that I explicitly named my fields, but you can also do a select * to get all the fields from the OPENROWSET. I had to put the fields from the cube in quotation marks to make it work because of the brackets in the field names. For Analysis Services, MSOLAP is the name of my provider.  You must have the correct version of this provider listed under the linked servers on the server where this is being executed.

MSOLAP ProviderFor SQL Server 2012 SP1, you can download the Microsoft Analysis Services OLE DB Provider for Microsoft SQL Server 2012 SP1 here. There is nothing special about the data source and initial catalog. They are exactly the same connection info you would use when connecting to SSAS through Management Studio.  The last part is simply the MDX query.  You should be able to copy that part out and run it in Management Studio connected to your SSAS database and get results.

Inserting Images Into a Database

Technical need: I was building a demo for Power BI and I needed to show some images in my Power View reports in Office 365.  In order for the images to show in Office 365, you must import the images into SQL Server and then import them into Power Pivot. First I downloaded images of flags for each country from the CIA world factbook.  I took the folder of flags and put it in my pictures folder. I already had a table in SQL Server that contained country names and codes.  So I just needed add a flag for each country.

The query:

  CREATE TABLE [dbo].[countryflagsdemo]
  (
  [id] int, 
  [country] varchar(100), 
   varchar(2),
  flagimage [varbinary](max)
  )
 --get ID, country, and code from existing table
  INSERT INTO dbo.countryflagsdemo (id, country, code)
  SELECT id, country, code FROM dbo.Countryflag1;

  DECLARE @i INT; 
  DECLARE @rows INT; 
  DECLARE @code VARCHAR(2);
  SET @i = 1;
--get max id value for loop
  SET @rows = (SELECT COUNT(country) FROM dbo.countryflagsdemo)
  WHILE (@i < @rows)
  BEGIN 

    DECLARE @sql VARCHAR(MAX); 
    SET @code = (SELECT code FROM dbo.countryflagsdemo WHERE [id] = @i);
    SELECT @sql = 'UPDATE dbo.countryflagsdemo 
      SET flagimage = 
      (SELECT BulkColumn FROM 
        OPENROWSET (BULK 
            ''C:\Users\mlongoria\Pictures\flags\large\' + @code 
               + '-lgflag.gif'', Single_Blob) as flagimage)
            WHERE  = ''' + @code + ''';'

    EXEC (@sql);
    SET @i = @i + 1; 
    --PRINT @code;

END 
GO

I figured out how to do this based upon this MSDN blog post. I inserted my other data first and then added the images.  I’m sure you could find a way to do this all in one select statement.  Since this was a one-time thing I didn’t see the need to find another way. It should be noted that the bulk insert must pull from a location on your computer; you can’t use links to the images on the internet. I downloaded the images in a folder. Each image name was the abbreviation for the country. If you are going to import this data into PowerPivot, be sure you change the table behavior properties so it will recognize the varbinary field as an image.

 

There may be other ways to accomplish my goals, but it’s nice to understand how to use OPENROWSET and have that in my toolbox.