I spent a good bit of time looking for the definitions/descriptions of the TMSCHEMA DMVs that allow us to view metadata and monitor the health of SSAS 2016 tabular models. As far as I can tell there are no details about them on any Microsoft site. Many of the columns are obvious, but there are a few fields that show IDs rather than descriptions (e.g., ExplicitDataType in TMSCHEMA_COLUMNS, Type in TMSCHEMA_DATA_SOURCES). It would be great to get the DMVs documented similar to the MDSCHEMA DMVs as they are quite useful for tasks like documenting your tabular model. Since the TMSCHEMA DMVs work in Azure Analysis Services as well, I have logged this request on the Azure AS User Voice for that. Please lend me a vote so we can make this information more easily available.
Category: Power BI
Back in June, I posted the feedback I received on the abstracts I submitted to PASS Summit 2016. I wasn’t originally selected to speak, but I did have one talk that was selected as an alternate. It turns out that a couple of speakers had to cancel , and I am now speaking at PASS Summit. PASS contacted me two weeks ago to ask if I could fill in. Luckily, my session that was chosen as an alternate was one that I give frequently at SQL Saturdays so it was feasible to get it ready to present at PASS Summit.
I am excited and honored to be speaking at PASS Summit, but I’m not stressed about it. Part of it is that I’ve only had two weeks to stress, and part of it is that I have quite a bit of experience speaking and I feel confident that I can deliver a decent (if not downright good) session. I’m still updating my presentation and practicing the delivery in preparation for this week, but I feel comfortable with the process of preparing. I’ve spoken at PASS Summit and many other conferences and user groups over the last five years. Over that time I have developed a process for preparing to speak, which includes building and editing the presentation as well as some disaster recovery measures. Once I have checked all the boxes I can feel that I have sufficiently prepared. It doesn’t mean everything will go perfectly, but it helps make sure I haven’t forgotten anything.
There are a couple of things I hope you take away from this.
- If you are selected as an alternate, it is a real possibility that you will be added to the schedule. This worked out nicely for me because I was already scheduled to be at PASS Summit and my presentation was already in a fairly finished state. You will definitely never be chosen if you never submit. So if you are interested, throw your hat in the ring next year.
- Giving a good presentation (and feeling confident about it) takes practice. If you are just beginning your speaking efforts in the SQL community and you are nervous about it, know that it gets easier (for most people) as you gain more experience. If you enjoy it, keep putting yourself out there. Speaking is something I thoroughly enjoy these days, and I have (mostly) conquered the nerves. Some of that is because I have experienced failures in presentations and the world did not end. Some of that is the result of my presentation prep checklist, which helps me feel prepared.
The downside of the last-minute addition is that I haven’t had as much time to market my session. So if you like Power BI, data visualization, or Kansas City BBQ, and you are attending PASS Summit, I invite you to join me on Friday morning at 8 am in room 3AB. I’m discussing how to apply good data visualization practices in Power BI using data on BBQ restaurants.
Here’s the description for the session I’m presenting:
Overcoming Chartaphobia with Power BI
Do reports in your organization consist mostly of giant tables of data? Perhaps you have gotten as far as adding KPIs or conditional highlighting to the tables. Maybe you have charts, but they are hideous and distracting. Although tables of data presented as reports are commonly found in many organizations, they may be doing you and your users a disservice. We’ll discuss why cognitive psychological studies tell us that graphs and pictures are more effective at communicating trends and comparisons and how to prepare to create good data visualizations. Then we’ll explore how to employ purposeful data viz designs to help users achieve their goal of making informed decisions, using a fun and useful Power BI dashboard. You’ll leave with guidance on how to take boring or unreadable tables of data and turn them into useful and visually appealing reports.
I hope to see you in my session on Friday!
If you were used to documenting your SSAS model using the MDSchema rowsets, you might have noticed that some of them do not work with the new tabular models. For example, the MDSCHEMA_MEASUREGROUP_DIMENSIONS DMV seems to just return one row per table rather than a list of the relationships between tables.
Not to worry, though. With the new compatibility level 1200 SSAS Tabular models came some new DMVs. They are, unfortunately, not documented at this time (as of 3 Oct 2016). I expect them to show up here at some point in the future. Until then I published a Gist that provides a list of those views as well as some notes on what kind of useful info is in each one. While there are 36 TMSCHEMA DMVs, the main DMVs you will want to check out for model documentation purposes are:
There are a few fields that show IDs rather than descriptions (e.g., ExplicitDataType in TMSCHEMA_COLUMNS, Type in TMSCHEMA_DATA_SOURCES). I’ve been able to figure out the descriptions that correspond to some of the values, but I’m still missing a few.
As with the MDSCHEMA DMVs, the TMSCHEMA DMVs can be used to document your model. This also works with Power BI models. You’ll just need to change the connection information to your Power BI instnce while the .pbix file is open in Power BI Desktop. For more information analyzing your Power BI model, see Chris Webb’s post. Since DMVs require DMX queries, you are somewhat limited in your ability to join the data from the DMVs together. This makes it necessary to store the data somewhere before transforming and merging them. Since I needed to document a tabular model for a client, I created a Power Pivot model in Excel 2016. You can download my documentation model here. My example model was connected to a quick demo SSAS model I made based upon the Wide World Importers DW database.
My Power Pivot Model
My Power Pivot model uses Power Query to import the data from the DMVs. This enables me to merge data into a model that I feel makes the most sense and doesn’t suffer from too many relationships and hidden tables (which is what would happen if you just imported the results of each DMV query into the model). In order to make this work for any model, I employed a trick I learned from Chris Webb to avoid duplicating connection information across queries. I built my model with the normal embedded connection strings and then converted them to use the connection Connection Info tables. As I did that, I ran into a Power Query limitation as evidenced by the error message below.
“Formula.Firewall: Query ‘QueryName’ (step ‘StepName’) references other queries or steps and so may not directly access a data source. Please rebuild this data combination.”
Ken Puls has a great explanation of how to get around this issue. Basically, you have to redesign your queries so that you have “staging tables”, which you can then use to build finalized tables. This redesign gets you around the error since the finalized tables don’t have the embedded source connection, just a reference to the results from another query.
This Excel file documents:
- Database and model info
- Tables and source queries
- Security roles, membership, and permissions
- Perspectives and the columns, hierarchies, and measures they contain
The Database Info tab uses cube functions to display the values shown. The rest of the tabs use pivot tables and slicers.
You can download and use this model if you have Excel 2016 or 2013 with Power Query. Just download the file, change the values in the TabularInstanceName and TabularDBName queries, and refresh. I was able to open the file and update the queries with no issues and no changes in Excel 2013.
This is great if you just need documentation or a quick way to look up specific information. But we could really analyze this using visualizations other than tables in Power BI Desktop.
My Power BI Model
I imported the Excel model into Power BI, made a few adjustments, and then added some visualizations. This report is a bit paired down from a visualization standpoint compared to the Excel file because I eliminated some of the data that wasn’t analytically relevant for me.
Click here to view this report in Power BI.
On the first page, I have several summary numbers that tell me about the contents of the model. It gives me a decent idea of the size and complexity of the model.
The second page uses a bar chart as an interactive filter against a list of table source queries and columns.
The third page shows relationships between the tables. Slicers on the left filter both the force directed graph and the table. Smart filters at the top of the page let you filter the Table and To Table in the relationships.
The last page shows security by role and table, again using column charts as interactive filters.
You can download the Power BI model here.
Update: Check out the Demo Day video I made on the BlueGranite blog for further explanation and a demo of the Tabular Model documentation.
My friend and coworker Melissa Coates (aka @sqlchick) messaged me the other day to see if I could help with a DAX formula. She had a Power BI dashboard in which she needed a very particular interaction to occur. She had slicers for geographic attributes such as Region and Territory, in addition to a chart that showed the percent of the regional total that each product type represented. The product type was in the fact/data table. Region and territory were in a dimension/lookup table and formed a hierarchy where a region was made up of one or more territories and each territory had only one region.
Back in the MDX days this would be easy because we could go up the hierarchy (assuming the hierarchy is called Geography) using something like
DAX doesn’t have this same capability, so you have to approach it a bit differently.
I made a quick demo model to mimic the circumstances.
- The Geography table contains an ID as well as Region and Territory.
- The Product table contains and ID and Product Name.
- The Product Type is actually in the fact/data table along with the Sales Amount.
- The relationships between the tables are many:1 from the Sales table to the lookup tables, both with cross filter direction set to single.
The ultimate calculation we wanted was <regional product type sales amount> / <region sales amount total for all product types>.
The initial calculation we started with was:
Percent Of Region Total := CALCULATE ( SUM ( 'Sales'[Sales Amount] ), ALLSELECTED ( 'Geography'[Region] ), ALL ( 'Geography'[Territory] ) ) / CALCULATE ( SUM ( 'Sales'[Sales Amount] ), ALL ( 'Sales'[Product Type] ), ALLSELECTED ( 'Geography'[Region] ), ALL ( 'Geography'[Territory] ) )
Notice that the numerator and denominator are the same, except the denominator overrides the query context for Product Type. This calculation worked fine when a single region was selected and no territories were selected. Once a territory is selected, it limits the scope to just that territory rather than the entire region, which is not desired.
In order to make sure I understood the query and filter context, I added two calculated measures:
TestFilters := IF ( HASONEFILTER ( 'Geography'[Region] ), "1", "Many" ) TestValues:= IF (HASONEVALUE('Geography'[Region] ), "1", "Many" )
I put these measures in a pivot table along with my product types and my geography slicers, and I selected no regions and one territory. The TestFilters measure returned “Many”, but the TestValues measure returned “1”. So I had my first clue as to how to find my desired region.
Next I worked on my numerator in my final calculation. I knew my measure would include the sum of Sales Amount, so I started with that and added CALCULATE() to it in order to change the context. I needed to remove the territory context so I started with
Calculate(Sum('Sales'[Sales Amount]), ALL('Geography'[Territory]))
Then I needed to group those results by region and filter to only the desired region, which I did using
MAXX(VALUES(Geography[Region])). If I selected a single region or a single territory, I would only get one region back in the values. MAXX() was used to retrieve a single result rather than a table. My numerator ended up as:
Region Contr Numerator := MAXX ( VALUES ( Geography[Region] ), CALCULATE ( SUM ( Sales[Sales Amount] ), ALL ( Geography[Territory] ) ) )
This way, the context of the region corresponding to the selected territory is saved outside of the CALCULATE() in the VALUES(‘Geography'[Region]), but it was removed inside the CALCULATE() to provide the region total. This formula works when no territories and one region is selected as well as when no regions and one territory is selected.
The denominator is written similarly with the addition of the ALL(‘Product'[ProductType]) filter inside of the CALCULATE().
Region Contr Denominator = MAXX ( VALUES ( Geography[Region] ), CALCULATE ( SUM ( Sales[Sales Amount] ), ALL ( 'Sales'[Product Type] ), ALL ( Geography[Territory] ) ) )
To finish up, I created my final calculation and added a bit of error handling for cases when no regions or territories are selected:
Region Avg = IF ( HASONEVALUE ( 'Geography'[Region] ), DIVIDE ( [Region Contr Numerator], [Region Contr Denominator] ), BLANK () )
These calculations provided the desired result.
Note that this only works with the cross filter direction set to single in the relationships. Also, originally I had a Product Type attribute in my Product table, used DAX to add it to my Sales table (to replicate the situation we were solving for) , and then forgot and put the Product Type from Product in my visuals. This clearly didn’t work because I didn’t set my context in the calculations using the field from the Product table.
Hopefully this will come in handy if you have some interesting filter/slicer requirements in your reports.
I just finished updating my Power BI updates list for the last time. I started the list in November as a way to keep track of all the changes. When I first started it, they weren’t posting the updates to the blog or the What’s New sections in the documentation. Now that Microsoft is on a steady cadence of monthly releases and doing a great job of communicating updates, there isn’t much need for this except as a fun data source. It’s also been about a year since Power BI V2 became generally available, so it seems like a good stopping point.
I removed the link to my updates page, but I left the page available for anyone who has it bookmarked and put a note on the page that the list is no longer being updated. I want to leave it in place because I appreciated the comments and the Power BI visuals people made using the list as a data source.
If you would like a copy of it to continue your own list or to use as a data source, you can access the file at http://1drv.ms/21Kjk3f.
Update: The ability to change the color of a KPI was delivered in August 2016!
Color blindness, or color vision deficiency (CVD) affects 1 in 12 men and 1 in 200 women in the world. The chances are good that you have met someone who is colorblind, but you may not have realized it. I know at least two people in the #sqlfamily who are colorblind. There are many types of CVD, but the most common is Deuteranomaly (red/green color blindness). Those with deuteranomoly have trouble distinguishing shades of red and green as well as some shades of gray, purple, and blueish green.
And which colors do we love to use with KPIs? Red and green, of course! Color is a very powerful tool in data viz. We use it to indicate meaning and to draw attention to something important. KPI boxes are used to display key metrics in an efficient manner. These key metrics are usually rather important, and our users need to be able to see their status at a glance.
I quite like the design of the KPI boxes in Power BI, but for some reason they were created without the ability to adjust the color associated with the state (good/bad). Shown below, they use the common red/green color scheme.
This may look fine to you, but if you are a deuteranope (green-blind), it looks like this:
That makes it much more difficult to tell how things are going. I would have a hard time getting the same quick information from the KPI boxes as a deuteranope. There are some visual clues in the KPI box to help those with CVD: there is an indicator (! for bad and ✓ for good) as well as the percent above or below the goal shown under the value. But both of those elements are rather small, and I have found no way to increase their size within the KPI box. For those without CVD, the color shown immediately communicates the state without me having to concentrate on the text within the KPI box. Color is a preattentive attribute. I notice it without conscious effort. But I need to be conscious of my audience’s ability to do the same with my color choices.
The video below shows what Power BI KPIs look like through various types of CVD.
If you work at a small company where you know no one is colorblind, or if you use Power BI for personal projects only, you may not care. But if you are an organization that makes data viz for the general public, I encourage you to be aware of what your audience sees and adjust your color choices so you are effectively communicating to your entire audience.
Blue and orange is a recommended palette to avoid common CVD issues. Unfortunately, I can’t change the Power BI KPIs to blue and orange, nor can I change the intensity of the hues to make them more easily distinguishable. If you agree that this functionality should be available in Power BI KPI boxes and would like to make your voice heard, please vote for my suggestion at:
Microsoft watches the suggestions on that site and has already implemented several of them once they were shown to be popular. Please help me out and spare a vote or two for the color vision deficient among us.
Update (25 Apr 2016): Microsoft saw my request and has already committed to getting this functionality into an upcoming release! You can see Miljan Braticevic’s comment below, and the idea is already marked as planned at the link above. That was a quick response!
This week I was asked to create a matrix in a Power BI report that looks like this:
To my surprise, Power BI only lets you put multiple values on columns in a matrix. You can’t stack metrics vertically. Note: this is true as of 8 Jan 2016 but may change in the future. If you agree that this should be a feature in Power BI, please make your voice heard and vote for this idea on the Power BI forum and encourage others to vote for it as well.
Please vote for ‘Matrix Measures or Values on Rows instead of Columns’ in Power BI. It is much needed. https://t.co/Du2TyEb0xU
— Meagan Longoria (@mmarie) January 6, 2016
Since I needed to finish this report, I got a little creative with a disconnected table and some DAX to implement a workaround in my tabular model.
The Disconnected Table
I created a table called Row Order in my tabular model using the following SQL Query as a source.
select 1 as [Row Order], 'Lost Customers' as [Customer Metric]
select 2 as [Row Order], 'Inactive Customers' as [Customer Metric]
select 3 as [Row Order], 'New Customers' as [Customer Metric]
I set the Sort By Column property on the Customer Metric column to Row Order and hid the Row Order column.
I already had the base metrics for my matrix:
- Lost Customers – Week (customers considered lost in the current week, based upon the selected date)
- Lost Customers – Month (customers considered lost in the current month, based upon the selected date)
- Lost Customers – Week (customers considered lost in the current month, based upon the selected date)
- Inactive Customers – Week (customers considered to have moved to inactive status in the current week, based upon the selected date)
- Inactive Customers – Month (customers considered to have moved to inactive status in the current month, based upon the selected date)
- Inactive Customers – Quarter (customers considered to have moved to inactive status in the current quarter, based upon the selected date)
- New Customers – Week (customers who have made their first purchase in the current week, based upon the selected date)
- New Customers – Month (customers who have made their first purchase in the current month, based upon the selected date)
- New Customers – Quarter (customers who have made their first purchase in the current quarter, based upon the selected date)
As a side note, when creating time-based measures I prefer to create the measure such that it accepts a date and then filters off of that rather than assuming everyone wants to see the current week/month/quarter. I then add an Is Today field on my date table so if someone does want to see the current week they easily select that in a filter on this field. This means that I can see lost customers for any week, not just the current. Maybe I want to filter my dashboard to the end of December. If I had just created a measure that assumed I wanted last month, I wouldn’t be able to just change my date filter and get the data I want. As another example, lots of people create a metric to calculate sales in the last 6 weeks. But many people assume they want the last 6 weeks from today rather than allowing the user to choose a date and then calculating sales in the 6 weeks leading up to that day. I much prefer the more flexible calculation.
But having this flexibility in my calculations meant that I couldn’t just put some calculated column for time period in my date dimension. And even if I had made the assumption that I wanted to see the data as of today, it would probably be a separate table or a many-to-many relationship (e.g., 5 Jan 2016 would be in the last week, month, and year).
With my new disconnected table added, I just needed to add a few more measures. First, I needed something to check which customer metric I should be showing:
RowOrder:=Max('Row Order'[Row Order])
I hid the RowOrder measure. Then I created one measure for each time frame (week/month/quarter) that would select the correct measure based upon the Customer Metric on that row.
Weekly Customer:=IF(HASONEVALUE(Row Order[Row Order]), SWITCH([RowOrder],1,[Lost Customers - Week], 2,[Inactive Customers - Week], 3,[New Customers - Week],BLANK()), BLANK())
Monthly Customers:=IF(HASONEVALUE(Row Order[Row Order]), SWITCH([RowOrder],1,[Lost Customers - Month], 2,[Inactive Customers - Month], 3,[New Customers - Month],BLANK()), BLANK())
Quarterly Customers:=IF(HASONEVALUE(Row Order[Row Order]), SWITCH([RowOrder],1,[Lost Customers - Quarter], 2,[Inactive Customers - Quarter], 3,[New Customers - Quarter],BLANK()), BLANK())
Power BI Matrix
I opened my report on Power BI.com and created my matrix.
I put Customer Metric in Rows and Weekly Customers, Monthly Customers, and Quarter Customers in Values. Then I filtered my table to As of Today = “As Of Today”.
That gives me the matrix I wanted to create. I don’t love that I had to add the extra table, but it works for now until I can achieve this natively with Power BI.