Month: February 2015

Improving Performance in Excel and Power View Reports with a Power Pivot Data Source

On a recent project at work, I ran into some performance issues with reports in Excel built against a Power Pivot model. I had 2 Power Views and 2 Excel pivot table reports, of which both Excel reports were slow and one Power View was slow. I did some research on how to improve performance to make the reports return results more quickly, and I found some great resources that I’ve compiled here in the hopes that they help you as well.

Purposeful Sorting of Source Queries

Alberto Ferrari suggests that you check the sorting and design of the tables you import into Power Pivot.  It seems that sort order does not significantly affect the final size of the data in Power Pivot, but it can affect performance when querying the model. I believe the goal is to order the columns based upon the number of distinct values ascending. I didn’t see a measurable improvement in querying my Power Pivot model after implementing this, but the benefits realized probably depend a lot on the type and size of your data.

Power Pivot Data Source Tips

André Van de Graaf has a great list of Power Pivot tips on his blog.  My favorites are:

  • Use views to import data from a database.
  • Import only the columns that are useful and analytically relevant.
  • Reduce the number of rows to import by including only those you are going to analyze (e.g.: perhaps you only need the last two years of data instead of five).

I used database views to specify only the columns and rows I needed in addition to performing any calculations that could be done on the row-level data.  Putting the calculation in the SQL view is often more efficient (and easier for DAX beginners) than creating a calculated column in a Power Pivot table.

Query Performance and Slicers

Power Pivot Pro has a good blog post about report performance when using slicers.  It includes some great tips:

  • Avoid using slicers for fields with many distinct values or fields that come from very large (usually fact/measures) tables.
  • Reduce the number of slicers.  If you can’t reduce the number of slicers, consider turning off cross-filtering on the slicers.
  • Reduce the number and complexity of measures.

It turns out that slicers issue 2 queries each: one to get the list of values and one to determine if those values have related data in the connected pivot table(s).  Slicers are a wonderful usability improvement for report consumers, and I think you should use them in place of report filters whenever it makes sense.  But they can slow performance to an unacceptable level. This is why you may need to remove a couple of slicers from a report that has several. Or you might just turn off cross-filtering.  Since the query for the pivot table basically has to be executed to determine which of the slicer values have related data, simplifying the measures and calcs used in your pivot table helps as well.

My Power Pivot model contained sales data by product. Those products were segmented into 3 product categories.  I had a product category slicer in my report for which I turned off cross-filtering.  With only 3 values, the cross-filtering wasn’t as valuable as the performance improvement achieved by turning it off.

Performance Tuning of Tabular Models

I also re-read the white paper on performance tuning SSAS tabular. Some of the suggestions weren’t applicable since I didn’t control the server where the Power Pivot model sits (my workbook was stored in Office 365). But the tips on changing the DAX calculations and the model are helpful:

  • Convert calculations into calculated columns.
  • Avoid error handling functions.
  • Detect filters using optimized functions like HASONEVALUE() and HASONEFILTER() rather than COUNTROWS().
  • Use CALCULATE() instead of FILTER().
  • Decide whether to use ISBLANK() versus ISEMPTY().
  • Use natural hierarchies, if possible.

Calculated columns are computed when the model is processed (refreshed) whereas calculated measures are computed during query execution. So if your queries are slow, you may want to attempt to use a calculated column to simplify a complex calculated measure, which should improve query execution times.

I used HASONEVALUE() in several calculations that dealt with time intelligence. We did a lot of year over year comparisons, so I had formulas to compare a 4 week period to the same 4 week period in the prior year based upon the fiscal calendar.

Qty Sold Last 4 Wks Prior Yr:=if(HASONEVALUE('End Date'[FiscalWeekKey]),
 CALCULATE([Total Quantity Sold], FILTER (ALL ('End Date' ),
 'End Date'[DaySeq] <= MAX ( 'End Date'[dayseqlastyear]) 
 && 'End Date'[dayseq] > Max('End Date'[dayseqlastyear]) - 28)),BLANK())

HasOneValue checks to make sure the row or column in the pivot table where this calculation is used includes only 1 week so that I can be sure I have the correct 4 week period.

This project reinforced the fact that DAX CALCULATE() function is very powerful and usually much faster than FILTER(). I had a few calculations that started out taking minutes to return for a model that had one fact table with about 8 million rows and 3 small dimension tables.  I rewrote the calculations to remove the FILTER() function and just use CALCULATE() with the same criteria in the filter clause, and they returned the correct values in under a second.

Conclusion

In the end, I used a combination of these tips for my project.  The changes that had the biggest impact were:

  • I reduced the number of rows I was importing.  I originally was pulling all data from 1/1/2010 forward, but I changed it to show the current calendar year plus the previous 3 years.
  • I turned off cross-filtering on some of my slicers.
  • I rewrote some calculated measures to use CALCULATE and remove FILTER()

As PowerPivotPro said almost 3 years ago, Slicer cross filtering is very often the #1 source of slowdown. It is still true today, and it will be one of the first places I look when I am faced with performance issues in Excel reports in the future.