Data Visualization, Microsoft Technologies, Power BI

Violin Plots in Power BI

In case you aren’t familiar, I would like to introduce you to the violin plot.

A violin plot is a nifty chart that shows both distribution and density of data. It’s essentially a box plot with a density plot on each side. Box plots are a common way to show variation in data, but their limitation is that you can’t see frequency of values. In other words, you can see statistics such as min, max, median, mean, or quartiles, but you can’t see the individual values nor how often they occurred.

example box plot
Example box plot showing min, max, median, and quartiles

The violin plot overcomes this limitation (by adding the density plot) without taking up much more room on the canvas.

In Power BI, you can quickly make a violin plot using a custom visual. The Violin Plot custom visual (created by Daniel Marsh-Patrick) has many useful formatting options. First, you can choose to turn off the box plot and just show the density plot. Or you can choose to trade the box plot for a barcode plot.

box plot with bar code plot
Box plot with barcode plot in Power BI

Formatting the Violin Plot

There are several sections of formatting for this visual. I’ll call out a few important options here. First, the Violin Options allow you to change the following settings related to the density plot portion of the violin plot.

Formatting options for the density plot in the violin plot.

Inner padding controls the space between each violin. Stroke width changes the width of the outline of the density plot. The sampling resolution controls the detail in the outline of the density plot. Check out Wikipedia to learn more about the kernel density estimation options.

The Sorting section allows you to choose the display order of the plots. In the example above, the sort order is set to sort by category. You can then choose whether the items should be sorted ascending or descending.

Sorting options for the violin plot in Power BI

Next you can choose the color and transparency of your density plot. You have the ability to choose a different color for each plot (violin), but please don’t unless you have a good reason to do so.

Data colors controls the color of the density plot

The Combo Plot section controls the look of the bar code plot or box plot. Inner padding determines the width of the plot. Stroke width controls the width of the individual lines in the bar code plot, or the outline and whiskers in the box plot. You can change the box or bar color in this section. For the barcode plot, you can choose whether you would like to show the first and third quartiles and the median the color, line thickness, and line style of their markers.

Also make sure to check out the Tooltip section. It allows you to show various statistics in the tooltip without having to calculate them in DAX or show them elsewhere in the visual.

Violin Plot Custom Visual Issues & Limitations

This is a well designed custom visual, but there are a couple of small things I hope will be enhanced in the future.

  1. The mean and standard deviation in the tooltip are not rounded to a reasonable amount of digits after the decimal.
  2. The visual does not seem to respond to the Show Data keyboard command that places data in a screen reader friendly table.

As always, make sure to read the fine print about what each custom visual is allowed to do. Make sure you understand the permissions you are granting and that you and your organization are ok with them. For example, I used public weather data in my violin plot, so I had no concerns about sending the data over the internet. I would be more cautious if I were dealing with something more sensitive like patient data in a hospital.

Introducing the Violin Plot to Your Users

I think violin plots (especially the flavor with the bar code plot) are fairly easy to read once you have seen one, but many people may not be familiar with them. In my weather example above, I made an extra legend to help explain what the various colors of lines mean.

Another thing you might consider is adding an explainer on how to read the chart. I used a violin plot with a coworker who does not nerd out on data viz to show query costs from queries executed in SQL Server, and I added an image that explains how to read the chart.

Example explanation of how to read a violin plot

After all, we use data visualization to analyze and present data effectively. If our users don’t understand it, we aren’t doing our job well.

Have you used the violin plot in Power BI? Leave me a comment about what kind of data you used it with and how you liked the resulting visual.

Azure, Azure Data Factory, Microsoft Technologies, Power BI

How Many Data Gateways Does My Azure BI Architecture Need?

Computer with lock protecting data

It’s not always obvious when you need a data gateway in Azure, and not all gateways are labeled as such. So I thought I would walk through various applications that act as a data gateway and discuss when, where, and how many are needed.

Note: I’m ignoring VPN gateways and application gateways for the rest of this post. You could set up a VPN gateway to get rid of some of the data gateways, but I’m assuming your networking/VPN situation is fixed at this point and working from there. This is a common case in my experience as many organizations are not ready to jump into Express Route when first planning their BI architecture.

Let’s start with what services may require you to use a data gateway.

You will need a data gateway when you are using Power BI, Azure Analysis Services, PowerApps, Microsoft Flow, Azure Logic Apps, Azure Data Factory, or Azure ML with a data source/destination that is in a private network that isn’t connected to your Azure subscription with a VPN gateway. Note that a private network includes on-premises data sources and Azure Virtual Machines as well as Azure SQL Databases and Azure SQL Data Warehouses that require use of VNet service endpoints rather than public endpoints.  

Luckily, many of these services can use the same data gateway. Power BI, Azure Analysis Services, PowerApps, Microsoft Flow, and Logic Apps all use the On Premises Data Gateway. Azure Data Factory (V1 and V2) and Azure Machine Learning Studio use the Data Factory Self-Hosted Integration Runtime.

On Premises Data Gateway (Power BI et al.)

If you are using one or more of the following:

  • Power BI
  • Azure Analysis Services
  • PowerApps
  • Microsoft Flow
  • Logic Apps

and you have a data source in a private network, you need at least one gateway. But there are a few considerations that might cause you to set up more gateways.

  1. Your services must be in the same region to use the same gateway. This means that your Power BI/Office 365 region and Azure region for your Azure Analysis Services resource must match for them to all use one gateway.  If you have resources in different regions, you will need one gateway per region.
  2. You may want high availability for your gateway. You can create high availability clusters so when one gateway is down, traffic is rerouted to another available gateway in the cluster.
  3. You may want to segment traffic to ensure the necessary resources for certain ad hoc live/direct queries or scheduled refreshes. If your usage and refresh patterns warrant it, you may want to set up one gateway for scheduled refreshes and one gateway for live/direct queries back to any on-premises data sources. Or you might make sure live/direct queries for two different high-traffic models go through different gateways so as not to block each other. This isn’t always warranted, but it can be a good strategy.

Data Factory Self-hosted Integration Runtime

If you are using Azure Data Factory (V1 or V2) or Azure ML with a data source in a private network, you will need at least one gateway. But that gateway is called a Self-hosted Integration Runtime (IR).

Self-hosted IRs can be shared across data factories in the same Azure Active Directory tenant. They can be associated with up to four machines to scale out or provide higher availability. So while you may only need one node, you might want a second so that your IR is not the single point of failure.

Or you may want multiple IRs to boost throughput of copy activities. For instance, copying from an on-premises file server with one IR node is about 195 Megabytes per second (MB/s). But with 4 IR nodes, it can be as fast as 505 MB/s.

Factors that Affect the Number of Data Gateways Needed

The main factors determining the number of gateways you need are:

  1. Number of data sources in private networks (including Azure VNets)
  2. Location of services in Azure and O365 (number of regions and tenants)
  3. Desire for high availability
  4. Desire for increased throughput or segmented traffic

If you are importing your data to Azure and using an Azure SQL DB with no VNet as the source for your Power BI model, you won’t need an On Premises Data Gateway. If you used Data Factory to copy your data from an on-premises SQL Server to Azure Data Lake and then Azure SQL DB, you need a Self-Hosted Integration Runtime.

If all your source data is already in Azure, and your source for Power BI or Azure Analysis Services is Azure SQL DW on a VNet, you will need at least one On-Premises Data Gateway.

If you import a lot of data to Azure every day using Data Factory, and you land that data to Azure SQL DW on a VNet, then use Azure Analysis Services as the data source for Power BI reports, you might want a self-hosted integration runtime with a few nodes and a couple of on-premises gateways clustered for high availability.

Have a Plan For Your Gateways

The gateways/integration runtimes are not hard to install. They are just often not considered, and projects get stalled waiting until a machine is provisioned to install them on. And many people forget to plan for high availability in their gateways. Make sure you have the right number of gateways and IR nodes to get your desired features and connectivity. You can add gateways/nodes later, but you don’t want to get caught with no high availability when it really matters.

Accessibility, Data Visualization, Microsoft Technologies, Power BI

Tab Order Enhances Power BI Report Accessibility

Confused starLike a diamond in the sky.
How I wonder what you are!
Twinkle, twinkle, little star,
Twinkle, twinkle, little star,
Up above the world so high,
How I wonder what you are!

Confused?

You were probably expecting a different order. Order is an important element when singing a song or telling a story or explaining information.

In western cultures we tend to read left to right, top to bottom, making a Z-pattern. This applies to books and blogs as well as reports and data visualizations. But if you are using the keyboard to navigate in a Power BI report, the order in which you arrive at visuals will not follow your vision unless you set the new tab order property. If you have low or no vision, this becomes an even bigger issue because you may not be able to see that you are navigating visuals out of visual order because the screen reader just reads whatever comes next.

It takes effort to consume each visual, and many visuals need the context of the other visuals around them to be most useful. When we present information out of order, we are putting more cognitive load on our users, forcing them to hold information in their limited working memory until they arrive at another visual that helps put the pieces together to make sense.

What is Tab Order?

Tab order is the order in which users interact with the items on a page using the keyboard. Generally, we want tab order to be predictable and to closely match the visual order on the page (unless there is a good reason to deviate). If you press the tab key in a Power BI report, you might be surprised at the seemingly random order in which you move from visual to visual on the page. The default order is the order in which the visuals were placed on the page in Power BI Desktop, or the last modified order in PowerBI.com if you have edited your report there.

I wrote about the issues with tab order in Power BI back in February and posted an idea for it. So I’m quite happy to see it come to fruition. Not only does it increase usability and accessibility, it also helps meet WCAG Success Criterion 2.4.3: Focus Order (accessibility compliance guidelines, for those who do not geek out on this stuff).

How To Set Tab Order In Power BI Desktop

To set the tab order of visuals on a report page in Power BI Desktop, go to the View tab, open the Selection Pane and select Tab Order at the top of the Selection Pane.

TabOrder1TabOrder2

From there, you can move visuals up and down in order, or hide them from tab order completely. This is helpful if you have decorative items on the page whose selection has no value to the user.

To change the tab order, you can either drag an item to a new position in the list, or you can select the item and click the up or down arrows above the list.

TabOrderCHange.gif

In case you missed it, slicers are now keyboard accessible. If you would like users to select values in slicers before using the other visuals on the page, make sure to put the slicers early in the tab order.

It only takes a minute to set the tab order, but it greatly increases usability for keyboard users.

Data Visualization, Microsoft Technologies, PASS Summit, Power BI

Power BI Visual Usability Checklist

At PASS Summit, I presented a session called “Do Your Data Visualizations Need a Makeover?”. In my session I explained how we often set ourselves up for failure when conducting explanatory data visualization before we ever place a visual on the page by not preparing appropriately, and I provided tips to improve. I also gave examples of visual design mistakes I see often. I polled the audience, and they shared some mistakes that they had seen often or that really bothered them. If you missed my presentation, you can watch it on PASS TV or Youtube.

As a companion to my presentation, I created the Power BI Visualization Usability Checklist. For those who are new to data visualization in Power BI, or those that want to employ some type of quality check, I think this is a good place to start. I occasionally do data viz makeover engagements to help people create a report that is more engaging and more widely adopted. This list draws from that experience as well as the tweaks I find myself making to my own Power BI reports. And now I have added a few things that my PASS Summit audience mentioned – thanks to those who shared their suggestions and experiences!

I’m not here to tell you to always use a certain color theme or font, or that everything should be a bar chart. Data visualization is situational and dependent upon your intended audience. I hope I can encourage you to consider your audience, how they take in information, and what information they are looking for.

This checklist provides guidelines to help make sure your report communicates your intended message in a way that works for your intended audience. It has two pages. The Data Viz Usability Checklist page contains the main checklist for you to use while building or reviewing a Power BI report.

Screenshot of Power BI Visualization Usability Checklist

The Data Viz Usability Concepts page gives you quick definitions and links for further reading about the underlying design concepts that inform my list.

Screenshot of Power BI Visualization Usability Concepts

Download the checklist here. I also have a checklist for accessibility in Power BI reports which you can find here.

If you have a suggestion to add to either list, please leave me a comment!

 

Data Visualization, Microsoft Technologies, Power BI

Considerations for Using Layout Images in Power BI

Using layout images in Power BI has become a popular design trend. When I say layout images, I’m referring to background images with shapes around areas where visuals are placed. This is different from the new wallpaper feature that became available in the July release, which can be used to format the grey area outside your report page and extend the main color of background images.

Layout images can help with spacing and alignment within a report and can help create consistency across reports. They can also help create affordances, using consistent layout and design to make it obvious how users should interact with our reports.

I use layout images in some of my reports, but I don’t think they are necessary on every report. There are a couple of things to consider when using layout images.

  1. Don’t let your layout image take the focus away from the data. This can happen due to lack of color contrast or because the color(s) used in the layout image are much more intense than the visuals on your report page.
  2. While we may strive for consistency in report design, especially in larger organizations, we can’t let a layout keep us from creating the most effective visual to communicate the information in our data. If we start with a layout and limit ourselves to only visuals that fit that layout, that constraint may prevent us from creating a better report. If you have identified the chart type you need to communicate your information, but it doesn’t fit in your 3-column layout because the visual needs to be a bit wider, get a new template that accommodates your visual. I would rather see slightly different templates than ineffective chart types or the right chart types but the visuals squished into a space where they don’t fit (hard to read, truncated labels, etc.).

You can make your own layout images or choose one that someone else has created. PowerBI.Tips offers Power BI layouts in the form of Power BI Templates (.PBIT files) with background images set on the report pages where you replace the data source and repopulate the visuals. The templates also contain report themes.

Frederik Hedenstrom has a grid generator where you can set the width, height, columns, spacing, and colors. Then you can download your image and set it as your page background.

When I make my own layouts, I typically just use PowerPoint. Usually, the layout is the last thing I add rather than the first, but you can do what works best for you. This is the process I use to make layout images:

  1. Take a screenshot of my report page and paste it onto a blank PowerPoint slide.
  2.  Draw shapes (usually rectangles or rectangles with rounded corners) over the screenshot where the visuals are placed.
  3. Delete the screenshot from the slide.
  4. Format the slide background and the shapes (alignment, colors, outline, shadow effects, etc.).
  5. Export the slide as an image.
  6. Import the image as the page background. Adjust the image fit and transparency as needed.

Doing it myself is a bit of work, but it means I get exactly the effect I want. And I’ve gotten pretty quick at it now that I have done it several times. But again, you can take shortcuts using the resources mentioned above. Once I have my background layout image set, I check that it isn’t too distracting by asking myself these questions:

  • Is the background more intense than the visuals to where I look at it before I look at the data visualizations?
  • Do my visuals no longer stand out because the background color is too similar to the colors in my visuals?
  • Can I still clearly read my charts, including all titles and labels now that the background image is in place?

To help illustrate my points, take a look at this example report I have been working on.

PBINoLayoutImage
Version 1: No Layout Image

PBILayoutImage
Version 2: Layout Image Applied

PBILayoutImageTooDark
Version 3: Layout Image Too Distracting

In Version 1, there is no layout image. Some might think the report looks a tad bare. While there is a layout of 3 rows, it’s not immediately obvious.

In version 2, I have applied a layout image. It is subtle, using a light gray background color and a soft shadow around the shapes. It emphasizes the 3 rows, which makes sense in this report. In the top row I have the title and some summary numbers. In the middle row, I’m slicing number of reviews and % recommended by high-level categories. In the bottom row I have one visual that slices average and median ratings by a more detailed category.

In version 3, I changed the background to a dark gray/light black.  With that background, the dark color is the thing in the report that stands out most to me, but it provides no information and doesn’t enhance the user experience more than the subtle light version of the layout image.

Final Thoughts

Layout images can be useful. You can save time by using images created by others, but don’t let a layout needlessly constrain your data visualization or distract from the information in your report.

 

Accessibility, Data Visualization, Microsoft Technologies, Power BI

Power BI Report Accessibility Checklist

In many cases, some small changes can go a long way in making your Power BI reports more accessible for users with different abilities. The checklist below lists considerations you should make in your report design to create more inclusive reports. I’ll update this post as new features are released.

Accessibility Checklist

Last Updated: 29-Nov-2018

All Visuals

  • Ensure color contrast between title, axis label, and data label text and the background are at least 4.5:1.
  • Avoid using color as the only means of conveying information. Use text or icons to supplement or replace the color.
  • Replace unnecessary jargon or acronyms.
  • Ensure alt text is added to all non-decorative visuals on the page.
  • Check that your report page works for users with color vision deficiency.

Slicers

  • If you have a collection of several slicers on your report pages, ensure your design is consistent across pages. Use the same font, colors, and spatial position as much as possible.

Textbox

  • Ensure color contrast between font and background are at least 4.5:1.
  • Make sure to put text contents in the alt text box so screen readers can read them.

Visual Interactions

  • Is key information only accessible through an interaction? If so, rearrange your visuals so they are pre-filtered to make the important conclusion more obvious.
  • Are you using bookmarks for navigation? Try navigating your report with a keyboard to ensure the experience is acceptable for keyboard-only users.

Sort Order

  • Have you purposefully set the sort order of each visual on the page? The accessible Show Data table shows the data in the sort order you have set on the visual.

Tooltips

  • Don’t use tooltips to convey important information. Users with motor issues and users who do not use a mouse will have difficulties accessing them.
  • Do add tooltips to charts as ancillary information. It is included in the accessible Show Data table for each visual.

Video

  • Avoid video that automatically starts when the page is rendered.
  • Ensure your video has captions or provide a transcript.

Audio

  • Avoid audio that automatically starts when the page is rendered.
  • Provide a transcript for any audio.

Shapes

  • Avoid using too many decorative shapes. They are announced by the screen reader when reading the page.
  • When using shapes to call out data points, use alt text to explain what is being called out.

Images

  • When using images to call out data points, use alt text to explain what is being called out.
  • Avoid using too many decorative images. They are announced by the screen reader when reading the page.

Custom Visuals

  • Check the accessible Show Data table for custom visuals. If the information shown is not sufficient, look for another visual.
  • If using the Play Axis custom visual, ensure it does not autoplay. Make it obvious that the user must press the play/pause button to start/stop the changing values.

Across Visuals on the Page

  • Set tab order and turn off tab order on any decorative items.

Power BI Accessibility Features

Tools to Check Accessibility in your Power BI Report

Keyboard Only Navigation

  • Use a keyboard to navigate and interact with your report, without using a mouse.

Color Vision Deficiency

Low Vision

  • Use a mobile device with brightness on low to test mobile reports
  • Use WebAIM or Accessible Colors to check color contrast of text vs background
  • Use The Squint Test to check that a Power BI report makes sense to someone with low vision

If you would like to suggest an update to the list, feel free to leave a comment on this post.

Data Visualization, Microsoft Technologies, Power BI

Choosing a Color Palette For Your Power BI Report

Color is a powerful attribute in data visualization. In a good visualization, it can focus attention and enhance meaning and clarity. When color is used poorly, it creates clutter and confusion. Power BI has a default color palette, but it isn’t always optimal or even appropriate for many reports. Luckily, Power BI allows you to use any color that can be defined by a hex code where visuals allow colors to be changed. With so many choices, choosing a color palette can be overwhelming. Below are some tips to help you choose a good color palette for your Power BI reports.

A color palette is simply a collection of colors applied to the visual elements in your report. What we typically refer to as color is a combination of three main properties: hue (base color on the color wheel), intensity (brightness or gray-ness) and value (lightness or darkness). You can build an engaging and professional looking report with just 6 colors. It’s possible to have fewer colors or more colors, but 6 should cover many common visualization needs. If you are using more than 6 colors, you might want to check that you are optimizing engagement and cognitive load.

  1. Main color – default color on graphs
  2. Color 2 – used when multiple colors are needed in a graph or report
  3. Color 3 – used when multiple colors are needed in a graph or report and Color 2 has already been used
  4. Highlight color – a color used to highlight important data points to make them stand out from other points on the page
  5. Border color – a light color used for borders on tables and KPIs where necessary
  6. Title color – color used for visual titles and axis labels as appropriate

Example Power BI color palette with 6 colors

While your title and border colors don’t have to be variations of gray, gray is a practical color for these purposes when using a white background. You could also use brown, blue, purple, etc. You just want to ensure that your text color has sufficient contrast from its background and that it isn’t more intense than your data colors. I tend to make my border color a tint of the title color.

This is a good place to define a few terms. A hue is a base color without black, white, or gray added, which you might find on a basic color wheel. A shade is achieved by adding black to any pure hue, making it darker. A tint is achieve by adding white to a pure hue, making it lighter. A tone is achieved by adding gray to a pure hue, making it less saturated and more muted.

I think it’s easiest to start your color palette by choosing the main color. This could be inspired by your corporate color palette or logo, your favorite color, or a color associated with the subject matter of your report. Be aware that color has cultural meaning and conveys emotion, and you want to choose a color that conveys the appropriate tone of your report. This can get tricky, so just try to make sure you don’t choose a main color that has a lot of cognitive dissonance with your subject. For instance, if you are creating a report for a U.S. audience about our gun violence problem, you probably wouldn’t use a light, happy, pastel green as your main color. But you could be fine using a bright red, dark blue, orange, black, or several other colors.

You will want to choose a main color that has a medium intensity.  If it is too bright or too dark, you won’t have any room to use a more intense version of that color to focus attention. And since you want your main, secondary, and tertiary color to be the same intensity, it would feel as if everything on the page were yelling at you if all three colors were bold. If you are starting from a corporate color palette, be aware that most brand color palettes were designed for websites and print collateral, not data visualization. They are usually too intense or bright to serve as your main data visualization colors. But you can use a tint or tone of your corporate colors so your reports stay on brand.

Once you have chosen your main color, you need to decide what type of color scheme you would like to use. Common options include:

  • monochromatic – tints and shades of a single hue
  • complementary – colors that are opposite each other on the color wheel
  • analogous – colors that are next to each other on the color wheel
  • split complementary – a base color plus two colors that are adjacent to its complement on the color wheel
  • triadic – colors that are evenly spaced on the color wheel

Example color schemes from ColorHexa.com

My current favorite tool for choosing colors is ColorHexa. It provides hex colors, color schemes (as shown above), tints, shades, and tones.

Once you have made some initial color choices, test it out on a few charts to ensure you can answer yes to the following questions:

  • Are all colors easily distinguishable from each other? If you were to use the main, secondary, and tertiary color in a line chart, could you easily follow the lines as they cross each other?
  • Is your color palette color blind friendly? You can use ColorHexa or Coblis to check this. It’s not always obvious when you have people with color vision deficiency in your intended audience, so it’s better to use colors that are easily distinguishable by those with red-green color blindness (deuteranomaly, deuteranopia, protanomaly, and protanopia).
  • Does your highlight color have high contrast from your other colors so it is obvious that it is being used to draw attention to a trend or data point?
  • If you are using a non-white background color, do your colors stand out sufficiently from your background?
  • When you look at your color choices, do you find the combination generally appealing, balanced, and not overly jarring? This is a bit subjective, but if you look at your colors and have a negative reaction then your audience will probably have a similar reaction.

Finally, be aware that colors display differently on different screens and surfaces. You can put a lot of time and effort into choosing the perfect colors, then share a report with someone and have it look rather different on their monitor or when viewed on a projector screen. If you can, review your colors on the equipment that your intended audience will most commonly use to make sure it looks good for them.

If you are still having trouble choosing colors, you can check out the Power BI Report Theme Gallery for some inspiration. Not every example in the gallery shows good color choices, but you can still use it to get ideas.

Once you have your color palette, you can reuse it in future reports by making a report theme. And if you aren’t a fan of manually writing JSON for your report theme, check out the Report Theme Generator from PowerBI.Tips. If you define your colors in a report theme, Power BI will create tints and shades of your colors for you, saving you the trouble of having to look them up yourself.

Power BI Color Picker

If you have advice to help others choose colors, leave a comment on this post or tweet me.