Azure, Azure Data Factory, Microsoft Technologies, Power BI

How Many Data Gateways Does My Azure BI Architecture Need?

Computer with lock protecting data

It’s not always obvious when you need a data gateway in Azure, and not all gateways are labeled as such. So I thought I would walk through various applications that act as a data gateway and discuss when, where, and how many are needed.

Note: I’m ignoring VPN gateways and application gateways for the rest of this post. You could set up a VPN gateway to get rid of some of the data gateways, but I’m assuming your networking/VPN situation is fixed at this point and working from there. This is a common case in my experience as many organizations are not ready to jump into Express Route when first planning their BI architecture.

Let’s start with what services may require you to use a data gateway.

You will need a data gateway when you are using Power BI, Azure Analysis Services, PowerApps, Microsoft Flow, Azure Logic Apps, Azure Data Factory, or Azure ML with a data source/destination that is in a private network that isn’t connected to your Azure subscription with a VPN gateway. Note that a private network includes on-premises data sources and Azure Virtual Machines as well as Azure SQL Databases and Azure SQL Data Warehouses that require use of VNet service endpoints rather than public endpoints.  

Luckily, many of these services can use the same data gateway. Power BI, Azure Analysis Services, PowerApps, Microsoft Flow, and Logic Apps all use the On Premises Data Gateway. Azure Data Factory (V1 and V2) and Azure Machine Learning Studio use the Data Factory Self-Hosted Integration Runtime.

On Premises Data Gateway (Power BI et al.)

If you are using one or more of the following:

  • Power BI
  • Azure Analysis Services
  • PowerApps
  • Microsoft Flow
  • Logic Apps

and you have a data source in a private network, you need at least one gateway. But there are a few considerations that might cause you to set up more gateways.

  1. Your services must be in the same region to use the same gateway. This means that your Power BI/Office 365 region and Azure region for your Azure Analysis Services resource must match for them to all use one gateway.  If you have resources in different regions, you will need one gateway per region.
  2. You may want high availability for your gateway. You can create high availability clusters so when one gateway is down, traffic is rerouted to another available gateway in the cluster.
  3. You may want to segment traffic to ensure the necessary resources for certain ad hoc live/direct queries or scheduled refreshes. If your usage and refresh patterns warrant it, you may want to set up one gateway for scheduled refreshes and one gateway for live/direct queries back to any on-premises data sources. Or you might make sure live/direct queries for two different high-traffic models go through different gateways so as not to block each other. This isn’t always warranted, but it can be a good strategy.

Data Factory Self-hosted Integration Runtime

If you are using Azure Data Factory (V1 or V2) or Azure ML with a data source in a private network, you will need at least one gateway. But that gateway is called a Self-hosted Integration Runtime (IR).

Self-hosted IRs can be shared across data factories in the same Azure Active Directory tenant. They can be associated with up to four machines to scale out or provide higher availability. So while you may only need one node, you might want a second so that your IR is not the single point of failure.

Or you may want multiple IRs to boost throughput of copy activities. For instance, copying from an on-premises file server with one IR node is about 195 Megabytes per second (MB/s). But with 4 IR nodes, it can be as fast as 505 MB/s.

Factors that Affect the Number of Data Gateways Needed

The main factors determining the number of gateways you need are:

  1. Number of data sources in private networks (including Azure VNets)
  2. Location of services in Azure and O365 (number of regions and tenants)
  3. Desire for high availability
  4. Desire for increased throughput or segmented traffic

If you are importing your data to Azure and using an Azure SQL DB with no VNet as the source for your Power BI model, you won’t need an On Premises Data Gateway. If you used Data Factory to copy your data from an on-premises SQL Server to Azure Data Lake and then Azure SQL DB, you need a Self-Hosted Integration Runtime.

If all your source data is already in Azure, and your source for Power BI or Azure Analysis Services is Azure SQL DW on a VNet, you will need at least one On-Premises Data Gateway.

If you import a lot of data to Azure every day using Data Factory, and you land that data to Azure SQL DW on a VNet, then use Azure Analysis Services as the data source for Power BI reports, you might want a self-hosted integration runtime with a few nodes and a couple of on-premises gateways clustered for high availability.

Have a Plan For Your Gateways

The gateways/integration runtimes are not hard to install. They are just often not considered, and projects get stalled waiting until a machine is provisioned to install them on. And many people forget to plan for high availability in their gateways. Make sure you have the right number of gateways and IR nodes to get your desired features and connectivity. You can add gateways/nodes later, but you don’t want to get caught with no high availability when it really matters.

Conferences, Data Visualization, Microsoft Technologies, PASS Summit

Join me for the PASS Data Expert Series Feb 7

I’m honored to have one of my PASS Summit sessions chosen to be part of the PASS Data Expert Series on February 7. PASS has curated the top-rated, most impactful sessions from PASS Summit 2018 for a day of solutions and best practices to help keep you at the top of your field. There are three tracks: Analytics, Data Management, and Architecture. My session is in the Analytics track along with some other great sessions from Alberto Ferrari, Jen Underwood, Carlos Bossy, Matt How, and Richard Campbell.

The video for my session, titled “Do Your Data Visualizations Need a Makeover?”, starts at 16:00 UTC (9 AM MT). I’ll be online in the webinar for live Q&A and chat related to the session.

I hope you’ll register and chat with me about data visualizations in need of a makeover on February 7.


Quick Programming Note: New Job!

I started this blog in 2013 during my first consulting job. In 2014, I joined BlueGranite, which is where I have been for the last 4 years. The knowledge gained while working with them has been the inspiration for a lot of the content on this blog. It has truly been a pleasure to work at BlueGranite. Work is fun when your leadership team is trustworthy and transparent and open to feedback, and your coworkers are smart and motivated and want to help you learn. I can’t express how much I appreciate the professional relationships and friendships formed during my time with them.

But an interesting opportunity presented itself to me, and I have decided to take it.

Denny Cherry & Associates logo

Today is my first day working for Denny Cherry & Associates! I’m excited to be working with Denny (B|T), Joey (B|T), Kerry (B|T), Monica (B|T), John (B|T), and Peter (T). I’ve known the consultants at Denny Cherry & Associates for years through the SQL Community. My chats with them at conferences and on twitter demonstrated their incredible SQL and Azure knowledge, and I can’t wait to learn from them. I might even teach them a thing or two about BI/Analytics. I feel very fortunate to work for a company that values and encourages participation in the technical community. There are five Microsoft MVPs (including me), and all 6 consultants are speakers and bloggers. DCAC does a healthy mix of implementations, migrations, health checks, support, and training. They have completed many interesting and impactful Azure projects, which was a big draw for me.

Otherwise, things should be business as usual for me: work from home with the bulldog and a little travel as needed. I’ll continue to blog here in my free time with new coworkers and projects to serve as my inspiration.

Azure, Microsoft Technologies

The Necessary Extras That Aren’t Shown in Your Azure BI Architecture Diagram

When we talk about Azure architectures for data warehousing or analytics, we usually show a diagram that looks like the below.

Modern DW Architecture
Modern DW Architecture

This diagram is a great start to explain what services will be used in Azure to build out a solution or platform. But many times, we add the specific resource names and stop there. If you have built several projects in Azure, you will know there some other things for which you will need to plan. So what’s missing?

Azure Active Directory

Let’s start with Azure Active Directory (AAD). In order to provision the resources in the diagram, your Azure subscription must already be associated with an Active Directory. AAD is Microsoft’s cloud-based identity and access management service. Members of an organization have a user account that can sign in to various services. AAD is used to access Office 365, Power BI, and Dynamics 365, as well as the Azure portal. It can also be used to grant access and permissions to specific Azure resources.

For instance, users who will participate in Data Factory pipeline development must be assigned to the Data Factory Contributor role in Azure. Users can authenticate via AAD to log in to Azure SQL DW. You can use AD Groups to grant permissions to users interacting with various Azure resources such as Azure SQL DW or SQL DB as well as Azure Analysis Services and Power BI. It’s more efficient to manage permissions for groups than to do it for each user. You’ll need a plan to manage group membership.

You may also need to register applications in Azure Active Directory to allow Azure services to authenticate and interact with each other. While the guidance for many services is now to use managed identities, this may not be available for every situation just yet.  

If your organization has some infrastructure on premises, it is likely that they have Active Directory on premises as well. So you will want to make sure you have a solution in place to sync your on-premises and Azure Active Directory.


Virtual Networks (or VNets) allow many types of Azure resources to securely communicate with each other, the internet, and on-premises networks. You can have multiple virtual networks in an Azure subscription. Each virtual network is isolated from other VNets, unless you set up VNet peering.

Some PaaS services such as Azure Storage Accounts, Azure SQL DW, and Azure Analysis Services support Virtual Network Service Endpoints. A common usage scenario is to set up VNets and VNet Service Endpoints to connect resources to on-premises networks. Some organizations prefer to use VNet Service Endpoints instead of public service endpoints, making it so that traffic can only access the resource from within the organization’s local network.

In order to connect a VNet to an on-premises network or another VNet (outside of peering), you’ll need a VPN Gateway. You’ll need to identify the most appropriate type of VPN Gateway: Point-to-Site, Site-to-Site, or Express Route. These offerings differ based on bandwidth, protocols supported, routing, connection resiliency, and SLAs. Pricing can vary greatly based upon your gateway type.

While VNets and VPN Gateways are probably the most common networking resources in Azure, there are many other networking services and related design decisions to consider as you plan an Azure deployment.

Data Gateways

Your BI solution may be entirely in Azure, but if you need to retrieve data from data sources in a private network (on premises or on a VNet), you’ll need a gateway. If you are using Azure Data Factory, you’ll need a Self-hosted Integration Runtime (IR). If the source for your Power BI or Azure Analysis Services model is on a private network, you’ll need an On-Premises Data Gateway. You can use the same gateway to connect to Analysis Services, Power BI, Logic Apps, Power Apps, and Flow. If you will have a mix of Analysis Services and Power BI models sharing the same gateway, you’ll need to make sure that your Power BI region and your AAS region match.

These gateways require that you have a machine inside the private network on which to install them. And if you want to scale out or have failover capabilities, you may need multiple gateways and multiple VMs. So while you may be building a solution in Azure, you might end up with a few on-premises VMs to be able to securely move source data.

Dev and Test Environments

Our nice and tidy diagram above is only showing production. We also need at least a development environment and maybe one or more test environments. You’ll need to decide how to design your dev/test environments. You may want duplicate resources in a separate resource group; e.g. a dev resource group that contains a dev Data Factory, a dev Azure SQL DW, a dev Azure Analysis Services, etc. While separating environments by resource group is common, it’s not your only option. You will need to decide if you prefer to separate environments by resource group, subscription, directory, or some combination of the three.

ARM templates and PowerShell are very useful for deploying updates and creating new environments. Also, take a look at Azure Blueprints.

You’ll also want to investigate ways to keep the costs of non-prod environments down via smaller-sized resources or pausing or deleting resources where applicable.

Plan For, Don’t Worry About, the Extras

There are several other ancillary Azure services that could/should be part of your solution.

  • For source control, GitHub and Azure Repos have the easiest integration, especially with Azure Data Factory. You’ll not only want source control for things like database projects and Data Factory pipelines, but also possibly for ARM templates and PowerShell scripts used to deploy resources (think: infrastructure as code).
  • You’ll want to set up Azure Monitoring, including alerts to let you know when processes and services are not running as intended.
  • If you want more cost management support than just setting a spending limit on a subscription (if it is available for your subscription type), it may be helpful to set budgets in Azure so you can be notified when you hit certain percentages or amounts.

Be sure to think through the entire data/solution management lifecycle. You may want to include extra services for cataloging, governing, and archiving your data.

This may sound like a complex list, but these resources and services are fairly easy to work with. Azure Active Directory has a user-friendly GUI in the portal, and commands can be automated with PowerShell, requiring relatively little effort to manage users and groups to support your BI environment. VNets and VPN Gateways are a little more complex, but there are step-by-step instructions available for many tasks in the Microsoft Docs. The Power BI Gateway and ADF IR have quick and easy GUI installers that just take a few minutes. You can automate Azure deployments with Azure Pipelines or PowerShell scripts.

None of these things are really that awful to implement or manage in Azure, unless you weren’t aware of them and your project is paused until you can get them set up.

Is there anything else you find is commonly left out when planning for BI solutions in Azure? Leave me a comment if you would like to add to the list.

Update (1/26/19): Helpful readers have commented on other aspects of your Azure BI architecture they felt were often overlooked:

  • Make sure you have a plan for how to process your Azure Analysis Services model. Will you use Azure Automation? Call an Azure Function from Data Factory?
  • Be sure to organize and tag your resources appropriately to help you understand and control costs.
  • Don’t forget Azure Key Vault. This will help you keep keys and passwords secure.

(Thanks to Chad Toney, Shannon Holck, and Santiago Cepas for these suggestions.)

Accessibility, Data Visualization, Microsoft Technologies, Power BI

Tab Order Enhances Power BI Report Accessibility

Confused starLike a diamond in the sky.
How I wonder what you are!
Twinkle, twinkle, little star,
Twinkle, twinkle, little star,
Up above the world so high,
How I wonder what you are!


You were probably expecting a different order. Order is an important element when singing a song or telling a story or explaining information.

In western cultures we tend to read left to right, top to bottom, making a Z-pattern. This applies to books and blogs as well as reports and data visualizations. But if you are using the keyboard to navigate in a Power BI report, the order in which you arrive at visuals will not follow your vision unless you set the new tab order property. If you have low or no vision, this becomes an even bigger issue because you may not be able to see that you are navigating visuals out of visual order because the screen reader just reads whatever comes next.

It takes effort to consume each visual, and many visuals need the context of the other visuals around them to be most useful. When we present information out of order, we are putting more cognitive load on our users, forcing them to hold information in their limited working memory until they arrive at another visual that helps put the pieces together to make sense.

What is Tab Order?

Tab order is the order in which users interact with the items on a page using the keyboard. Generally, we want tab order to be predictable and to closely match the visual order on the page (unless there is a good reason to deviate). If you press the tab key in a Power BI report, you might be surprised at the seemingly random order in which you move from visual to visual on the page. The default order is the order in which the visuals were placed on the page in Power BI Desktop, or the last modified order in if you have edited your report there.

I wrote about the issues with tab order in Power BI back in February and posted an idea for it. So I’m quite happy to see it come to fruition. Not only does it increase usability and accessibility, it also helps meet WCAG Success Criterion 2.4.3: Focus Order (accessibility compliance guidelines, for those who do not geek out on this stuff).

How To Set Tab Order In Power BI Desktop

To set the tab order of visuals on a report page in Power BI Desktop, go to the View tab, open the Selection Pane and select Tab Order at the top of the Selection Pane.


From there, you can move visuals up and down in order, or hide them from tab order completely. This is helpful if you have decorative items on the page whose selection has no value to the user.

To change the tab order, you can either drag an item to a new position in the list, or you can select the item and click the up or down arrows above the list.


In case you missed it, slicers are now keyboard accessible. If you would like users to select values in slicers before using the other visuals on the page, make sure to put the slicers early in the tab order.

It only takes a minute to set the tab order, but it greatly increases usability for keyboard users.

Data Visualization, Microsoft Technologies, PASS Summit, Power BI

Power BI Visual Usability Checklist

At PASS Summit, I presented a session called “Do Your Data Visualizations Need a Makeover?”. In my session I explained how we often set ourselves up for failure when conducting explanatory data visualization before we ever place a visual on the page by not preparing appropriately, and I provided tips to improve. I also gave examples of visual design mistakes I see often. I polled the audience, and they shared some mistakes that they had seen often or that really bothered them. If you missed my presentation, you can watch it on PASS TV or Youtube.

As a companion to my presentation, I created the Power BI Visualization Usability Checklist. For those who are new to data visualization in Power BI, or those that want to employ some type of quality check, I think this is a good place to start. I occasionally do data viz makeover engagements to help people create a report that is more engaging and more widely adopted. This list draws from that experience as well as the tweaks I find myself making to my own Power BI reports. And now I have added a few things that my PASS Summit audience mentioned – thanks to those who shared their suggestions and experiences!

I’m not here to tell you to always use a certain color theme or font, or that everything should be a bar chart. Data visualization is situational and dependent upon your intended audience. I hope I can encourage you to consider your audience, how they take in information, and what information they are looking for.

This checklist provides guidelines to help make sure your report communicates your intended message in a way that works for your intended audience. It has two pages. The Data Viz Usability Checklist page contains the main checklist for you to use while building or reviewing a Power BI report.

Screenshot of Power BI Visualization Usability Checklist

The Data Viz Usability Concepts page gives you quick definitions and links for further reading about the underlying design concepts that inform my list.

Screenshot of Power BI Visualization Usability Concepts

Download the checklist here. I also have a checklist for accessibility in Power BI reports which you can find here.

If you have a suggestion to add to either list, please leave me a comment!


Microsoft Technologies, PASS Summit, Personal

Learning Better Presentation Skills (T-SQL Tuesday #108)

TSQLTuesdayThis month’s T-SQL Tuesday is hosted by Malathi Mahadevan (@SqlMal). The topic is to pick one thing I would like to learn that is not SQL Server.

I’m going to go a different direction than I think most people will. I spend a lot of time learning new technologies in Azure, but I am also focusing on learning better presentation skills and improving my use of related technologies. Yes, that often means PowerPoint. But sometimes I do presentations directly in Power BI when I am presenting data or mostly doing demos. Building presentations requires tech, design, and speaking skills. And I enjoy that mix.

I enjoy presenting at user groups and conferences, and lately I’ve been branching out in the types of presentations I give. At PASS Summit this year, I delivered a pre-con and a general session, and I participated in a panel and the BI Power Hour.  Each one required a slightly different presentation style.

Just like we may cringe when we go back and look at old code and wonder what we were thinking, I have the same reaction when I go back and look at old presentations.

As a data viz person, you would think I would be better at building engaging presentations since a lot of the data viz concepts apply to visual presentation, but it’s still a struggle and a constant endeavor to improve.  I have made some progress to date. Below is a sample from a presentation on good report design practices in SSRS that I gave in 2015.

Old Presentation

While it’s not the worst slide I’ve ever seen, it’s definitely not the best. Here are a few slides from my PASS Summit presentation called “Do Your Data Visualizations Need A Makeover?”, which cover the same topic as the above slide.


The new slides are much more pleasant and engaging. I have changed to this style based upon what I have learned so far from Echo Rivera’s blog and her Countdown to Stellar Slides program. I use more and larger images and less text on each slide. This naturally leads to having more slides, but I spend less time on each one. And there is less of a chance that I will revert to just reading from a slide since there is just less text. I get to tell you about what that one sentence means to me.

My goals is to learn how to deliver presentations that are more accessible and more engaging.

I blogged about accessible slide templates earlier this year. I got interested in accessibility when I was learning how to make more accessible Power BI reports.  I want people to feel welcome and get something out of my talks, even if they have visual, auditory, or information processing challenges. So far, what I have learned is that I can be more inclusive with a handful of small changes.

To continue my learning about presentation delivery, I plan to:

And of course, I plan to give presentations so I can try out what I learn and improve from there. I have already submitted to SQLSaturday Colorado Springs, and I’m sure I will add more presentations next year.

If you have resources that have been particularly helpful in improving your presentation delivery, please leave them in the comments.

Happy T-SQL Tuesday!