Since conditional formatting was released for Power BI, I have seen countless examples of bar charts that have a gradient color fill. If you aren’t careful about the gradient colors (maybe you just used the default colors), you will end up with poor color contrast. Luckily there are a couple of quick (less than 30 seconds for most people to implement) fixes that can improve your color contrast.
Check out this video to get a more detailed explanation of the problem and potential solutions.
If you aren’t the type to watch a longer video, here are my tips:
If you are using the same measure for both the gradient and the bar length, consider whether you even need to have the gradient coloring. It’s not adding information.
If you do need/want the gradient, either because you are encoding a different measure or you just can’t bear to part with it, consider adding a border around the individual bars. Make sure the color contrast from the background and bar colors meet the desired color contrast ratio (usually 3:1).
A couple of weeks ago, I encountered a DAX question that I had not previously considered. They had a situation where there were two paths between two tables: on direct between a fact and dimension and another that went through a different dimension and a bridge table. This could happen in many scenarios:
Sales: There is a salesperson who is directly responsible for a sale. Sales leadership in various positions also get credit for that sale based upon the client or region to which the sale is associated, and this may logically be a many:many relationship (multiple people get credit for the same sale and a person gets credit for multiple sales).
Consulting: Hours are worked and billed by a consultant, but the team lead(s) or tech lead(s) associated with each project also get some sort of credit or responsibility for all hours or invoices on a project.
Law firms: An attorney works directly on a matter and bills for that time. Billing attorneys approve the timesheets or other attorneys get some sort of credit/commission on the work based upon a relationship to the matter.
This scenario creates a data model similar to the below (I went with the sales example here). There is a direct relationship between Sales and Employee. On the sale, that related employee is the person who directly made the sale. But there are others who are associated with that sale because they have some sort of ownership of the client relationship. In this situation, it’s not just the manager of the salesperson – there could be a matrix where several people are involved. To accommodate this, we have a bridge table that relates the client dimension with the employee, and role dimensions. The role dimension might define the role as sales region lead or sales VP or sales product lead, and there might be multiple people in each role.
It’s very common to have role-playing dimensions, where the fact table has multiple columns that can relate to the target dimension. We set one as active and the other as inactive, and we can then use the USERELATIONSHIP() function with CALCULATE().
But this situation is different because it’s not about two direct relationships between two tables. In one path, we are going directly between Sales and Employee. In another path, we go from Sales through Client and Bridge to Employee. As you can see in the image, Power BI allows both of these relationships to be active at the same time. We can still use DAX to switch between the two, but we need to use different functions.
The Data
In my sales table, I have 4 rows:
There are 4 sales. Two sales are related to client 1 and two are related to client 2. Employee 1 made $15 in sales. Employee 2 made $20 in sales.
By default, if I create a table with the sales amount and the employee, it will use the direct relationship between Sales and Employee.
Employee 3 and Employee 4 have relationships to both Client 1 and Client 2 in the bridge table.
The bridge table has a bidirectional relationship with the Client table, and the Employee table has a single-direction 1:many relationship with the bridge table. When we use this path between the tables, we expect Employee 3 and Employee 4 to each have associated sale amounts of $35.
The DAX
I originally wasn’t sure if this could be done. I tried using TREATAS() to switch relationships, but that wasn’t enough to ignore a physical relationship. But I asked around and Ed Hansberry provided a solution using CROSSFILTER().
I started by creating a base measure:
Amt = Sum('Sales'[Sale Amount])
I created the following measure to use the bridge table:
Amt Bridge = CALCULATE([Amt],CROSSFILTER('Employee'[Employee ID], 'Sales'[Sales Worker Emp ID], None))
While I could have just used the [Amt] column for the direct relationship, I tested a measure that is more explicit in ignoring the relationship between the Employee and Bridge tables.
Amt Dir = CALCULATE([Amt], CROSSFILTER('Bridge'[Employee ID], Employee[Employee ID], NONE))
Again, this produces the same result as just using the original [Amt] measure.
CROSSFILTER() was exactly what I needed to accomplish my goal. Setting the CrossFilterType argument to None allows me to ignore an existing active physical relationship when used as a filter inside of CALCULATE().
While a private endpoint and vNets are preferred, sometimes we need to configure Azure SQL Database or Azure Storage to allow use of public endpoints. In that case, an IP-based firewall is used to prevent traffic from unauthorized locations. But Azure Data Factory’s Azure Integration Runtimes do not have a single static IP. So how do we keep the firewall updated so that ADF can access these resources?
One option is to run everything through a self-hosted integration runtime, which can have a static public IP. But that also means you have to install (and keep updated) a Java SDK if you are converting data to Parquet or ORC files.
Another option is to obtain the IP range list published by Microsoft and update the firewall. That is what I did for a recent project.
The list of IP ranges is published weekly as a JSON file. New ranges appearing in the file will not be used in Azure for at least one week. The file name is in the format “ServiceTags_Public_<YYYYmmdd>.json”. It appears that the date in the file always references the Monday of the week of publication.
This means we can use an automation tool to get the information we want from the file each week and update our firewall rules. The Az PowerShell library makes this task achievable with a few lines of code.
Modifying firewall IP ranges with PowerShell
The full script is here. The following is an explanation of each section.
To make my PowerShell script more reusable, I added parameters.
This script logs into Azure using the managed identity of the Azure Automation account. This account needs contributor access on the SQL Server in order to modify the firewall rules. It also needs Storage Account Contributor on the storage account to modify the firewall rules there.
# Ensures you do not inherit an AzContext in your runbook
Disable-AzContextAutosave -Scope Process
# Connect to Azure with system-assigned managed identity
$AzureContext = (Connect-AzAccount -Identity).context
# Set and store context
$AzureContext = Set-AzContext -SubscriptionName $AzureContext.Subscription -DefaultProfile $AzureContext
$sub = Select-azSubscription -SubscriptionName $subscription
Write-Output "Connected to Azure"
To download the file containing the IP service tag IP ranges, we need to dynamically generate the correct file name and then call Invoke-RestMethod. Then we can filter the IP ranges down to only the data factory IPs in our specified region
# Download the service tags for Azure services
$last_week = (Get-Date)
while ($last_week.DayOfWeek -ne 'Monday') { $last_week = $last_week.AddDays(-1) }
$address_prefixes Write-Output "Latest IP ranges acquired"
The downloaded file lists IP ranges for all the service tags, both for overall Azure and for services in specific regions. The service tag we care about is called DataFactory.{region}. So if your data factory is in Central US, you would look for that service tag. Under DataFactory.CentralUS in the screenshot below, we see the array of addressPrefixes. This is the info we need.
I’m working under the assumption that everyone in the organization should be accessing the storage account and SQL server while on the corporate VPN, so I know that anything that doesn’t come from that IP range from ADF is unnecessary. So I’m deleting all the storage firewall rules and SQL firewall rules and then repopulating them.
A couple of things to note: Storage account firewall rules do not have a name, and they accept CIDR notation. Azure SQL firewall rules have a name and expect an IPv4 range. When we delete storage account firewall rules, we must provide the resource group, storage account name, and IP address/range. When we delete SQL firewall rules, we must provide the resource group, server name, and firewall rule name. But the approach is the same for storage and SQL server: get the list of existing rules, loop through and delete each one.
To add the ADF IP ranges, I updated both the storage account and SQL server in the same ForEach loop. You can do them in separate loops if you would like to make sure you complete one and then the other.
$addrct = 0 foreach ($address_prefix in $address_prefixes.values) {
As stated earlier, the storage account firewall accepts CIDR notation, so we can quickly loop through each range we retrieved from the file and add it to the storage account firewall.
Then we need to convert each range to a start and end IP for SQL server. We get the first address in the CIDR range and make that our start IP. In many scripts you will see people add 1 to this address to get the range start, but I tested this, and it actually needs this first IP address to be present and not incremented by 1. To get the end IP, we get the number after the slash, subtract it from 32, take 2 to the power of that resulting number and subtract 1, and add it to the start address.
Azure SQL Server firewall rules require a rule name, so I named mine “ADF Rule {#}” where the number is a variable that is incremented by 1 in each execution of the loop. My rule name for the first firewall rule is “ADF Rule 1”. You could do other things with this, such as adding the date in the name of the rule.
Once I have added the ADF-related firewall rules, I go back and add the static IP addresses for the organization.
If you know you have more than 2 simple IP addresses to add back at the end, you could create an array of IP addresses and loop through them. I decided to keep my script simple here, but you should do what balances reusability with maintainability in your scenario.