Accessibility, Data Visualization, Microsoft Technologies, Power BI

Your gradient fill bar charts in Power BI have poor color contrast, but you can fix them

Since conditional formatting was released for Power BI, I have seen countless examples of bar charts that have a gradient color fill. If you aren’t careful about the gradient colors (maybe you just used the default colors), you will end up with poor color contrast. Luckily there are a couple of quick (less than 30 seconds for most people to implement) fixes that can improve your color contrast.

Check out this video to get a more detailed explanation of the problem and potential solutions.

Meagan’s 10-minute video on color contrast issues with gradient color fill in Power BI bar charts shows how to identify and resolve the color contrast issues

If you aren’t the type to watch a longer video, here are my tips:

  1. If you are using the same measure for both the gradient and the bar length, consider whether you even need to have the gradient coloring. It’s not adding information.
  2. If you do need/want the gradient, either because you are encoding a different measure or you just can’t bear to part with it, consider adding a border around the individual bars. Make sure the color contrast from the background and bar colors meet the desired color contrast ratio (usually 3:1).

Were you aware we can now put borders on the individual bars in our bar charts? If you’d like some practice with this formatting option, check out the Workout Wednesday for Power BI exercise from 2023 Week 52.

DAX, Microsoft Technologies, Power BI

Switching between different active physical relationships in a Power BI model

A couple of weeks ago, I encountered a DAX question that I had not previously considered. They had a situation where there were two paths between two tables: on direct between a fact and dimension and another that went through a different dimension and a bridge table. This could happen in many scenarios:

  • Sales: There is a salesperson who is directly responsible for a sale. Sales leadership in various positions also get credit for that sale based upon the client or region to which the sale is associated, and this may logically be a many:many relationship (multiple people get credit for the same sale and a person gets credit for multiple sales).
  • Consulting: Hours are worked and billed by a consultant, but the team lead(s) or tech lead(s) associated with each project also get some sort of credit or responsibility for all hours or invoices on a project.
  • Law firms: An attorney works directly on a matter and bills for that time. Billing attorneys approve the timesheets or other attorneys get some sort of credit/commission on the work based upon a relationship to the matter.

This scenario creates a data model similar to the below (I went with the sales example here). There is a direct relationship between Sales and Employee. On the sale, that related employee is the person who directly made the sale. But there are others who are associated with that sale because they have some sort of ownership of the client relationship. In this situation, it’s not just the manager of the salesperson – there could be a matrix where several people are involved. To accommodate this, we have a bridge table that relates the client dimension with the employee, and role dimensions. The role dimension might define the role as sales region lead or sales VP or sales product lead, and there might be multiple people in each role.

a Power BI model with 5 tables: role, bridge, client, employee, and sales. Role is related to Bridge. Bridge is related to client and employee. Employee is related to Sales. Client is related to Sales.

It’s very common to have role-playing dimensions, where the fact table has multiple columns that can relate to the target dimension. We set one as active and the other as inactive, and we can then use the USERELATIONSHIP() function with CALCULATE().

But this situation is different because it’s not about two direct relationships between two tables. In one path, we are going directly between Sales and Employee. In another path, we go from Sales through Client and Bridge to Employee. As you can see in the image, Power BI allows both of these relationships to be active at the same time. We can still use DAX to switch between the two, but we need to use different functions.

The Data

In my sales table, I have 4 rows:

a table showing 4 rows of data with columns: Sales Worker Emp ID, Sale Amount, Client. There are 2 clients and two employees in the data.
Sales fact table in the Power BI model

There are 4 sales. Two sales are related to client 1 and two are related to client 2. Employee 1 made $15 in sales. Employee 2 made $20 in sales.

By default, if I create a table with the sales amount and the employee, it will use the direct relationship between Sales and Employee.

A table visual that shows Alice has $15 in sales and Bob has $20 in sales.
Table visual in Power BI that uses the direct relationship between Employee and Sales tables

Employee 3 and Employee 4 have relationships to both Client 1 and Client 2 in the bridge table.

The bridge table contains 4 rows of data. Employees 3 and 4 are related to both clients 1 and 2.
Bridge table relating employees to clients in Power BI

The bridge table has a bidirectional relationship with the Client table, and the Employee table has a single-direction 1:many relationship with the bridge table. When we use this path between the tables, we expect Employee 3 and Employee 4 to each have associated sale amounts of $35.

A table visual showing Sue and Jerry both having $35 in sales.
Table visual in Power BI that uses the the bridge table

The DAX

I originally wasn’t sure if this could be done. I tried using TREATAS() to switch relationships, but that wasn’t enough to ignore a physical relationship. But I asked around and Ed Hansberry provided a solution using CROSSFILTER().

I started by creating a base measure:

Amt = Sum('Sales'[Sale Amount])

I created the following measure to use the bridge table:

Amt Bridge = CALCULATE([Amt],CROSSFILTER('Employee'[Employee ID], 'Sales'[Sales Worker Emp ID], None))

While I could have just used the [Amt] column for the direct relationship, I tested a measure that is more explicit in ignoring the relationship between the Employee and Bridge tables.

Amt Dir = CALCULATE([Amt], CROSSFILTER('Bridge'[Employee ID], Employee[Employee ID], NONE))

Again, this produces the same result as just using the original [Amt] measure.

CROSSFILTER() was exactly what I needed to accomplish my goal. Setting the CrossFilterType argument to None allows me to ignore an existing active physical relationship when used as a filter inside of CALCULATE().

The two table visuals shown next to each other where the direct relationship to sales and the indirect relationship to sales are shown on one report page in Power BI.
Azure, Azure Data Factory, Azure SQL DB, Azure Storage, Microsoft Technologies, PowerShell

Update Azure SQL database and storage account public endpoint firewalls with Data Factory IP ranges

While a private endpoint and vNets are preferred, sometimes we need to configure Azure SQL Database or Azure Storage to allow use of public endpoints. In that case, an IP-based firewall is used to prevent traffic from unauthorized locations. But Azure Data Factory’s Azure Integration Runtimes do not have a single static IP. So how do we keep the firewall updated so that ADF can access these resources?

One option is to run everything through a self-hosted integration runtime, which can have a static public IP. But that also means you have to install (and keep updated) a Java SDK if you are converting data to Parquet or ORC files.

Another option is to obtain the IP range list published by Microsoft and update the firewall. That is what I did for a recent project.

The list of IP ranges is published weekly as a JSON file. New ranges appearing in the file will not be used in Azure for at least one week. The file name is in the format “ServiceTags_Public_<YYYYmmdd>.json”. It appears that the date in the file always references the Monday of the week of publication.

This means we can use an automation tool to get the information we want from the file each week and update our firewall rules. The Az PowerShell library makes this task achievable with a few lines of code.

Modifying firewall IP ranges with PowerShell

The full script is here. The following is an explanation of each section.

To make my PowerShell script more reusable, I added parameters.

# Input bindings passed in via param block

Param
(
    [Parameter (Mandatory= $false)]
    [String] $storageRG = "storageRG",

    [Parameter (Mandatory= $false)]
    [String] $sqlRG = "sqlRG",

    [Parameter (Mandatory= $false)]
    [String] $storageacctname = "stgacct",

    [Parameter (Mandatory= $false)]
    [String] $sqlservername = "sqldev",

    [Parameter (Mandatory= $false)]
    [String] $region = "EastUS",

    [Parameter (Mandatory= $false)]
    [String] $subscription = "mysub"
)

This script logs into Azure using the managed identity of the Azure Automation account. This account needs contributor access on the SQL Server in order to modify the firewall rules. It also needs Storage Account Contributor on the storage account to modify the firewall rules there.


# Import needed libraries
Import-Module Az.Storage
Import-Module Az.Sql
Import-Module Az.Resources

# Ensures you do not inherit an AzContext in your runbook
Disable-AzContextAutosave -Scope Process

# Connect to Azure with system-assigned managed identity
$AzureContext = (Connect-AzAccount -Identity).context

# Set and store context
$AzureContext = Set-AzContext -SubscriptionName $AzureContext.Subscription -DefaultProfile $AzureContext
$sub = Select-azSubscription -SubscriptionName $subscription
Write-Output "Connected to Azure"

To download the file containing the IP service tag IP ranges, we need to dynamically generate the correct file name and then call Invoke-RestMethod. Then we can filter the IP ranges down to only the data factory IPs in our specified region

# Download the service tags for Azure services

$last_week = (Get-Date)

while ($last_week.DayOfWeek -ne 'Monday') {
$last_week = $last_week.AddDays(-1)
}

$last_week = $last_week.ToString("yyyyMMdd")

$url = 'https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_' + $last_week + '.json'

$ip_ranges = Invoke-RestMethod $url -Method 'GET'

# Filter to only ADF in specified region

$STagName = "DataFactory.${region}"

$address_prefixes = $ip_ranges.values `| Where-Object {$_.name -eq $STagName} `| Select-Object -ExpandProperty properties ` | Select-Object -ExpandProperty addressPrefixes

$address_prefixes = $address_prefixes `
| Where-Object { $_ -notmatch ":" } `
| ForEach-Object { @{ ipAddressOrRange = $_ } }

$address_prefixes
Write-Output "Latest IP ranges acquired"

The downloaded file lists IP ranges for all the service tags, both for overall Azure and for services in specific regions. The service tag we care about is called DataFactory.{region}. So if your data factory is in Central US, you would look for that service tag. Under DataFactory.CentralUS in the screenshot below, we see the array of addressPrefixes. This is the info we need.

I’m working under the assumption that everyone in the organization should be accessing the storage account and SQL server while on the corporate VPN, so I know that anything that doesn’t come from that IP range from ADF is unnecessary. So I’m deleting all the storage firewall rules and SQL firewall rules and then repopulating them.

#delete storage firewall rules

$SIPs = (Get-AzStorageAccountNetworkRuleSet `
   -ResourceGroupName $storageRG `
   -Name $storageacctname).IpRules.IPAddressOrRange

foreach ($IPR in $SIPs) {
Remove-AzStorageAccountNetworkRule `
   -ResourceGroupName $storageRG `
   -Name $storageacctname `
   -IPAddressOrRange $IPR
}
Write-Output "Storage firewall rules removed"


#delete sql firewall rules

$FRules = Get-AzSqlServerFirewallRule `
   -ResourceGroupName $sqlRG `
   -ServerName $sqlservername

foreach ($FRule in $FRules) {
Write-Output "Removing " + $Frule.FirewallRuleName
Remove-AzSqlServerFirewallRule -ServerName $sqlservername `
    -ResourceGroupName $sqlRG `
    -FirewallRuleName $Frule.FirewallRuleName
}
Write-Output "SQL firewall rules removed"

A couple of things to note: Storage account firewall rules do not have a name, and they accept CIDR notation. Azure SQL firewall rules have a name and expect an IPv4 range. When we delete storage account firewall rules, we must provide the resource group, storage account name, and IP address/range. When we delete SQL firewall rules, we must provide the resource group, server name, and firewall rule name. But the approach is the same for storage and SQL server: get the list of existing rules, loop through and delete each one.

To add the ADF IP ranges, I updated both the storage account and SQL server in the same ForEach loop. You can do them in separate loops if you would like to make sure you complete one and then the other.

$addrct = 0
foreach ($address_prefix in $address_prefixes.values) {

# Add rule to storage account firewall
Add-AzStorageAccountNetworkRule `
   -ResourceGroupName $storageRG `
   -Name $storageacctname `   -IPAddressOrRange $address_prefix

# Add rule to sql server firewall
$addrct = $addrct + 1
$RuleName = "ADF Rule " + $addrct.ToString()

#Convert CIDR to IPV4 start and end
$StrNetworkAddress = ($address_prefix.split("/"))[0]
$NetworkIP = `
   ([System.Net.IPAddress]$StrNetworkAddress).GetAddressBytes()
[Array]::Reverse($NetworkIP)
$NetworkIP = ([System.Net.IPAddress]($NetworkIP `
    -join ".")).Address
$StartIP = $NetworkIP

If (($StartIP.Gettype()).Name -ine "double")
{
$StartIP = [Convert]::ToDouble($StartIP)
}
$StartIP = [System.Net.IPAddress]$StartIP

[int]$NetworkLength = ($address_prefix.split("/"))[1]
$IPLength = 32-$NetworkLength
$NumberOfIPs = ([System.Math]::Pow(2, $IPLength)) -1
$EndIP = $NetworkIP + $NumberOfIPs

If (($EndIP.Gettype()).Name -ine "double")
{
$EndIP = [Convert]::ToDouble($EndIP)
}
$EndIP = [System.Net.IPAddress]$EndIP

New-AzSqlServerFirewallRule -ResourceGroupName $sqlRG `
    -ServerName $sqlservername `
    -FirewallRuleName $RuleName `
    -StartIpAddress $StartIP.ToString() `
    -EndIpAddress $EndIP.ToString()
}

As stated earlier, the storage account firewall accepts CIDR notation, so we can quickly loop through each range we retrieved from the file and add it to the storage account firewall.

Then we need to convert each range to a start and end IP for SQL server. We get the first address in the CIDR range and make that our start IP. In many scripts you will see people add 1 to this address to get the range start, but I tested this, and it actually needs this first IP address to be present and not incremented by 1. To get the end IP, we get the number after the slash, subtract it from 32, take 2 to the power of that resulting number and subtract 1, and add it to the start address.

Azure SQL Server firewall rules require a rule name, so I named mine “ADF Rule {#}” where the number is a variable that is incremented by 1 in each execution of the loop. My rule name for the first firewall rule is “ADF Rule 1”. You could do other things with this, such as adding the date in the name of the rule.

Once I have added the ADF-related firewall rules, I go back and add the static IP addresses for the organization.

#Static IPs 
$CorpIP = "X.X.X.X"
$CorpIP2 = "X.X.X.X"

#re-add static IP to Storage
Add-AzStorageAccountNetworkRule `
  -ResourceGroupName $storageRG `
  -Name $storageacctname `
  -IPAddressOrRange $CorpIP

Add-AzStorageAccountNetworkRule `
  -ResourceGroupName $storageRG `
  -Name $storageacctname `
  -IPAddressOrRange $CorpIP2

#re-add static IP to SQL

New-AzSqlServerFirewallRule `
  -ResourceGroupName $sqlRG `
  -ServerName $sqlservername `
  -FirewallRuleName "CorpIP" `
  -StartIpAddress $CorpIP `
  -EndIpAddress $CorpIP

New-AzSqlServerFirewallRule `
  -ResourceGroupName $sqlRG `
  -ServerName $sqlservername `
  -FirewallRuleName "CorpIP2" `
  -StartIpAddress $CorpIP2 `
  -EndIpAddress $CorpIP2

If you know you have more than 2 simple IP addresses to add back at the end, you could create an array of IP addresses and loop through them. I decided to keep my script simple here, but you should do what balances reusability with maintainability in your scenario.

You can get the full script on GitHub.

Thanks to Prashanth Kumar and Tao Yang for making their code available on their blogs, which helped me create my solution tailored for ADF IP ranges.