Next week I’m speaking at at the Dynamic Communities Power Up event titled “Exploring the Power BI Ecosystem“. It takes place on May 27 & 28, 2020. This exciting 2-Day virtual event is designed to ensure attendees have a complete view of the Power BI product and surrounding ecosystem, provide expanded knowledge of the core components and showcase the possibilities for continued exploration and innovation.
Sessions during the event are 2.5 hours long, to really give you time to get into a topic. There are healthy 45-minute breaks between sessions to give you time to attend to personal matters. And the sessions are recorded to give you a chance to catch anything you miss. Some sessions, including mine, offer a take-home exercise to help solidify concepts discussed during the session.
I’m presenting Data Visualization and Storytelling on May 28 at 9am EST/1pm UTC. In this session, you will learn how to build eye-catching Power BI reports to support decision making. You will also see the importance and a realistic approach to data storytelling.
The following topics will be showcased through practical examples:
Creating beautiful reports: prioritizing your KPIs, playing with colors, grid
Choosing the best chart to illustrate your point
Introduction to the concept of Data Storytelling
Implementing quality checks on your report design
Implementing navigation in your report: bookmarks, drill-through, page-report tooltips, interactive Q&A
This training is a paid event, but it’s just $399 for the full 2 days. This training is great if you are a beginner-to-intermediate Power BI user trying to round out your skills across the many areas of the Power BI suite. You can head over to the website to register. I hope to see you there!
I had the privilege of working with Tessa Hurr (PM on the Power BI team) on a presentation for the 2020 Microsoft Business Applications Summit (MBAS) about five features in Power BI that increase report accessibility. This 23-minute presentation is almost entirely demos, and only a few slides. While we talk about some features such as alt text and tab order that are primarily used for accessibility purposes, we also talk about how chart titles, header tooltips, and report themes can be used to make your report more accessible.
The conference was entirely online this year, and you can catch the sessions on demand now. I hope you’ll take some time to watch my session as well as the other great content that came from the conference. You can watch my session on the MBAS website.
Data Factory allows parameterization in many parts of our solutions. We can parameterize things such as connection information in linked services as well as blob storage containers and files in datasets. We can also parameterize certain properties in activities. For instance, we can write an expression to determine the stored procedure to be executed in a Stored Procedure Activity or the filename in the sink (destination) of a Copy Activity.
But we cannot parameterize the invoked pipeline in an Execute Pipeline Activity. This means we need to find workarounds in order to have a metadata-driven execution framework. What I mean by metadata-driven execution framework is that data is stored in a datastore (in my case, a SQL Database) and used to determine what pipelines and activities get executed. With this type of framework, if I don’t want a specific pipeline to execute, I would just update my data in the datastore rather than delete the pipeline execution from the parent pipeline. We’ve been doing this type of development in SSIS for years, and Biml has played a big part in that. But SSIS allows us to parameterize the Execute Package Task.
Since we can’t implement this parameterized execution of pipelines natively, we need to look for something that Data Factory can call to accomplish the task. Paul Andrew has a nice framework that uses Azure Functions. I was working on a Data Factory solution for a client who doesn’t have C# or PowerShell developers on hand to help with the ELT process, so we needed to explore a low-code solution.
While there is no Logic App activity in Data Factory, we can use a Web Activity to call the Logic App. I might have a pipeline that looks something like what is pictured below.
Within the ForEach loop is a single Web Activity.
I used some variables and parameters in an expression to populate the URL so it would be dynamic. I used a GET method in the call.
My initial version of my Logic App is shown below.
I added path parameters in my HTTP request trigger to allow me to capture the information I need to execute the appropriate pipeline. For me this included the pipeline name, a data source ID, and a country. Your parameters would vary according to your requirements.
Logic apps has an action called “Create a pipeline run”. You tell it which data factory, which pipeline, and any parameter values needed for the pipeline execution.
At this point in the workflow, our pipeline would be executing. But now we need to know when it has finished. That’s what the Initialize Variable and Until Loop actions are handling. I created a string variable called Pipeline Status and set the default value to “InProgress”. My Until loop action checks my pipeline execution status. If it’s still running, it waits 5 seconds, gets the new status, and assigns that status to the variable. This repeats until the pipeline execution is no longer in progress.
Here’s the expression I used to check whether the pipeline execution is still running:
Once the pipeline execution is complete, an HTTP response with the pipeline status is sent back to the caller.
This is all great until you find out that Logic Apps will experience an HTTP timeout if the request takes more than 2 minutes.
Do you have any pipelines that take longer than two minutes to execute? If so, you need to change your solution to handle this. Note that you would have the same issue with Azure Functions, although it would give you 230 seconds instead of 120 seconds before it timed out. We need to switch to an asynchronous call to support long running pipelines. Paul has already done this in his framework using Azure Functions. In Logic Apps, we can change our response to an asynchronous response and then implement a polling pattern to check the status. We could alternatively implement a webhook action. I’ll write about updating the solution to handle long running pipelines in a future post.
Times are stressful right now. There is an ongoing pandemic affecting people’s health and livelihoods. Schedules are messed up, kids are home. People who aren’t used to working remotely are fumbling through learning how to work from home. And then there are the normal stresses that aren’t taking a break just because there is a pandemic: arguments with family members, home repairs, student loan debt.
I recently read the book Design for Real Life by Eric Meyer and Sara Wachter-Boettcher. I discovered it because I read another book by Sara Wachter-Boettcher,Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech that encourages us to look at the technology around us and consider how it can and has negatively affected people. If you work in analytics or web design, I recommend both books. She approaches design more from the context of user interfaces and web apps, but there is a lot of applicability to data visualization.
What’s a Stress Case?
Design for Real Life encourages us to consider stress cases. The idea behind stress cases is to consider our users as full, complex, sometimes distracted, vulnerable, and stressed-out people. They aren’t just happy personas who always click the right buttons and share all of our knowledge and assumptions.
Certain use cases, often labeled as edge cases, often go unidentified or dismissed as too infrequent. User interface designers, including data visualization designers, need to train ourselves to seek out those stress cases and understand how our design choices affect people in those usage scenarios. This helps us to minimize harm done by our design. But also, as stated in the book,
When we make things for people at their worst, they’ll work that much better when people are at their best.
Design for Real Life
There is a toxic trend in tech that we developers make things for ourselves, valuing the developers and designers over the users. Designing for stress cases helps us change that trend. We need to value our users’ time, understand our biases as designers, and make features that match our users’ priorities. A very applicable part of this for data visualization is to consider all the contexts that usage scenarios might happen.
Stress Cases in Data Visualization
Whether you are designing a visualization to be embedded in an app, included in an online news article, or published in a corporate dashboard, you can design for stress cases.
Let’s think about a corporate HR report built in Power BI that shows a prediction of employee retention. HR and team managers may use this information to help them assign projects or give promotions and raises. This dashboard and the conversation around it may address information related to identity (race, gender, sexual orientation). The use of the dashboard may affect someone’s compensation and job satisfaction.
How can we design for stress cases here?
We should be asking if we have the right (appropriate and validated) data to accomplish our goals, not just using whatever we can get our hands on. We definitely need to consider laws against discrimination that might be applicable to how we use demographic data. We need to consider the cost or harm done if we allow users to see these predictions down to the individual employee rather than an aggregation. And we should always be asking what actions people are taking as a result of using our dashboard. Have we considered any unintended consequences?
We want to use plain, easy to understand language. We want to explain our data sources and (at least high-level) how the predictive algorithm works. And we need to explain how we intend for this dashboard to be used. Basically, we want to be as transparent and easy to understand as possible. This dashboard can affect people’s livelihoods and happiness as well as the operational and financial health of the business. These data points are more than just pretty dots – they are people.
We also want to make sure our dashboard is easy to use. Think about digital affordances. We want it to be clear what happens when someone interacts with it. We can get fancy with bookmarks and editing interactions in a Power BI report. But does our intended user understand what is shown when they click a button that loads a bookmark? Can the intended user easily get what they need? Or do they have to jump through hoops to get to the right page and set the right filters?
Let’s say Sarah is a new manager at an organization that is trying to improve a high attrition rate, and she needs this information for a meeting with her boss and peers. She’s sitting in an online meeting using a 3-year-old tablet with 10 minutes of remaining battery life. Her two-year-old child is running around in the next the room. The group is discussing whether they should let some people go and then try to rebuild their teams. Can Sarah easily pull up the dashboard on her tablet and set the correct filters to see attrition numbers and retention factors for her team while trying to put on a brave face in front of her colleagues as she races against the tablet’s remaining battery time?
Visualizing the Coronavirus
Currently, it seems everyone in the data community wants to visualize the spread of COVID-19. Just look at the Power BI Data Stories Gallery. I get the appeal of using new and relevant data for practice, but consider your audience and the consequences of making it public. Do you trust and understand your data? Have you considered how the chart types and colors you choose can misrepresent the data? Who is it helping for you to publish your visualization out into the world? What message is your visualization sending? Is it unintentionally adding stress for your already stressed-out twitter followers? Most of us are not working with city officials or healthcare practitioners. We are visualizing this “for fun”. People are showing off Power BI/Tableau/D3 skills and not necessarily communicating clearly with purpose.
Your twitter friend/follower Frank is really concerned about the pandemic. His daughter lost her job at a restaurant. He’s worried about lining up projects for the next few months. And his wife is immunocompromised and in a higher risk group for getting COVID-19. He feels scared and helpless.
Your friend Dave is worried about his mother who lives alone a few hours away from him. He is bombarded by messages every day that say older people are more at risk of getting COVID-19. His mom says she is fine, but he wants her to come and stay with him. It’s all he’s thought about all day.
If you know nothing about epidemiology and don’t understand the response from various national, state, and local governments in your data, and you publish your viz with a choropleth (filled) map covered in shades of red, how will that affect Frank and Dave and everyone else who reads it? Did you add value to the COVID-19 conversation, or just increase confusion and fear?
To sum it up — #vizresponsibly; which may mean not publishing your visualizations in the public domain at all.
Stress cases can be related to a crisis, mundane technology failures, or just situations that are stressful in the context of the user’s life. If you are visualizing data, remember there are people who may be interacting with our visuals in less than ideal circumstances. We need to design for them as much as for our ideal use cases.
If you have real-life examples of designing for stress cases in data visualization, please share them in the comments or tweet me at @mmarie.
If you’ve never thought about accessibility in data visualization before, here is what I want you to know.
Your explanatory data visualization should be communicating something to your intended audience. You can’t assume people in your intended audience do not have a disability. People with disabilities want to consume data visualizations, too.
We can’t make everything 100% usable for everyone. But that doesn’t mean we should do nothing. Achieving accessibility is a shared responsibility of the tool maker and the visualization designer. There are several things we can do to increase accessibility using any data visualization tool that don’t require much effort. Regardless of the tool you use, you can usually control things like color contrast, keyboard tab/reading order, and removing or replacing jargon.
Accessible design may seem foreign or tedious in the beginning. We tend to design for ourselves because that is the user we understand most. But if we start adding tasks like checking color contrast and setting reading order into our normal design routine, it just becomes habit. Over time, those accessible design habits become easier and more intuitive.
I hope that one day accessible design will just be design. You can be part of that effort, whether you are a professional designer, a database administrator just trying to show some performance statistics, or an analyst putting together a report.
Listen to the podcast for my top 5 things you should do to make your data visualizations more accessible.
I had the pleasure of presenting a full-day pre-conference session on the Friday before SQLSaturday Austin-BI last weekend. I could spend paragraphs telling you how enjoyable and friendly and inclusive the event was. But I’d like to focus on one really cool aspect of my speaking experience: I had both live captioning and sign language interpreters in my pre-con session.
First, let’s talk about the captions. While PowerPoint does have live captions/subtitles, that only works when you are using PowerPoint. When you show a demo or go to a web page, taking PowerPoint off the screen, you lose that ability. So we had a special setup provided by Shawn Weisfeld (Twitter|GitHub).
How the Live Captions Worked
The presenter connects their laptop to the Epiphan Pearl with an HDMI cable so they can send the video (picture) from the laptop. The speaker wears a lavalier microphone, which sends audio to the Pearl. The transcription green screen computer takes audio from the Pearl, sends it to Azure to be transcribed using Cognitive Services, and overlays the returned transcription text on a green screen input that is sent back to the Pearl. The projector gets the combined output of the transcription text and the presenter’s computer video output.
You can see an example of what it looked like from my presentation on Saturday in the tweet below. There are lots more pictures of it on Twitter with the #SqlSatAustinBI hashtag.
While this setup requires a bit more hardware, it worked so well! It took about 10 minutes to get it set up in the morning. As the speaker, I didn’t have to do anything but wear a mic. It transcribed everything I said regardless of what program my laptop was showing. There was very little lag. It seemed to be less than one second between when I would say something and when we would see it on the screen. While I try to speak clearly and slowly, sometimes I slip and fall back into speaking quickly. But the transcription kept up well. Some attendees said it was great to have the captions up on the screen to help them understand what I said when I occasionally spoke too quickly. The captions are placed at the top of the screen, above the image coming from my laptop, so I didn’t have to adjust my slides or anything to allow space for the captions.
The live captions were a big success. They helped not only people who had trouble hearing, but also those who spoke English as a second language and those who weren’t familiar with some of the terms I used and needed to see them spelled out.
Presenting With Sign Language Interpreters
This was my first time presenting with sign language interpreters to help communicate with my audience. Since the pre-con session lasted multiple hours, there were two interpreters in my room. They would switch places about every hour. They were kind enough to answer a few questions for me during breaks.
I asked them if it was difficult to sign all the technical terminology used and if they tried to study up on terms ahead of time. One of them told me that they don’t study the subject and they fingerspell all the technical terms. Most of my terms were spelled on my slides, and I saw the interpreter look at the slide to get the spelling. When someone asked a question about the font I was using, the interpreter asked me to spell it out, since it wasn’t written anywhere. I asked if having printed slides helped (I provided PDFs of the slides to the attendees at the beginning of the session). One of the interpreters told me no, because they were already watching the signer for questions and watching my slides and listening to me.
What I loved most about having the interpreters there was that the person using the service got to fully participate in the session. They asked questions and made comments like anyone else. And they participated in hands-on small group activities.
Check out this great photo of one of the interpreters in action during a small group activity.
Having ASL interpreters didn’t require any extra effort on my part. I didn’t have to practice with them beforehand or provide them with any of my conference materials. They were great professionals and were able to keep up with me through lecture, demos, small group exercises, and Q&A.
Sign language interpreters cost money. And they should – they provide a valuable service. In this case, the interpreters were provided by the State of Texas because the person using the service worked for the state government. Because this was training for their job, the person’s employer was obligated to provide this service. So we were lucky that it didn’t cost us anything.
While the SQLSaturday organizers were coordinating the ASL interpreters, they found out that there is a fund in Texas that can help with accessibility services when a person’s employer doesn’t/can’t provide them. It may not be the same in every state, but it’s definitely something to look into if you need to pay for interpreters for an event like this.
Make Your Next Event More Accessible
I have organized events, and I understand the effort that it requires. I’m so happy that Angela and Mike made the effort to make SQLSaturday Austin-BI a more inclusive event. I would like to challenge you to do the same for the next event you organize or the next presentation you give at a tech conference.
Your conference may not be able to afford the Epiphan Pearl (note: the original model we used is discontinued, but there is a new model) and the Azure costs. I’d like to see SQLSaturdays join together and purchase equipment and share across events – it would be great if PASS would help with this. Or maybe a company involved in the community could sponsor them? If we can’t do that, we could always start small with the built-in capabilities in PowerPoint and work our way up from there.
It was a great experience as a speaker and as an audience member to have the live captions. And I was so happy that someone wanted to attend my session and was making the effort to sign up and request the ASL interpreters. I hope we see more of that in the future. But we need to do our part to let people know that we welcome that and we will work to make it happen.
We can now pass dynamic values to linked services at run time in Data Factory. This enables us to do things like connecting to different databases on the same server using one linked service. Some linked services in Azure Data Factory can be parameterized through the UI. Others require that you modify the JSON to achieve your goal.
Recently, I needed to parameterize a Data Factory linked service pointing to a REST API. At this time, REST APIs require you to modify the JSON yourself.
In order to pass dynamic values to a linked service, we need to parameterize the linked service, the dataset, and the activity.
I have a pipeline where I log the pipeline start to a database with a stored procedure, lookup a username in Key Vault, copy data from a REST API to data lake storage, and log the end of the pipeline with a stored procedure. My username and password are stored in separate secrets in Key Vault, so I had to do a lookup with a web activity to get the username. The password is retrieved using Key Vault inside the linked service. Data Factory doesn’t currently support retrieving the username from Key Vault so I had to roll my own Key Vault lookup there.
I have parameterized my linked service that points to the source of the data I am copying. My linked service has 3 parameters: BaseUrl, Username, and SecretName. The JSON for my linked service is below. You can see that I need to reference the parameter as the value for the appropriate property and also define the parameter at the bottom.
I have defined these three parameters in my dataset, along with one more parameter that is specific to the dataset (that doesn’t get passed to the linked service). I don’t need to set the default value on the Parameters tab of the dataset.
On the Connection tab of the dataset, I set the value as shown below. We can see that Data Factory recognizes that I have 3 parameters on the linked service being used. The relativeURL is only used in the dataset and is not used in the linked service. The value of each of these properties must match the parameter name on the Parameters tab of the dataset.
In my copy activity, I can see my 4 dataset parameters on the Source tab. There, I can write expressions to provide the values that should be passed through to the dataset, 3 of which are passed through to the linked service. In my case, this is a child pipeline that is called from a parent pipeline that passes in some values through pipeline parameters which are used in the expressions in the copy activity source.
And that’s it. I can run my pipeline and have it call different REST APIs using one linked service and one dataset.