Microsoft Technologies, Power BI

Power BI: Where Should My Data Live? Webcast

When you start a Power BI project, you need to decide how and where you should store the data in your dataset. There are three “traditional” options:

  • Imported Model: Data is imported and compressed and stored in the PBIX file, which is then published to the Power BI Service (or Report Server if you are on-prem)
  • Live Connection: Data is stored in Analysis Services and your Power BI dataset is really a reference to the Analysis Services database.
  • DirectQuery: Data remains in the source system and Power BI stores metadata and a reference to the source data, executing live queries when a user interacts with a report

As Power BI has evolved, there are now some variations and additions to those options. Composite models allow you to combine imported data sources and DirectQuery data sources. We also now have dataflows, which allow you to use self-service data prep to define and share reusable data entities.

Each of these options has its advantages and limitations. There is no single right answer of which one you should always pick.

If you have been struggling with this topic, or just want to double-check your thinking, please join me and Kerry Tyler (@AirborneGeek on twitter) for our Denny Cherry & Associates Consulting webcast on April 5th at 12pm Mountain / 2pm Eastern.

The webcast will review your options for where to store data and explain the factors that should be used in determining what option is right for you. Obvious requirements such as data size, license costs and management, and desired data latency will be discussed. We’ll also talk about other factors such as the desire for self-service BI and avoiding data model sprawl. We’ll have content to present, but we are also happy to take questions during the webcast.

Register for the webcast today and join us next Friday, April 5th.

Accessibility, Data Visualization, Microsoft Technologies, Power BI

Power BI Now Has Keyboard Accessible Visual Interactions

The March 2019 release of Power BI Desktop has brought us keyboard accessible visual interactions. One of Power BI’s natural strengths is that you can click on a data point within a visual and have it cross-highlight or cross-filter the other visuals on a page. But keyboard-only users weren’t able to use this feature until now. This greatly raises the accessibility of the Power BI report consumption experience.

Below is a demonstration of interacting with a visual using keyboard commands. Notice how I can select specific data points within the line chart, and the other charts on the page filter based upon the selection.

Keyboard commands can now access visual interactions

If you are a person that prefers to use a keyboard over a mouse, this might also be something you want to try. Relevant keyboard commands include:

  • Ctrl + right arrow: Move focus into the chart area of the visual
  • Tab or Arrow keys: navigate between data points (or legend items in a chart that contains a legend)
  • Enter or Space: select a data point within a visual
  • Ctrl + Enter or Ctrl + Space: select multiple data points within a visual
  • Ctrl + shift + c: clear all selections

I think this was the last big missing piece of keyboard accessibility. I’m excited to see its impact on users. If you try keyboard accessible visual interactions, or are taking advantage of keyboard accessibility in Power BI in general, please let me know how you are liking it. Tweet me at @mmarie or send me a note via my blog contact form.

Azure, Azure Data Factory, Microsoft Technologies

There is Now A Delete Activity in Data Factory V2!

Data Factory can be a great tool for cloud and hybrid data integration. But since its inception, it was less than straightforward how we should move data (copy to another location and delete the original copy).

It is a common practice to load data to blob storage or data lake storage before loading to a database, especially if your data is coming from outside of Azure. We often create a staging area in our data lakes to hold data until it has been loaded to its next destination. Then we delete the data in the staging area once our subsequent load is successful. But before February 2019, there was no Delete activity. We had to write an Azure Function or use a Logic App called by a Web Activity in order to delete a file. I imagine every person who started working with Data Factory had to go and look this up.

But now Data Factory V2 has a Delete activity.

Data Factory V2 Copy activity followed by a Delete activity

How the Delete Activity Works

The Delete activity can delete from the following data stores:

  • Azure blob storage
  • ADLS Gen 1
  • ADLS Gen 2
  • File systems
  • FTP
  • SFTP
  • Amazon S3

You can delete files or folders. You can also specifiy whether you want to delete recursively (delete including all subfolders of the specified folder). If you want to delete files/folders from a file system on a private network (e.g., on premises), you will need to use a self-hosted integration runtime running version 3.14 or higher. Data Factory will need write access to your data store in order to perform the delete.

You can log the deleted file names as part of the Delete activity. It requires you to provide a blob storage or ADLS Gen 1 or 2 account as a place to write the logs.

You can parameterize the following properties in the Delete activity itself:

  • Timeout
  • Retry
  • Delete file recursively
  • Max concurrent connections
  • Enable Logging
  • Logging folder path

You can also parameterize your dataset as usual.

All that’s required in the Delete activity is an activity name and dataset. The other properties are optional. Just be sure you have specified the appropriate file path. Maybe try this out in dev before you accidentally delete your way through prod.

To delete all contents of a folder (including subfolders), specify the folder path in your dataset and leave the file name blank, then check the box for “Delete file recursively”.

You can use a wildcard (*) to specify files, but it cannot be used for folders.

Here’s to much more efficient development of data movement pipelines in Azure Data Factory in V2.