Azure, Azure SQL DB, Microsoft Technologies, SQL Server

How to Change the Browser Used by SSMS for AAD Auth

Did you know that you can change the browser used by SQL Server Management Studio to authenticate using Azure Active Directory to a SQL database in Azure?

I had been experiencing serious delays with the window that pops up to accept my credentials taking 30 seconds or more to populate. I also once got a warning that the browser I was using was old.

I recently learned that I can change the browser used for this purpose in the SSMS settings.

First, make sure you have updated SSMS to the latest version.

Then, go to Tools > Options. Select Azure Services from the menu on the left in the Options dialog. In the Miscellaneous section, you will see a setting titled “UseSystemBrowser”. By default, it will be set to false.

The Options dialog in SSMS showing the UseSystemBrowser setting

When you set this to true, the window to complete AAD authentication will open in the default browser on your machine. For instance, my default browser is set to Chrome (sorry, Edge 😉). So now my authentication window opens as a browser tab in my existing open Chrome window. Then I can enter my credentials as normal. And changing the browser fixed my issues with the authentication window being slow.

Azure SQL DB, Azure SQL DW, Data Warehousing, Microsoft Technologies, Python, SQL Server

pandas.DataFrame.drop_duplicates and SQL Server unique constraints

I’ve now helped people with this issue a few times, so I thought I should blog it for anyone else that runs into the “mystery”.

Here’s the scenario: You are using Python, perhaps in Azure Databricks, to manipulate data before inserting it into a SQL Database. Your source data is a flattened data extract and you need to create a unique list of values for an entity found in the data. For example, you have a dataset containing sales for the last month and you want a list of the unique products that have been sold. Then you insert the unique product values into a SQL table with a unique constraint, but you encounter issues on the insert related to unique values.

Given a data frame with one column containing product names, you might write some Python like df.dropDuplicates(subset="ProductName").

This would give you a list of distinct product names. But Python is case sensitive. So if you have product “abc123” and product “ABC123” those are considered distinct.

Case sensitivity in SQL Server

SQL Server is case insensitive by default. While you can use a case sensitive collation, most databases don’t.

Let’s say I create a table to contain products and add a unique constraint on product name. Adding a unique constraint means that I cannot insert two values into that column that are the same.

CREATE TABLE [dbo].[Products] (
    [ProductName] nvarchar(100) NOT NULL
)
GO

ALTER TABLE [dbo].[Products] 
ADD CONSTRAINT UQ_Product_ProductName UNIQUE (ProductName)

I can insert my first row.

Insert into dbo.Products (ProductName)
values ('abc123')

Now I try to insert another row where the letters are capitalized.

Insert into dbo.Products (ProductName)
values ('ABC123')

This fails with the following error:

Violation of UNIQUE KEY constraint 'UQ_Product_ProductName'. Cannot insert duplicate key in object 'dbo.Products'. The duplicate key value is (ABC123).

This is because my database collation is set to SQL_Latin1_General_CP1_CI_AS (the CI means case insensitive).

Potential solution

If you need to get to a list of case insensitive distinct values, you can do the following:

  1. Create a column that contains the values of the ProductName converted to lower case.
  2. Run the dropDuplicates on the lowercase column.
  3. Drop the lowercase column.

You can see code on this Stack Overflow solution as an example (as always, please read it and use at your own risk).

Of course, you need be sure that you don’t need to keep both (differently cased) versions of the value as distinct before you follow the above steps. That’s a business/data decision that must be considered. If you are fine to only have one value regardless of case, this should work. Be careful if you further use this data in Python. pandas.DataFrame.merge (similar to a SQL join) is case sensitive, as are most Python functions. Make sure you are handling your data correctly there, or just do your joins before you deduplicate.

Azure, Azure Data Factory, Azure SQL DB, Microsoft Technologies, PowerShell

Thoughts on Unique Resource Names in Azure

Each resource type in Azure has a naming scope within which the resource name must be unique. For PaaS resources such as Azure SQL Server (server for Azure SQL DB) and Azure Data Factory, the name must be globally unique within the resource type. This means that you can’t have two data factories with the same name, but you can have a data factory and a SQL server with the same name. Virtual machine names must be unique within the resource group. Azure Storage accounts must be globally unique. Azure SQL Databases should be unique within the server.

Since Azure allows you to create a data factory and a SQL server with the same resource name, you may think this is fine. But you may want to avoid this, especially if you plan on using system-defined managed identities or using Azure PowerShell/CLI. And if you aren’t planning on using these things, you might want to reconsider.

I ran into this issue of resources with the same name in a client environment and then recreated it in my Azure subscription to better understand it.

I already had a data factory named adf-deploydemo-dev so I made an Azure SQL server named adf-deploydemo-dev and added a database with the same name.

A data factory named adf-deploymentdemo-dev, a SQL Server named adf-deploymentdemo-dev, and a database named adf-deploymentdemo-dev
A data factory, a SQL Database, and a SQL Server all with the same name in the same region and same resource group

Azure Data Factory should automatically create its system-assigned managed identity. It will use the resource name for the name of the service principal. When you go to create a linked service in Azure Data Factory Studio and choose to use Managed Identity as the authentication method, you will see the name and object ID of the managed identity.

Managed identity name: adf-deploymentdemo-dev. Managed identity object ID: 575e8c6e-dfe6-4b5f-91be-40b0f0b9643b
Information shown in my data factory when creating a linked service for a storage account.

For the Azure SQL Server, we can create a managed identity using PowerShell. The Set-AzSqlServer cmdlet has an -AssignIdentity parameter, which creates the system-assigned managed identity.

Executing PowerShell command: Set-AzSqlServer -AssignIdentity -ResourceGroupName 'ADFDemployDemoDev' -ServerName 'adf-deploydemo-dev'
Executing the PowerShell command to create a managed identity

If you use Get-AzSqlServer to retrieve the information and assign the Identity property to a variable, you can then see the system-assigned managed identity and its application ID.

Executing PowerShell command: $S = Get-AzSqlServer -ResourceGroupName 'ADFDemployDemoDev' -ServerName 'adf-deploydemo-dev'
$S.Identity
The results show principalID, Type, and TenantID
Verifying the managed identity is in place for an Azure SQL server.

Now when I look in Active Directory, I can see both managed identities have the same name but different application IDs and object IDs.

Two managed identities in AAD, both called adf-deploymentdeo-dev.
Two managed service principals used for managed identities that have the same name but different IDs

Everything is technically working right now, but I have introduced some needless ambiguity that can cause misunderstandings and issues.

Let’s say that I want to grant the Storage Blob Data Reader role to my data factory. I go to the storage account, choose to add a role assignment, select the role, and then go to add members. This is what I see:

The user interface to select members to add to a role assignment shows users and service principals by name, so ti contains two objects named adf-deploydemo-dev
Which managed identity belongs to the data factory?

Or let’s say that I use PowerShell to get lists of resources by name. I may be locating resources to add tags, add a resource lock, or move the resource to another region or resource group.

Executing PowerShell command Get-AzResource - Name 'adf-deploydemo-dev' | ft
Getting resources by name returns all three resources

If I don’t specify the resource type, I will get my data factory, my database, and my server in the results. You may be saying “Well, I would always specify the type.” Even if that is true, are you sure all coworkers and consultants touching your Azure resources would do the same?

Why introduce this ambiguity when there is no need to do so?

There are some good tips in the Cloud Adoption Framework in Microsoft Docs about naming conventions. Your organization probably wants to decide up front what names are acceptable and then use Azure Policy as well as good processes to ensure adherence to your defined conventions. And if I were the consultant advising you, I would suggest that resources within your tenant be unique across resource types. The suggestion in Docs is to use a resource type abbreviation at the beginning of your resource name. That would avoid the issue I have demonstrated above. Naming conventions should be adjusted to your organization’s needs, but the ones suggested in Docs are a good place to start if you need some help. It is beneficial to have some kind of resource naming convention beyond just whatever is allowed by Azure.

Azure, Azure SQL DB, Microsoft Technologies, T-SQL

Altering a Computed Column in a Temporal Table in Azure SQL

System-versioned temporal tables were introduced in SQL Server 2016. They provide information about data stored in the table at any point in time by storing an effective dated version of each row rather than only the data that is correct at the current time

You can alter a temporal table to add or change columns, but you must first turn off system versioning. Let’s look at an example.

CREATE TABLE [dbo].[DatabaseSize](
	 [DatabaseID] [varchar](200) NOT NULL 
	,[ServerName] [varchar](100) NOT NULL
	,[DatabaseName] [varchar](100) NOT NULL
	,[SizeBytes] [bigint] NULL
	,[SizeMB]  AS ([SizeBytes]/(1048576))
	,[ValidFrom] [datetime2](7) GENERATED ALWAYS AS ROW START NOT NULL
	,[ValidTo] [datetime2](7) GENERATED ALWAYS AS ROW END NOT NULL
	,PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
	,CONSTRAINT PK_DatabaseSize_DatabaseID PRIMARY KEY CLUSTERED (DatabaseID)
) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[DatabaseSizeHistory]));

Temporal tables must have a primary key defined. They also must contain two datetime2 columns, declared as GENERATED ALWAYS AS ROW START / END. The statement above creates both the current table and a history table.

The history table has the same schema as the current table, with one difference: the SizeMB column in the history table is not a computed column.

The dbo.DatabaseSize table is a system-versioned table. The DatabaseSizeHistory table is the related history table. DatabaseSizeHistory contains the same columns as DatabaseSize, except the SizeMB column is not a computed column in the history table.


When I initially created the table, I typoed the formula in the computed column. You can’t alter a computed column — you must drop and recreate the column. This is no problem, just turn off system versioning and alter your table, and turn system versioning back on.

But if you try this without specifying your history table, you will find that it stops using the history table created earlier and makes a new history table.

dbo.DatabaseSize is a system versioned table. The history table now shows as dbo.MSSL_TemporalHistoryFor_1909581841

If you specify your history table while turning system versioning back on, you will encounter an error:

Setting SYSTEM_VERSIONING to ON failed because column 'SizeMB' at ordinal 5 in history table 'Test.dbo.DatabaseSizeHistory' has a different name than the column 'ValidFrom' at the same ordinal in table 'Test.dbo.databasesize'.

Temporal tables match the columns between the current table and history table not only by name and data type but by the column’s ordinal position. Dropping and adding the computed column changed its order as it was added to the end of the table.

You can change the column order of a table in the SQL Server Management Studio UI by right-clicking on the table, selecting Design, and then dragging the column to the correct position. Note that you cannot do this on the system-versioned table while system versioning is on. You can either change the column order on the history table, or turn system versioning off and then change the current table.

dragging the SizeMB column to the bottom of the columns list in the table.

Once the column orders match, you can turn system versioning back on and specify the the history table.

ALTER TABLE [dbo].[DatabaseSize]
SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[DatabaseSizeHistory]));

This time, the command should complete successfully. You’ll want to drop any unused auto-created history tables before you’re finished.

Azure, Azure SQL DB, Microsoft Technologies, SQL Server

New Centralized View of SQL Resources in Azure

Yesterday some new views were made available in the Azure portal that will be helpful to those of us who create or manage Azure SQL resources.

First, a new guided approach to creating resources has been added to the Azure portal. We now have a unified experience to create Azure SQL resources that offers guidance as to the type of Azure SQL resource you need for your use case: SQL database, managed instance, or SQL Server virtual machine. This new Azure SQL blade under Marketplace offers a high-level description of each offering and the scenario that it best serves. If you already know what you want but are having trouble remembering exactly what the resource is called in the marketplace, this can also alleviate that issue.

New guided experience for creating SQL Azure resources in the Azure Portal
New guided experience for creating SQL Azure resources in the Azure Portal

Notice that SQL virtual machine images are a listed offering in the new experience. As Microsoft phrased it, “SQL Server on Azure VMs is now a first-class member of the Azure SQL family.” This blade gives you an easily accessible place to see all the SQL Server VM images without having to search through lots of other unrelated VM images.

A drop-down box listing all of the available SQL Server VM images
A drop-down box listing all of the available SQL Server VM images

Once your Azure SQL resources are created, you can use the new centralized management hub to administer them. Locate the Azure SQL resources blade to see a list of all of your single databases, database servers, elastic pools, managed instances, and virtual machines running SQL.

Centralized management hub for Azure SQL resources
Centralized management hub for Azure SQL resources

This is the foundation for a unified database platform in Azure with more consistency across offerings and more manageability features to come in the future. For more information, read the announcement from Microsoft or watch the new video they posted on Channel 9.