Azure, Microsoft Technologies, SQL Server

Check if File Exists Before Deploying SQL Script to Azure SQL Managed Instance in Azure Release Pipelines

I have been in Azure DevOps pipelines a lot recently, helping clients set up automated releases. Many of my clients are not in a place where automated build and deploy of their SQL databases makes sense, so they deploy using change scripts that are reviewed before deployment.

We chose release pipelines over the YAML pipelines because it was easier to manage for the client and pretty quick to set up. While I had done this before, I had a couple of new challenges:

  • I was deploying to an Azure SQL managed instance that had no public endpoint.
  • There were multiple databases for which there may or may not be a change script to execute in each release.

This took a bit longer than I expected, and I enlisted my coworker Bill to help me work through the final details.

Setup

There are a few pre-requisites to make this work.

Self-hosted Agent

First, you need to run the job on a self-hosted agent that is located on a virtual machine that has access to the managed instance. This requires downloading and installing the agent on the VM. We ended up installing this on the same VM where we have our self-hosted integration runtime for Azure Data Factory, at least for deployment from dev to test. This was ok because the VM had enough resources to support both, and deployments are not long or frequent.

When you create a new self-hosted agent in Azure DevOps, a modal dialog appears, which provides you with a link to download the agent and a command line script to install and configure the agent.

Dialog from Azure DevOps titled "Get the agent". There are options for Windows, macOS, and Linux. There is a button to copy a link to download the agent. Then there is a set of command line scripts to create the agent, and configure the agent.
Dialog showing the instructions to download and install the Azure Pipelines Self-hosted agent

We configured the agent to run as a service, allowing it to log on as Network Service. We then could validate it was running by opening the Services window and looking for Azure Pipelines Agent.

You must add an inbound firewall rule for the VM to allow the traffic from the IP addresses related to dev.azure.com. This can be done in the Azure portal on the Networking settings for the VM.

Then you must install the SqlServer module for PowerShell on the VM. And you must also install sqlpackage on the VM. We also updated the System Path variable to include the path to sqlpackage (which currently gets installed at C:\Program Files\Microsoft SQL Server\160\DAC\bin).

DevOps Permissions

Azure DevOps has a build agent with a service principal for each project. That service principal must have Contribute permissions on the repo in order to follow my steps.

Database Permissions

If you want the Azure DevOps build agent to execute a SQL script, the service principal must have appropriate permissions on the database(s). The service principal will likely need to create objects, execute stored procedures, write data, and read data. But this depends on the content of the change scripts you want to execute.

The Pipeline

In this project we had one pipeline to release some analytics databases and Azure Data Factory assets from dev to test.

Therefore, we included the latest version of the repos containing our database projects and data factory resources as artifacts in our pipeline. While the database objects and data factory resources were related, we decided they weren’t dependent upon each other. It was ok that one might fail and one succeed.

Azure DevOps release pipeline with 2 artifacts (database code and ADF code) and 2 stages (Database deployment and ADF Deployment).

Side note: I love the ADF Tools extension for Azure DevOps from SQL Player. It is currently my preferred way to deploy Azure Data Factory. I find it is a much better experience than ARM template deployment. So that is what I used here in this ADF Deploy job.

Repo Setup

Getting back to databases, we had 3 databases to which we might want to deploy changes. We decided that we would set a standard of having one change script per database. We have an Azure DevOps repo that contains one solution with 3 database projects. These databases are highly related to each other, so the decision was made to keep them all in one repo. I added a folder to the repo called ChangeScripts. ChangeScripts contains two subfolders: Current and Previous. When we are ready to deploy database changes, we add a .sql file with the name of the database into the Current folder. For any deployment, there may be 1, 2, or 3 change scripts (1 for each database that has changes). When the release pipeline runs, the change scripts in the Current folder are executed and the files are renamed and moved into the Previous folder. This is why my DevOps service principal needed Contribute permissions on the repo.

Here are the tasks in my Database Deployment job:

  1. Check if SQL Scripts exist in the expected location.
  2. If a script exists for Database1, execute the script.
  3. If a script exists for Database 2, execute the script.
  4. If a script exists for Database 3, execute the script.
  5. Rename the change scripts to fit the pattern [DatabaseName][yyyyMMddHHmm].sql and move them to the Previous folder.

Configuring the Tasks

The first thing we must do is set the agent job to run using the agent pool containing the self-hosted agent. There is another important setting under Additional options for Allow scripts to access the OAuth token. This allows that service principal to authenticate and execute the SQL scripts, so make sure that is checked.

My first task, called Check for SQL Scripts is a PowerShell task. I used an inline script, but you can also reference a file path that is accessible by the agent. My script looks for the three files and then sets a pipeline variable for as true or false to indicate the presence each file. This variable is used later in the process.

My task is configured with the working directory set to $(System.DefaultWorkingDirectory)/_Databases/ChangeScripts/Current. There are no environment variables or output variables.

Here’s my PowerShell script:

$DB1Script = "Database1.sql"
$DB2Script = "Database2.sql"
$DB3Script = "Database3.sql"

$IsDB1Script = Test-Path -Path $DB1Script -PathType Leaf
$IsDB2Script =  Test-Path -Path $DB2Script -PathType Leaf
$IsDB3Script = Test-Path -Path $DB3Script -PathType Leaf

Write-Host $IsDB1Script  $IsDB2Script $IsDB3Script

Write-Output ("##vso[task.setvariable variable=DB1Script;]$IsDB1Script")

Write-Output ("##vso[task.setvariable variable=DB2Script;]$IsDB2Script")

Write-Output ("##vso[task.setvariable variable=DB3Script;]$IsDB3Script")

We got stuck on the syntax for setting variables for a while. Most of the things we found in Microsoft Docs and on blogs were close, but didn’t work. Finally, we found this blog, which showed us the syntax that worked – be careful with the quotes, semi-colon, and parentheses.

My second task executes the script for Database1. It uses the Azure SQL Database deployment task. When you use this task, you can specify whether it deploys a dacpac or executes a SQL script. We chose to execute a SQL script that we stored in the same repo as the database projects. For this task, you must specify the Azure Subscription, the target server, and the target database. I set my authentication type to Service Principal, since I had granted access to the Azure DevOps service principal during setup. My Deploy type was set to SQL Script File. And my SQL Script location was set to $(System.DefaultWorkingDirectory)/_Databases/ChangeScripts/Current/Database1.sql. An important configuration on this task is to set Run this task to Custom Conditions. I specified my custom condition as eq(variables['DB1Script'], 'True'). This tells the task to execute only if the variable we set in the previous task is equal to True. It seems data types don’t really matter here – I set the variable in the PowerShell script as a boolean, but it was converted to a string in Azure DevOps.

My third and fourth tasks do the same activities for Database2 and Database3.

My last task, called Move Change Scripts to Previous Folder, is another inline PowerShell script. The script uses the variables set in the first task to determine if the file exists and needs to be moved. Because the script is working on a local copy on the VM, it must commit the changes back to the git repo after renaming and moving the file(s). The Working Directory for this task is set to $(System.DefaultWorkingDirectory)/_Databases/ChangeScripts.

Here’s the code:

# Write your PowerShell commands here.

Write-Host "Getting Files"
$path = Get-Location
Write-Host $path


$File1 = "Current/Database1.sql"
$File2 = "Current/Database2.sql"
$File3 = "Current/Database3.sql"

$IsDB1Script = Test-Path -Path $File1 -PathType Leaf
$IsDB2Script = Test-Path -Path $File2 -PathType Leaf
$IsDB3Script = Test-Path -Path $File3 -PathType Leaf

$DateString = Get-Date -Format "yyyyMMddHHmm"

$NewFile1 = "Previous/Database1$($DateString).sql"
$NewFile2 = "Previous/Database2$($DateString).sql"
$NewFile3 = "Previous/Database3$($DateString).sql"


 Write-Host "Changing file locations"
    git config --global user.email "myemail@domain.com"
    git config --global user.name "My Release Pipeline"

   
if($IsDB1Script) 
{
    $File1 |  git mv ($File1) ($NewFile1) 
    Write-Host "Moved Database1 script"
}

if($IsDB2Script)
{
    $File2 |  git mv ($File2) ($NewFile2) 
  Write-Host "Moved Database2 script"
}

if($IsDB3Script)
{
  $File3 |  git mv ($File3) ($NewFile3) 
  Write-Host "Moved Database3 script"
}

git commit -a -m 'Move executed change scripts to Previous Folder'
git push origin HEAD:master

Anything you output from your PowerShell script using Write-Host is saved in the release logs. This is helpful for troubleshooting if something goes wrong.

As you can see above, you can execute git commands from PowerShell. This Stack Overflow question helped me understand the syntax.

And with those five steps, I’m able to execute change scripts for any or all of my three databases. We don’t anticipate adding many more databases, so this configuration is easily manageable for the situation. I wouldn’t do this (one step per database) for a situation in which I have dozens or hundreds of databases that may be created or deleted at any time.