Azure, Data Warehousing, Microsoft Technologies, T-SQL

Create a Date Dimension in Azure SQL Data Warehouse

Most data warehouses and data marts require a date dimension or calendar table. Those of us that have been building data warehouses in SQL Server for a while have collected our favorite scripts to build out a date dimension. For a standard date dimension, I am a fan of Aaron  Bertrand’s script posted on MSSQLTips.com. But the current version (as of Aug 8, 2016) of Azure SQL Data Warehouse doesn’t support computed columns, which are used in Aaron’s script.

I adapted Aaron’s script to work in Azure SQL Data Warehouse and am sharing it with you below, so you don’t have to do the same. I did leave out holidays because I didn’t need them in my calendar table.  To add them back in, just add the column back to the table and use Aaron’s update statements. I also changed the MonthYear field to include a space between the month and year. Otherwise, my script should produce the same results as Aaron’s in a Azure SQL DW friendly way. Notice that I chose to use Round Robin distribution for my date table. Also, tables are now created with clustered columnstore indexes by default in Azure SQL DW.


Creates a table called RPT.Calendar. Change the table name on line 69. Change date range on line 2.
DECLARE @StartDate DATE = '20100101', @NumberOfYears INT = 30;
prevent set or regional settings from interfering with
interpretation of dates / literals
CREATE TABLE #dimdate
(
[date] DATE,
[day] tinyint,
[month] tinyint,
FirstOfMonth date,
[MonthName] varchar(12),
[week] tinyint,
[ISOweek] tinyint,
[DayOfWeek] tinyint,
[quarter] tinyint,
[year] smallint,
FirstOfYear date,
Style112 char(8),
Style101 char(10)
);
SET DATEFIRST 7;
SET DATEFORMAT mdy;
SET LANGUAGE US_ENGLISH;
DECLARE @CutoffDate DATE = DATEADD(YEAR, @NumberOfYears, @StartDate);
this is just a holding table for intermediate calculations:
use the catalog views to generate as many rows as we need
INSERT #dimdate([date])
SELECT d
FROM
(
SELECT d = DATEADD(DAY, rn 1, @StartDate)
FROM
(
SELECT TOP (DATEDIFF(DAY, @StartDate, @CutoffDate))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
on my system this would support > 5 million days
ORDER BY s1.[object_id]
) AS x
) AS y;
UPDATE #DimDate
set
[day] = DATEPART(DAY, [date]),
[month] = DATEPART(MONTH, [date]),
FirstOfMonth = CONVERT(DATE, DATEADD(MONTH, DATEDIFF(MONTH, 0, [date]), 0)),
[MonthName] = DATENAME(MONTH, [date]),
[week] = DATEPART(WEEK, [date]),
[ISOweek] = DATEPART(ISO_WEEK, [date]),
[DayOfWeek] = DATEPART(WEEKDAY, [date]),
[quarter] = DATEPART(QUARTER, [date]),
[year] = DATEPART(YEAR, [date]),
FirstOfYear = CONVERT(DATE, DATEADD(YEAR, DATEDIFF(YEAR, 0, [date]), 0)),
Style112 = CONVERT(CHAR(8), [date], 112),
Style101 = CONVERT(CHAR(10), [date], 101)
;
CREATE TABLE RPT.Calendar
WITH
(
DISTRIBUTION = ROUND_ROBIN
)
AS
SELECT
DateKey = CONVERT(INT, Style112),
[Date] = [date],
[Day] = CONVERT(TINYINT, [day]),
DaySuffix = CONVERT(CHAR(2), CASE WHEN [day] / 10 = 1 THEN 'th' ELSE
CASE RIGHT([day], 1) WHEN '1' THEN 'st' WHEN '2' THEN 'nd'
WHEN '3' THEN 'rd' ELSE 'th' END END),
[Weekday] = CONVERT(TINYINT, [DayOfWeek]),
[WeekDayName] = CONVERT(VARCHAR(10), DATENAME(WEEKDAY, [date])),
[DOWInMonth] = CONVERT(TINYINT, ROW_NUMBER() OVER
(PARTITION BY FirstOfMonth, [DayOfWeek] ORDER BY [date])),
[DayOfYear] = CONVERT(SMALLINT, DATEPART(DAYOFYEAR, [date])),
WeekOfMonth = CONVERT(TINYINT, DENSE_RANK() OVER
(PARTITION BY [year], [month] ORDER BY [week])),
WeekOfYear = CONVERT(TINYINT, [week]),
ISOWeekOfYear = CONVERT(TINYINT, ISOWeek),
[Month] = CONVERT(TINYINT, [month]),
[MonthName] = CONVERT(VARCHAR(10), [MonthName]),
[Quarter] = CONVERT(TINYINT, [quarter]),
QuarterName = CONVERT(VARCHAR(6), CASE [quarter] WHEN 1 THEN 'First'
WHEN 2 THEN 'Second' WHEN 3 THEN 'Third' WHEN 4 THEN 'Fourth' END),
[Year] = [year],
MMYYYY = CONVERT(CHAR(6), LEFT(Style101, 2) + LEFT(Style112, 4)),
MonthYear = CONVERT(CHAR(8), LEFT([MonthName], 3) + ' ' + LEFT(Style112, 4)),
FirstDayOfMonth = FirstOfMonth,
LastDayOfMonth = MAX([date]) OVER (PARTITION BY [year], [month]),
FirstDayOfQuarter = MIN([date]) OVER (PARTITION BY [year], [quarter]),
LastDayOfQuarter = MAX([date]) OVER (PARTITION BY [year], [quarter]),
FirstDayOfYear = FirstOfYear,
LastDayOfYear = MAX([date]) OVER (PARTITION BY [year]),
FirstDayOfNextMonth = DATEADD(MONTH, 1, FirstOfMonth),
FirstDayOfNextYear = DATEADD(YEAR, 1, FirstOfYear)
FROM #dimdate
;
DROP Table #dimdate;

Instead of the computed columns, I inserted the dates, then updated the other columns in the temporary table. Then I used the Create Table As Select syntax to create my final RPT.Calendar table.

Happy time trending!

4 thoughts on “Create a Date Dimension in Azure SQL Data Warehouse”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s