I’ve been working on a project that includes geographical data representing stops on a delivery route. I’ve just completed loading this data into a data mart. The source data contains longitude and latitude in millionths of a degree with 9 digits of data. We haven’t decided what tool we will use to visualize this data yet, but we know Power View and Power Map both accept latitude and longitude values. I decided to store my longitude and latitude data in decimal (9,6) fields. There is a good possibility that we may be computing distances between points in the future, so I thought it would be good to store the data as a spatial data type as well. I thought I would share a few things that I learned along the way.
There are two spatial data types in SQL Server: geometry and geography. Geometry represents the flat-earth system where units are all equally spaced apart. Geography represents the round-earth system measured in latitude and longitude. Since I had longitude and latitude in my data, I used the geography data type. The geography spatial data type is implemented as a .NET common language runtime (CLR) data type in SQL Server.
I populated my table using a query of which I’ve included a snippet below. You can see the use of the Point function to create my geography values.
SELECT ... [Latitude] = [Latitude] / 1000000.0 ,[Longitude] = [Longitude] / 1000000.0 ,[GeographyPoint] = geography::Point([Latitude]/1000000.0, [Longitude]/1000000.0, 4326) FROM [MySourceTable]
The Point function accepts a a latitude, longitude, and SRID, and returns a geography value. An SRID is a unique identifier associated with a coordinate system, tolerance, and resolution. SRIDs are not specific to SQL Server. They are maintained by the International Association of Oil & Gas Producers (OGP) Surveying & Positioning Committee. Here’s a blog post that I think does a good job explaining many of the terms associated with spatial data in SQL Server.
Tip #1: You can see a list of SRIDs available in SQL Server by running the following query. SQL Server uses the default SRID of 4326, which is the WGS 84 spatial reference system.
SELECT * FROM sys.spatial_reference_systems
My source database has planned delivery stops and times and actual delivery stops and times stored in separate columns in a very wide table. I decided to pivot that data and create a table with a scenario key that refers to either plan or actual data. To do this, I wrote 2 queries and attempted to union them together to produce my final data set. That’s when I learned:
Tip #2: When SQL Server performs a UNION it must compare values to remove duplicate rows. CLR user-defined type columns like geography are not comparable. As long as there is no risk of duplicate data between the two sets, you can use UNION ALL.
The query below works just fine as long as you use UNION ALL.
SELECT [Latitude] ,[Longitude] ,[GeographyPoint] FROM [MySourceTable] WHERE [RouteID] = 1 UNION ALL SELECT [Latitude] ,[Longitude] ,[GeographyPoint] FROM [MySourceTable] WHERE [RouteID] = 5
As I finalized my table design I considered using a computed column to store my geography data. But I encountered an issue when I went to add a spatial index. Spatial indexes are built on top of B+ trees. They decompose space into 4 levels of grids. I think spatial indexes are interesting, but they have some restrictions of which you should be aware. They require the table to have a clustered primary key. They cannot be specified on indexed views. And…
Tip #3: You can create a computed column to store the geography point based upon the latitude and longitude. But you cannot create a spatial index on a computed column.
If you try to create a spatial index on a computed column you will get SQL Server error message 6342.
You don’t have to use spatial data types just because you have spatial data. Many data viz tools have built-in geocoding that will accept longitude and latitude or an address. But spatial data types can be useful when calculating distances between two points and planning and measuring routes.