Why querying CosmosDB from SQL Server? Cosmos DB enables you to store documents and other non-relational types of data and provides SQL/API that enables you to fetch, filter, and sort results. However, in some cases, you would need to run more complex queries that have GROUP BY, HAVING, analytical functions, or to join your non-relational data from CosmosDB with data that you are storing in SQL Server tables. In that case, you might want to leverage full power of TransactSQL to query data in Cosmos DB.
Cosmos DB setup
First you need to setup CosmosDB account and add some documents to your collections. I have created CosmosDB account called odbc, with database WWI and collection Orders, and added three documents:
I’m accessing this collection using SQL API.
Now you need to install ODBC Driver for CosmosDB on the computer where you have SQL Server installed. I’m using Microsoft Azure Cosmos DB ODBC 64-bit.msi for 64-bit Windows – 64-bit versions of Windows 8.1 or later, Windows 8, Windows 7, Windows Server 2012 R2, Windows Server 2012, and Windows Server 2008 R2.
Once you install this driver, you should setup ODBC source in system DSN and test the connection:
Once you setup everything, you can use classic OPENROWSET function to query CosmosDB data by specifying CosmosDB DSN in the connection:
As a results you will get three rows representing three documents in CosmosDB. Missing fields will be returned as NULL. You can also filter results based on some criterion like:
SELECT a.* FROM OPENROWSET('MSDASQL', 'DSN=cosmosdb1', 'select * from Orders where billto_Name = ''John Smith''') as a
One interesting thing that you need to be aware is that complex JSON objects such as shipTo or billTo from the first image are flattened and every field is returned in the format <object name>_<field name>. You should be aware of this if you expected to have sub-objects as-is. Also, in my case, array properties such as tags are not mapped/returned.
ODBC Driver for Cosmos DB is enables you to run Transact-SQL queries on CosmosDb data, which might be useful if you need rich data analytic on remote data stored in CosmosDB. In this case, you can send query to CosmosDB with predicate that will filter the results, select only the fields that you need and then do rich-analysis on SQL Server using full Transact-SQL language.
Note that this is possible only on SQL Server and not in Azure SQL Database because CosmosDb driver is not installed on Azure SQL and you cannot add your own drivers. If you are interested for this feature on Azure SQL you can send the idea on feedback for SQL Database or SQL Managed Instance.
Source: Microsoft Blog – SQL Server Storage Engine – Running Azure CosmosDB queries from SQL Server using ODBC driver