Azure Cosmos DB: Choosing the Right NoSQL Solution
Azure Cosmos DB is a globally distributed, multi-model database service with single-digit millisecond latencies worldwide. Picking the right API, partition key, and throughput model is the difference between a system that scales effortlessly and one that throttles constantly.
API Selection Guide
flowchart TD
Q1{What data model\ndo you need?}
Q1 -- Documents / JSON --> Q2
Q1 -- Key-Value --> API_TABLE[Table API\nSimple KV, Migration from Azure Table]
Q1 -- Graphs --> API_GREMLIN[Gremlin API\nRelationship traversal]
Q1 -- Cassandra compat --> API_CASS[Cassandra API\nLift-and-shift from Cassandra]
Q1 -- MongoDB compat --> API_MONGO[MongoDB API\nMigration from MongoDB]
Q2{New workload or\nMongoDB migration?}
Q2 -- New workload --> API_NOSQL[NoSQL API ✅\nRecommended default\nSDK, LINQ, serverless]
Q2 -- MongoDB migration --> API_MONGO
style API_NOSQL fill:#d1fae5,stroke:#059669,color:#065f46
style API_MONGO fill:#dbeafe,stroke:#3b82f6,color:#1e3a8a
style API_TABLE fill:#fef3c7,stroke:#f59e0b,color:#78350f
style API_GREMLIN fill:#ede9fe,stroke:#8b5cf6,color:#4c1d95
style API_CASS fill:#fee2e2,stroke:#ef4444,color:#7f1d1d
| API | Wire Protocol | Best For | SDK Support |
|---|---|---|---|
| NoSQL (default) | Cosmos native | New apps, rich queries, LINQ | .NET, Java, Python, Node.js, Go |
| MongoDB | MongoDB 4.x | Migration from MongoDB | All MongoDB drivers |
| Cassandra | Cassandra CQL | Migration from Cassandra | Cassandra drivers |
| Gremlin | TinkerPop | Graph traversal, recommendations | TinkerPop clients |
| Table | OData | Simple KV migration from Azure Tables | Table SDK |
Start with the NoSQL API for new workloads. It offers the richest SDK experience, server-side JavaScript, change feed, and the best integration with Azure services.
Consistency Levels
flowchart LR
subgraph Levels["← Stronger consistency ——————————— Better performance →"]
direction LR
S[Strong]
BS[Bounded\nStaleness]
SE[Session\n★ Default]
CP[Consistent\nPrefix]
EV[Eventual]
end
S -.->|Higher latency\nHigher RU cost| EV
EV -.->|Lower latency\nLower RU cost| S
style SE fill:#d1fae5,stroke:#059669,color:#065f46
style S fill:#fee2e2,stroke:#ef4444,color:#7f1d1d
style EV fill:#dbeafe,stroke:#3b82f6,color:#1e3a8a
| Level | Guarantees | RU Cost | Use When |
|---|---|---|---|
| Strong | Linearisable reads | ~2× | Financial transactions requiring absolute accuracy |
| Bounded Staleness | Lag bounded by K ops or T time | ~1.5× | Global apps needing near-strong |
| Session ⭐ | Consistent within a session | 1× | Most OLTP apps — read your own writes |
| Consistent Prefix | No out-of-order reads | <1× | Order matters but lag is acceptable |
| Eventual | No guarantees | <1× | Analytics, counters, cache warm-up |
Partition Key Design
flowchart TD
subgraph Good["✅ Good Partition Keys"]
G1[tenantId — SaaS apps]
G2[userId — User data]
G3[orderId — Orders]
G4[deviceId — IoT]
end
subgraph Bad["❌ Anti-Patterns"]
B1[status — low cardinality\nhot partition]
B2[createdDate — range\nhot partition today]
B3[type — only a few values]
end
subgraph Synthetic["🔧 Synthetic Keys\nfor tricky cases"]
S1["userId + categoryId\n→ 'u123_electronics'"]
S2["tenantId + month\n→ 't456_2025-04'"]
end
style Good fill:#d1fae5,stroke:#059669,color:#065f46
style Bad fill:#fee2e2,stroke:#ef4444,color:#7f1d1d
style Synthetic fill:#fef3c7,stroke:#f59e0b,color:#78350f
Rules for choosing a partition key:
- High cardinality — thousands or millions of unique values
- Even distribution — access spread across partitions
- Matches your access pattern — include it in most queries
- Avoid hotspots — single items should never dominate
Step 1: Create a Cosmos DB Account
RESOURCE_GROUP="rg-data-platform"
ACCOUNT_NAME="acct-biz-global"
az group create --name $RESOURCE_GROUP --location eastus
# Multi-region account with Session consistency
az cosmosdb create \
--name $ACCOUNT_NAME \
--resource-group $RESOURCE_GROUP \
--locations regionName=eastus failoverPriority=0 isZoneRedundant=false \
--locations regionName=westeurope failoverPriority=1 isZoneRedundant=false \
--default-consistency-level Session
# Create database and container
az cosmosdb sql database create \
--account-name $ACCOUNT_NAME \
--resource-group $RESOURCE_GROUP \
--name ordersdb
az cosmosdb sql container create \
--account-name $ACCOUNT_NAME \
--resource-group $RESOURCE_GROUP \
--database-name ordersdb \
--name orders \
--partition-key-path /partitionKey \
--throughput 400
Sample Document Model
{
"id": "ord-8847",
"partitionKey": "tenant-acme",
"customerId": "cust-123",
"status": "Pending",
"lineItems": [
{ "sku": "A100", "qty": 2, "price": 49.95 },
{ "sku": "B200", "qty": 1, "price": 49.60 }
],
"total": 149.50,
"_ttl": 604800
}
Partition key:
tenant-acme— predictable distribution, natural access boundary, high cardinality in a SaaS context.
Step 2: Indexing Policy
By default Cosmos DB indexes every field. For large arrays or rarely queried sub-objects, explicitly exclude them to reduce write RU costs:
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [{ "path": "/*" }],
"excludedPaths": [
{ "path": "/lineItems/*" },
{ "path": "/\"_etag\"/?" }
]
}
Update via .NET SDK:
var props = new ContainerProperties("orders", "/partitionKey")
{
IndexingPolicy = new IndexingPolicy
{
Automatic = true,
IndexingMode = IndexingMode.Consistent,
IncludedPaths = { new IncludedPath { Path = "/*" } },
ExcludedPaths =
{
new ExcludedPath { Path = "/lineItems/*" },
new ExcludedPath { Path = "/\"_etag\"/?" }
}
}
};
await database.CreateContainerIfNotExistsAsync(props, throughput: 400);
Step 3: CRUD and Bulk Operations (.NET)
var client = new CosmosClient(endpoint, key, new CosmosClientOptions
{
AllowBulkExecution = true // Enable for bulk inserts
});
var container = client.GetContainer("ordersdb", "orders");
// Create
await container.CreateItemAsync(order, new PartitionKey(order.partitionKey));
// Point read (1 RU, fastest path)
var response = await container.ReadItemAsync<Order>(
order.id, new PartitionKey(order.partitionKey));
// Query with projection (lower RU than SELECT *)
var query = new QueryDefinition(
"SELECT c.id, c.status, c.total FROM c " +
"WHERE c.partitionKey = @tenant AND c.status = @status")
.WithParameter("@tenant", tenantId)
.WithParameter("@status", "Pending");
var iterator = container.GetItemQueryIterator<OrderSummary>(
query,
requestOptions: new QueryRequestOptions
{
PartitionKey = new PartitionKey(tenantId)
});
while (iterator.HasMoreResults)
{
var page = await iterator.ReadNextAsync();
// page.RequestCharge = RUs consumed
}
// Bulk insert
var tasks = ordersToInsert.Select(o =>
container.CreateItemAsync(o, new PartitionKey(o.partitionKey)));
await Task.WhenAll(tasks);
Throughput Models
flowchart LR
subgraph Manual["Manual / Provisioned"]
M_F[Fixed RU/s\ne.g. 400 RU always]
M_F --> M_USE[Predictable cost\nSteady traffic]
end
subgraph Auto["Autoscale"]
A_MIN[Min 10%\nof max RU/s]
A_MAX[Max RU/s\nyou define]
A_MIN <-->|Scales automatically| A_MAX
A_MAX --> A_USE[Spiky workloads\nNo manual tuning]
end
subgraph Serverless["Serverless"]
SL[Pay per\nrequest only]
SL --> SL_USE[Dev / test\nUnpredictable low volume]
end
style Manual fill:#dbeafe,stroke:#3b82f6,color:#1e3a8a
style Auto fill:#d1fae5,stroke:#059669,color:#065f46
style Serverless fill:#fef3c7,stroke:#f59e0b,color:#78350f
| Model | Cost Pattern | Scale Behaviour | Best For |
|---|---|---|---|
| Provisioned (Manual) | Fixed per hour | Fixed, no auto-scale | Steady, predictable workloads |
| Autoscale | Min 10% of max RU, up to max | Scales 10× on demand | Spiky / variable workloads |
| Serverless | Per-request billing | No provisioned capacity | Dev/test, low or sporadic volume |
Cost Optimisation Strategies
| Strategy | RU Impact | How To |
|---|---|---|
| Autoscale | Efficient burst handling | Set max RU, let Cosmos scale |
| Analytical Store | Removes RU overhead from analytics | Enable per container |
| TTL (Time-to-live) | Frees storage + index | Set _ttl on items |
| Excluded paths | Lowers write RU | Skip indexing for large unused arrays |
| Caching (Redis) | Reduces read RU dramatically | Cache hot items by id + partitionKey |
| Bulk execution | Reduces per-call overhead | Enable AllowBulkExecution in SDK |
| Query projection | Lower response/RU | SELECT c.id, c.total not SELECT * |
Security
# Use Managed Identity — no keys in application code
az cosmosdb sql role assignment create \
--account-name $ACCOUNT_NAME \
--resource-group $RESOURCE_GROUP \
--role-definition-name "Cosmos DB Built-in Data Contributor" \
--principal-id <managed-identity-object-id> \
--scope "/"
# Private endpoint (restricts public access)
az network private-endpoint create \
--name pe-cosmos \
--resource-group $RESOURCE_GROUP \
--vnet-name vnet-app \
--subnet snet-data \
--private-connection-resource-id $(az cosmosdb show -n $ACCOUNT_NAME -g $RESOURCE_GROUP --query id -o tsv) \
--group-id Sql \
--connection-name cosmos-private-connection
Security checklist:
| Control | Recommendation |
|---|---|
| Authentication | Managed Identity via DefaultAzureCredential — no connection strings |
| Network | Private Endpoints + deny public access for production |
| Encryption | CMK (Customer Managed Keys) for regulated workloads |
| Least privilege | Assign Cosmos DB Built-in Data Reader to read-only identities |
| Auditing | Enable diagnostic logs → Log Analytics |
Monitoring and Alerts
# Alert when RU consumption spikes above 50,000 in a 5-min window
az monitor metrics alert create \
--name cosmos-ru-burst \
--resource-group $RESOURCE_GROUP \
--scopes "/subscriptions/$SUB/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.DocumentDB/databaseAccounts/$ACCOUNT_NAME" \
--condition "max TotalRequestUnits > 50000" \
--window-size 5m \
--evaluation-frequency 1m \
--action-group "/subscriptions/$SUB/resourceGroups/$RESOURCE_GROUP/providers/microsoft.insights/actionGroups/ag-oncall"
Key metrics to monitor:
| Metric | What to Watch | Alert Threshold |
|---|---|---|
TotalRequestUnits |
Overall RU consumption | > 80% of provisioned |
NormalizedRUConsumption |
Per-partition hotspots | Any partition > 90% |
ThrottledRequests |
HTTP 429 rate | > 0 in production |
ServerSideLatency |
Data plane p99 | > 10 ms |
AvailabilityPercentage |
SLA compliance | < 99.99% |
Troubleshooting
| Symptom | Root Cause | Fix |
|---|---|---|
| High RU per query | SELECT * or full partition scans |
Add projections; exclude unused indexed paths |
| Hot partition (partition > 90% RU) | Low-cardinality partition key | Redesign key; add synthetic key |
| Frequent 429 throttling | RU under-provisioned | Increase RU or enable autoscale |
| Latency increase globally | Cross-region reads with Strong consistency | Downgrade to Session where safe |
| High storage cost | Orphaned / stale documents | Enable TTL; archive essentials |
| SDK timeout | Network / large document | Enable retry policy; split large docs |
Key Takeaways
- ✅ Start with the NoSQL API for new workloads — richest SDK and query support
- ✅ Session consistency is the right default for 95% of OLTP applications
- ✅ Partition key choice is the most critical design decision — high cardinality, even distribution
- ✅ Autoscale handles burst traffic without manual intervention
- ✅ Use Managed Identity — zero connection strings in application code
- ✅ Excluded indexing paths and query projections are the fastest way to cut RU costs
Additional Resources
- Azure Cosmos DB documentation
- Partition key selection guide
- Consistency levels explained
- Cost optimization guide
- .NET SDK v3 reference
What partition key patterns have worked best for your Cosmos DB workloads? Share your data modelling experience below.
Discussion