Azure DevOps Enterprise CI/CD Pipelines Deep Dive
pool:
vmImage: ubuntu-latest
steps:
- task: NodeTool@0
inputs:
```yaml
versionSpec: '18.x'```
- script: npm ci
displayName: Install deps
- script: npm test -- --ci
displayName: Run tests
Expected output:
Test Files 3 passed (3)
Tests 26 passed (26)
Duration 1.23s
Expanding to Multi-Stage
name: $(Date:yyyyMMdd).$(Rev:r)
trigger:
branches:
```yaml
include: [ main ]```
batch: true
pr:
branches:
```yaml
include: [ main ]
variables:
NodeVersion: '18.x' BuildConfiguration: 'Release' ArtifactName: 'webapp' EnableCodeCoverage: true
stages:
- stage: Build
displayName: Build & Unit Test
jobs:
- job: build
pool: { vmImage: ubuntu-latest }
steps:
- task: NodeTool@0
inputs: { versionSpec: $(NodeVersion) }
- script: npm ci
displayName: Install
- script: npm run lint
displayName: Lint
- script: npm test -- --coverage
displayName: Unit Tests
- publish: $(System.DefaultWorkingDirectory)
artifact: $(ArtifactName)
- stage: Quality
dependsOn: Build
jobs:
- job: security
pool: { vmImage: ubuntu-latest }
steps:
- script: npm audit --json > audit.json || true
displayName: Dependency Audit
- task: Bash@3
displayName: SAST (Example)
inputs:
targetType: inline
script: |
echo "Run static analysis tool here"
- script: echo "Generate SBOM" && echo "sbom" > sbom.txt
displayName: SBOM Generation
- publish: sbom.txt
artifact: sbom
- stage: Package
dependsOn: Quality
jobs:
- job: publish
steps:
- download: current
artifact: $(ArtifactName)
- script: echo "Signing artifact"
displayName: Sign Artifact
- task: UniversalPackages@0
inputs:
command: publish
publishDirectory: $(Pipeline.Workspace)/$(ArtifactName)
feedsToUse: internal
vstsFeed: my-feed-id
packagePublishName: $(ArtifactName)
packagePublishVersion: $(Build.BuildNumber)
- stage: Deploy_Dev
displayName: Deploy Dev Environment
dependsOn: Package
jobs:
- deployment: devDeploy
environment: dev
strategy:
runOnce:
deploy:
steps:
- task: AzureCLI@2
inputs:
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
echo "Deploy to dev with Bicep"
az deployment group create -g rg-dev -f infra/main.bicep
- script: echo "App deployment"
- stage: Deploy_QA
displayName: Deploy QA Environment
dependsOn: Deploy_Dev
jobs:
- deployment: qaDeploy
environment: qa
strategy:
runOnce:
deploy:
steps:
- task: AzureCLI@2
inputs:
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
echo "Deploy to QA"
az deployment group create -g rg-qa -f infra/main.bicep
- stage: Deploy_Prod
displayName: Deploy Production (Blue/Green)
dependsOn: Deploy_QA
jobs:
- deployment: prodBlue
environment: prod
strategy:
runOnce:
deploy:
steps:
- script: echo "Deploy blue slot"
- task: AzureCLI@2
inputs:
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
echo "Swap after health checks"
# az webapp deployment slot swap --name myapp --slot blue --target-slot production
- script: echo "Run smoke tests"
- script: echo "Swap to production if healthy"
## Environment Approvals & Checks
| Feature | Purpose | Example |
|---------|---------|---------|
| Manual Approval | Human gate before prod | Release manager signs off |
| Business Hours Check | Restrict deployment windows | Block outside 08:00–18:00 |
| Quality Gate (Tests % / Coverage) | Enforce minimum reliability | Coverage ≥ 80% |
| Security Scan Threshold | Block critical vulnerabilities | No Critical severity allowed |
| Work Item Linking | Traceability | Build must reference user story |
| Required Templates | Consistency | Standard header & scanning steps |
Use environment protection rules in Azure DevOps (Project Settings → Pipelines → Environments) to configure approvals & checks centrally.
## Secrets & Identity Strategy
| Aspect | Recommendation | Rationale |
|--------|---------------|----------|
| Authentication to Azure | OIDC federation (no PAT/secret) | Eliminates credential sprawl |
| Runtime Secrets | Key Vault references (managed identity) | Rotation + RBAC control |
| Pipeline Variables | Variable groups (locked + audit) | Central governance |
| Service Connections | Least privilege scoped managed identity | Reduce blast radius |
| Encryption in Transit | TLS everywhere | Compliance baseline |
| Encryption at Rest | Azure-managed keys (optionally CMEK) | Control + compliance |
## Supply Chain Security
| Control | Description | Tooling |
|---------|-------------|---------|
| SBOM | Inventory of dependencies | `cyclonedx`, `syft` |
| Digital Signing | Sign artifacts/packages | Azure Sign or cosign |
| Provenance Metadata | Build identity & commit | Pipeline variables + attestation |
| Vulnerability Scans | Dependency & container | `npm audit`, Trivy |
| License Compliance | Approved license list | Scan + policy file |
| Tamper Detection | Hash verification pre-deploy | Compare hash vs manifest |
## Deployment Patterns
| Pattern | Flow | Benefits | Risks |
|---------|------|---------|------|
| Blue/Green | Parallel slots, traffic switch | Fast rollback | Higher infra cost |
| Canary | Gradual % traffic shift | Early failure detection | Complex routing |
| Rolling | Batch replace instances | Reduced downtime | Possible partial inconsistency |
| Ring (Phased) | Internal → pilot → full | Controlled exposure | Longer lead time |
| Shadow | Duplicate traffic, observe | Zero risk to users | Expensive, complex |
Choose pattern based on risk appetite, compliance guidelines and recovery objectives.
## Observability Integration
_Pipeline deploys the app, which emits logs/metrics/traces to Application Insights, powering dashboards and alerts._
Telemetry steps:
1. Emit structured logs (correlation IDs attached)
2. Trace deployment events (custom event with build number)
3. Capture performance metrics (CPU, latency, error rate)
4. Alert on SLO breaches (error % or P95 latency)
5. Link work items to incidents (bi-directional traceability)
Kusto queries (Application Insights):
```kusto
exceptions
| where timestamp > ago(1h)
| summarize count() by type
requests
| summarize p95(duration) by bin(timestamp, 5m)
Cost & Performance Optimization
| Lever | Action | Impact |
|---|---|---|
| Parallel Jobs | Only where independent | Reduce overall duration |
| Caching | Cache npm/dependency artifacts | Faster rebuilds |
| Incremental Tests | Run impacted tests only | Shorter feedback cycle |
| Ephemeral Agents | Use cloud-hosted scale set | Eliminate idle VM cost |
| Artifact Retention | Short TTL for non-release builds | Lower storage cost |
| Consolidated Scans | Merge SAST/DAST in single job | Fewer agent minutes |
Rollback & DR
| Scenario | Mechanism | Steps |
|---|---|---|
| Failed Blue/Green | Slot swap back | Previous slot remains intact |
| Canary failure | Halt progression + revert config | Roll traffic to stable % |
| Data migration issue | Versioned scripts + backups | Restore DB snapshot |
| Regional outage | Multi-region deployment + traffic manager | Redirect to secondary region |
| Pipeline mistake | Re-run last good build by tag | Immutable artifact restore |
Advanced Multi-Stage with Approvals & Checks (Excerpt)
Figure: Power Automate integration – approval flow with email notifications.
stages:
- stage: Deploy_Prod
dependsOn: Deploy_QA
approval: Manual
jobs:
- deployment: prodRing1
```yaml
environment: prod-ring1
strategy:
runOnce:
deploy:
steps:
- script: echo "Deploy ring1"```
- deployment: prodRing2
```yaml
environment: prod-ring2
strategy:
runOnce:
deploy:
steps:
- script: echo "Deploy ring2"```
- deployment: prodFinalize
```yaml
environment: prod
strategy:
runOnce:
deploy:
steps:
- script: echo "Finalize deployment"
## Troubleshooting Matrix
| Symptom | Likely Cause | Diagnosis | Resolution |
|---------|--------------|----------|-----------|
| Slow pipeline | Redundant sequential jobs | Timeline view | Parallelize independent steps |
| Failing approval | Incorrect approvers list | Environment settings | Update approvals config |
| Secrets not available | Missing Key Vault permission | Pipeline logs / Key Vault RBAC | Grant get/list to identity |
| Artifact mismatch | Not downloading correct version | Job logs | Pin version via build number |
| High error rate post-deploy | Misconfigured connection strings | App Insights logs | Rollback + fix config |
| SBOM empty | Tool misconfigured path | Task logs | Adjust working directory |
| Canary fails | Feature toggle logic | Metrics comparison | Revert toggle + investigate |
## Image References
## References
- [Azure DevOps YAML Pipelines](https://learn.microsoft.com/azure/devops/pipelines/yaml-schema)
- [Pipeline Environments & Approvals](https://learn.microsoft.com/azure/devops/pipelines/process/environments)
- [Service Connections Guidance](https://learn.microsoft.com/azure/devops/pipelines/library/service-endpoints)
- [Key Vault Integration](https://learn.microsoft.com/azure/devops/pipelines/release/key-vault-integration)
- [Security Scanning](https://learn.microsoft.com/azure/devops/pipelines/security/overview)
- [App Service Deployment Slots](https://learn.microsoft.com/azure/app-service/deploy-staging-slots)
- [Application Insights](https://learn.microsoft.com/azure/azure-monitor/app/app-insights-overview)
## Architecture Decision and Tradeoffs
When designing cloud infrastructure solutions with Azure, consider these key architectural trade-offs:
| Approach | Best For | Tradeoff |
|----------|----------|----------|
| Managed / platform service | Rapid delivery, reduced ops burden | Less customisation, potential vendor lock-in |
| Custom / self-hosted | Full control, advanced tuning | Higher operational overhead and cost |
> **Recommendation:** Start with the managed approach for most workloads and move to custom only when specific requirements demand it.
## Validation and Versioning
- Last validated: April 2026
- Validate examples against your tenant, region, and SKU constraints before production rollout.
- Keep module, CLI, and SDK versions pinned in automation pipelines and review quarterly.
## Security and Governance Considerations
- Apply least-privilege access using RBAC roles and just-in-time elevation for admin tasks.
- Store secrets in managed secret stores and avoid embedding credentials in scripts or source files.
- Enable audit logging, data protection policies, and periodic access reviews for regulated workloads.
## Cost and Performance Notes
- Define budgets and alerts, then monitor usage and cost trends continuously after go-live.
- Baseline performance with synthetic and real-user checks before and after major changes.
- Scale resources with measured thresholds and revisit sizing after usage pattern changes.
## Official Microsoft References
- https://learn.microsoft.com/azure/
- https://learn.microsoft.com/azure/architecture/
- https://learn.microsoft.com/azure/well-architected/
## Public Examples from Official Sources
- These examples are sourced from official public Microsoft documentation and sample repositories.
- Documentation examples: https://learn.microsoft.com/azure/architecture/
- Sample repositories: https://github.com/Azure-Samples
- Prefer adapting these examples to your tenant, subscriptions, and governance requirements before production use.
## Key Takeaways
- Choose hosting and scale appropriately
- Secure with Key Vault + Managed Identity
- Monitor with Application Insights
Discussion