What should an Azure cost optimization checklist include?

A practical Azure checklist should cover VM and AKS rightsizing, Reserved Instance and Savings Plan coverage on the stable baseline, blob storage lifecycle and tier management, networking and egress controls (especially NAT and cross-region traffic), and governance — required cost tags, budgets, anomaly alerts, and policy guardrails.

How much can a focused Azure cost pass realistically save?

Mid-market Azure tenants that have not run a structured optimization pass in 12+ months typically cut 20–35% within 60 days without any architectural rewrite. The biggest movers are reservation coverage on stable baseline compute, downsizing the long tail of over-provisioned VMs and SQL databases, and Log Analytics retention tuning.

Should we use Reservations or Savings Plans on Azure?

Use Reservations when your VM family is firmly fixed and you need the deepest discount on a specific SKU. Use the Azure Compute Savings Plan when your workload mix shifts (different VM sizes or families over time) — you trade a few percent of discount for flexibility. Most modern Azure environments end up with a small RI baseline plus a Savings Plan covering the variable tail.

How often should Azure cost reviews run?

Run weekly operational reviews focused on anomalies and budget tracking, plus a monthly strategic review covering reservation coverage, rightsizing recommendations, and unit economics (cost per customer or per workload). Quarterly, revisit architecture-level decisions like region placement and storage tiering.

Azure Cost Optimization Checklist (2026)

A 36-item interactive checklist for FinOps and platform teams running mid-market Azure tenants. Your progress saves locally as you tick items.

Advertising disclosure: We earn commissions when you shop through the links below.

Most mid-market Azure environments at $10k–$80k/mo of cloud spend can take 20–35% out of the bill in a single 60-day push, without rewriting a single application. The waste is spread across over-provisioned VMs, under-reserved baseline compute, untiered blob storage, NAT and cross-region egress, and Log Analytics retention defaults that nobody set on purpose.

Where Azure spend actually goes

Before you optimize anything, know where the bill comes from. Typical mid-market Azure tenants we audit break down roughly like this — your mix will vary by workload, but the ranking rarely does:

Cost category	Share of bill	Optimization difficulty
VMs, VMSS, AKS node pools	45–60%	Medium — rightsize + reserve
Managed databases (SQL DB, PostgreSQL, Cosmos DB)	10–20%	Medium — tier + serverless
Blob and managed-disk storage	8–15%	Easy — lifecycle + tier
Networking (NAT, peering, egress, App Gateway)	6–14%	Hard — architectural
Log Analytics, App Insights, Sentinel	4–10%	Easy — retention + filtering
App Service, Functions, Container Apps	3–8%	Easy — plan tier + scale

The pattern: compute and managed databases are the bulk of the bill but they're medium-difficulty to optimize — they need rightsizing data and reservation analysis. Storage, observability, and App Service are smaller but easy wins. Most teams should attack the easy categories first to fund the work, then tackle compute.

Case study: $52k/mo Azure tenant, B2B SaaS company

Anonymized engagement — a Series-B SaaS company running three products on Azure across two regions. AKS for application services, Azure SQL for the primary OLTP database, Cosmos DB for one product's session store, blob storage for customer file uploads, and a Sentinel + Log Analytics setup their security team had configured 18 months prior and never revisited.

Line item	Before	After	Saved
AKS node pools (D8s v5, mostly idle nights/weekends)	$18,400/mo	$10,200/mo	$8,200
Azure SQL DB (Business Critical, low DTU usage)	$9,600/mo	$5,400/mo	$4,200
Cosmos DB (provisioned RU, peak-sized)	$5,200/mo	$2,800/mo	$2,400
Blob storage (no lifecycle, all Hot tier)	$4,100/mo	$1,650/mo	$2,450
NAT Gateway + cross-region egress	$3,800/mo	$2,100/mo	$1,700
Log Analytics + Sentinel (verbose ingestion)	$4,200/mo	$1,400/mo	$2,800
App Service, misc	$6,700/mo	$5,200/mo	$1,500
Monthly total	$52,000	$28,750	$23,250

45% reduction over 60 days. The biggest movers: switching AKS to scheduled-scaling with mixed reservation + Savings Plan coverage, dropping Azure SQL from Business Critical to General Purpose with the right Storage Performance tier, switching Cosmos DB to autoscale RU/s, and pulling Sentinel data ingestion under control with table-level retention. No customer-facing changes.

1) VM and AKS rightsizing Typical savings: 20–35% of compute

Open Azure Advisor → Cost and review every "Right-size or shutdown underutilized virtual machines" recommendation. Filter by impact and start with the largest.
Pull 14-day p95 CPU and memory per VM via Azure Monitor. Any VM where p95 CPU stays under 30% is a downsize candidate.
Benchmark newer VM generations (Dasv6, Easv6, Dpsv6 ARM) before re-buying. Per-dollar performance for general-purpose workloads typically improves 15–25% generation-to-generation.
Audit AKS node pools: confirm cluster autoscaler is enabled with sensible min/max bounds, and that user node pools downscale on idle (system pools usually shouldn't).
Use Spot node pools in AKS for stateless, interruption-tolerant workloads (CI runners, batch jobs, background processors). Spot pricing is 60–90% off pay-as-you-go.
Schedule auto-shutdown on every non-production VM via Dev Test Labs or a tag-based automation runbook. Static dev/staging VMs running 24/7 are the single most common waste pattern.
Audit VMSS instances: confirm scale-in policies match scale-out triggers — many VMSS deployments scale up readily but never scale down because the metric and threshold weren't paired.

2) Managed databases and PaaS data tier Typical savings: 25–45% of database spend

Review Azure SQL Database DTU/vCore utilization over 14 days. Business Critical tier is often overspecified — General Purpose with the right Storage Performance setting matches most OLTP workloads.
Evaluate Azure SQL Serverless for dev/test or low-utilization databases. Auto-pause cuts compute cost to zero during idle periods.
For Cosmos DB, switch from provisioned to autoscale RU/s on workloads with variable throughput. You pay only for actual peak, not always-provisioned peak.
Audit Cosmos DB indexing policy: default policy indexes every property. Excluding paths you never query reduces RU consumption 20–60%.
Right-size Azure Database for PostgreSQL/MySQL via the Memory Optimized to General Purpose transition where workload allows. Reserved capacity adds another 30–55% off the baseline.
Move old backups and snapshots off premium-priced backup storage to LRS where compliance permits.

3) Reservations and Azure Savings Plans Typical savings: 30–55% on baseline compute

Identify your true stable baseline: the compute that runs 24/7 even after all the rightsizing in section 1. Reserve only this baseline.
Default to Azure Compute Savings Plan unless your VM SKU is firmly locked (e.g., a specific GPU family). The flexibility usually outweighs the 3-5% discount difference vs RIs.
Choose 1-year terms unless you have >18 months of stable workload data justifying 3-year. The default should be 1-year.
Layer reservations across services: SQL DB RIs, App Service RIs, Cosmos DB RIs, Synapse RIs are separate from VM reservations and often forgotten.
Set a monthly reservation utilization review: if any reservation drops below 95% utilization for two consecutive months, that's a signal to exchange or sell on the Marketplace.
Verify Azure Hybrid Benefit is applied wherever eligible — Windows Server VMs, SQL Server VMs, and AHB for AKS Windows containers. This is a free 30–55% discount that's silently missed in 1 of 3 tenants we audit.

4) Storage lifecycle and disk hygiene Typical savings: 40–60% of storage spend

Enable Blob lifecycle management on every container with predictable access patterns: Hot → Cool after 30 days, Cool → Archive after 90 days, delete after retention period.
For containers with unpredictable access, evaluate Blob Access Tiering (automatic). Cheaper than always-Hot for any container where <30% of objects are accessed monthly.
Delete orphaned managed disks: detached disks left after VM termination. Use az disk list --query "[?managedBy==null]" to find them.
Audit premium-tier managed disks: Premium SSD v1 is rarely needed. Many workloads run fine on Premium SSD v2 (cheaper, more configurable) or Standard SSD.
Clean up old snapshots: set a retention policy. Snapshots cost the full provisioned size, not the differential, on Azure.
Review Storage Account redundancy: GRS and RA-GRS are 2x the cost of LRS. Use them only where business continuity actually requires geo-redundancy.

5) Networking, NAT, and egress Typical savings: 15–35% of networking spend

Audit the top 10 egress contributors by workload and region. NAT Gateway data-processing fees alone are often $0.045/GB — at scale this dwarfs raw transfer costs.
For container workloads pulling images, use Azure Container Registry geo-replicated to the workload region rather than pulling cross-region through NAT.
Place Private Endpoints on chatty PaaS services (Storage, SQL, Cosmos) instead of routing service traffic through NAT or public peering.
Consolidate cross-region traffic: review whether dev/test environments genuinely need cross-region peering, or if same-region would work.
Evaluate Azure Front Door / CDN for static and global traffic. CDN egress is meaningfully cheaper than origin egress at any non-trivial scale.
Right-size Application Gateway and Load Balancer SKUs: Standard tier is sufficient for most workloads; Premium WAF tier costs ~2x and is only justified when you actually use WAF rules.

6) FinOps governance and observability Typical savings: 50–70% of Log Analytics + faster catch on anomalies

Set per-table retention in Log Analytics. Default 90-day retention across all tables is the most common Azure observability waste pattern. Operational tables: 30 days. Audit/security: per policy.
Audit Sentinel data sources: every connector you enable ingests data. Disable connectors for sources your SOC doesn't actively monitor.
Enforce required cost allocation tags via Azure Policy (e.g., environment, team, product, cost-center). Without tags, chargeback/showback is impossible.
Set budgets and anomaly alerts in Cost Management per subscription, resource group, and product tag. Anomaly alerts catch the slow drift that threshold alerts miss.
Track unit economics: cost per customer, cost per workload, or cost per transaction. Absolute monthly spend hides the customer-cohort metrics that actually matter.
Run a monthly cost review with engineering, platform, and finance stakeholders. The single biggest predictor of sustained Azure savings is whether someone is actually looking at the bill on a cadence.

Azure-specific tooling worth evaluating

For mid-market Azure tenants past $30k/mo, third-party tooling often pays for itself within 2–4 months by automating what's manual on this checklist. Best fit depends on your existing stack:

We don't currently take affiliate commissions on these — if a tool comes up in an audit recommendation, it's because it fits the workload, not because it pays a referral.

Common Azure cost anti-patterns

30-day Azure cost optimization plan

Re-measure in 60 days. If the bill hasn't moved 20%, the bottleneck is almost always organizational — either nobody owns the optimization work, or the team owning it doesn't have authority to change workloads. The fix at that point is governance, not technology.

Want a prioritized Azure action list for your specific tenant?

Run a focused FinOps audit — we take this checklist, layer it against your actual usage and bill, and return a ranked action plan with dollar-impact estimates within 5–7 business days. Free.

Start a FinOps audit →