Every organization I have audited has Key Vault. Every one of them says they rotate secrets. Almost none of them can tell me, on a specific secret, when it was last rotated, what consumed it, and what would break if it rotated at 3 p.m. on a Tuesday. Rotation on Azure Key Vault is not a hard feature. It is a hard operational practice, and the gap between "we have rotation" and "rotation actually works" is where most of the real risk lives.
This post covers the three object types Key Vault manages — secrets, keys, and certificates — and the rotation pattern that works for each at production scale. It assumes you are using Key Vault with RBAC (the access policy model is legacy; please stop deploying new vaults on it) and that you have some form of infrastructure as code for the resources that consume these materials.
The Rotation Problem in One Paragraph
Rotation is not "generate a new value and put it in Key Vault." Rotation is "generate a new value, get every consumer to pick it up, retire the old value, and verify nothing broke." The part in the middle — the handoff — is the part that every rotation strategy is really about. Key Vault gives you the primitives for the first and third steps; the handoff is your architecture.
There are three handoff patterns that work:
- Just-in-time retrieval: consumers read the secret every time they need it. Rotation is immediate. Cost is every request fetching from Key Vault.
- Cached with TTL: consumers read the secret into memory with a 5–15 minute TTL. Rotation completes in one TTL. Cost is a small window of stale reads.
- Two-secret overlap: two versions of the secret are valid simultaneously for a defined overlap period. Rotation is non-atomic but safe. Cost is coordinating the overlap.
Every rotation strategy you build is a combination of one of these and some automation around generation.
Secret Rotation With Event Grid
Key Vault emits events to Azure Event Grid on every object lifecycle change — SecretNewVersionCreated, SecretNearExpiry, SecretExpired, and the equivalents for keys and certificates. Since 2020 this has been the native way to tie rotation into downstream automation, and by 2024 it is the only pattern I recommend for new workloads.
The working architecture is:
- A Key Vault secret has a rotation policy (auto-rotation for supported secret types, or a timer-triggered function for custom ones).
- Rotation generates a new secret version and stores it in Key Vault.
- Key Vault publishes a
Microsoft.KeyVault.SecretNewVersionCreatedevent to an Event Grid system topic. - An Azure Function subscribed to that topic performs the handoff — calls the target system to accept the new secret, verifies, and retires the old version.
For Azure SQL passwords, storage account keys, and Azure Cosmos DB keys, Microsoft ships reference implementations of the rotation function, and Key Vault's auto-rotation handles the generation side. For custom secrets — API keys to a third-party SaaS, a webhook signing secret — you write the rotation function yourself. The template is the same: new version event in, API call to the target out, old version disable on success.
The part most teams miss is the retirement. The rotation function creates the new version and hands it to the target, but the old version stays valid in Key Vault indefinitely unless something disables it. A rotation that never retires the old version is not a rotation; it is a proliferation. The retirement step — az keyvault secret set-attributes --enabled false on the old version after a safe window — has to be part of the automation.
The Two-Secret Strategy for Non-Atomic Rotation
Some targets cannot accept a new credential atomically. A service account in a legacy system may have one password slot. A webhook endpoint may only accept one signing secret. For these, the two-secret strategy is the answer.
The pattern stores two secrets in Key Vault — webhook-signing-primary and webhook-signing-secondary — and consumers try both on every validation. Rotation works by:
- Generate a new value and write it to the
secondaryslot. - Configure the producer (the webhook sender) to sign with the secondary.
- Wait for the TTL of the consumer cache to expire and confirm all consumers accept both.
- Promote secondary to primary; write a fresh new value to secondary for the next cycle.
This buys atomicity at the cost of complexity, and it is the right tradeoff for credentials that cross trust boundaries where a coordinated handoff is not possible. I have deployed this for webhook signing keys, OAuth client secrets, and JWT signing material. The implementation is a few hundred lines of Azure Function code and a Key Vault access policy that grants the producer and consumers the right scope.
Keys and the HSM Question
Cryptographic keys in Key Vault have different rotation requirements from secrets. A secret is data; a key is data plus an operation. Rotating an RSA signing key means all signatures made with the old key need a path to verification after rotation, which is usually "keep the old key enabled for verify-only operations until every signed artifact has expired."
Key Vault supports two key rotation modes. Auto-rotation (GA since mid-2022) generates a new key version on a schedule — typical values are 60 or 90 days — and the new version becomes the default. The old version stays accessible by version-specific URI for operations that need it. For signing keys with long-lived artifacts (Notation image signatures, for instance), the old version needs to stay enabled for the lifetime of the artifacts it signed.
For workloads that require FIPS 140-2 Level 3 validation, the Premium SKU with HSM-backed keys is the only option, and for very high-assurance workloads, Managed HSM is the right tier. Managed HSM has a different rotation API but the same lifecycle patterns apply. The big difference with Managed HSM is key release policies — you can attach an attestation policy to a key so that the key is only released to a confirmed TEE. That is a different kind of rotation consideration, because the policy itself is part of the key's trust boundary.
Certificate Rotation and the Integration Story
Key Vault certificates are the rotation object that breaks most silently, because the failure mode is "the certificate expires and the service stops accepting connections" with a lead time measured in weeks before anyone notices. The auto-renewal feature (GA since 2018) is table stakes — every certificate should have a renewal policy that triggers at 30 days before expiry — but renewal is only half the story. Renewal generates a new certificate version; the target system has to pick it up.
For Azure App Service and Application Gateway, the integration is native — the service imports the certificate from Key Vault and auto-updates on new versions, usually within 24 hours. For any other consumer — NGINX in a VM, an AKS ingress, a third-party appliance — the pickup is your automation. Event Grid events on CertificateNewVersionCreated are the hook, and the downstream Function or Logic App is the actuator.
The specific failure mode to plan for: the new certificate version is issued, the Event Grid event fires, the Function runs the deployment, and the deployment fails because the target is having a bad day. If the function does not retry and does not alert, the old certificate expires 30 days later while everyone assumes it renewed. Retries with exponential backoff plus alerting on rotation failures are not optional; they are the difference between rotation and hoping.
Access, Audit, and Rotation Observability
Every rotation event should be auditable. Key Vault's diagnostic settings, sent to Log Analytics, record every secret read, write, and version change with the calling identity. The base audit log exists. What most teams are missing is the higher-level view: "for every secret in every vault, show me the last rotation date and the next scheduled rotation." Building that view is a Kusto query on AzureDiagnostics plus the vault's secret list, and it is usually the first dashboard I help teams build.
The secondary observability control is alert on rotation failure. Azure Monitor alerts on the KeyVaultSecretExpiredEvent metric (expired secrets without a rotation event in the preceding window) catch the silent-failure case above.
How Safeguard Helps
Safeguard inventories every Key Vault object across subscriptions, reconciles rotation policies against actual rotation history, and surfaces secrets that have not rotated within their intended cadence — the "we have rotation" versus "rotation actually works" gap that I opened with. For certificates, it tracks pickup status on downstream consumers through the Azure Resource Graph so a failed Application Gateway refresh shows up before the certificate expires. Rotation coverage becomes a reported metric rather than a spreadsheet, and the long tail of "secrets nobody rotates because nobody remembers what uses them" becomes a closable backlog.