Cloud Security

Azure Key Vault Managed HSM for Artifact Signing: Pattern Library

Managed HSM gives you FIPS 140-3 Level 3 key custody in Azure. We map the patterns for using it as the root of trust for code signing, container signing, and SBOM attestation.

Alex
Security Engineer
7 min read

A software supply chain program eventually runs into a question that engineering teams want to avoid: where does the signing key actually live? Storing a code-signing key on a build agent is a known anti-pattern, but the practical alternatives have historically had ugly tradeoffs. Hardware security modules in a data center solve the trust problem and introduce a procurement problem. Cloud KMS services solve the procurement problem at the cost of inheriting the cloud provider's compliance posture for the key material. Azure Key Vault Managed HSM threads this needle. It is a single-tenant HSM cluster running on hardware certified at FIPS 140-3 Level 3, with key custody that even Microsoft cannot bypass under their security domain model, exposed through familiar Azure APIs. For supply chain defenders building a signing program in 2026, it is one of the most operationally usable options. It is also one of the easiest to misconfigure into a service that provides regulatory checkbox value without real security improvement.

What makes Managed HSM different from standard Key Vault?

Standard Azure Key Vault uses a multi-tenant HSM pool certified at FIPS 140-2 Level 2 (or Level 3 for premium SKU keys). Managed HSM is a single-tenant cluster of FIPS 140-3 Level 3 HSMs dedicated to one customer. The differences that matter operationally are three. First, Managed HSM uses a security domain — a customer-controlled set of three to ten signing keys that authorize cluster recovery — which means Microsoft cannot administratively reset the cluster without those keys. Second, Managed HSM supports a richer role-based access control model with separation between cluster administrators, key managers, key users, and crypto users, allowing the principle of least privilege at the operation level rather than the resource level. Third, Managed HSM provides higher throughput and lower latency for signing operations than standard Key Vault, which matters when you sign every artifact in a CI pipeline rather than occasional production releases.

How should the security domain actually be handled?

The security domain is the most important artifact Managed HSM produces and the one most often mishandled. When you provision a Managed HSM, you provide a quorum (commonly three of five) of public RSA keys; the HSM encrypts its internal state to that quorum and emits an encrypted security-domain file. To recover the HSM after disaster, an attacker after Microsoft credentials, or a misconfiguration, you must present at least the quorum of corresponding private keys. The right pattern is to generate those keys on hardware tokens that never touch a build agent, distribute them to officers across organizational boundaries (security, compliance, engineering leadership, legal), store the encrypted security-domain file in a location independent of Azure (an on-prem safe, an offline encrypted backup, a customer-controlled S3 bucket if you accept the cross-cloud dependency), and rehearse the recovery procedure annually. The wrong pattern — generating all five keys on the same workstation and storing the security-domain file in the same Azure subscription as the HSM — turns a Level 3 control into Level 1 in practice.

What is the right role split for an artifact signing pipeline?

Managed HSM defines roles at the data-plane level, separate from Azure RBAC at the management plane. The roles that matter for a signing pipeline are Managed HSM Crypto User (can perform crypto operations using a key), Managed HSM Crypto Officer (can manage key lifecycle), and Managed HSM Administrator (can manage roles and the HSM itself). The right split for a CI/CD signing flow is: build agents authenticate with a workload identity that has Crypto User on the specific signing key, with permission for sign but not decrypt or unwrapKey; key lifecycle (creation, rotation, deletion) is performed by a Crypto Officer identity that requires a manual approval workflow and is not accessible to CI; administrators of the HSM are a small group of humans, ideally requiring privileged access workstations to authenticate. The build agent's identity should not be able to read the key material or list keys it does not need; both are properties of the role assignment, not the HSM.

# Assign a workload identity Crypto User on a specific signing key only
az keyvault role assignment create \
  --hsm-name myorg-mhsm \
  --role "Managed HSM Crypto User" \
  --assignee-object-id "$BUILD_WORKLOAD_PRINCIPAL" \
  --scope "/keys/codesign-prod-2026"

How does this integrate with Notation, Cosign, and Sigstore?

All three of the main artifact signing toolchains can use Managed HSM as a key backend. Notation supports it through a plugin model that calls Azure APIs with the build agent's managed identity. Cosign supports KMS-backed keys via a URI scheme that includes the vault and key reference. Sigstore's keyless signing is a different model — it relies on short-lived certificates from Fulcio — but organizations that prefer keyed signing can run their own Fulcio instance with a Managed HSM-backed CA key, getting Sigstore's transparency log benefits while keeping a long-term root of trust under FIPS Level 3 custody. The choice between these is less about Managed HSM and more about how you want to handle revocation, transparency, and offline verification. The common requirement is that the key never leaves the HSM, all toolchains support that, and the right one for your environment is whichever your developers will actually adopt.

What about geographic and tenancy constraints?

Managed HSM is provisioned to a specific Azure region and does not replicate cross-region automatically. For organizations with regulatory requirements that signing keys remain in a specific jurisdiction, that is a feature: the key is provably resident in the region you provisioned to. For organizations with availability requirements that signing must continue if a region goes offline, that is a constraint: you need a second Managed HSM in another region, an independent security domain, and a process for resigning critical artifacts under the secondary HSM during an outage. The wrong shortcut is to back up keys across regions; the security domain model deliberately prevents that, and any procedure that appears to do so should be treated as a misconfiguration. The right pattern is hot-standby: two HSMs, two key identities, signatures from either accepted by the verification policy, and a documented procedure for promoting the standby during an incident.

How does signing cost and latency play out at scale?

Managed HSM pricing has historically been a per-cluster monthly cost plus a per-operation cost. For organizations that sign every CI build, the per-operation cost adds up. The cost-optimization patterns are: cache the result of attestation generation rather than re-signing identical artifacts, batch signing operations where the tooling supports it (Notation and Cosign both support signing multiple artifacts in a session), and split signing across two key types — a fast-rotating ephemeral signing key for development artifacts and a long-lived release key for production artifacts. Latency is rarely the bottleneck for signing itself; the typical Managed HSM sign operation completes in tens of milliseconds. The bottleneck is usually the surrounding workflow: pulling certificates, building attestation envelopes, uploading to transparency logs. Profile end-to-end before optimizing the HSM call.

How Safeguard Helps

Safeguard inventories every Managed HSM and standard Key Vault across your Azure tenants, mapping which signing keys are bound to which CI/CD identities, when each was last rotated, and which artifacts in production were signed with which key version. Policy gates block infrastructure-as-code changes that grant decrypt or unwrapKey permissions to build identities that only need sign, that move signing keys into standard Key Vault from Managed HSM, or that disable purge protection on a vault holding signing material. Griffin AI traces a deployed artifact back through its signature to the HSM key, the role assignment that permitted the sign operation, and the workload identity that requested it — turning "is this image really ours" into a one-query answer. Continuous monitoring of HSM activity logs, security domain status, and role assignments produces a defended signing program that satisfies auditors and survives the operational realities of teams, rotations, and incident response.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.