Datadog is the rare observability platform that has credibly pushed into security without abandoning its roots. If you already have Datadog instrumented across your infrastructure, the incremental effort to turn it into a supply chain security sensor is smaller than standing up a dedicated SIEM. The platform is not without limitations, especially around historical data retention and some detection rule expressiveness, but for teams that want fast time to value, it is a reasonable fit.
This post covers how I have approached supply chain monitoring in Datadog across three engagements over the past year. The patterns are fairly consistent, though the specific sources and rules vary with each customer's stack.
Why Datadog Specifically
Most of the teams I work with already have Datadog for application monitoring and infrastructure metrics. Their engineering organizations live in Datadog dashboards. That is a real advantage for security, because detections and dashboards that land in the same tool engineers already use get attention in a way that alerts in a separate SIEM often do not.
Datadog's Cloud SIEM specifically brought detection rule authoring and log-based alerting into the same UI as infrastructure monitoring. Combined with Application Security Management for runtime attack detection and the broad logs pipeline for ingesting anything with a JSON structure, the platform covers enough of the supply chain surface to be useful.
Ingesting Supply Chain Sources
The most important sources for supply chain monitoring are not native Datadog integrations, which means you end up leveraging either the generic HTTP logs API, the Datadog agent with custom log sources, or dedicated integrations like the GitHub integration that ships as part of the default integration catalog.
The sources I typically wire up:
- GitHub audit log via the native Datadog GitHub integration
- Jenkins build logs via the Datadog Jenkins plugin, which adds build metadata as tags
- Artifactory access logs via the Datadog Artifactory integration that ships logs from the built-in Artifactory log aggregator
- npm registry events via a custom Lambda that polls the npm changes feed and posts to the logs API
- Container image scan results from Trivy or Grype output via a simple shipper
All of these land in the Datadog logs pipeline, and I tag them with service:supply_chain along with more specific tags like component:scm or component:registry. Those tags drive the dashboards and detection rule filtering.
Detection Rule: Unusual Workflow Execution Time
GitHub Actions workflows have predictable durations. When a workflow suddenly takes three times as long as usual, something changed. This could be legitimate, like a new test suite, but it could also indicate crypto mining embedded in a workflow step or additional exfiltration activity extending runtime.
Datadog's detection rule DSL handles this with a comparison against a baseline:
source:github @evt.name:workflow_run.completed
@duration:>300000000000
(@duration > avg_past_14d(@duration) * 3)
The detection fires when a workflow's duration exceeds three times its 14-day average. The baseline is computed with a Datadog metric query that rolls up workflow duration per workflow name.
In practice this rule has caught two legitimate regressions (a test suite that grew without anyone noticing) and one actually suspicious case where a workflow pulled a secondary artifact and ran it for several minutes before the legitimate build steps.
Detection Rule: New Package Source at Build Time
Build agents that suddenly install from a new registry deserve attention. The rule uses Datadog's process monitoring in combination with build logs:
source:jenkins @build.step_name:*install*
@registry:*
NOT @registry:(registry.npmjs.org OR artifactory.internal.com)
This requires parsing the registry URL out of npm, pip, or maven command outputs, which I handle in a logs pipeline processor. The processor matches common patterns like npm install --registry=<url> and extracts the registry hostname into the @registry attribute.
Detection Rule: Container Image Vulnerability at Runtime
Datadog's Container Image Management scans images in production and surfaces vulnerabilities. The detection rule fires when a critical CVE appears in an image actively running in a production namespace:
@container.image.vulnerability.severity:critical
@container.namespace:production
@evt.name:image_scan_result
The value of this detection comes from its correlation with other signals. Datadog can join the image vulnerability event with the Kubernetes audit logs that show which service account deployed the image, which surfaces both the risk and the actor accountability.
Detection Rule: Secret Detected in CI Log
GitHub and GitLab have native secret scanning, but they miss secrets that appear in build output rather than committed files. Datadog's Sensitive Data Scanner can run on log streams, including Jenkins build logs. The configuration looks like:
rule: detect_aws_access_key_in_build_logs
matches: AKIA[0-9A-Z]{16}
replace: [REDACTED]
scanning_group: build_logs
When the scanner matches, it can trigger a detection rule that pages the security team. The scanner also redacts the secret in the log stream, which prevents it from being visible to anyone with log read access. That matters because build logs often have much broader read access than SCM repositories.
Dashboards for Engineering Teams
The detection rules are one thing, but engineering leaders want a view that shows overall pipeline security health. I built a dashboard with these panels:
- Build success rate segmented by SBOM generation status
- Mean time from vulnerability disclosure to pipeline-blocked deployment
- Top 20 dependencies by production footprint with risk score overlay
- Secret scanner match rate across the last 30 days
- Attestation coverage percentage across production services
The attestation coverage panel in particular has been useful for driving adoption. When engineering leadership sees that 30 percent of services deploy without signed attestations, and that percentage shows up week over week, it becomes a metric teams care about.
Using Watchdog for Anomaly Detection
Datadog's Watchdog feature finds anomalies in metrics automatically. For supply chain monitoring I apply Watchdog to:
- npm install count per build agent per hour
- Number of unique packages installed per project per day
- Artifactory upload count per user per hour
- GitHub Actions secret reads per repository per day
Watchdog surfaces deviations without requiring explicit rule authoring. The trade-off is reduced precision; it flags any unusual pattern, not specifically malicious ones. I treat Watchdog alerts as triage signals rather than high-confidence detections.
Integrating with Application Security Management
Datadog ASM adds runtime context that SIEMs generally lack. When a service starts making unexpected outbound connections, ASM detects it. For supply chain monitoring, this matters because the payload of a compromised package often manifests as runtime behavior rather than a build-time signature.
I correlate ASM events with package install events from build logs. If a service that includes a package installed in the last 14 days suddenly shows anomalous outbound connections detected by ASM, the detection rule fires with both events attached. This narrows the investigation dramatically compared to a bare ASM alert.
Retention and Historical Analysis
One real limitation is Datadog's log retention. Default retention is 15 days for indexed logs and 30 days for hot storage in some configurations. Supply chain investigations often span months, because compromised packages might sit in your dependency tree for a year before anyone notices. For long-horizon investigations I configure Datadog to archive logs to S3 and use Rehydrate to pull them back on demand. It is more cumbersome than native long-term retention, but it works.
Tuning and False Positive Management
The build duration detection was the noisiest one initially. Test suites legitimately grow. I tuned it by requiring both the duration anomaly and at least one other signal, like an unfamiliar process on the build agent or a new registry hostname in the logs. Combining weak signals into a higher-confidence detection reduced alert volume by roughly 70 percent.
How Safeguard Helps
Safeguard ships a Datadog integration that forwards supply chain findings, component risk scores, and SBOM analysis results into your Datadog logs pipeline as structured events. This lets your existing detection rules and dashboards reference Safeguard enrichment without custom data engineering. When a runtime anomaly surfaces in Datadog ASM, the linked Safeguard finding provides the build-time context engineers need to determine whether the cause is a compromised dependency. Teams using this integration typically reduce triage time on supply chain alerts because the upstream context is already attached to the event in Datadog.