Product

Safeguard Local Runner: Agentic Security on Your Laptop

The Local Runner is a command-line agent that runs Safeguard workflows against your working tree. Think claude-code-for-security, but for supply chain.

Shadab Khan
Security Engineer
7 min read

The Safeguard Local Runner is generally available. It is a command-line agent that runs Safeguard workflows against your local working tree — scan, classify, reach, remediate, and attest — the way Claude Code runs code-editing workflows. The mental model we use internally is "claude-code-for-security," but the workflows are structured and auditable rather than open-ended chat.

The Local Runner is the same Rust core that powers the desktop application, packaged as a single binary. It is appropriate for three audiences: developers who want to run workflows on a repo before committing, security engineers who want to script scans against many repos without a CI detour, and CI pipelines that want a sandbox-free agent with proper capability scoping.

What does the Local Runner actually do?

Answer first: it runs a named workflow against a local target and produces a structured result. If the workflow calls Griffin, it will edit your working tree; if it calls Eagle or Lino, it will write findings or evidence.

The basic invocation is safeguard run <workflow> --target <path>. The workflow is either one of the 50+ library templates or a YAML file in your repo under .safeguard/workflows/. The target is a repo, a directory, an image name, or a package manifest. The runner resolves the workflow, prompts for any missing inputs, runs each step, and writes a structured run record to disk.

Three typical sessions. First, a developer about to open a PR runs safeguard run triage-diff --target . which scans the diff, classifies any new dependencies with Eagle, runs reachability on any new imports, and prints a summary. Second, a security engineer runs safeguard run remediate-open-cves --target ./repos/* against a directory of clones to open remediation PRs in parallel. Third, a CI pipeline runs safeguard run gate-release --target ./image.tar --output json as a release gate and parses the JSON.

The runner's two big differences from a typical CLI tool. It is resumable — a remediation run that gets interrupted can be resumed with safeguard run resume <run-id>, and the model state is restored from the run record. And it is auditable — every tool call, model decision, and file edit is recorded with a cryptographic hash chain so the run can be replayed byte-for-byte after the fact.

How does it differ from running the same workflows in CI?

Three differences worth calling out: it edits your actual working tree, it uses your local credentials, and it talks to models faster when the desktop app is running.

Editing your actual working tree is the point. A CI-based remediation opens a branch and a PR; a Local Runner remediation opens a branch on your local copy and leaves you at the final commit, ready to review and push. For developers who want to understand a fix before it lands, this is the natural flow. For CI use cases, the runner has a --no-commit and --output diff mode that produces the diff without editing — close to what a CI step would do — but the default is to edit.

Local credentials mean the runner uses your git credentials, your cloud credentials, your registry credentials, and your package manager credentials. This matters for private registries, private GitHub orgs, and environments where the CI service account cannot reach the same resources a developer can. The runner respects git credential helpers, npmrc, pip.conf, kubeconfig, and AWS/GCP/Azure CLI auth on its own — there is no secondary credential store.

Model latency is lower when the desktop application is running because the runner proxies through the desktop's local model cache. On a typical Macbook, a triage-diff run that takes 8 seconds without the desktop takes 2 seconds with it. If the desktop is not running, the runner talks directly to the backend and the latency is network-bound.

# A real session, roughly
$ safeguard run triage-diff --target .
# scanning 12 changed files...
# classifying 3 new dependencies (eagle@3.0)...
#   pkg:npm/left-pad@2.0.0          benign
#   pkg:npm/lodash@4.17.20          vulnerable.prototype_pollution (reachable)
#   pkg:npm/async@3.2.3             benign
# 1 finding. Run `safeguard run remediate --finding fnd_01J9...` to fix.

How does capability scoping work?

Answer first: workflows declare the capabilities they need, the runner prompts the user to grant them, and the grants are scoped per-invocation by default.

The capability model has six buckets: fs.read, fs.write, net, git.push, registry.publish, and exec. A workflow that only scans needs fs.read. A remediation workflow needs fs.read, fs.write, and usually exec to run the test suite. Publishing an attested artifact needs registry.publish. The runner refuses to execute a step that requires a capability the invocation does not have.

Grants can be per-invocation (the default — prompts every time), per-workflow (remembered for that workflow until revoked), or per-workspace (set at the workspace level in the web app, propagated down to any runner authenticated to the workspace). Destructive capabilities — git.push, registry.publish — default to per-invocation no matter what the lower-level setting is. This is deliberate and not configurable; the prompt has been the right friction in private preview.

For CI use, grants can be pre-declared in the invocation environment, which is how you avoid interactive prompts in a pipeline:

# .github/workflows/security.yml
- name: Remediate reachable CVEs
  run: safeguard run remediate-reachable --target .
  env:
    SG_GRANTS: "fs.read,fs.write,exec,git.push"
    SG_WORKSPACE: "acme-prod"
    SG_TOKEN: ${{ secrets.SAFEGUARD_TOKEN }}

The runner records grants in the run record. If you audit who had what capability at runtime, the answer is in the record with a timestamp and the invoking identity.

What kinds of workflows ship with it?

The library has 50+ templates and the runner can execute all of them. The most-used ones cluster into five groups.

Triage workflows — triage-diff, triage-repo, triage-image, triage-branch — scan and classify a target and produce a findings summary. Read-only, good for first-time use.

Remediation workflows — remediate-reachable, remediate-high-severity, remediate-deprecated, remediate-base-image — open or land remediation changes against a target. Each has a policy knob (allow major bumps, allow base-image changes, and so on) and each produces a Lino evidence item on completion.

Attestation workflows — attest-release, attest-image, attest-sbom — produce a signed attestation bundle for an artifact. Used by release pipelines to produce the bundle that downstream consumers verify.

Gate workflows — gate-release, gate-merge, gate-dependency-add — evaluate a target against a policy and return pass/fail with reasons. Designed to be called from CI.

Compliance workflows — evidence-q1, evidence-for-control, evidence-for-framework — query Lino and produce evidence packages. Mostly used from scripts, sometimes from assistants via the MCP Server.

You can write your own workflows in YAML in the repo. The schema is versioned, the editor in the web app provides validation, and the local runner will validate a custom workflow before executing it. The workflow library, the IDE extensions, the MCP Server, and the desktop all share the same workflow format — a workflow authored anywhere runs anywhere.

How Safeguard.sh Helps

The Local Runner is the single binary that makes Safeguard usable from a terminal, and it is the most portable way to integrate the platform into existing developer and CI workflows. It runs the same engine as the desktop application and speaks the same workflow format as the web app at app.safeguard.sh, the MCP Server, and the IDE extensions for VS Code, Cursor, and JetBrains — no separate authoring. Every run produces Lino 2.0 evidence, talks to Eagle 3.0 and Griffin 3.0 against the current workspace config, and resolves against the Gold Registry when remediation pins are needed. Capability scoping makes the runner safe to use in sensitive environments, and FedRAMP HIGH, IL7, and SOC 2 Type II coverage applies to the runner in the Gov region when authenticated against a Gov workspace.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.