Company · Engineering Principles

How we ship. Twelve principles that govern engineering at Safeguard.

The three constitutions cover what we build and how we behave. This page covers how we write and ship the software underneath all of it. The principles below are how senior engineers at Safeguard hold themselves and each other accountable — in design review, in code review, on call, and in post-mortems.

The twelve

The twelve principles.

Each principle is short on its own. The elaboration is for the engineer who is trying to apply it on a Tuesday afternoon.

Write the test first when you can; otherwise write it before you call it done

Test-first is the default. When the shape of the work makes test-first impractical — exploratory spikes, performance investigations — we still require a passing test before the change is considered done. "It works on my machine and I will add the test later" is not a state we ship from.

Reproducibility is a feature, not a hope

Every build, every model release, every audit log is reproducible from the recorded recipe. If a binary cannot be regenerated from source plus a recorded environment, the binary is not allowed in production. The same rule applies to model weights and to the eval results that accompany them.

The on-call rotation is short, fair, and respected

Rotations are kept short enough that nobody burns out and wide enough that everybody has skin in the game. Compensation for on-call is part of the package, not a favour. If on-call is unsustainable, the system is unsustainable — and the fix is the system, not the rotation length.

Post-mortems are blameless and read by the whole team

When an incident happens, we write a post-mortem that names components, timelines, and decisions — not people. The whole team reads it. The action items have owners and dates. We track them to closure with the same discipline we track production bugs.

We deploy frequently in small increments

Small, reversible changes go to production often. Long-lived branches are a smell — they accumulate risk that has to be resolved in one merge event. If a branch is more than a few days old and not behind a feature flag, somebody is making a tradeoff we should examine.

Customer code is sacred

We do not read it casually, log it incidentally, train models on it, or move it across tenant boundaries without explicit consent. The platform is designed so that even an authorised engineer cannot accidentally cross those lines. The few paths that do touch customer code are audited per access.

Feature flags are not a substitute for a decision

Flags are for ramp, soak, and rollback — not for postponing the question of what the right behaviour is. A flag that has been on at 100% for three months should become a default. A flag that has been at 0% for three months should be deleted. Flag debt is technical debt.

Performance budgets are committed before the work starts

p95 latency is a feature with a target written into the design document. If the implementation comes in over budget, we go back to the design — not ship and chase. Budgets exist for memory, p95 latency, model inference cost, and binary size, depending on the surface.

The interface is the contract

Public APIs — REST, gRPC, the trace format, the SDK shapes, the model output schema — are versioned. We do not break them without a deprecation window, a migration guide, and an opt-in ramp. Internal APIs follow the same discipline whenever they cross a team boundary.

Code review is a senior responsibility, not a chore

Reviews from peers are mandatory. Reviews from leads are owed within a working day for non-trivial changes. A review is a teaching surface — "this works" is not a review; "this works, here is the edge case I would test, here is the simpler alternative" is.

Security review is upstream of merge, not downstream of release

Sensitive paths — anything touching auth, tenant boundaries, model serving, customer data — pass through security review before merge. Catching a security issue in production is failure mode, not the design. The security-review label is non-negotiable on those paths.

Documentation is part of the deliverable

A feature without docs is half-done. Internal runbook, customer-facing changelog, API reference if applicable. Docs are reviewed alongside the code in the same pull request. A merged feature that has no merged docs is a regression waiting for the next on-call shift.

Anti-patterns

What we do NOT do.

The list of practices that are not allowed under this engineering bar — even when the temptation to make an exception is high.

Silent skipping of tests. If a test is being skipped, the skip is annotated with a reason and a ticket.

One-off bypasses of policy gates that do not get logged. Every bypass is a logged event with a justification.

Untested hotfixes to production. Even at 2am during an incident, the hotfix has a test before it merges.

Force-pushing to shared branches. The branch history is the audit trail.

Copy-pasting customer code into prompts, chat, or any external surface. Customer code stays where customer code lives.

Deploying without a roll-back plan. If the deployment cannot be rolled back, the deployment cannot ship.

On-call & incident response

When something breaks.

Rotation length

One-week rotation with a day-and-night handoff at a fixed time. Long enough to build context, short enough that a hard week does not become a hard month.

Severity & SLAs

Severity definitions and response SLAs are documented and linked from the public status page. A P1 has a written response window; a P3 has a written triage window. No verbal SLAs.

Post-mortem template

Blameless, structured: timeline, decisions, contributing factors, action items with owners. The template is short on purpose — the goal is reading it, not writing it.

Customer comms during P1

Public status updates at a fixed cadence during a P1 incident — even when the cadence is "still investigating, next update at X." Silence is not allowed during a customer-impacting event.

Code review norms

How code gets reviewed.

Two-reviewer minimum

Non-trivial changes need at least two approvals. Trivial changes — typo fixes, dependency bumps with automated checks — can ship on a single approval with the appropriate label.

Written rationale on approval

An approval comes with a sentence — what was checked, what was deferred, what the reviewer is on the hook for. "LGTM" alone is a not-yet, not a yes.

Security-review label

Sensitive paths carry a security-review label. The label cannot be removed by the author; it is removed by the security reviewer once the review passes.

Owner sign-off across areas

Changes that cross an ownership boundary require the owning team's sign-off. The CODEOWNERS file is the source of truth and is reviewed every quarter.

Onboarding

How we onboard engineers.

Five steps from day one to first on-call shift. Each step has a written exit criterion, signed off by the engineer's lead.

Week 1

Shadow on-call

New engineer shadows the on-call rotation — pages, dashboards, runbooks. Reads incidents from the last quarter. No production responsibility yet; observation only.

Week 2

Ship a tiny end-to-end change

A small, real change that goes from local clone to production behind a flag. The point is the path, not the size — every step of the release pipeline gets exercised.

Week 4

Own a small feature

First small feature in the engineer's area. Design doc, implementation, tests, docs, deployment — the whole loop, owned end to end. Reviewed by their lead.

Month 3

Own a moderate feature

A feature that crosses a system or team boundary. The engineer is now the person who answers questions about that area in design review. Mentorship still active.

Month 6

Join on-call rotation

By month six, the engineer is on the rotation as a primary. Not before — running production for paying customers is not an entry-level responsibility.

Where to go next.

Values · three constitutions

Security, AI, and Human Values. The three documents that govern how Safeguard builds, ships, and behaves.

Leadership

Who leads, how they lead, the leadership operating manual, and the decision-making rules of the road.

Mission & vision

Why we built this, the ten-year arc, success and failure criteria, and the leading indicators.

Security posture

The platform's own security posture — tenant isolation, signed weights, key management, the bug bounty programme.

If this is how you want to build.

Open engineering roles — across model serving, scanner fusion, control plane, and the surfaces in the IDE — are listed in careers. The principles above are the contract on day one.