Lino runs inline on a developer's laptop in under 100 ms. Griffin L is seventy times its size and takes seconds to answer. The student inherits most of the teacher's judgement on a narrow but useful slice of security tasks — sink detection, sanitiser scoring, inline triage — and concedes the rest. The technique that makes that trade work is distillation.
A smaller "student" model is trained to mimic the input-to-output behaviour of a larger "teacher", recovering most of the teacher's capability at a fraction of the parameter count.
In plain label distillation, the student only sees the teacher's final answer. In trace distillation — the variant Lino uses — the student also sees the teacher's intermediate reasoning, so it learns the reasoning shape and not just the verdict.
The result is a model that is dramatically smaller and faster, that gives up some of the teacher's reach — long context, deep multi-hop reasoning, novel-pattern generalisation — and keeps the parts that matter for its specific job.
Order matters. Skipping the trace step produces a faster classifier that confidently misjudges sanitised flows. Skipping quantisation produces a model that's too slow to live inline. Skipping the realistic prompt distribution produces a model that fails on the prompts it will actually see.
Inputs are drawn from real engineering workflows: sink detections, sanitiser-quality checks, dangerous-import flags, suspicious deserialisation patterns, and inline questions a developer would actually ask their IDE. We don't sample from synthetic prompt collections — the distribution has to look like what the inline model will face in production.
Griffin L produces the answer and, crucially, the structured reasoning trace that led to it: the hypothesised exploit, the cited path through the call graph, the disproof attempt, the proposed patch. The trace is the supervision signal — not just the final label — and is what turns label distillation into trace distillation.
The student model is optimised against two objectives in parallel: (input, final-label) gives it the verdict, (input, intermediate-trace) forces it to learn the reasoning shape. Both signals are weighted; trace distillation prevents the student from collapsing to a confident-but-shallow classifier.
Once the student passes the eval harness against Griffin, its weights are quantised to INT8 for on-device inference, packaged with the IDE extension and CLI, and pinned by SHA. The same artifact runs identically on every developer machine — no cloud round-trip, no per-call cost, no source code leaving the laptop.
Plain label distillation hands the student a stack of (input, answer) pairs and tells it to fit them. The student learns to predict the verdict — and forgets the shape of the reasoning that produced it. That is fine for image classification. It is not fine for security, because security verdicts depend on conditions the model has to actively check: is this sink reachable, is this sanitiser sufficient, does this dangerous import actually get called.
Trace distillation forces the student to walk through the same intermediate steps the teacher walked through. The supervision signal includes the hypothesis, the cited path, the sanitiser check, the disproof attempt. The student doesn't just learn the answer — it learns the procedure that produces the answer, which is what generalises to inputs it hasn't seen.
That procedure is what gives Lino its accuracy at sub-100 ms. The student is small, but it's following a reasoning recipe inherited from a model seventy times its size.
Teacher (Griffin L)
prompt ─┐
▼
hypothesise
│
▼
cite path
│
▼
attempt disproof
│
▼
verdict
────────────────────────────
Plain label distillation
prompt ──────────► verdict
(student never sees the steps)
Trace distillation (Lino)
prompt ──► hypothesise
│
▼
cite path
│
▼
attempt disproof
│
▼
verdict
(student is supervised on
every intermediate step)A 1B student does not match a 70B teacher across the board. The point of the lineup is that it doesn't have to — Eagle and Griffin pick up everything Lino concedes, and the routing layer knows when to defer.
Distillation is what makes Lino possible. The corpus is what makes Griffin worth distilling from. The evaluation harness is what proves the student didn't silently regress on the cases that matter.
Drop Lino into the IDE. Measure its latency, refusal rate, and agreement with Griffin on the cases that matter. The distillation work shows up in those three numbers.