DeepMind updates its Frontier Safety Framework with new risk domains and tracked capability levels

Advait Vats

Why it matters

As frontier models get embedded into products, “trust us” safety claims aren’t enough. Public frameworks like FSF shape what customers, regulators, and partners expect to see: explicit thresholds, documented mitigations, and clear go/no-go governance.

What changed

The most important change is operational: DeepMind adds Tracked Capability Levels to watch for meaningful risks below the “critical” threshold, and it extends its taxonomy to include harmful manipulation as a first-class risk domain alongside areas like cyber and CBRN misuse.

Practical read

The accompanying FSF 3.1 document describes how the company plans to use early-warning evaluations, alert thresholds, and safety-case reviews as part of a broader risk acceptance process. Even if you don’t adopt the exact same structure, it’s a useful blueprint for internal gating and disclosure.

What to watch

For organizations buying or deploying AI systems, this is a reminder to ask for artifacts, not assurances: what evaluations were run, what mitigations were required, and what monitoring exists post-deployment.

DeepMind updates its Frontier Safety Framework with new risk domains and tracked capability levels

Understand this update in under a minute

Why it matters

What changed

Practical read

What to watch

Google DeepMind

Advait Vats