Anthropic donates Petri alignment audits to independent Meridian Labs
Anthropic says it is handing Petri, its open-source alignment auditing toolbox, to Meridian Labs and releasing Petri 3.0 with more adaptable and realistic behavior tests.
Passed source freshness, duplicate, QA, and review checks before publishing. Main source freshness limit: 14 days.
- Source count
- 1
- Primary sources
- 1
- QA status
- pass
Plain English
What this means in simple words
Petri runs scripted, multi-step “test conversations” to see how an AI model behaves. Anthropic says an independent nonprofit will now maintain and improve it.
What happened
On May 7, 2026, Anthropic said it is transferring development of Petri, its open-source alignment testing toolbox, to Meridian Labs alongside updates branded as Petri 3.0.
Why it matters
Safety tests matter most when multiple labs and regulators can trust and reuse them. Moving a widely used audit tool to an independent home can make results feel more neutral while keeping the tooling maintained as model APIs evolve.
Key points
- Anthropic says Petri 3.0 changes the architecture to make audits more adaptable across use cases.
- It describes a “Dish” add-on meant to make tests more realistic and reduce eval-awareness artifacts.
- Anthropic says Petri is moving to Meridian Labs so the tooling is not owned by a single AI lab.
What to watch
Watch whether Meridian Labs publishes clearer baselines and versioning so different labs can compare results over time without tool drift.
Key terms
- Alignment audit
- A structured evaluation that checks whether a model reliably follows safety and policy expectations across scenarios.
- Eval-awareness
- When a model recognizes it is being tested and changes behavior, which can make evaluations less representative.
Sources
Source dates are original publication dates. The posted date above is when The AI Tea published this explanation.
- Donating our open-source alignment tool Anthropic · Research blog · Original source May 7, 2026 · Source age 3 days Primary