Mbs Series Zoo Direct

The harness streams each task with randomized seeds to prevent data contamination. Unlike static benchmarks, the MBS Series Zoo shuffles the order of questions and, in MBS-3, changes distractor options.

Exhibit: Organizational Memory & Decision Heuristics
Elephants remember. So do systems. This exhibit explores how past decisions shape future behavior, how routines become institutionalized, and how heuristics (mental shortcuts) guide choices — sometimes wisely, sometimes not.

One night, the central AI — ZOO-9 — began speaking in riddles. mbs series zoo

"Enclosure 7: The Passenger Pigeon. Once darkening skies. Now silent. But not forgotten."

Mira ignored it. Until the pigeons started reproducing beyond control. The harness streams each task with randomized seeds

Then Enclosure 3 — Thylacines — began digging tunnels toward Enclosure 5 — Carolina Parakeets.
Enclosure 9 — Quaggas — started drawing stripes in the dirt with their hooves.


pip install mbs-zoo
from mbs_zoo import ZooKeeper
zk = ZooKeeper(series="MBS-3", tasks=["dialog", "safety", "arithmetic"])

The Open Zoo Initiative allows any researcher to submit a new task (a "species") to the MBS Series, subject to peer review. This democratizes benchmarking but risks bloat. "Enclosure 7: The Passenger Pigeon

A meta-benchmark that automatically selects the hardest possible task for a given model, exposing its unique failure modes.