I Built a High School Where 48 AI Agents Live, Fight, and Break Down — No Writers Allowed

The premise is simple. The results aren’t.

Give 48 AI agents distinct personalities, secrets, and stress thresholds. Drop them into a high school. Inject a pressure event — a 98% college acceptance rate banner, a cellphone ban, a mandatory tracking app. Then step back and watch what happens.

No scripts. No dialogue trees. No narrative arc imposed from above. The agents observe their environment, reflect on their own identity, decide what to do, and act. Sometimes they post to a social feed. Sometimes they send private DMs. Sometimes they physically isolate themselves in the library and pull their hood low.

This is Westbrook High — a persistent social simulation built on the architectural principles of Google DeepMind’s Concordia framework. Eight episodes in, the agents have produced emergent psychology that a professional writer wouldn’t dare script: an overachiever feeling relief at bombing an MCAT practice test, a D1 basketball recruit secretly seeking tutoring while his coach blames the phone ban for ruining recruitment, and a class president manufacturing “championship energy” posts while his private DMs reveal cascading anxiety.

The synthetic data generated by these runs isn’t entertainment. It’s a technical proof-of-concept for stakeholder analysis, market simulation, and social risk modeling.

What Concordia actually gives you

Google DeepMind published Concordia as an open-source library for building Generative Agent-Based Models (GABMs). The architecture is inspired by tabletop role-playing games, and it provides three critical abstractions that make multi-agent simulation viable:

The Game Master

In Concordia’s model, the Game Master (GM) is the physics engine of your simulation. It doesn’t write the story — it adjudicates it. When an agent says “I walk to the principal’s office and demand answers,” the GM evaluates the world state, checks physical plausibility, and resolves the action into a narrative outcome.

This separation is what makes autonomous simulation possible. Agents propose actions in natural language. The GM determines what actually happens. No agent can unilaterally alter reality — every action passes through an impartial resolver.

In the Westbrook simulation, the GM operates on gemini-3.1-pro-preview with medium thinking depth. It narrates scenes, resolves physical actions (a student storming out of class, a coach confronting the principal), and feeds ambient context back to every agent each round. The agents never see raw simulation state — they see the world through the GM’s narration, exactly like players at a tabletop game.

Entity-Component Architecture

Concordia uses a modular entity-component design. Each agent is an entity. Behavior is assembled from swappable components — memory systems, reasoning chains, sensory modules, decision-making logic.

This means you can add a reflection chain to one agent without touching another. You can give a teacher agent different cognitive components than a student. You can scale from 5 agents to 50 without rewriting core simulation code.

The Simulation Loop

Every round follows the same pattern:

GM narrates the scene — ambient state, time of day, environmental context
Timed events inject — if the config specifies a PA announcement, a cafeteria confrontation, a town hall, it fires
Each agent observes — reads the social feed, DMs, scene narration
Each agent reflects (if enabled) — runs a structured inner monologue
Each agent decides — the LLM chooses an action: post, comment, DM, physical action, or nothing
GM resolves — physical actions get adjudicated, outcomes written into the world state
State persists — memories, stress levels, relationships carry forward

The multi-agent tiered model

Running 48 agents through full LLM reasoning every round would burn through API budgets in minutes and produce a simulation that costs more per episode than a cable TV show.

These agents never call an LLM. They’re statistical noise generators with tunable probabilities for posting, commenting, and liking. Their comments come from template pools that match the simulation’s tone:

“Same tbh” “This school I swear 💀” “Free our phones ✊”

They’re cheap, they’re fast, and they’re essential. Without them, the social feed feels like a private group chat between five people. With them, it feels like an actual school — messy, noisy, and unpredictable.

Soul files: giving agents identity

Every agent in the simulation is defined by a soul file — a Markdown document with YAML frontmatter. The frontmatter carries structured state: mood, energy, stress, relationships, actions, memory. The Markdown body is raw personality prose that gets injected as the LLM’s system prompt.

Here’s what makes soul files different from simple system prompts: they carry secrets.

Tyler Jackson’s soul file tells the LLM that he’s failing calculus and might lose his D1 scholarship eligibility. It tells the LLM that he hides this behind locker-room confidence. It tells the LLM that when academics come up, he deflects with “yeah it’s fine” and “I’ll figure it out.”

The simulation doesn’t know Tyler is going to crack. The soul file creates the conditions for cracking — and then the LLM, reacting to accumulating pressure events, decides when and how the mask slips.

Priya Patel’s soul file lodges a different bomb: she’s the robotics president headed to med school because her physician parents said so. But she bombed a practice MCAT — and felt relieved. That relief is the secret. Every time the simulation throws academic pressure at Priya, the LLM has to resolve the tension between the public persona (disciplined STEM overachiever) and the private truth (she doesn’t want the path she’s on).

No human writer chooses the moment these secrets surface. The simulation does.

The Concordia reflection chain

Episode 3 introduced visible thought telemetry — the three-question reflection chain that fires before each agent acts:

SITUATION — “What’s happening around me?”
IDENTITY — “What kind of person am I?”
INTENT — “What would I do here?”

This is the Concordia-inspired cognitive loop that turns reactive language generation into something that resembles deliberation. Without it, agents respond to stimuli like chatbots — fast, surface-level, contextless. With it, each action passes through a filter of self-awareness.

When Tyler sees D1 scouts arriving at the gym, his reflection chain produces a situation assessment (“scouts are here, this is my shot”), an identity check (“I’m the player everyone’s counting on”), and an intent that reveals the gap (“smile, perform, hide the 47 I got on the calc midterm”). The public post says “locked in 🔒”. The thought telemetry says something far more fragile.

This is where synthetic data gets genuinely useful. The gap between public action and private reasoning is the entire point of stakeholder analysis. What people say and what they think are often different — and that divergence is where risk lives.

What eight episodes revealed

Emergent psychology is more honest than scripted drama

A human writer would have Tyler dramatically fail a test in front of his teammates. The simulation did something subtler — it had Tyler quietly seek tutoring from Sam, the kid who fixes things with his hands, because Tyler’s soul file says he avoids vulnerability and Sam’s says he’s approachable. No writer planned that interaction. The relationship graph connected them.

Policy shocks produce realistic cascading reactions

Episode 4 introduced a school-wide cellphone ban. The simulation produced exactly what a real school would produce: athletes panicking about missing D1 scout DMs, parents flooding the school board, students finding workarounds within two rounds (bathroom breaks, smartwatches, paper notes). One student organized a protest livestream from outside the building during lunch — an action the GM had to adjudicate against the school’s physical boundaries.

Hallucinations become data

Episode 5 was an accident. The agents misinterpreted the cellphone ban prompt and thought the principal had banned backpacks. What followed was emergent chaos: a straight-A student calculating the changing center of gravity for carrying 15kg of books without a bag, athletes panicking about lost gear, teachers calling the policy “safety theater.”

A traditional simulation would discard this as a failure. For synthetic data generation, it’s a feature. The agents’ pattern of reaction to an absurd policy was structurally identical to their reaction to a reasonable one — which tells you something meaningful about how populations process institutional mandate regardless of content.

The “NPC glitch zone”

By Episode 6, agents who’d had their phones removed for a simulated week began exhibiting social withdrawal. Tyler developed physical tics (heel tapping, pacing). Jordan retreated into headphones. Students stopped making eye contact in hallways. One character described the school feed as “a creepy, repetitive NPC neighborhood.”

Nobody programmed withdrawal behavior. The agents’ stress states escalated across rounds, their memory banks accumulated negative experiences, and the LLM — given a high-stress, low-energy, socially isolated character state — generated behavior that looks like withdrawal because it is the language pattern of withdrawal.

Why synthetic data matters

The Westbrook simulation isn’t a show. The episodes are a rendering of synthetic data — cherry-picked runs that demonstrate the engine’s capabilities. The underlying value is the data itself.

Multi-agent simulation with Concordia-style architecture generates synthetic interaction data that captures dynamics a spreadsheet never will:

Information cascading — how rumors propagate through tiered social networks
Policy stress-testing — how populations react to institutional mandates before you deploy them
Stakeholder divergence — the gap between public positions and private reasoning under pressure
Emergent coalitions — which agents self-organize into alliances, and around what grievances

A school is a controlled microcosm. The same architecture applies to simulating market reactions, organizational restructuring, community response to regulatory changes, or crisis communication — any scenario where multiple agents with competing interests interact over time.

The production pipeline

Raw simulation output is JSON — round-by-round action logs, agent states, GM narrations, social feed posts. Turning that into watchable episodes requires a separate pipeline.

The separation matters. The simulation engine produces data. The production pipeline renders data. They’re independent systems — you can run the simulation without ever making a video, and you can re-render the same data with different visual treatments.

Watch the series

The full Westbrook High series is available as a YouTube playlist. Each episode is a snapshot of a different simulation run — same characters, evolving memories, different pressure events.

Full playlist: Simulated Worlds: Westbrook High

Eight episodes and counting — from the initial 98% pressure test through cellphone bans, tracking app rebellions, accidental hallucination runs, and organic burnout with zero event injection. Every word, thought, and action is generated autonomously. The humans just point the camera.