Inside Garry Tan's G-Brain: The Open-Source Repo YC Wants Every Company to Clone

“We need Garry’s G-Brain, but for every business in the world.”

That’s Tom Blomfield — YC General Partner, founder of Monzo and GoCardless — writing on Y Combinator’s Summer 2026 Request for Startups. Not a suggestion. A declaration of a new investable category.

What makes the statement land differently from a typical accelerator wish list: the CEO of YC already shipped the personal version. Open source. Running on his own data. Garry Tan’s repo is called G-Brain, and it opens with a sentence that frames everything that follows:

Your AI agent is smart but forgetful. G-Brain gives it a brain.

The numbers Tan publishes alongside it: 17,888 pages. 4,383 people. 723 companies. 21 cron jobs running on schedule. Built in 12 days. Not a demo. Production data — his own.

When the CEO ships open-source code and a General Partner names that exact code as the prototype for a category they want to fund, that isn’t coincidence. That’s a thesis.

What G-Brain actually is

In plain terms, three pieces — each a deliberate architectural choice:

Markdown files on disk are the source of truth. A human can always read and edit them. Git tracks every change. The brain is not locked behind an API.
A retrieval layer — Postgres with pgvector, or an embedded version Tan calls PGLite — sits on top. Hybrid search: keyword + vector + reciprocal rank fusion.
29 markdown skill files tell the agent how to use it. Plus a process he calls the dream cycle that runs while he sleeps. In his words: “I wake up and the brain is smarter than when I went to sleep.”

The gap, stated up front

G-Brain is a personal brain. Blomfield is asking founders to build the company version.

One user with one brain — solved. Many users with shared brains — not solved. Multi-tenancy, access control, write conflict resolution, consensus across teams — none of those are addressed in the public repo. That gap is the entire investment thesis. We’ll come back to it.

Claim 1: Knowledge graphs beat vector retrieval

The kind of question vector search struggles with — Who works at Acme AI? What has Bob invested in this quarter? — is a relationship question. Cosine similarity over text embeddings doesn’t reach those answers. A typed link traversal does.

How the graph wires itself

On every page write, G-Brain extracts entity references with regex. It infers typed edges — attended, works_at, invested_in, founded, advises — from the page’s role and the surrounding text.

Zero language model calls. The graph wires itself, on the cheap, on every save.

The number that matters comes from Tan’s own benchmark suite, BrainBench:

Metric	Graph ON	Graph OFF	Delta
Precision @ 5	49.1%	17.7%	+31.4 pts

Same corpus — 240 rich-prose pages. Same embedder. Same pipeline. Only the typed-link extract changed.

Transferable principle: The graph wires itself. Regex over language model, where it can be.

Claim 2: Skill files are code

Tan’s exact phrase: “Skill files are code. The runtime is dumb. The intelligence lives in markdown a human can read, version, and audit.”

29 skill files ship with G-Brain. A single file routes intent to the right skill — it’s literally called the resolver. There’s no orchestrator. No planner. No chain. There’s a flat directory of fat markdown files, and an agent that reads the right one when it has a task.

Why this matters

Most agent frameworks bury their instructions in code. You can’t read them. You can’t review them. You can’t change them without a deployment. Tan’s bet is the inverse:

Most frameworks	G-Brain
Instructions buried in code	Instructions in markdown
Can’t read them	Read them in any editor
Can’t review them	Review them in a PR
Can’t change without a deploy	Change them with one commit

If you can’t read your agent’s instructions, you don’t own them.

Transferable principle: Plain text outlives code.

Claim 3: Minions crush sub-agents under load

This is a benchmark, not a philosophy. Tan ran a real production task: pull a month of his social media posts from an external API, ingest them end-to-end into the brain. Realistic load — 19 cron jobs already in flight.

He compared two routes:

Route A: Spawn a sub-agent (what every agent framework does by default)
Route B: Queue a Minion — a deterministic background job, Postgres-native, no LLM in the hot path

The numbers, side by side

	Sub-agent	Minion
Wall time	>10,000 ms (gateway timeout)	753 ms
Token cost	~$0.03 per run	$0.00
Success rate	0% (couldn’t spawn under load)	100%

The routing rule Tan extracts: Deterministic work goes to Minions. Judgment work goes to sub-agents. Most of what we call “agent work” is the first thing pretending to be the second.

Transferable principle: Determinism is a routing decision, not a fallback.

Claim 4: The dream cycle — memory as process

Eight phases run on a schedule while the user sleeps:

Lint — catch artifacts
Backlinks — enforce the link iron law
Sync — pick up overnight changes
Synthesize — distill conversation transcripts into reflections
Extract — auto-link new pages
Patterns — roll reflections into long-term themes (Tan calls them 25-year patterns)
Embed — refresh stale embeddings
Orphans — re-link orphaned pages

None of these phases is novel on its own. The interesting part is what the pipeline does to the data over time.

Overnight, conversation transcripts get distilled into reflections. Cross-session reflections aggregate into long-term patterns. Citations get audited. Orphan pages get re-linked. Stale embeddings get refreshed. The brain isn’t searched at the end of this. It’s reorganized.

A static knowledge base ages — pages drift, links rot, old context goes stale. A self-maintaining one compounds. Most knowledge tools degrade with use. This one appreciates.

Transferable principle: Memory should be a process, not a snapshot.

The deeper lesson: Two architecture mistakes

Step back. Most failures we blame on AI agents aren’t intelligence failures. They’re two distinct architecture mistakes — and naming them is the load-bearing move.

Mistake 1: Amnesia

Agents have no persistent world model. Every session starts cold. Tokens get spent every conversation re-establishing context that should have been written down once and indexed forever.

Vector retrieval patches this badly — fuzzy recall, no relationships
Knowledge graphs patch it well — typed links, backlink ranking, persistent world model

Mistake 2: Mis-routing

Deterministic work gets sent through a reasoning model because the framework doesn’t distinguish between work that needs judgment and work that just needs running.

Sub-agents stall under load — tokens spent on a decision that wasn’t a decision
Minions land in milliseconds — zero LLM in the hot path

The architect’s question isn’t can my agent do this? It’s should it? Memory is one layer. Routing is another. Tan’s bet: separate world knowledge from operational logic. Markdown is durable. Chains aren’t.

What changes if Blomfield is right

If the company brain becomes the default substrate for AI automation, three things shift:

1. Vector databases compress as a category. Knowledge graphs replace them as the default substrate. Vector retrieval dominated 2023–2024 because it was the only thing that worked. Once typed link extraction at scale is a solved problem, the comparison ends.

2. Orchestration frameworks become legacy infrastructure. The same way ORMs displaced query builders. The framework that makes the agent dumber wins. Chains of LLM calls — LangChain, multi-agent orchestrators — become the old layer. Thin runtime + fat skills become the new one.

3. Token bills drop 50–80%. Sub-agent spawning gets a Minions filter in front of it. Most savings come from workloads where most steps were queued shell work pretending to be reasoning.

Where the thesis might break

G-Brain is personal. One user. One brain. Multi-tenancy, access control, write conflict resolution, consensus across teams — Tan hasn’t solved any of those publicly. That’s exactly the gap Blomfield is paying founders to close.

YC isn’t betting on a product. They’re betting on a stack. “Company Brain” is one of four pieces named in their Summer Request for Startups — alongside AI OS for Companies (Diana Hu), Software for Agents (Aaron Epstein), and AI-Native Services (Gustaf Alströmer). Tan shipped the only piece YC built itself.

The Architect’s Lens takeaway

Three points to write down:

Memory is the leverage layer. If your stack doesn’t separate memory from reasoning, you’re paying tokens for state.
Determinism is a routing decision, not a fallback. Most agent work is queued shell work in disguise. Route it accordingly.
Markdown outlives code. Skills you can read survive longer than chains you can’t.