The LLM Wiki: A Compounding Knowledge Base

Most people’s experience with LLMs and documents looks like RAG (Retrieval-Augmented Generation). You upload a collection of files, the LLM retrieves relevant chunks at query time, and it generates an answer.

This works, but the LLM is rediscovering knowledge from scratch on every question. There is no accumulation. If you ask a subtle question that requires synthesizing five different documents, the LLM has to find and piece together the relevant fragments every single time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most commercial RAG systems work exactly this way.

The LLM Wiki Pattern is different. Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki—a structured, interlinked collection of markdown files that sits between you and the raw sources.

The Core Idea

When you add a new source into an LLM Wiki, the LLM doesn’t just index it for later retrieval. It reads it, extracts the key information, and physically integrates it into the existing wiki. It updates entity pages, revises topic summaries, notes where new data contradicts old claims, and strengthens the evolving synthesis. The knowledge is compiled once, and then kept current.

This is the key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you’ve read. The wiki keeps getting richer with every source you add and every question you ask.

You rarely write the wiki yourself. You are in charge of sourcing, exploration, and asking the right questions. The LLM does the grunt work—the summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful. I keep my LLM agent open on one side and Obsidian open on the other. Obsidian is the IDE; the LLM is the tireless junior programmer; the wiki is the codebase.

This architecture scales across multiple domains:

Personal: Tracking goals, journaling, and filing podcast notes to build a psychological profile over time.
Research: Deep-dives where you incrementally build a thesis across weeks of reading papers.
Reading: Filing book chapters to build character trees and plot nodes, similar to a fan-wiki like Tolkien Gateway.
Business: Generating an internal project wiki automatically fed by Slack threads, meeting transcripts, and PRDs.

The Architecture

There are three layers to this setup:

Raw Sources: Your absolute source of truth. Articles, PDFs, images. These are immutable—the LLM reads them but never modifies them.
The Wiki: A directory of LLM-generated markdown files. Summaries, concept comparisons, and overviews. The LLM exclusively owns this layer. It writes, updates, and interlinks everything.
The Schema: A configuration document (e.g. AGENTS.md) that explicitly tells the LLM how the wiki is structured, the naming conventions, and the exact workflows for ingesting sources and linting the wiki.

Daily Operations

1. Ingest

You drop a new source into the raw collection and command the LLM to process it. The LLM reads the source, writes a summary page, updates the master index, cascades updates across relevant concept pages, and appends a timestamped entry to the log. A single source ingestion might gracefully touch 10 to 15 wiki pages in one pass.

2. Query

You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations.

The important insight here: good answers should be filed back into the wiki as new pages. A comparison matrix you asked for or a connection you discovered shouldn’t disappear into chat history. They are valuable artifacts that compound the knowledge base.

3. Lint

Periodically, you ask the LLM to health-check the wiki. Look for contradictions between pages, stale claims that newer sources have superseded, orphan pages with no inbound links, and legacy redundancy. The LLM operates as a janitor, proposing structural fixes to keep the wiki healthy as it scales.

Two special files help the LLM (and you) navigate the wiki as it grows:

index.md: A structured catalog of everything in the wiki. Every page is listed with a link, a one-line summary, and metadata like source counts. When answering a query, the LLM reads the index first to find relevant pages, then drills mathematically into them. This avoids the need for heavy vector/embedding infrastructure at moderate scales.
log.md: An append-only chronological record of what happened and when. If each entry starts with a consistent prefix (e.g., ## [2026-04-02] ingest | Article Title), the log becomes highly parseable using simple unix tools like grep.

Why This Works

The tedious part of maintaining a knowledge base is not the reading or the thinking—it is the bookkeeping. Humans abandon personal wikis because the maintenance burden eventually grows faster than the value.

LLMs don’t get bored. They don’t forget to update a cross-reference, and they comfortably touch 15 files in a single pass. The wiki stays maintained because the cost of maintenance drops to near zero.

The human’s job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM’s job is everything else.