Grokers — The Codebase Brain

The Problem

Every AI tool re-reads your code from scratch

Today's AI coding assistants — Cursor, Claude Code, Copilot — reconstruct a mental model of your codebase on every single query. They read files, trace imports, reason about which functions call which. Then they throw it all away.

Next query: same work again. For a codebase of any real size, this is slow, expensive, and imprecise. They can only afford to look at a narrow slice, so they miss context that exists two files away.

The Grokers insight

Pay the comprehension cost once. Store the result permanently as a queryable graph. Every subsequent question — "what calls this function?", "what config keys does this module read?", "which conventions govern how I extend this?" — is answered in milliseconds from the database, not minutes of LLM inference.

For any codebase touched more than a few times, this model is orders of magnitude cheaper. For systematic changes spanning hundreds of files, it's the difference between possible and impossible.

How It Works

Parse once. Ask forever.

Grokers runs in four passes. The first three happen automatically when you index a repo. The fourth updates incrementally as files change.

Parse every file

Grokers uses tree-sitter — a fast, precise parser — on 11 programming languages plus HTML, CSS, templates, and config files. Every function, class, and method becomes a stream in the database with its file location, language, and parameters.

Build the dependency graph

Call edges, hook relationships, config reads, external service calls — all stored as typed relations between streams. Circular dependencies are detected and grouped. The graph is sorted bottom-up so leaf functions come first.

Analyze bottom-up with AI

Starting from leaf functions (things that call nothing), AI agents analyze each symbol. By the time an agent looks at a complex function, all its callees have already been analyzed. The agent reasons about known facts, not raw source. Each analysis produces preconditions, postconditions, side effects, and invariants — stored permanently.

Regrok: stay current incrementally

When files change, Grokers re-parses only the changed files, reconciles the new symbol set against the old, and re-analyzes only the functions whose behavior actually changed. Everything else stays as-is. The graph grows more accurate over time.

The Knowledge Graph

Everything is a stream. Relations are the graph.

Grokers stores all knowledge as Qbix streams — structured, access-controlled, real-time. Each node type has a specific shape. Relations between nodes encode the dependency structure.

Grokers/method

A function or method. Stores qualName, file, language, parameters, preconditions, postconditions, side effects, invariants, and call count.

Grokers/class

A class or interface. Groups methods via Grokers/method relations. Created automatically from the methods it contains.

Grokers/extern

An external resource the code touches. Q.Config.get(['Q','app']) becomes Grokers/extern/config/Q.app. Config keys, text strings, API endpoints, hooks — all mapped.

Grokers/concept

A named subsystem, automatically discovered from patterns. Q.Config, Q.Text, Q.Streams are concepts. Scored by how many functions reference them (importance) and how consistent the evidence is (confidence).

Grokers/convention

A recipe for extending the framework. "How do you define a new tool?" "Where does the file go?" "What must the constructor return?" Scored by confidence. Only conventions above 85% confidence are injected into AI prompts.

Grokers/clue

A signal the AI flagged for further investigation. Unusual patterns, possible dynamic dispatch, unresolved references. Feeds the Investigator agent.

Grokers/impact

A report of what changes when a function changes. Created automatically when regrok detects a modification. Shows callers affected, conventions that go stale, severity (breaking / likely-breaking / uncertain / informational).

Concepts & Conventions

The codebase teaches itself

The most valuable thing Grokers builds is not the function graph — it's the higher-level understanding that emerges from it.

🧠

Concepts grow from evidence

When Grokers sees fifty functions calling Q.Config.get(), it creates a Grokers/concept called "Q.Config" and maps out every config key the codebase uses. Each new function that references it increments the concept's importance score. Exceptions lower confidence. The concept gets sharper over time.

📐

Conventions encode the rules

When Grokers analyzes Q.Tool.define(), it creates a Grokers/convention capturing the full recipe: where the file goes, what the constructor receives, the lifecycle order, what not to do. Any AI asked to write a new tool gets this convention injected automatically — and cannot hallucinate a signature it hasn't confirmed from the graph.

📊

Scored, not binary

Every concept and convention has a confidence score (0–1) based on how many examples confirm it vs. contradict it, and an importance score based on how many functions depend on it. Only high-confidence, high-importance knowledge gets injected into AI prompts. The rest is stored and available for search.

🔗

Connected to the graph

Conventions link to the functions they describe via Grokers/derivedFrom relations. When those functions change, Grokers automatically marks the conventions stale. AI tools skip stale conventions rather than inject outdated recipes.

Impact Reports

Change something. Know everything it affects.

When regrok detects that a function's implementation changed, Grokers doesn't just mark it for re-analysis. It immediately produces an impact report — a hierarchical record of everything downstream.

Function changes

→

Grokers/impact created

↳

All direct callers listed (with severity)

↳

Stale conventions flagged

↳

Grokers/symbol/changed posted on repo

The impact report is a stream in the graph, not a document. Teams can subscribe to it, bots can act on it, developers can browse it. And before making any change, you can ask Grokers to produce a proposed impact report — the same structure, but hypothetical, so you understand the consequences before you write a line.

AI Safety

No hallucination. No guessing.

Every AI agent that uses Grokers operates under a hard rule: never invent a function signature you haven't seen from the graph. If you need to know how stateChanged() works, look it up. If a convention lists related functions, walk the graph to them before referencing them.

This is enforced in the prompt, not just hoped for. The result is AI-generated code that follows actual framework conventions — because it looked them up — rather than plausible-sounding fabrications.

The graph also makes this practical. Because Grokers pre-computed all the relationships, looking up a function is a millisecond database query, not a file search. The AI doesn't have to guess because looking it up is trivially fast.

Get Started

One command to start.

Grokers ships as a Qbix plugin. Once installed, a single CLI command indexes a repository and starts the analysis pipeline.

# Full index + analysis
node Grokers.js grok /path/to/repo
# Update after file changes
node Grokers.js regrok /path/to/repo
# Ask a question
node Grokers.js ask /path/to/repo "What does JWTManager::validate assume about its input?"
# Check progress
node Grokers.js status /path/to/repo

Your codebase, fully understood