Back to Blog🇮🇹 Leggi in Italiano

Knowledge Graphs for AI Coding: How They Work and Why Ours Is Different

What a knowledge graph is, why vector search alone is not enough, and the specific mechanisms that make our hook-driven, cross-project KG different from Mem0, Letta, and Augment Code.

Every AI coding assistant forgets. Close the session, reopen tomorrow, the context vanishes. Standard RAG helps: dump project files into a vector database, let the model retrieve similar chunks. But this works only when question and answer share surface-level vocabulary. Decisions, tradeoffs, architectural rationale — these live outside the code itself.

A knowledge graph fixes a harder problem. It holds institutional memory in a structure that survives sessions. The gap between tools: how that memory stays synced, how it's scoped, and how results get ranked.

What a Knowledge Graph Actually Is

A knowledge graph is a structured store of facts. Each fact is a node - a concept, a tool, a project, a decision - and nodes are connected by typed edges that describe the relationship: uses, implements, extends, depends on. Unlike a flat document store, the graph knows that "FastAPI" and "Pydantic" are not just two words that sometimes appear together, but that one uses the other.

Modern implementations layer semantic embeddings on top of the graph: every node's content becomes a vector, so you can search by intent, not just by keyword. The combination matters. A pure vector database returns chunks that look similar; a graph lets you follow typed links from a relevant node to discover the decisions and parent concepts behind it.

The idea predates AI. Google's Knowledge Graph (2012), Palantir, Neo4j, Weaviate, Obsidian's wiki syntax all share this lineage. What's new since 2024: AI coding agents need this memory at runtime. Mem0 extracts entities from chat; Letta models memory as tiered OS pages. Both are production systems, both work differently from ours.

What Makes Ours Different

The VibeCoded Orchestrator runs a knowledge graph as first-class infrastructure through five concrete mechanisms, none of which appear combined elsewhere. Weaviate backend, self-hosted. Text embeddings via qwen3-embedding:0.6b (1024-dim), code embeddings via CodeSage-Large-v2 (2048-dim), both local. No data leaves your machine.

1. Hook-driven auto-sync, not explicit memory calls. A PostToolUse hook watches Edit(knowledge/**.md), fires on save, re-embeds and upserts to Weaviate in under two seconds—no human step. Mem0 and Letta need explicit memory.add() calls in agent code; Augment runs cloud indexing at slower cadence. We get edit-to-searchable freshness on every save, automatically.

2. Cross-project shared collection. Per-project KG_COLLECTION plus a shared collection: patterns documented in project A become searchable from B. Cursor, Windsurf, Augment all scope memory to single workspaces. We optimize for developers working across multiple codebases in a week, where a JWT decision from one project should surface in another.

3. Plain markdown + typed wikilinks, no proprietary store. Nodes are .md files in knowledge/, one per concept/tool/project/pattern. Relationships are wikilinks: [[uses::FastAPI]], [[implements::Blackboard Pattern]], [[extends::Base Agent]]. Developers read, edit, diff, grep them directly. Mem0 and Letta keep memory in proprietary backends—useful but opaque.

4. Token-aware result detail (Pro). hybrid_search(detail="titles" | "descriptions" | "full") returns three calibrated levels: titles only (browse), 6-line summaries (triage), full content (implement). The trick: every KG node stores a pre-built summary alongside full text. A PostToolUse hook runs kg-summary-generator.sh (Haiku-powered, debounced by content hash) on every write. Summaries exist at query time; nothing is generated on retrieval.

The RL-scored reranker (Pro) picks detail per result by confidence score: top-3 high-score hits get full text, mid-score get summaries, low-score get titles. Static rerankers dump full chunks regardless. Token waste prevented by design.

5. semantic_graph_search for typed-link traversal. Vector search finds similar text. semantic_graph_search starts at a seed concept, walks typed edges to depth N, returns the connected subgraph—surfacing "why" rather than just topically similar text.

A knowledge-curator agent reviews new nodes, deduplicates, suggests missing links, flags stale content. The store_knowledge_node MCP tool is read-write for both humans and agents. The KG learns as agents learn.

If this is the kind of memory layer you want in your editor, the base Orchestrator is free under AGPL-3.0 and ships with the full KG, code graph, hooks, and both search tools - see the Orchestrator page.

Two Concrete Scenarios

Scenario 1. A developer starts a new FastAPI service and asks the AI how to handle auth. The first response includes a link to a decision node from a Next.js project six weeks ago - the one that explained why short-lived JWTs with refresh tokens were chosen over sessions for that particular threat model. The reasoning carries over even though the stack is different. That cross-project surfacing comes from mechanism (2).

Scenario 2. Someone asks "what did we decide about rate limiting?" Grep finds nothing because no file contains that exact phrase. hybrid_search returns three nodes - a Redis token bucket implementation, a middleware pattern, and an incident postmortem - and on Pro, the reranker hands back the postmortem in full but the others as six-line summaries, because the postmortem is what the question really wants. Mechanism (4) at work.

Where It Sits in the Landscape

ToolApproachLocal?Cross-project?Auto-sync?Detail tiers?Open Source?Price
VibeCoded OrchestratorKG + code graph, typed links, hook-drivenYesYes (shared collection)Yes (PostToolUse, <2s)Yes (Pro, RL-scored)AGPL-3.0Free
VibeCoded Orchestrator ProAdds RL-scored reranker over the same KGYesYesYesYesSource-available€19/mo, €149/yr, €199 lifetime
CursorPer-project embedding index, 200K context windowNoNoImplicitNoNo$20/mo+
GitHub CopilotCompletion-focused, no persistent graphNoNon/aNoNo$10/mo+
Augment CodeContext Engine (graph-aware cloud indexing)NoLimitedCloud-pacedNoNoPaid
Windsurf (Codeium)Codemaps (graph of code structure)NoNoImplicitNoNo$15/mo+
Mem0Vector + optional graph extraction from chatSelf-host optionYesExplicit add() callsNoApache 2.0 (core)Graph on $249/mo Pro
Letta (ex-MemGPT)Tiered OS-style memory, agent-managedYesYesExplicitTiered (different concept)Apache 2.0Free (self-host)
Obsidian + KG pluginsPersonal notes, not coding-specificYesYesManualNoFreeFree / $8 sync

We optimize for long-running development: cross-project knowledge that travels with you, hook-driven freshness (no explicit memory.add() calls), inspectable markdown storage. Other tools solve adjacent problems. Augment targets massive monorepos; Letta pages memory per-agent; Mem0 extracts from chat history; Cursor's 200K window masks single-project limits. Each solves a different shape of work. If you work across projects and want the AI to learn with you across all of them, the Orchestrator fits.

Real failure modes exist: stale nodes, noise from over-linking, curation cost. Auto-sync, curator agents, and RL-scored detail levels keep that cost low.

Get It

The base Orchestrator - full KG, code graph, 20 hooks, 4 MCPs, agents and skills - is free and AGPL-3.0. If you want the RL-scored reranker that prevents token waste on long agent loops, Orchestrator Pro is €19/month, €149/year, or €199 lifetime (capped to the first 100 customers).


Sources