---
title: "Microsoft GraphRAG: what we learned"
url: https://memory.wiki/6WkjlKgA
updated: 2026-05-14T18:15:49.480Z
hub: https://memory.wiki/hub/demo
bundle_count: 1
concept_count: 12
source: "memory.wiki"
---
# Microsoft GraphRAG: what we learned

> Read the 2024 paper and the follow-ups (Project NotebookLM, the v1.1 release, the open-source community fork). Notes for the team to reference when "GraphRAG" comes up.

## The thesis, in one sentence

> Build a knowledge graph from a document corpus, run community detection on it, and answer queries by traversing the graph instead of just embedding-matching.

## What's good

- **Multi-hop reasoning.** GraphRAG beats naive RAG on the "compare X and Y across documents" class of questions. The community-detection step gives it a structural prior that vector recall misses.
- **Honest about its costs.** The paper is explicit that GraphRAG is 10-50x more expensive than naive RAG to build, because the graph extraction is an LLM-per-chunk operation. They don't bury this.
- **The open-source release is real.** The Python package works. The community fork (rust-graphrag) shaves indexing time by ~3x.

## What's structurally different from us

GraphRAG is a **service**. You hand it a corpus, it builds an index, it serves queries internally to an upstream system. The graph never leaves the service.

mdfy is a **delivery model**. We don't traverse the graph internally on the receiver's behalf — we ship the graph in the URL response and let the receiving AI inherit it. Same primitive (a knowledge graph over docs), different delivery shape.

## Where the comparison breaks down

The question "is mdfy a GraphRAG implementation?" is the wrong question. We share the substrate (LLM-extracted entities + relations over a markdown corpus). We don't share the API surface. GraphRAG is a Python package you embed in a backend; mdfy is a URL you paste into Claude.

## What we should take

1. **Community detection.** We currently group concepts by simple cosine clustering. The Leiden-community approach in GraphRAG is more robust to high-variance corpus sizes. Worth porting.
2. **Hierarchical summaries.** GraphRAG indexes summaries at multiple zoom levels (community → community-of-communities). We index summaries at the bundle level only. Possible v7 expansion.

## What we should leave

The whole "build-time is expensive, query-time is fast" promise. Our users won't tolerate a build step. The graph extraction has to be a streaming-friendly background job, not a batch one.


---

## Concepts in this document
- **mdfy** _(entity)_
  A tool that stores project context and decision history, integrated into Cursor via custom rules.
- **Knowledge Management** _(tag)_
  Domain of organizing, storing, and retrieving information for human and AI use.
- **llms.txt** _(concept)_
  Plain-text discoverability standard for AI agents at site root, analogous to robots.txt and sitemap.xml.
- **Obsidian** _(entity)_
  The primary subject being tested for import functionality and markdown compatibility.
- **Mem0** _(entity)_
  Extracted memory system that produces factual, direct summaries from conversation history with strong extraction quality but limited stylistic or contextual nuance.
- **Letta** _(entity)_
  Extracted memory system that infers behavioral patterns and work-shape preferences from conversation history with loose but contextually interesting extraction.
- **Multi-hop reasoning** _(concept)_
  GraphRAG's capability to answer comparative questions across documents by traversing graph structure; advantage over vector-only retrieval.
- **Community detection** _(concept)_
  Graph clustering technique (Leiden algorithm) that GraphRAG uses for structural priors; identified as improvement candidate for mdfy.
- **RAG Systems** _(tag)_
  Broad category of retrieval-augmented generation approaches for enhancing AI with external knowledge.
- **GraphRAG** _(entity)_
  Microsoft's knowledge-graph-based retrieval system that uses community detection for multi-hop reasoning; primary subject of analysis.
- **Extracted memory** _(concept)_
  AI-generated user profiles automatically inferred from conversation history without direct user authorship, central to both Mem0 and Letta approaches.
- **Knowledge Graphs** _(concept)_
  Structured representations of information as interconnected entities and relationships, enabling multi-hop reasoning.

## Concept relations (within this doc's concepts)
- **Knowledge Graphs** enables **Multi-hop reasoning**
- **Community detection** analyzes structure **Knowledge Graphs**
- **Mem0** implements **Extracted memory**
- **Letta** implements **Extracted memory**
- **GraphRAG** implements **Knowledge Graphs**
- **Mem0** is type of **Extracted memory**
- **Letta** is type of **Extracted memory**
- **mdfy** should adopt from **Community detection**
- **GraphRAG** enables capability **Multi-hop reasoning**
- **GraphRAG** uses for clustering **Community detection**

## Bundles containing this document
- [AI memory research: the public frontier](https://memory.wiki/b/wpwVCSDF)
  > Side-by-side notes on Mem0, Letta, Microsoft GraphRAG, Karpathy's LLM Wiki, llms.txt adoption.

_Hub canonical:_ https://memory.wiki/hub/demo
_Concept digest:_ https://memory.wiki/raw/hub/demo?digest=1&compact=1
