Bundle & Doc URL enrichment — verified at 100% cross-AI
"Hub URLs hit 100% across Claude / OpenAI / Gemini. Bundle URLs and individual doc URLs need the same structural pass — and the same measurement — before they earn the same claim."
🎯 Result (Round 5, MWBench)
10 bundle queries × 3 runners × 2 modes = 60 cells, all 100%. 10 doc queries × 3 runners × 2 modes = 60 cells, all 100%. Cross-AI consistency: 100% on every URL shape and mode.
Compact mode is 8–9× cheaper in input tokens than full for bundles. Doc URLs are small enough that compact ≈ full in size (compactMarkdown only strips whitespace; nothing to strip on a short doc).
The shape problem
A Memory.Wiki user can paste three URL shapes into an AI:
| URL | Scope | State before this work |
|---|---|---|
/hub/<slug> |
Whole knowledge hub | ✅ Enriched (concept_index + concept_relations + bundle AI graphs + all-docs catalog + per-doc gist + skeleton) |
/b/<id> |
Curated bundle of docs | ⚠ Had bundle AI graph (themes/insights/connections/concepts), but per-doc digest was just 1. [title](url) — annotation — no body signal |
/<id> |
Single document | ⚠ Just frontmatter + raw markdown body. No concepts, no relations, no parent bundles, no hub pointer |
What changed
Bundle (/raw/bundle/<id>)
Digest mode now mirrors the hub all-docs catalog:
### N. [Doc title](https://memory.wiki/<id>)
> annotation (if any)
gist (Facts → summary → firstParagraph chain)
*sections:* H2 headings + first-line skeleton
Bundle still leads with its graph_data (themes / cross-doc insights / key takeaways / connections / concepts / concept relations) — that's the bundle's distinct value. Now followed by enriched per-doc digest so an AI fetching the bundle URL has answer-ready material without fetching each member doc individually.
Single doc (/raw/<id>)
Body unchanged, but a knowledge-graph context block appended:
---
title: ...
url: ...
hub: https://memory.wiki/hub/<slug>
bundle_count: N
concept_count: M
---
[body markdown]
---
## Concepts in this document
- **<concept>** _(type)_
description
## Concept relations (within this doc's concepts)
- **<concept A>** relation_label **<concept B>**
## Bundles containing this document
- [<bundle title>](https://memory.wiki/b/<id>)
> description
_Hub canonical:_ https://memory.wiki/hub/<slug> · _Concept digest:_ https://memory.wiki/raw/hub/<slug>?digest=1&compact=1
An AI landing on a single doc URL now knows:
- Which concepts the doc touches (from hub-wide
concept_index, filtered bydoc_ids) - How those concepts relate to each other (from
concept_relations) - Which bundles this doc lives in (so it can fetch a richer multi-doc payload if needed)
- The owner's hub canonical (so it can fetch the whole graph if the question demands it)
Bench validation
Bundle queries (10 across 5 bundles)
| Bundle | Sample query |
|---|---|
ggAzbcHr — Knowledge Graph as AI-Native Infrastructure |
"What MRR target does the v7 business plan project at 12 months?" |
9FATHAnw — mdfy Foundations |
"What are the three URL primitives mdfy uses?" |
Ih0eUczw — Phase 2 strategy |
"What is the strongest single recommendation from the native skills strategy?" |
4OGRyHs9 — case studies |
"How does mdfy help onboarding every new Claude Code session?" |
rN2L-MvM — Memory Wiki v7 |
"What three-tier architecture does memory.wiki use?" |
All 10 → 100% on Claude / OpenAI / Gemini × full / compact.
Doc queries (10 across 10 single docs)
Sample:
- "How many moving parts does mdfy have according to the architecture doc?" →
KRKz_MD- - "What Saudi Arabia national AI project did Hyunsang Cho work on?" →
ZVfnXzCU - "How many slides are in the a16z speedrun pitch deck?" →
gnEMFJgI - "What MRR target does memory.wiki v7 project for the 12-month horizon?" →
qHc1FWxq
All 10 → 100% on Claude / OpenAI / Gemini × full / compact.
Why this matters
The hub URL is the "everything" payload — heavy. Most user pastes are scoped:
- "Summarize this bundle" → user pastes
/b/<id>. The bundle's AI graph + member-doc gists in one payload means the AI answers without 8 round-trips. - "What does this doc say about X" → user pastes
/<id>. The body answers the doc question; the appended concept block tells the AI "this doc is part of concept graph Y, see also bundles Z" — so follow-up questions don't blow up cross-doc context.
Without this, bundle and doc URLs were structurally weaker than hub URLs — but most paste traffic in the wild is bundle / doc, not hub. Now all three shapes carry the same kind of structural signal at their own scope, and the bench confirms they all deliver 100% cross-AI parity.
Shared utility
Extracted extractFacts / firstParagraph / extractSkeleton / docGist to apps/web/src/lib/doc-gist.ts. Three routes (hub, bundle, doc) now share the same gist chain. Adding a new gist heuristic anywhere — e.g. ## Q&A block extraction, or LLM-generated long-form summaries — drops in one place and propagates.
Harness change
run-bench.mjs now takes per-query scope and scope_id. Each query carries its own URL shape; corpus is fetched per (scope, scope_id, mode) tuple and cached. Same harness handles hub, bundle, and doc bench cleanly — switch query file, same orchestration:
bashnode eval/run-bench.mjs --queries=queries/v1.jsonl # hub (20 queries)
node eval/run-bench.mjs --queries=queries/bundles-v1.jsonl # bundle (10 queries)
node eval/run-bench.mjs --queries=queries/docs-v1.jsonl # doc (10 queries)
Judge picks the right URL automatically based on run.scope + run.scope_id.
Facts
- 10 bundle queries × 3 runners × 2 modes = 60 cells, all 100%
- 10 doc queries × 3 runners × 2 modes = 60 cells, all 100%
- Cross-AI consistency 100% across both URL shapes
- Bundle compact 8–9× cheaper in input tokens vs bundle full
- Doc URLs are small enough that compactMarkdown gives no size reduction
- Shared
apps/web/src/lib/doc-gist.tspowers gist chain across hub/bundle/doc routes run-bench.mjstakes per-queryscope+scope_id; same harness handles all three URL shapes- Combined with MWBench Round 4 for hub URLs, every URL shape Memory.Wiki exposes hits 100% cross-AI accuracy