Bundle & Doc URL enrichment — verified at 100% cross-AI

"Hub URLs hit 100% across Claude / OpenAI / Gemini. Bundle URLs and individual doc URLs need the same structural pass — and the same measurement — before they earn the same claim."

🎯 Result (Round 5, MWBench)

10 bundle queries × 3 runners × 2 modes = 60 cells, all 100%. 10 doc queries × 3 runners × 2 modes = 60 cells, all 100%. Cross-AI consistency: 100% on every URL shape and mode.

Compact mode is 8–9× cheaper in input tokens than full for bundles. Doc URLs are small enough that compact ≈ full in size (compactMarkdown only strips whitespace; nothing to strip on a short doc).

The shape problem

A Memory.Wiki user can paste three URL shapes into an AI:

URL Scope State before this work
/hub/<slug> Whole knowledge hub ✅ Enriched (concept_index + concept_relations + bundle AI graphs + all-docs catalog + per-doc gist + skeleton)
/b/<id> Curated bundle of docs ⚠ Had bundle AI graph (themes/insights/connections/concepts), but per-doc digest was just 1. [title](url) — annotation — no body signal
/<id> Single document ⚠ Just frontmatter + raw markdown body. No concepts, no relations, no parent bundles, no hub pointer

What changed

Bundle (/raw/bundle/<id>)

Digest mode now mirrors the hub all-docs catalog:

### N. [Doc title](https://memory.wiki/<id>) > annotation (if any) gist (Facts → summary → firstParagraph chain) *sections:* H2 headings + first-line skeleton

Bundle still leads with its graph_data (themes / cross-doc insights / key takeaways / connections / concepts / concept relations) — that's the bundle's distinct value. Now followed by enriched per-doc digest so an AI fetching the bundle URL has answer-ready material without fetching each member doc individually.

Single doc (/raw/<id>)

Body unchanged, but a knowledge-graph context block appended:

--- title: ... url: ... hub: https://memory.wiki/hub/<slug> bundle_count: N concept_count: M --- [body markdown] --- ## Concepts in this document - **<concept>** _(type)_ description ## Concept relations (within this doc's concepts) - **<concept A>** relation_label **<concept B>** ## Bundles containing this document - [<bundle title>](https://memory.wiki/b/<id>) > description _Hub canonical:_ https://memory.wiki/hub/<slug> · _Concept digest:_ https://memory.wiki/raw/hub/<slug>?digest=1&compact=1

An AI landing on a single doc URL now knows:

  • Which concepts the doc touches (from hub-wide concept_index, filtered by doc_ids)
  • How those concepts relate to each other (from concept_relations)
  • Which bundles this doc lives in (so it can fetch a richer multi-doc payload if needed)
  • The owner's hub canonical (so it can fetch the whole graph if the question demands it)

Bench validation

Bundle queries (10 across 5 bundles)

Bundle Sample query
ggAzbcHr — Knowledge Graph as AI-Native Infrastructure "What MRR target does the v7 business plan project at 12 months?"
9FATHAnw — mdfy Foundations "What are the three URL primitives mdfy uses?"
Ih0eUczw — Phase 2 strategy "What is the strongest single recommendation from the native skills strategy?"
4OGRyHs9 — case studies "How does mdfy help onboarding every new Claude Code session?"
rN2L-MvM — Memory Wiki v7 "What three-tier architecture does memory.wiki use?"

All 10 → 100% on Claude / OpenAI / Gemini × full / compact.

Doc queries (10 across 10 single docs)

Sample:

  • "How many moving parts does mdfy have according to the architecture doc?" → KRKz_MD-
  • "What Saudi Arabia national AI project did Hyunsang Cho work on?" → ZVfnXzCU
  • "How many slides are in the a16z speedrun pitch deck?" → gnEMFJgI
  • "What MRR target does memory.wiki v7 project for the 12-month horizon?" → qHc1FWxq

All 10 → 100% on Claude / OpenAI / Gemini × full / compact.

Why this matters

The hub URL is the "everything" payload — heavy. Most user pastes are scoped:

  • "Summarize this bundle" → user pastes /b/<id>. The bundle's AI graph + member-doc gists in one payload means the AI answers without 8 round-trips.
  • "What does this doc say about X" → user pastes /<id>. The body answers the doc question; the appended concept block tells the AI "this doc is part of concept graph Y, see also bundles Z" — so follow-up questions don't blow up cross-doc context.

Without this, bundle and doc URLs were structurally weaker than hub URLs — but most paste traffic in the wild is bundle / doc, not hub. Now all three shapes carry the same kind of structural signal at their own scope, and the bench confirms they all deliver 100% cross-AI parity.

Shared utility

Extracted extractFacts / firstParagraph / extractSkeleton / docGist to apps/web/src/lib/doc-gist.ts. Three routes (hub, bundle, doc) now share the same gist chain. Adding a new gist heuristic anywhere — e.g. ## Q&A block extraction, or LLM-generated long-form summaries — drops in one place and propagates.

Harness change

run-bench.mjs now takes per-query scope and scope_id. Each query carries its own URL shape; corpus is fetched per (scope, scope_id, mode) tuple and cached. Same harness handles hub, bundle, and doc bench cleanly — switch query file, same orchestration:

bash
node eval/run-bench.mjs --queries=queries/v1.jsonl # hub (20 queries) node eval/run-bench.mjs --queries=queries/bundles-v1.jsonl # bundle (10 queries) node eval/run-bench.mjs --queries=queries/docs-v1.jsonl # doc (10 queries)

Judge picks the right URL automatically based on run.scope + run.scope_id.

Facts

  • 10 bundle queries × 3 runners × 2 modes = 60 cells, all 100%
  • 10 doc queries × 3 runners × 2 modes = 60 cells, all 100%
  • Cross-AI consistency 100% across both URL shapes
  • Bundle compact 8–9× cheaper in input tokens vs bundle full
  • Doc URLs are small enough that compactMarkdown gives no size reduction
  • Shared apps/web/src/lib/doc-gist.ts powers gist chain across hub/bundle/doc routes
  • run-bench.mjs takes per-query scope + scope_id; same harness handles all three URL shapes
  • Combined with MWBench Round 4 for hub URLs, every URL shape Memory.Wiki exposes hits 100% cross-AI accuracy