Bundle & Doc URL enrichment — verified at 100% cross-AI

"Hub URLs hit 100% across Claude / OpenAI / Gemini. Bundle URLs and individual doc URLs need the same structural pass — and the same measurement — before they earn the same claim."

🎯 Result (Round 5, MWBench)

10 bundle queries × 3 runners × 2 modes = 60 cells, all 100%. 10 doc queries × 3 runners × 2 modes = 60 cells, all 100%. Cross-AI consistency: 100% on every URL shape and mode.

Compact mode is 8–9× cheaper in input tokens than full for bundles. Doc URLs are small enough that compact ≈ full in size (compactMarkdown only strips whitespace; nothing to strip on a short doc).

The shape problem

A Memory.Wiki user can paste three URL shapes into an AI:

URL	Scope	State before this work
`/hub/<slug>`	Whole knowledge hub	✅ Enriched (concept_index + concept_relations + bundle AI graphs + all-docs catalog + per-doc gist + skeleton)
`/b/<id>`	Curated bundle of docs	⚠ Had bundle AI graph (themes/insights/connections/concepts), but per-doc digest was just `1. [title](url) — annotation` — no body signal
`/<id>`	Single document	⚠ Just frontmatter + raw markdown body. No concepts, no relations, no parent bundles, no hub pointer

What changed

Bundle (`/raw/bundle/<id>`)

Digest mode now mirrors the hub all-docs catalog:


### N. [Doc title](https://memory.wiki/<id>)
> annotation (if any)
gist (Facts → summary → firstParagraph chain)
*sections:* H2 headings + first-line skeleton

Bundle still leads with its graph_data (themes / cross-doc insights / key takeaways / connections / concepts / concept relations) — that's the bundle's distinct value. Now followed by enriched per-doc digest so an AI fetching the bundle URL has answer-ready material without fetching each member doc individually.

Single doc (`/raw/<id>`)

Body unchanged, but a knowledge-graph context block appended:


---
title: ...
url: ...
hub: https://memory.wiki/hub/<slug>
bundle_count: N
concept_count: M
---

[body markdown]

---

## Concepts in this document
- **<concept>** _(type)_
  description

## Concept relations (within this doc's concepts)
- **<concept A>** relation_label **<concept B>**

## Bundles containing this document
- [<bundle title>](https://memory.wiki/b/<id>)
  > description

_Hub canonical:_ https://memory.wiki/hub/<slug> · _Concept digest:_ https://memory.wiki/raw/hub/<slug>?digest=1&compact=1

An AI landing on a single doc URL now knows:

Which concepts the doc touches (from hub-wide concept_index, filtered by doc_ids)
How those concepts relate to each other (from concept_relations)
Which bundles this doc lives in (so it can fetch a richer multi-doc payload if needed)
The owner's hub canonical (so it can fetch the whole graph if the question demands it)

Bench validation

Bundle queries (10 across 5 bundles)

Bundle	Sample query
`ggAzbcHr` — Knowledge Graph as AI-Native Infrastructure	"What MRR target does the v7 business plan project at 12 months?"
`9FATHAnw` — mdfy Foundations	"What are the three URL primitives mdfy uses?"
`Ih0eUczw` — Phase 2 strategy	"What is the strongest single recommendation from the native skills strategy?"
`4OGRyHs9` — case studies	"How does mdfy help onboarding every new Claude Code session?"
`rN2L-MvM` — Memory Wiki v7	"What three-tier architecture does memory.wiki use?"

All 10 → 100% on Claude / OpenAI / Gemini × full / compact.

Doc queries (10 across 10 single docs)

Sample:

"How many moving parts does mdfy have according to the architecture doc?" → KRKz_MD-
"What Saudi Arabia national AI project did Hyunsang Cho work on?" → ZVfnXzCU
"How many slides are in the a16z speedrun pitch deck?" → gnEMFJgI
"What MRR target does memory.wiki v7 project for the 12-month horizon?" → qHc1FWxq

All 10 → 100% on Claude / OpenAI / Gemini × full / compact.

Why this matters

The hub URL is the "everything" payload — heavy. Most user pastes are scoped:

"Summarize this bundle" → user pastes /b/<id>. The bundle's AI graph + member-doc gists in one payload means the AI answers without 8 round-trips.
"What does this doc say about X" → user pastes /<id>. The body answers the doc question; the appended concept block tells the AI "this doc is part of concept graph Y, see also bundles Z" — so follow-up questions don't blow up cross-doc context.

Without this, bundle and doc URLs were structurally weaker than hub URLs — but most paste traffic in the wild is bundle / doc, not hub. Now all three shapes carry the same kind of structural signal at their own scope, and the bench confirms they all deliver 100% cross-AI parity.

Shared utility

Extracted extractFacts / firstParagraph / extractSkeleton / docGist to apps/web/src/lib/doc-gist.ts. Three routes (hub, bundle, doc) now share the same gist chain. Adding a new gist heuristic anywhere — e.g. ## Q&A block extraction, or LLM-generated long-form summaries — drops in one place and propagates.

Harness change

run-bench.mjs now takes per-query scope and scope_id. Each query carries its own URL shape; corpus is fetched per (scope, scope_id, mode) tuple and cached. Same harness handles hub, bundle, and doc bench cleanly — switch query file, same orchestration:

bash
node eval/run-bench.mjs --queries=queries/v1.jsonl         # hub (20 queries)
node eval/run-bench.mjs --queries=queries/bundles-v1.jsonl # bundle (10 queries)
node eval/run-bench.mjs --queries=queries/docs-v1.jsonl    # doc (10 queries)

Judge picks the right URL automatically based on run.scope + run.scope_id.

Facts

10 bundle queries × 3 runners × 2 modes = 60 cells, all 100%
10 doc queries × 3 runners × 2 modes = 60 cells, all 100%
Cross-AI consistency 100% across both URL shapes
Bundle compact 8–9× cheaper in input tokens vs bundle full
Doc URLs are small enough that compactMarkdown gives no size reduction
Shared apps/web/src/lib/doc-gist.ts powers gist chain across hub/bundle/doc routes
run-bench.mjs takes per-query scope + scope_id; same harness handles all three URL shapes
Combined with MWBench Round 4 for hub URLs, every URL shape Memory.Wiki exposes hits 100% cross-AI accuracy