One URL.
Every AI.
100% verified.

memory.wiki delivers your knowledge to Claude, ChatGPT, and Gemini through a single URL. MWBench is the open eval that measures whether the wedge actually works, including on content the AI has never seen during training.

Headline result

Mode	raymindai familiar hub	mwbench-zorblax synthetic, unseen
Paste mode / full corpus AI receives every doc body in the prompt	100%	100%
Paste mode / compact 8–9× smaller payload (concept digest + skeleton)	100%	100%
Browse mode (AI fetches the URL) The real user scenario	98%	100%
Adversarial refusal AI correctly refuses when corpus lacks the answer	100%	n/a
Tool-use rate Did the AI actually fetch the URL when handed one	100%	100%

Three runners: claude-sonnet-4-6, gpt-5.5, gemini-3.5-flash. Judge: quote-evidence, requires a literal corpus quote per claim.

Two independent axes

Axis 1

Browse vs Paste

Paste: the bench tool fetches the URL itself and includes the body in the prompt. The AI reads what is in front of it. Internal sanity check.

Browse: the AI gets only the URL plus a fetch_url tool. It decides to fetch, follows links inside the hub, then answers. This is what happens when a user pastes a memory.wiki URL into Claude.ai or ChatGPT.

Axis 2

Familiar vs Unseen

Familiar (raymindai): a public hub that may have been crawled by AI training data. Some of the accuracy could be memorization.

Unseen (mwbench-zorblax): a synthetic hub seeded for this test. Every fact is fictional (ZorblaxCorp, CipherPlate v3.4.1, Talia Renford), none exist anywhere in AI training data. Only the URL fetch can produce correct answers.

Browse × Unseen is the only cell that fully isolates the wedge. AI must fetch (Browse), and memorization is impossible (Unseen). 100% across Claude / OpenAI / Gemini means the cross-AI URL delivery model genuinely works, not just on content the AI happened to memorize.

We don't bench every hub

The cross-AI wedge is proven at the system level, not per-hub. The unseen-hub result (100% on content the AIs have never seen) means every hub built on memory.wiki inherits the same property automatically. Re-running the bench on every customer hub would be repeating a proof we've already given.

The harness, the data, and the deeper write-ups are below for anyone who wants to audit the claim or run it themselves.

Methodology

Three runners. Each query runs through claude-sonnet-4-6 (1M context), gpt-5.5, and gemini-3.5-flash. Same prompt template, same tool spec for browse mode (fetch_url), independent API calls.

Quote-evidence judge. The judge model (claude-sonnet-4-6) is given the runner's full corpus and must produce a literal quote from that corpus for every substantive claim in the answer. Score = supported share of claims. No “this sounds like hallucination” guesswork. Every percentage point is auditable.

Cross-doc synthesis is allowed. A claim is grounded if it appears anywhere in the runner's corpus, not just in the doc the query targets. Mirrors how real users ask multi-doc questions.

Adversarial subset. 5 queries ask for facts that are NOT in the corpus (someone's home address, an unannounced acquisition, etc.). Empty answer is treated as implicit refusal. Catches the classic “AI made something up rather than admitting it didn't know” failure mode.

Reproducible. Harness is at github.com/raymindai/memory-wiki /eval. Re-run any round with node eval/run-bench.mjs or node eval/run-browse-bench.mjs.

The full write-ups

MWBench v1, main write-up

9 rounds, 8 production deploys, ~600 bench cells. Compact 33% to 100%, paste 100%, browse 90 to 100%, adversarial 100%, tool-use 100%.

Round 6-7: browse mode honest measurement

Real-world scenario: AI receives only a URL and a fetch_url tool. Discovers and fetches and answers, all without the corpus in the prompt.

Bundle & Doc URL enrichment

How the three URL shapes (hub / bundle / doc) carry knowledge-graph signal at their own scope.

Try it yourself

Sign up at memory.wiki, capture five docs from any AI chat, and paste your hub URL into Claude.ai or ChatGPT. The AI will fetch, read, and answer, even on content it has never seen during training.

Start free

One URL.Every AI.100% verified.