---
title: "Decision: Anthropic Haiku for hub-recall reranker"
url: https://memory.wiki/sB3a5eOG
updated: 2026-05-14T18:15:49.480Z
hub: https://memory.wiki/hub/demo
concept_count: 12
source: "memory.wiki"
---
# Decision: Anthropic Haiku for hub-recall reranker

> Logged 2026-04-11.

## What we ship

Hybrid retrieval (BM25 + pgvector union, top-30) → Anthropic Haiku 4.5 reranks → top-k returned to caller.

## Why Haiku, not Voyage rerank-2 / Cohere rerank-v3 / Mixedbread

Voyage rerank-2 is the obvious technical choice (it's *the* reranker model from a *rerankers-only* shop). I ran the eval anyway:

| Reranker | nDCG@5 (our eval set) | p95 latency | $/1M tokens |
|---|---:|---:|---:|
| Haiku 4.5 | 0.83 | 320ms | $1 in / $5 out |
| Voyage rerank-2 | 0.85 | 110ms | $0.50 / 1M |
| Cohere rerank-v3 | 0.84 | 180ms | $1.00 / 1M |

Voyage and Cohere are slightly more accurate and faster. So why Haiku?

- **Single-vendor story.** We already use Anthropic for capture summarisation and graph extraction. Adding a second LLM provider for *just* reranking is operationally heavier than the marginal quality gain.
- **The eval gap is inside noise.** Our eval set has 60 queries. The 0.02 nDCG gap between Haiku and Voyage falls inside the bootstrap CI. We can't prove the difference is real at this scale.
- **Latency budget has room.** Our recall budget is 800ms p95 end-to-end. The reranker is 320ms of that. We're not against a wall.

## When I'd revisit

- If Voyage releases rerank-2.5 with a meaningful jump on the long-doc benchmark. Voyage has explicitly said they're working on it.
- If we ever need to serve recall to a free-tier user at scale. Voyage's $0.50/1M would compound enough to matter.

Until then, single-vendor wins.


---

## Concepts in this document
- **Supabase** _(entity)_
  Backend-as-a-service providing authentication, database, and row-level security without separate auth overhead.
- **Vendor consolidation** _(concept)_
  The operational principle of reducing authentication surfaces, SDKs, and control planes by keeping vector search within the existing Postgres infrastructure.
- **pgvector** _(entity)_
  PostgreSQL extension providing vector data type and HNSW indexing for efficient similarity search.
- **Anthropic** _(entity)_
  Incumbent AI vendor mentioned as competitive threat with integrated memory and user lock-in.
- **Performance Optimization** _(concept)_
  Critical performance considerations driving technical architecture decisions.
- **AI-First Architecture** _(concept)_
  Design philosophy that assumes AI capabilities replace traditional manual processes.
- **Supabase Postgres** _(entity)_
  The existing database infrastructure that pgvector extends, serving as the single unified platform for relational and vector queries.
- **comrak** _(entity)_
  The Rust markdown parser selected for production use due to GFM compliance and performance.
- **Cost Optimization** _(tag)_
  Financial considerations in technology and vendor selection decisions.
- **Rust + WASM** _(entity)_
  Core technology stack providing performance advantages for markdown rendering.
- **Anthropic Haiku 4.5** _(entity)_
  The selected reranker model chosen to rerank hybrid retrieval results in the hub-recall pipeline.
- **AI-First Design** _(concept)_
  Product philosophy prioritizing AI agent compatibility over traditional user patterns.

## Concept relations (within this doc's concepts)
- **AI-First Architecture** synergizes with **Vendor consolidation**
- **Performance Optimization** balanced priorities **Cost Optimization**
- **Vendor consolidation** reduces complexity costs **Cost Optimization**
- **Anthropic** supports strategy **Vendor consolidation**
- **Supabase** enables approach **Vendor consolidation**
- **Supabase** hosts **pgvector**
- **Supabase** exemplifies approach **Vendor consolidation**
- **Supabase** reduces complexity **Vendor consolidation**
- **Supabase** integrates **pgvector**
- **Supabase** exemplifies strategy **Vendor consolidation**
- **Vendor consolidation** guides **Supabase**

_Hub canonical:_ https://memory.wiki/hub/demo
_Concept digest:_ https://memory.wiki/raw/hub/demo?digest=1&compact=1