Project Acme — Architecture
Recreated outline. Original was lost to the SAMPLE_WELCOME race.
Overview
Acme is a post-call sales intelligence layer. Components (high level):
audio in → transcription → extraction → CRM write
(Whisper) (Claude) (Salesforce / HubSpot)
Components
Ingestion
- Cloud upload endpoint (POST /v1/calls)
- Webhook from Zoom / Gong / Chorus
- File-store: object storage (raw audio, retained 30d)
Transcription
- Provider: Whisper (self-hosted) or Deepgram (managed)
- Output: speaker-diarised transcript JSON
Extraction
- LLM: Claude with structured output (JSON schema)
- Inputs: transcript, account context, recent CRM state
- Outputs: follow_ups[], decisions[], next_steps[]
CRM bridge
- OAuth tokens per tenant (Salesforce, HubSpot)
- Idempotent write with external ID = call_id
- Approval queue: nothing writes without human OK in v1
Data model
calls(id, tenant_id, audio_url, status, transcript_id, ...)extractions(id, call_id, kind, payload, status, approver_id)integrations(tenant_id, provider, refresh_token, ...)
Open questions
- Streaming vs batch transcription
- Multi-tenant inference vs per-tenant model pinning
- Retention: raw audio vs transcript vs extracted-only