Project Acme — Architecture

Recreated outline. Original was lost to the SAMPLE_WELCOME race.

Overview

Acme is a post-call sales intelligence layer. Components (high level):


audio in → transcription → extraction → CRM write
            (Whisper)       (Claude)     (Salesforce / HubSpot)

Components

Ingestion

Cloud upload endpoint (POST /v1/calls)
Webhook from Zoom / Gong / Chorus
File-store: object storage (raw audio, retained 30d)

Transcription

Provider: Whisper (self-hosted) or Deepgram (managed)
Output: speaker-diarised transcript JSON

Extraction

LLM: Claude with structured output (JSON schema)
Inputs: transcript, account context, recent CRM state
Outputs: follow_ups[], decisions[], next_steps[]

CRM bridge

OAuth tokens per tenant (Salesforce, HubSpot)
Idempotent write with external ID = call_id
Approval queue: nothing writes without human OK in v1

Data model

calls(id, tenant_id, audio_url, status, transcript_id, ...)
extractions(id, call_id, kind, payload, status, approver_id)
integrations(tenant_id, provider, refresh_token, ...)

Open questions

Streaming vs batch transcription
Multi-tenant inference vs per-tenant model pinning
Retention: raw audio vs transcript vs extracted-only