Project Acme — Architecture

Recreated outline. Original was lost to the SAMPLE_WELCOME race.

Overview

Acme is a post-call sales intelligence layer. Components (high level):

audio in → transcription → extraction → CRM write (Whisper) (Claude) (Salesforce / HubSpot)

Components

Ingestion

  • Cloud upload endpoint (POST /v1/calls)
  • Webhook from Zoom / Gong / Chorus
  • File-store: object storage (raw audio, retained 30d)

Transcription

  • Provider: Whisper (self-hosted) or Deepgram (managed)
  • Output: speaker-diarised transcript JSON

Extraction

  • LLM: Claude with structured output (JSON schema)
  • Inputs: transcript, account context, recent CRM state
  • Outputs: follow_ups[], decisions[], next_steps[]

CRM bridge

  • OAuth tokens per tenant (Salesforce, HubSpot)
  • Idempotent write with external ID = call_id
  • Approval queue: nothing writes without human OK in v1

Data model

  • calls(id, tenant_id, audio_url, status, transcript_id, ...)
  • extractions(id, call_id, kind, payload, status, approver_id)
  • integrations(tenant_id, provider, refresh_token, ...)

Open questions

  • Streaming vs batch transcription
  • Multi-tenant inference vs per-tenant model pinning
  • Retention: raw audio vs transcript vs extracted-only