Human overview · for understanding

Meeting-Transcription Pipeline — human overview

Fireflies HYBRID smart relay · built, merged to main, live on v102 · 2026-06-24 · 2026-06-24

Fireflies HYBRID smart relay · built, merged to main, live on v102 · 2026-06-24

Master summary — the gist in 30 seconds

TL;DREvery sales meeting now gets recorded, transcribed, summarized in Hungarian, and filed against the right lead — bot-free for small calls, with a notetaker bot auto-invited only for big ones. Built, tested (558 green), and live on the main link.

Inputs: a Google Meet call (or a recorded audio file, or a pasted transcript). Outputs: the transcript + a Hungarian AI summary on the lead's deal card AND a full row in a dedicated Notion 'Call Recordings & Transcripts' database — never anything auto-sent to the lead.

Why this mattersThe capture step used to be half-built (the live-bot path was dead code with a crash-on-paste landmine). Now the system is anti-fragile: four capture routes, idempotent storage, and graceful degradation when an account prerequisite isn't set yet — so no meeting silently goes unrecorded.
flowchart LR
  A[Meet call / upload / paste] --> B{how many<br/>humans?}
  B -->|≤3| C[bot-free SDK]
  B -->|>3| D[Notetaker bot]
  C --> E[Fireflies webhook]
  D --> E
  E --> F[match the deal]
  F --> G[store + HU summary]
  G --> H[board pill]
  G --> I[Notion calls DB]

1 · The smart relay (≤3 SDK / >3 bot)

TL;DRA router counts the real humans and picks the cheapest capture that works.

Input: the Meet participant list. Output: a routing decision — 'sdk' (no bot, the common 1:1 case), 'bot' (auto-invite the notetaker for >3, since the bot-free SDK caps at 3), or 'none' (only one person = a holding room).

Why it mattersMost calls are small and deserve an invisible recording; big calls still get captured rather than silently truncated. One decision function makes the whole thing predictable and testable.
flowchart TD
  P[participants] --> N{distinct<br/>humans}
  N -->|<2| X[none]
  N -->|2-3| S[sdk · bot-free]
  N -->|>3| B[bot · addToLiveMeeting]

2 · Single-flight bot dispatch (the landmine, fixed)

TL;DRThe bot is invited exactly once per meeting — using the store's atomic claim, not the method that never existed.

Input: repeated 'participant joined' events for the same meeting. Output: one bot invite, then no-ops. The guard is store().put_if_absent('meet_bot_sent:URL') — the documented store().set(ttl=) would have thrown AttributeError and never dispatched.

Why it mattersWebhooks retry. Without an atomic claim you either double-invite (hit Fireflies' 3-per-20-min cap) or, with the broken snippet, never invite at all. This is the difference between 'reliable' and 'mysteriously flaky'.
flowchart LR
  E1[event #1] --> C{claim<br/>meet_bot_sent}
  C -->|new| D[dispatch once]
  E2[event #2] --> C
  C -->|exists| K[skip]

3 · Matching a transcript to the right lead

TL;DRUse the strongest signal first: the deal id we round-tripped, then calendar, then email.

Input: a finished transcript's webhook. Output: the exact deal. New: read clientReferenceId (the deal id we already pass on upload) BEFORE the email→CRM→meeting-id chain, so an aliased or unknown attendee email no longer loses the match.

Why it mattersEmail-only matching silently dropped calls where the lead joined under a different address. Carrying our own deal id removes that whole failure class for the upload and bot-relay paths.
flowchart TD
  W[webhook] --> R1{clientReferenceId?}
  R1 -->|yes| D[deal]
  R1 -->|no| R2{cal_id / email}
  R2 -->|hit| D
  R2 -->|miss| SL[Slack flag · link by hand]

4 · The Notion calls database

TL;DREvery call becomes one idempotent, lead-linked row with the full transcript as the page body.

Input: a matched + summarized call. Output: a row in 'Call Recordings & Transcripts' — Lead→CRM relation, recording URL, duration, source, Hungarian AI summary, status, and the diarized transcript as markdown. Deduped on Transcript ID so a retried webhook never doubles it.

Why it mattersThe deal card only ever held the summary; the raw transcript had nowhere to live. The calls DB is now the system of record for the full conversation, searchable and linked back to the CRM.
flowchart LR
  T[transcript] --> Q{Transcript ID<br/>seen?}
  Q -->|yes| K[skip]
  Q -->|no| C[create row]
  C --> L[Lead → CRM]
  C --> M[markdown body]

5 · What's live now vs what you must unlock

TL;DRFloor is live today; the invisible ≤3 capture waits on Workspace admin + a Fireflies plan tier.

Input: your account actions. Output: full automation. Live now: upload + paste → Notion row, the relay routing, idempotency. Blocked on you: Workspace admin approval for the bot-free Meet SDK, the account-level Fireflies webhook, and the Meet participant-joined subscription.

Why it mattersClaude can't grant Google admin consent or pick a paid plan tier. The build is honest about that: it fails CLOSED (no crash) when a prerequisite is missing and degrades to the always-on upload/relay floor.
flowchart TD
  NOW[live v102: upload · paste · relay routing] --> OK[usable today]
  OWN[owner: admin + plan + webhook + sub] --> SDK[≤3 bot-free auto-capture]
  OWN --> TRG[automatic >3 live trigger]