ClientsFlow Pipeline · Merged Whole-Journey EBO + Live Visual-QA Transcript

Merged EBO — every scenario, every test

One running document that fuses the signed Expected-Behavior Oracle (what each scenario should do) with a live visual-QA transcript per step (what it actually did). UX-critique column dropped. Each scenario is a toggle; each step opens a transcript that runs in real order: ① "What should I see?" (before the shot) → 📸 screenshot → ② what to look for → ③ what I see (Gemini pixel report) → 🔬 Sonnet adjudicates whenever Gemini isn't a clean pass → ④ verdict → ⑤ next action.

▦ LAYOUT A (chosen) — Integrated table · the EBO grid + a 🧪 Test column + an inline live-test transcript per step
Mockup preview — structure check. Showing the first 3 real scenarios (S1–S3, Studio↔Pipeline) so you can confirm the new live-reasoning transcript order before I replicate it across all 55 scenarios. S3 step 2 now demonstrates the key features: when a step fails, the toggle keeps multiple screenshots (Attempt 1 = BUG frame → Attempt 2 = PASS-after-fix frame), and the errors-log below indexes every defect. The transcript text + screenshots here are illustrative; in the full build they are filled from real visual-qa-ultra runs only (genuine model output, judged off the real PNG bytes — never fabricated).

⚠ Errors log — every defect found, fixed & re-shot (links to each scenario's full retry history)

ScenarioStepFailed-pixel observationFix (builder sub-agent)AttemptsFinal
S3 — booked on red-❗ → resume on URL save2❗ cleared but no "🎨 Dizájn készül" chip; DOM showed no Studio project — URL-save did not re-fire create+kickWired URL-save → ensure_studio_project(); cleared studio_deferred in one transaction (TDD red→green); redeployed2✓ green after fix

Demo shows 1 logged error; the overnight runner appends one row per defect across all 55 scenarios, each linking down to the scenario toggle where the failed + passing frames are kept.

PENDING_REVIEW run-trust 0.62 · 2 of 3 scenarios verified LIVE · 1 pending re-shoot · 55 scenarios total when fully merged · trust is in the asserted invariants, not "the system works"
You do → action You should see → on-screen result Element changes → copy · look · where What changes underneath → data/state Must NOT happen → bug guard 🕓 Touchpoint history 🧪 Test → honesty tag + verdict; expand for the live transcript (expect → shot → look → Gemini sees → 🔬 Sonnet adjudicates non-pass → verdict → next)
S1 Enrichment guarantees a website URL; missing → red ❗ badge guardrail / upstream studio-S1bnf-r3-S8duo2-S14 LIVE ✅ PASS

Who: System (lead enrichment) + Mátyás  ·  When: A new lead is enriched; enrichment must save a website_url. Runs upstream of Studio and gates everything downstream.

⚠ Merge conflict — 3 source EBOs disagreed on the missing-URL signal
studio-S1Raises a red ❗ badge pinned top-right of the card; the badge is the only thing that blocks the Studio auto-kick.
bnf-r3-S8Shows an amber ⚠️ "enrich pending" chip and silently retries enrichment 3× before flagging.
duo2-S14Fills legal/CRM fields on ingestion but is silent when no website is found (no card signal at all).
✔ Resolved (signed EBO wins): studio-S1 governs the card face — a red ❗ is raised when enrichment finds no URL; bnf-r3 retry logic is kept underneath (retry, then flag); the duo2 silent path is the bug this scenario forbids.
#You doYou should see Element that changes
copy · look · where
What changes underneathMust NOT happen 🕓 Touchpoint history🧪 Test
1 Enrichment runs on a new lead and finds + saves the prospect's website The card shows the website (favicon/domain) and NO red ❗ — Studio readiness is implicitly met Copy: domain shown (e.g. "beridoor.hu") · Look: normal card, optional 🌐 favicon · Where: card meta row deal.website_url persisted; Studio-readiness = true Lead must NOT exist long-term with an empty website silently; readiness must NOT be assumed without a saved URL SYSTEM 🔎 "Weboldal megtalálva" · "{domain} mentve" · by system LIVE ✅PASS
Step 1 — live test transcript
① Before the shot — “What should I see?”

A saved website domain on this deal's Details panel (e.g. "beridoor.hu") and no red ❗ on the card face — enrichment should have persisted website_url before any booking, so Studio-readiness is implicitly true. The duo2-merge also means the Legal-info fields should already be filled.

📸 screenshot taken
Details panel with website filled
frames/s1-01-details.png · 1280×800 · captured 21:18:04
② What to look for (exactly)

On the Details panel, find the "Weboldal" row and confirm it carries a real domain. On the card header, confirm there is no red ❗ ribbon. Cross-check that the Legal-info fields (entity type, tax number) are populated — this row merges duo2-S14's ingestion fill.

③ What I see — Gemini · 5-sentence pixel report

The Details panel shows a populated contact block; the "Weboldal" row reads "beridoor.hu"; the legal-info fields below are filled; the card header carries no red or amber badge; there are no empty-field placeholders.

④ Verdict — pass or error?

Pixels (filled Weboldal row, no badge) and the DOM/state (website_url persisted) both agree with the happy branch. → PASS (LIVE ✅ — pixels ∧ state).

⑤ Next action

PASS → Proceed to the no-URL branch (step 2) to confirm the red ❗ fires when enrichment finds nothing.

2 Enrichment finds NO usable website at all A prominent red ❗ badge on the card face — the only thing that will later block the Studio auto-kick Copy: tooltip "Nincs weboldal — add meg a Studio dizájnhoz" · Look: RED ❗ badge, pinned top-right · Where: card corner ribbon deal.website_url empty → readiness = false; ❗ flag raised (after the bnf-r3 retry budget) The ❗ must NOT be hidden in a panel; a no-URL deal must NOT pass as Studio-ready (the duo2 silent path) SYSTEM ❗ "Nincs weboldal" · "kézi megadás szükséges" · by system LIVE ✅PASS
Step 2 — live test transcript
① Before the shot — “What should I see?”

On a lead where enrichment found no website, a prominent red ❗ badge pinned top-right of the card face — never hidden in a panel, never only an amber chip, never the duo2 silent path. There must be no "Dizájn készül" chip (the Studio kick is held).

📸 screenshot taken
Board card with red badge
frames/s1-02-noURL.png · 1280×800 · captured 21:18:41
② What to look for (exactly)

Scan the card top-right corner for a red ❗ ribbon. Confirm there is no "🎨 Dizájn készül" chip. Hover to confirm the tooltip prompts the operator for a URL.

③ What I see — Gemini · 5-sentence pixel report

The board card carries a bright red ❗ ribbon top-right, above the fold; the card greys slightly to draw the eye; hovering shows the "add a URL" tooltip; no design chip is present.

④ Verdict — pass or error?

The red ❗ is raised and unmissable, and the Studio kick is held → matches studio-S1, not the duo2 silent path. → PASS (the conflict resolution holds in pixels).

⑤ Next action

PASS → Carry into S3 — book a call on this red-❗ deal and confirm the kick is deferred, not fake-run.

S2 Sales call booked + URL present → auto-create project + auto-kick the Lab (Phase ①) happy path studio-S2 RENDERED ⚠️ PARTIAL

Who: System (triggered by booking) + Mátyás  ·  When: A deal with a saved website URL has its sales call booked. This is the Phase-① "sell with it" kickoff.

#You doYou should see Element that changes
copy · look · where
What changes underneathMust NOT happen 🕓 Touchpoint history🧪 Test
1 The sales call is booked on a URL-present deal A Studio project is auto-created, bound to this deal; the status chip "🎨 Dizájn készül" appears Copy: chip "🎨 Dizájn készül" · Look: indigo chip · Where: card status-chip slot Studio POST /api/projects {source_url, crm_ref=deal_id}; project id stored on the deal Must NOT create a 2nd project (idempotent); must NOT block the booking if Studio is slow (fail-open) NEW 🎨 "Studio projekt létrehozva" · by system LIVE ✅PASS
Step 1 — live test transcript
① Before the shot — “What should I see?”

After booking a URL-present deal, exactly one indigo chip "🎨 Dizájn készül" on the card, and a Studio project created and bound (crm_ref = deal_id) — idempotent (no duplicate on re-POST), and the booking itself never blocked even if Studio is slow.

📸 screenshot taken
Card with design chip
frames/s2-01-chip.png · 1480×1000 · captured 21:22:10 · ZZ Studio Walk
② What to look for (exactly)

Confirm a single "🎨 Dizájn készül" chip (not a cluster of widgets). Read the DOM for studio_project_id. Re-POST the booking and confirm no second project is created.

③ What I see — Gemini · 5-sentence pixel report

One indigo "🎨 Dizájn készül" chip on the card; DOM read confirms studio_project_id is set; a second booking POST returns the same project id — no duplicate.

④ Verdict — pass or error?

Pixels (single chip) ∧ state (project bound) ∧ idempotency (no duplicate) all hold. → PASS (LIVE ✅).

⑤ Next action

PASS → Watch the generation step — expect RENDERED, because the Lab spend/artifacts are back-end and not pixel-visible.

2 (Automatic) the Lab is kicked to generate the Phase-① draft Studio begins generating a homepage design + SEO audit + suggested structure from the saved URL Copy: chip stays "🎨 Dizájn készül" with a progress shimmer · Look: animated chip · Where: card status chip Lab run started (scrape → SEO audit → structure → homepage); Gemini/DataForSEO spend on this paying-intent lead Must NOT auto-send anything to the prospect; Lab kick is internal-only; no new Modal cron added NEW ⚙️ "Dizájn generálás elindítva" · by system RENDERED ⚠️UNVERIFIED
Step 2 — live test transcript
① Before the shot — “What should I see?”

The chip should stay "🎨 Dizájn készül" with a progress shimmer while the Lab runs (scrape → SEO → structure → homepage). Gemini/DataForSEO spend is incurred underneath, and nothing is sent to the prospect.

📸 screenshot taken
Generating shimmer chip
frames/s2-02-generating.png · RENDERED (not driven to completion)
② What to look for (exactly)

Confirm the chip still reads "készül" with a shimmer and there is no error state. Note in advance: the actual Lab run/spend is not something a screenshot can prove — flag it accordingly.

③ What I see — Gemini · 5-sentence pixel report

Gemini's call: not a clean pass — confidence medium, because the shimmer is present but completion can't be seen. The chip reads "🎨 Dizájn készül" with a subtle shimmer; the card is otherwise healthy; there is no visible error; completion is not yet shown.

🔬 Sonnet — second-judge adjudication (Gemini did not return a clean pass)

Re-examining the flagged frame against the spec: the shimmer is a legitimate loading / working state, not a product defect — so this is not a BUG. But it also cannot be GREEN: the asserted state-change (Lab run started + Gemini/DataForSEO spend) lives in the Studio back-end and is absent from these pixels. Adjudication: down-rank to RENDERED ⚠️ / UNVERIFIED rather than PASS or BUG, and require a Studio run-state assertion (API) to settle it.

④ Verdict — pass or error?

The surface renders correctly, but the asserted state-change (Lab run + spend) is back-end and not verified by pixels. Per Sonnet's adjudication → tagged RENDERED ⚠️, which cannot be GREEN → UNVERIFIED (not an error, but not a pass).

⑤ Next action

UNVERIFIED → Re-shoot once the Studio run reaches "ready" and assert run-state via the Studio API — the state assertion is what upgrades this frame to LIVE ✅.

S3 Booked on a red-❗ deal → NO auto-kick; saving a URL resumes the flow failure / deferred studio-S3 ✗→✓ 2 frames · PASS after fix

Who: System + Mátyás  ·  When: A deal whose card carries the red ❗ (no website) gets its sales call booked. The Studio kickoff must be held back, not silently skipped or fake-run.

#You doYou should see Element that changes
copy · look · where
What changes underneathMust NOT happen 🕓 Touchpoint history🧪 Test
1 The sales call is booked while the card still shows the red ❗ NO Studio project, NO Lab kick; the card prompts: "Add meg a weboldalt a Studio dizájnhoz" Copy: prompt "Weboldal szükséges a dizájnhoz" · Look: ❗ stays + inline "Weboldal megadása" field · Where: card body Studio create/kick suppressed; deal flagged studio_deferred=true Must NOT auto-create a blank project; must NOT silently skip; must NOT fake-run a design on a missing URL SYSTEM ⏸️ "Studio dizájn elhalasztva" · by system LIVE ✅PASS
Step 1 — live test transcript
① Before the shot — “What should I see?”

Booking a call while the red ❗ is present must not create a project or kick the Lab. Instead the card should show an inline "Weboldal megadása" field, and the deal should be flagged studio_deferred=true.

📸 screenshot taken
Deferred card
frames/s3-01-deferred.png · 1280×800 · captured 21:26:33
② What to look for (exactly)

Confirm there is no "🎨 Dizájn készül" chip; confirm the red ❗ persists; confirm an inline URL field rendered; read state for studio_deferred=true.

③ What I see — Gemini · 5-sentence pixel report

The red ❗ persists; an inline "Weboldal megadása" field is rendered in the card body; no "Dizájn készül" chip exists; the kick was suppressed.

④ Verdict — pass or error?

The kick is correctly deferred (no blank project, no fake run) and the rep is prompted for the URL. → PASS (LIVE ✅).

⑤ Next action

PASS → Paste a valid URL into the inline field and confirm the S2 create+kick resumes on save alone (step 2).

2 Mátyás pastes a valid website URL into the inline field and saves The ❗ clears and the normal S2 flow resumes automatically: project created + Lab kicked Copy: ❗ gone; chip flips to "🎨 Dizájn készül" · Look: red badge → indigo chip · Where: card website_url saved; studio_deferred cleared; same create+kick as S2 fires Must NOT require re-booking; the save alone must resume it NEW 🎨 "Studio projekt létrehozva" · "Weboldal megadva" · by operator LIVE ✅PASS
Step 2 — live test transcript · 2 attempts (✗ Attempt 1 → ✓ Attempt 2)
① Before the shot — “What should I see?”

After pasting a valid URL and saving, the red ❗ should clear and the normal S2 flow should resume automatically — project created + Lab kicked — without any re-booking; studio_deferred cleared. This step failed on the first attempt, so the toggle keeps BOTH frames below.

Attempt 1 LIVE ✅BUG · 21:27:40
📸 screenshot taken — attempt 1
Attempt 1 — no chip rendered
frames/s3-02-retry1-FAIL.png · 1280×800 · captured 21:27:40
② What to look for (exactly)

After save: confirm the ❗ cleared, an indigo "🎨 Dizájn készül" chip appeared, and the DOM shows exactly one new project bound to the deal — with no re-booking.

③ What I see — Gemini · 5-sentence pixel report

Gemini's call: not a pass. The red ❗ has cleared, but the card shows no design chip at all; the status-chip slot is empty; nothing indicates a Studio project spun up; the card otherwise looks idle.

🔬 Sonnet — second-judge adjudication (Gemini did not return a clean pass)

Adjudicating against pixels + DOM: the ❗ did clear, but there is no chip and the DOM shows no Studio project → a genuine BUG (the URL-save did not re-fire create+kick), not a Gemini misread or a transient loading state.

④ Verdict — pass or error?

Resume did not fire on save alone → BUG (LIVE ✅ — the failure is real, not a render artifact).

⑤ Next action

ERROR → Back up (git bundle), spawn a builder sub-agent (own worktree, TDD) to wire the URL-save to ensure_studio_project(), redeploy (snap_deploy.sh), then re-shoot this exact step.

Attempt 2 · after fix LIVE ✅PASS · 21:28:12
📸 screenshot taken — attempt 2 (after the builder fix + redeploy)
Attempt 2 — resumed
frames/s3-02-retry2-PASS.png · 1280×800 · captured 21:28:12
② What to look for (exactly)

Same checklist: ❗ gone, indigo "🎨 Dizájn készül" chip present, exactly one new project in the DOM, studio_deferred false, no re-booking.

③ What I see — Gemini · 5-sentence pixel report

Gemini's call: clean pass (conf high). The badge is gone; an indigo "🎨 Dizájn készül" chip now renders in the status slot; DOM confirms a single new project bound to the deal; studio_deferred is false.

④ Verdict — pass or error?

On attempt 2 the resume fires on save alone (one project, chip flipped, no re-booking). Gemini is a clean pass → no Sonnet adjudication needed. → PASS.

⑤ Next action

PASS → Scenario complete; advance to S4 (open the Phase-① draft from the card).

error-arc — full history kept (both frames above stay in this toggle)
ERRORAttempt 1: URL-save cleared the ❗ but did NOT re-fire create+kick — no chip, no project. Violated "save alone resumes it".
FIXBuilder sub-agent wired the URL-save handler to ensure_studio_project(); cleared studio_deferred in one transaction (TDD red→green). Redeployed via snap_deploy.
PASSAttempt 2: save alone resumed; one project, chip flipped, no re-booking. Verified LIVE.