Sequence Lifecycle Audit¶

Date: 2026-07-01 Scope: Full drip-campaign ("Sequences") lifecycle — data model, enrollment, execution, sending, compliance/opt-out, and analytics/reporting. Method: Source review of the executor, the two React hooks, all 36 sequence-related migrations, the unsubscribe function, and all enrollment/analytics UI components.

Primary artifacts audited - the platform (1,258 lines — the runtime engine) - the platform - the platform, the platform - the platform (enrollment dialogs, analytics dashboard, form) - the platform (schema, RLS, indexes, triggers, constraints)

Remediation status (2026-07-01): All findings below have been addressed. Schema/trigger/RPC/grant/index fixes ship in migration 20260701130000_sequence_lifecycle_audit_fixes.sql; runtime fixes in sequence-executor and sequence-unsubscribe; enrollment/reporting fixes in the app, the app, the enrollment menus, and SequenceAnalyticsDashboard. Docs updated in docs/02-user-guide/sequences.md and CLAUDE.md §12. One additional critical bug (C0) was found during remediation and fixed.

1. Executive summary¶

C0 — 🔴 Critical (found during remediation): the v1 daily-send-count RPC increment_sequence_send_count (used by WhatsApp Lite) still referenced the ON CONFLICT (email_account_id, send_date) target that migration 20260603050044 dropped, so every call raised 42P10. Combined with C2, WhatsApp Lite sequence sends were silently skipped and advanced. Fixed by recreating the RPC against the current (account_kind, email_account_id, send_date) unique key and switching the executor's WA Lite path to the generalized _v2 counter.

The sequences feature is functional and reasonably defensive at send time, but the audit surfaced 4 critical/high issues that cause silent message loss, broken unsubscribe state, defeated anti-spam pacing, and a cross-tenant IDOR, plus a cluster of medium reporting bugs that make the analytics dashboard actively misleading.

The single highest-confidence finding — corroborated independently from the DB, compliance, and analytics angles — is that the enrollment status 'unsubscribed' is not a permitted value in the CHECK constraint, so the unsubscribe endpoint's attempt to stop an enrollment silently fails.

#	Severity	Area	One-line
C1	🔴 Critical	Compliance / State	`'unsubscribed'` status violates the CHECK constraint → unsubscribe never updates the enrollment row
C2	🔴 Critical	Execution	Daily send-limit returns `skipped`, which the engine treats as success and advances past the step → the message is permanently dropped (not "retried tomorrow" as the error claims)
H1	🟠 High	Anti-spam	WhatsApp Lite anti-burst stagger is a no-op; all due WA Lite enrollments fire in the same cron tick with zero spacing
H2	🟠 High	Security	Test mode trusts `organization_id`/`contact_id`/`sequence_id` from the request body with no caller-org check → cross-tenant send + PII read (IDOR)
H3	🟠 High	Compliance	Re-enroll `upsert` overwrites existing rows → restarts `completed`/`removed`/`bounced` contacts from step 0
H4	🟠 High	Compliance	`resubscribe_on_opt_in` trigger silently reverses honored unsubscribes on any opt-in flip (import/bulk edit)
H5	🟠 High	Ingestion	`pause_sequence_on_reply` can insert `NULL` `step_id` → `NOT NULL` violation that blocks inbound message ingestion
M1–M9	🟡 Medium	Reporting / Execution	Inflated "Messages Sent", dead status filters, wrong unsub denominator, uncounted terminal states, no open/click/reply tracking, quota leak on failed sends, wrong WhatsApp template params (see §5)
L1–L8	⚪ Low	Hygiene	Missing GRANTs, FKs to `auth.users`, deprecated `std serve`, doc drift, non-self-contained migrations (see §6)

2. Intended lifecycle (as built)¶

enroll (upsert active, current_step_index=0, next_step_at=now+step0.delay)
 │
 ▼
cron every 5 min → sequence-executor
 │ SELECT status='active' AND next_step_at<=now LIMIT 50
 │ claim (optimistic lock: push next_step_at +10min)
 │ guard: sequence still active? deal-exit? sending window? WA-Lite stagger?
 │ executeStep(channel) → { sent | failed | skipped }
 │ INSERT sequence_execution_log
 ▼
 result=sent/skipped → advance current_step_index, next_step_at=now+nextStep.delay
 result=failed → retry w/ backoff (5→10→20min), mark 'failed' after 3 tries
 no more steps → 'completed'

Terminal / holding states written by the runtime: completed, failed, exited (deal-exit trigger + executor), paused (reply trigger), removed (UI). Constraint-permitted but effectively dead: bounced (only ever written to email_messages, never to sequence_enrollments). Written by code but constraint-forbidden: unsubscribed (C1).

Scheduling clock: a single next_step_at timestamptz per enrollment plus current_step_index into ordered sequence_steps; per-step relative delay_minutes. Timezone is resolved at send time from sequences.sending_window_timezone → org settings → UTC. There is no per-enrollment timezone column.

3. Critical & High findings (with fixes)¶

C1 — `'unsubscribed'` enrollment status is rejected by the CHECK constraint 🔴¶

Where: the platform writes status: "unsubscribed". The constraint only permits active, completed, paused, removed, failed, exited, bounced. Effect: The UPDATE fails with check_violation (23514). The function does not check that update's error (:152-157), so it is swallowed. The token is marked unsubscribed_at and contacts.opt_in_status is set false, but the enrollment row stays active. Unsubscribe only "works" transitively because the executor separately checks opt_in_status/the email token at send time. Enrollment state and unsubscribe reporting are broken, and if the opt_in_status fallback were ever removed, unsubscribe would stop working entirely. Fix: Add 'unsubscribed' to the CHECK constraint (mirroring how failed/exited/bounced were added), and .throwOnError (or check the error) on that update so failures aren't silent.

C2 — Daily send-limit `skipped` silently drops the message 🔴¶

Where: sequence-executor/index.ts:560-563 (WA Lite) and :913-916 (email) return status: "skipped" with the message "Will retry tomorrow." But the main loop only reschedules on failed; skipped falls into the else branch (:238 vs :265-294) which advances current_step_index. Effect: When a sender hits its daily cap, the step is marked skipped and the enrollment moves on to the next step. The capped message is never sent — the "retry tomorrow" promise is false. High-volume orgs silently lose sends. Fix: Treat daily-limit skipped as a reschedule (set next_step_at to tomorrow's window, do not advance current_step_index). Distinguish "retryable skip" (daily limit) from "terminal skip" (opt-out, missing contact info).

H1 — WhatsApp Lite anti-burst stagger never triggers 🟠¶

Where: sequence-executor/index.ts:171-192. On the first WA Lite send it seeds waLiteRollingMs = nowMs (the current time, :191), then for later enrollments checks if (waLiteRollingMs > nowMs) (:175). Effect: Because the clock is seeded to "now" rather than a future slot, waLiteRollingMs > nowMs is never true on subsequent iterations, so the jitter branch (:178-188) is dead code. Every due WA Lite enrollment in the batch fires in the same second — defeating the documented "random 1–19 min gap between contacts" anti-spam pacing (CLAUDE.md §12/§14). Blast is bounded only by the 50-row batch cap and the 200/day per-channel limit. Fix: Seed the rolling clock to a future slot on the first send: waLiteRollingMs = nowMs + jitter, and reschedule (not send) any enrollment whose slot is in the future.

H2 — Test mode is a cross-tenant IDOR 🟠¶

Where: handleTestMode (sequence-executor/index.ts:321-421). It reads sequence_id, contact_id, organization_id from the request body and fetches steps/contact by id with no organization_id scoping and no verification that the caller's JWT belongs to that org, then sends using the service role. The function relies on the default verify_jwt=true, so the caller only needs any valid account. Effect: A user in org A can invoke test mode with org B's sequence/contact/channel IDs and (a) read org B contact PII in the response and (b) send real messages through org B's channel accounts. Note test mode is not a dry run — it actually delivers (email/WA), only skipping the daily-limit increment. Fix: In test mode, resolve the caller's org from their JWT and require organization_id/sequence_id/contact_id to belong to it; scope all fetches by organization_id. Consider a true dry-run mode that renders without sending.

H3 — Re-enroll `upsert` resurrects/restarts contacts 🟠¶

Where: all enroll paths use .upsert(..., { onConflict: "sequence_id,contact_id" }) with status:"active", current_step_index:0 — useSupabaseSequences.ts:355-367, SequenceEnrollmentMenu.tsx:113, BulkSequenceEnrollmentMenu.tsx:82-95, DealSequenceEnrollmentMenu.tsx:95-107. Effect: Re-enrolling an existing contact overwrites prior state: a completed contact restarts from step 0, a removed/bounced/exited contact is reactivated, a mid-sequence contact jumps back to the start. Bulk/tag/deal enroll will restart everyone already in the sequence. Fix: Use a conditional insert (ignoreDuplicates, or WHERE status NOT IN ('active','completed',...)) or route enrollment through an edge function that decides restart-vs-skip explicitly.

H4 — Opt-in trigger reverses honored unsubscribes 🟠¶

Where: resubscribe_contact_emails_on_opt_in nulls unsubscribed_at for all tokens matching the contact whenever opt_in_status flips false→true. Effect: Any internal edit, CSV import, or bulk op that sets opt-in true silently re-subscribes people who used the unsubscribe link, and email sending resumes. This is a regulatory hazard (CAN-SPAM/GDPR: an unsubscribe must not be reversible by an internal action without fresh consent). Fix: Do not auto-clear unsubscribed_at on opt-in flips; require an explicit, logged re-consent action.

H5 — Reply-pause trigger can block inbound message ingestion 🟠¶

Where: pause_sequence_on_reply inserts a sequence_execution_log row with step_id = (SELECT ss.id... WHERE ss.step_order = se.current_step_index LIMIT 1). Effect: If no step matches that index (index past the last step — which happens because the executor advances current_step_index after a send — or steps were deleted), the subquery returns NULL, violating sequence_execution_log.step_id NOT NULL. Since this fires AFTER INSERT on messages, the failure aborts the inbound message insert for that contact — a webhook-path landmine. It also mis-attributes the pause to the next, not-yet-sent step, and its 'Auto-paused: contact replied' row is later rendered in the analytics "Errors & Warnings" panel as if a reply were a delivery failure. Fix: Guard the subquery (skip the log insert when no step resolves, or make step_id nullable for system events), and exclude reply-pause rows from the error/deliverability aggregation.

4. Reporting audit (the "clear reporting" ask)¶

The analytics dashboard (the app + the app) is read-rich but has correctness bugs that make headline numbers wrong, and large coverage gaps. Reporting reads only sequence_execution_log (statuses sent/failed/skipped) + enrollment states + a client-side cross-reference to email_bounces.

Reporting correctness bugs¶

M1 — "Messages Sent" & Success Rate use the wrong total. totalExecutions = executions.length (useSequenceAnalytics.ts:249) includes skipped rows; the dashboard labels it "Messages Sent" (:170-171) and divides success by it (:194-196). Skipped rows (opt-outs, missing contact info, daily-limit) are neither sent nor failed → both the count and the denominator are inflated. Same flaw in per-sequence (:302) and per-channel (:281) rates.
M2 — Dead status filters. Analytics filters on "success", "error", "bounced", "pending" (hook :207,210,250,251,266,282,283,290,317; dashboard :50,55) — none of which the executor can write (StepResult is sent|failed|skipped, :425-430) and the CHECK forbids. All permanently false; latent since the write/read contract was never reconciled.
M3 — Unsubscribe rate mixes date windows. totalUnsubscribes/totalEnrollments (dashboard :215-217): the numerator filters unsubscribed_at in-range while the denominator filters enrolled_at in-range (hook :146-153 vs :183-187), so a contact enrolled long ago but unsubscribing today inflates the rate — not a cohort rate.
M4 — Terminal states are uncounted. The overview buckets only active/completed/paused/removed (hook :245-248), but failed/exited/bounced are counted in totalEnrollments (:244). The status breakdown never sums to 100%, and goal-exits/bounces are invisible.
M5 — executed_at is the batch-start timestamp (:33 reused at :234), not the per-step send time, so intra-tick ordering in "Recent Executions" is arbitrary. Timeline bucketing uses UTC days shown as local (hook :394,467) → off-by-one for non-UTC orgs.

Reporting coverage gaps (what users can't see but want)¶

No open / click rate. Email opens/clicks land in email_tracking_events keyed by email_messages.tracking_id (email-tracking-pixel/index.ts:131-133) with no link to the sequence or enrollment.
No reply rate. Replies only flip the enrollment to paused via trigger; never counted.
No true delivery/bounce funnel. sent means "accepted by the send function," not delivered; a bounced email still counts as sent/success in per-step and channel rates.
Skip reasons not categorized. ~8 distinct skip reasons collapse into skipped, distinguishable only by free-text error_message — no breakdown.
No cohort funnel. The step "funnel" is execution-count based, not "of N enrolled, X reached step 3"; because failed steps retry and skips advance, execution counts ≠ unique contacts per step.
No conversion/goal metric, despite exit_on_deal_stages/exit_on_deal_created support in the executor (:115-137).

Recommended reporting model¶

Add a typed event_type/reason column (or a normalized sequence_events table) so sends, delivery, opens, clicks, replies, bounces, unsubscribes, and skip-reasons are first-class — not inferred from a 4-value status + free text.
Join email tracking (email_tracking_events) to the enrollment/step so open/click rates are attributable per sequence and step.
Fix denominators: "Messages Sent" = sent only; success rate = sent / (sent + failed); unsubscribe/reply rates as cohorts of the same enrolled population.
Surface all terminal states (failed/exited/bounced/unsubscribed) and a per-step cohort drop-off + goal-conversion view.

5. Other medium findings¶

M6 — Quota leak on failed sends. Daily-limit RPCs (increment_sequence_send_count, ..._v2) increment before the fetch (sequence-executor/index.ts:554-558, 906-911); if the downstream send returns failed, the count was already consumed, and each retry re-increments.
M7 — WhatsApp Cloud template parameters are wrong. buildTemplateComponents/contactFallback fill every placeholder ({{1}},{{2}},…) with the contact's full name (:613-624,626-689), ignoring the placeholder's meaning. Templates with non-name or multiple variables get incorrect data.
M8 — Terminal-skip enrollments keep iterating. Opt-out / missing-contact-info return skipped, which advances through every remaining step (each logging a skip) instead of terminating the enrollment — wasted cycles + log noise.
M9 — No enroll-time validation. Enrollment does not check unsubscribe/opt-out/valid-channel/bounce before writing, so junk enrollments inflate counts and rely entirely on send-time skips. A whatsapp_cloud step with no template silently skips forever without surfacing to the enroller.

6. Low / hygiene¶

L1 — Missing GRANTs. None of the sequence tables carry the project-mandated GRANT... TO authenticated/service_role (CLAUDE.md §4.4); they work only via an implicit base-schema grant not present in the migration tree.
L2 — FKs to auth.users. sequence_enrollments.enrolled_by/removed_by reference auth.users (20260330100834_*.sql:35,38), against CLAUDE.md §4.3 (use profiles(id)).
L3 — Deprecated std@0.168.0/serve in both sequence-executor and sequence-unsubscribe (CLAUDE.md §13.4).
L4 — Doc drift. CLAUDE.md and docs/02-user-guide/sequences.md reference a non-existent sequence_unsubscribes table (real: sequence_unsubscribe_tokens) and claim PBKDF2 tokens; the token is actually a 256-bit random hex (token TEXT DEFAULT encode(gen_random_bytes(32),'hex')) — unguessable and fine, but not what the spec says, and it never expires despite "invalid or expired" copy.
L5 — Constraint/code shipped out of sync. exited/failed were written by triggers/executor ~26 days before the CHECK was expanded to allow them (20260526093935 vs 20260621120000); in that window deal-stage updates for enrolled contacts could have thrown check_violation.
L6 — Non-self-contained migrations. public.sequences has no CREATE TABLE in the repo (Lovable-managed base); a fresh supabase db reset would fail on the first ALTER TABLE public.sequences.
L7 — Guest/demo RLS to anon unqualified on public.sequences (20260102190930_*.sql:65-68) — limited to the demo org, but an unqualified-role SELECT on a business table.
L8 — Missing composite index on sequence_execution_log(enrollment_id, step_id, status) for the per-step fail-count query (:240-245); acceptable now, latent at scale.

7. Prioritized remediation¶

Now (correctness/compliance/security): 1. C1 — add 'unsubscribed' to the CHECK; error-check the unsubscribe update. 2. C2 — reschedule (don't advance) on daily-limit skips. 3. H2 — org-scope test mode to the caller's JWT. 4. H4 — stop auto-reversing unsubscribes on opt-in. 5. H5 — guard the reply-pause step_id subquery.

Next (reliability/anti-spam): 6. H1 — fix the WA Lite stagger seed. 7. H3 — conditional enroll (no silent restart). 8. M6/M7/M8/M9 — quota accounting, template params, terminal-skip handling, enroll-time validation.

Reporting (the "clear reporting" deliverable): 9. M1–M5 + coverage gaps — typed event model, tracking join, corrected denominators, terminal-state + cohort + conversion views (see §4).

Hygiene: L1–L8.

All line references are against the repository state on branch claude/sequence-lifecycle-audit-n3ugik at audit time.