Sequence Lifecycle Audit¶
Date: 2026-07-01 Scope: Full drip-campaign ("Sequences") lifecycle — data model, enrollment, execution, sending, compliance/opt-out, and analytics/reporting. Method: Source review of the executor, the two React hooks, all 36 sequence-related migrations, the unsubscribe function, and all enrollment/analytics UI components.
Primary artifacts audited - the platform (1,258 lines — the runtime engine) - the platform - the platform, the platform - the platform (enrollment dialogs, analytics dashboard, form) - the platform (schema, RLS, indexes, triggers, constraints)
Remediation status (2026-07-01): All findings below have been addressed. Schema/trigger/RPC/grant/index fixes ship in migration
20260701130000_sequence_lifecycle_audit_fixes.sql; runtime fixes insequence-executorandsequence-unsubscribe; enrollment/reporting fixes in the app, the app, the enrollment menus, andSequenceAnalyticsDashboard. Docs updated indocs/02-user-guide/sequences.mdandCLAUDE.md §12. One additional critical bug (C0) was found during remediation and fixed.
1. Executive summary¶
C0 — 🔴 Critical (found during remediation): the v1 daily-send-count RPC increment_sequence_send_count (used by WhatsApp Lite) still referenced the ON CONFLICT (email_account_id, send_date) target that migration 20260603050044 dropped, so every call raised 42P10. Combined with C2, WhatsApp Lite sequence sends were silently skipped and advanced. Fixed by recreating the RPC against the current (account_kind, email_account_id, send_date) unique key and switching the executor's WA Lite path to the generalized _v2 counter.
The sequences feature is functional and reasonably defensive at send time, but the audit surfaced 4 critical/high issues that cause silent message loss, broken unsubscribe state, defeated anti-spam pacing, and a cross-tenant IDOR, plus a cluster of medium reporting bugs that make the analytics dashboard actively misleading.
The single highest-confidence finding — corroborated independently from the DB, compliance, and analytics angles — is that the enrollment status 'unsubscribed' is not a permitted value in the CHECK constraint, so the unsubscribe endpoint's attempt to stop an enrollment silently fails.
| # | Severity | Area | One-line |
|---|---|---|---|
| C1 | 🔴 Critical | Compliance / State | 'unsubscribed' status violates the CHECK constraint → unsubscribe never updates the enrollment row |
| C2 | 🔴 Critical | Execution | Daily send-limit returns skipped, which the engine treats as success and advances past the step → the message is permanently dropped (not "retried tomorrow" as the error claims) |
| H1 | 🟠 High | Anti-spam | WhatsApp Lite anti-burst stagger is a no-op; all due WA Lite enrollments fire in the same cron tick with zero spacing |
| H2 | 🟠 High | Security | Test mode trusts organization_id/contact_id/sequence_id from the request body with no caller-org check → cross-tenant send + PII read (IDOR) |
| H3 | 🟠 High | Compliance | Re-enroll upsert overwrites existing rows → restarts completed/removed/bounced contacts from step 0 |
| H4 | 🟠 High | Compliance | resubscribe_on_opt_in trigger silently reverses honored unsubscribes on any opt-in flip (import/bulk edit) |
| H5 | 🟠 High | Ingestion | pause_sequence_on_reply can insert NULL step_id → NOT NULL violation that blocks inbound message ingestion |
| M1–M9 | 🟡 Medium | Reporting / Execution | Inflated "Messages Sent", dead status filters, wrong unsub denominator, uncounted terminal states, no open/click/reply tracking, quota leak on failed sends, wrong WhatsApp template params (see §5) |
| L1–L8 | ⚪ Low | Hygiene | Missing GRANTs, FKs to auth.users, deprecated std serve, doc drift, non-self-contained migrations (see §6) |
2. Intended lifecycle (as built)¶
enroll (upsert active, current_step_index=0, next_step_at=now+step0.delay)
│
▼
cron every 5 min → sequence-executor
│ SELECT status='active' AND next_step_at<=now LIMIT 50
│ claim (optimistic lock: push next_step_at +10min)
│ guard: sequence still active? deal-exit? sending window? WA-Lite stagger?
│ executeStep(channel) → { sent | failed | skipped }
│ INSERT sequence_execution_log
▼
result=sent/skipped → advance current_step_index, next_step_at=now+nextStep.delay
result=failed → retry w/ backoff (5→10→20min), mark 'failed' after 3 tries
no more steps → 'completed'
Terminal / holding states written by the runtime: completed, failed, exited (deal-exit trigger + executor), paused (reply trigger), removed (UI). Constraint-permitted but effectively dead: bounced (only ever written to email_messages, never to sequence_enrollments). Written by code but constraint-forbidden: unsubscribed (C1).
Scheduling clock: a single next_step_at timestamptz per enrollment plus current_step_index into ordered sequence_steps; per-step relative delay_minutes. Timezone is resolved at send time from sequences.sending_window_timezone → org settings → UTC. There is no per-enrollment timezone column.
3. Critical & High findings (with fixes)¶
C1 — 'unsubscribed' enrollment status is rejected by the CHECK constraint 🔴¶
Where: the platform writes status: "unsubscribed". The constraint only permits active, completed, paused, removed, failed, exited, bounced.
Effect: The UPDATE fails with check_violation (23514). The function does not check that update's error (:152-157), so it is swallowed. The token is marked unsubscribed_at and contacts.opt_in_status is set false, but the enrollment row stays active. Unsubscribe only "works" transitively because the executor separately checks opt_in_status/the email token at send time. Enrollment state and unsubscribe reporting are broken, and if the opt_in_status fallback were ever removed, unsubscribe would stop working entirely.
Fix: Add 'unsubscribed' to the CHECK constraint (mirroring how failed/exited/bounced were added), and .throwOnError (or check the error) on that update so failures aren't silent.
C2 — Daily send-limit skipped silently drops the message 🔴¶
Where: sequence-executor/index.ts:560-563 (WA Lite) and :913-916 (email) return status: "skipped" with the message "Will retry tomorrow." But the main loop only reschedules on failed; skipped falls into the else branch (:238 vs :265-294) which advances current_step_index.
Effect: When a sender hits its daily cap, the step is marked skipped and the enrollment moves on to the next step. The capped message is never sent — the "retry tomorrow" promise is false. High-volume orgs silently lose sends.
Fix: Treat daily-limit skipped as a reschedule (set next_step_at to tomorrow's window, do not advance current_step_index). Distinguish "retryable skip" (daily limit) from "terminal skip" (opt-out, missing contact info).
H1 — WhatsApp Lite anti-burst stagger never triggers 🟠¶
Where: sequence-executor/index.ts:171-192. On the first WA Lite send it seeds waLiteRollingMs = nowMs (the current time, :191), then for later enrollments checks if (waLiteRollingMs > nowMs) (:175).
Effect: Because the clock is seeded to "now" rather than a future slot, waLiteRollingMs > nowMs is never true on subsequent iterations, so the jitter branch (:178-188) is dead code. Every due WA Lite enrollment in the batch fires in the same second — defeating the documented "random 1–19 min gap between contacts" anti-spam pacing (CLAUDE.md §12/§14). Blast is bounded only by the 50-row batch cap and the 200/day per-channel limit.
Fix: Seed the rolling clock to a future slot on the first send: waLiteRollingMs = nowMs + jitter, and reschedule (not send) any enrollment whose slot is in the future.
H2 — Test mode is a cross-tenant IDOR 🟠¶
Where: handleTestMode (sequence-executor/index.ts:321-421). It reads sequence_id, contact_id, organization_id from the request body and fetches steps/contact by id with no organization_id scoping and no verification that the caller's JWT belongs to that org, then sends using the service role. The function relies on the default verify_jwt=true, so the caller only needs any valid account.
Effect: A user in org A can invoke test mode with org B's sequence/contact/channel IDs and (a) read org B contact PII in the response and (b) send real messages through org B's channel accounts. Note test mode is not a dry run — it actually delivers (email/WA), only skipping the daily-limit increment.
Fix: In test mode, resolve the caller's org from their JWT and require organization_id/sequence_id/contact_id to belong to it; scope all fetches by organization_id. Consider a true dry-run mode that renders without sending.
H3 — Re-enroll upsert resurrects/restarts contacts 🟠¶
Where: all enroll paths use .upsert(..., { onConflict: "sequence_id,contact_id" }) with status:"active", current_step_index:0 — useSupabaseSequences.ts:355-367, SequenceEnrollmentMenu.tsx:113, BulkSequenceEnrollmentMenu.tsx:82-95, DealSequenceEnrollmentMenu.tsx:95-107.
Effect: Re-enrolling an existing contact overwrites prior state: a completed contact restarts from step 0, a removed/bounced/exited contact is reactivated, a mid-sequence contact jumps back to the start. Bulk/tag/deal enroll will restart everyone already in the sequence.
Fix: Use a conditional insert (ignoreDuplicates, or WHERE status NOT IN ('active','completed',...)) or route enrollment through an edge function that decides restart-vs-skip explicitly.
H4 — Opt-in trigger reverses honored unsubscribes 🟠¶
Where: resubscribe_contact_emails_on_opt_in nulls unsubscribed_at for all tokens matching the contact whenever opt_in_status flips false→true.
Effect: Any internal edit, CSV import, or bulk op that sets opt-in true silently re-subscribes people who used the unsubscribe link, and email sending resumes. This is a regulatory hazard (CAN-SPAM/GDPR: an unsubscribe must not be reversible by an internal action without fresh consent).
Fix: Do not auto-clear unsubscribed_at on opt-in flips; require an explicit, logged re-consent action.
H5 — Reply-pause trigger can block inbound message ingestion 🟠¶
Where: pause_sequence_on_reply inserts a sequence_execution_log row with step_id = (SELECT ss.id... WHERE ss.step_order = se.current_step_index LIMIT 1).
Effect: If no step matches that index (index past the last step — which happens because the executor advances current_step_index after a send — or steps were deleted), the subquery returns NULL, violating sequence_execution_log.step_id NOT NULL. Since this fires AFTER INSERT on messages, the failure aborts the inbound message insert for that contact — a webhook-path landmine. It also mis-attributes the pause to the next, not-yet-sent step, and its 'Auto-paused: contact replied' row is later rendered in the analytics "Errors & Warnings" panel as if a reply were a delivery failure.
Fix: Guard the subquery (skip the log insert when no step resolves, or make step_id nullable for system events), and exclude reply-pause rows from the error/deliverability aggregation.
4. Reporting audit (the "clear reporting" ask)¶
The analytics dashboard (the app + the app) is read-rich but has correctness bugs that make headline numbers wrong, and large coverage gaps. Reporting reads only sequence_execution_log (statuses sent/failed/skipped) + enrollment states + a client-side cross-reference to email_bounces.
Reporting correctness bugs¶
- M1 — "Messages Sent" & Success Rate use the wrong total.
totalExecutions = executions.length(useSequenceAnalytics.ts:249) includesskippedrows; the dashboard labels it "Messages Sent" (:170-171) and divides success by it (:194-196). Skipped rows (opt-outs, missing contact info, daily-limit) are neither sent nor failed → both the count and the denominator are inflated. Same flaw in per-sequence (:302) and per-channel (:281) rates. - M2 — Dead status filters. Analytics filters on
"success","error","bounced","pending"(hook:207,210,250,251,266,282,283,290,317; dashboard:50,55) — none of which the executor can write (StepResultissent|failed|skipped,:425-430) and the CHECK forbids. All permanently false; latent since the write/read contract was never reconciled. - M3 — Unsubscribe rate mixes date windows.
totalUnsubscribes/totalEnrollments(dashboard:215-217): the numerator filtersunsubscribed_atin-range while the denominator filtersenrolled_atin-range (hook:146-153vs:183-187), so a contact enrolled long ago but unsubscribing today inflates the rate — not a cohort rate. - M4 — Terminal states are uncounted. The overview buckets only active/completed/paused/removed (hook
:245-248), butfailed/exited/bouncedare counted intotalEnrollments(:244). The status breakdown never sums to 100%, and goal-exits/bounces are invisible. - M5 —
executed_atis the batch-start timestamp (:33reused at:234), not the per-step send time, so intra-tick ordering in "Recent Executions" is arbitrary. Timeline bucketing uses UTC days shown as local (hook:394,467) → off-by-one for non-UTC orgs.
Reporting coverage gaps (what users can't see but want)¶
- No open / click rate. Email opens/clicks land in
email_tracking_eventskeyed byemail_messages.tracking_id(email-tracking-pixel/index.ts:131-133) with no link to the sequence or enrollment. - No reply rate. Replies only flip the enrollment to
pausedvia trigger; never counted. - No true delivery/bounce funnel.
sentmeans "accepted by the send function," not delivered; a bounced email still counts assent/success in per-step and channel rates. - Skip reasons not categorized. ~8 distinct skip reasons collapse into
skipped, distinguishable only by free-texterror_message— no breakdown. - No cohort funnel. The step "funnel" is execution-count based, not "of N enrolled, X reached step 3"; because failed steps retry and skips advance, execution counts ≠ unique contacts per step.
- No conversion/goal metric, despite
exit_on_deal_stages/exit_on_deal_createdsupport in the executor (:115-137).
Recommended reporting model¶
- Add a typed
event_type/reasoncolumn (or a normalizedsequence_eventstable) so sends, delivery, opens, clicks, replies, bounces, unsubscribes, and skip-reasons are first-class — not inferred from a 4-value status + free text. - Join email tracking (
email_tracking_events) to the enrollment/step so open/click rates are attributable per sequence and step. - Fix denominators: "Messages Sent" =
sentonly; success rate =sent / (sent + failed); unsubscribe/reply rates as cohorts of the same enrolled population. - Surface all terminal states (failed/exited/bounced/unsubscribed) and a per-step cohort drop-off + goal-conversion view.
5. Other medium findings¶
- M6 — Quota leak on failed sends. Daily-limit RPCs (
increment_sequence_send_count,..._v2) increment before the fetch (sequence-executor/index.ts:554-558, 906-911); if the downstream send returnsfailed, the count was already consumed, and each retry re-increments. - M7 — WhatsApp Cloud template parameters are wrong.
buildTemplateComponents/contactFallbackfill every placeholder ({{1}},{{2}},…) with the contact's full name (:613-624,626-689), ignoring the placeholder's meaning. Templates with non-name or multiple variables get incorrect data. - M8 — Terminal-skip enrollments keep iterating. Opt-out / missing-contact-info return
skipped, which advances through every remaining step (each logging a skip) instead of terminating the enrollment — wasted cycles + log noise. - M9 — No enroll-time validation. Enrollment does not check unsubscribe/opt-out/valid-channel/bounce before writing, so junk enrollments inflate counts and rely entirely on send-time skips. A
whatsapp_cloudstep with no template silently skips forever without surfacing to the enroller.
6. Low / hygiene¶
- L1 — Missing GRANTs. None of the sequence tables carry the project-mandated
GRANT... TO authenticated/service_role(CLAUDE.md §4.4); they work only via an implicit base-schema grant not present in the migration tree. - L2 — FKs to
auth.users.sequence_enrollments.enrolled_by/removed_byreferenceauth.users(20260330100834_*.sql:35,38), against CLAUDE.md §4.3 (useprofiles(id)). - L3 — Deprecated
std@0.168.0/servein bothsequence-executorandsequence-unsubscribe(CLAUDE.md §13.4). - L4 — Doc drift. CLAUDE.md and
docs/02-user-guide/sequences.mdreference a non-existentsequence_unsubscribestable (real:sequence_unsubscribe_tokens) and claim PBKDF2 tokens; the token is actually a 256-bit random hex (token TEXT DEFAULT encode(gen_random_bytes(32),'hex')) — unguessable and fine, but not what the spec says, and it never expires despite "invalid or expired" copy. - L5 — Constraint/code shipped out of sync.
exited/failedwere written by triggers/executor ~26 days before the CHECK was expanded to allow them (20260526093935vs20260621120000); in that window deal-stage updates for enrolled contacts could have throwncheck_violation. - L6 — Non-self-contained migrations.
public.sequenceshas noCREATE TABLEin the repo (Lovable-managed base); a freshsupabase db resetwould fail on the firstALTER TABLE public.sequences. - L7 — Guest/demo RLS to
anonunqualified onpublic.sequences(20260102190930_*.sql:65-68) — limited to the demo org, but an unqualified-role SELECT on a business table. - L8 — Missing composite index on
sequence_execution_log(enrollment_id, step_id, status)for the per-step fail-count query (:240-245); acceptable now, latent at scale.
7. Prioritized remediation¶
Now (correctness/compliance/security):
1. C1 — add 'unsubscribed' to the CHECK; error-check the unsubscribe update.
2. C2 — reschedule (don't advance) on daily-limit skips.
3. H2 — org-scope test mode to the caller's JWT.
4. H4 — stop auto-reversing unsubscribes on opt-in.
5. H5 — guard the reply-pause step_id subquery.
Next (reliability/anti-spam): 6. H1 — fix the WA Lite stagger seed. 7. H3 — conditional enroll (no silent restart). 8. M6/M7/M8/M9 — quota accounting, template params, terminal-skip handling, enroll-time validation.
Reporting (the "clear reporting" deliverable): 9. M1–M5 + coverage gaps — typed event model, tracking join, corrected denominators, terminal-state + cohort + conversion views (see §4).
Hygiene: L1–L8.
All line references are against the repository state on branch claude/sequence-lifecycle-audit-n3ugik at audit time.