Files
mokosh/.planning/phases/04-harden-clean-up-optional/04-04-PLAN.md
Mark 526ac78046 docs(04): create phase plan — 7 plans for Phase 4 hardening (audit P1 polish + flake stabilization + SW persistence + visual polish + closure)
Wave structure:
- W1 (parallel): 04-01 (Audit P1 polish #11/#14/#15 TDD) + 04-02 (build/CSP hygiene: setimmediate polyfill + dead-code + generate-icons.cjs)
- W2: 04-03 (A29 cs-injection-world rewrite; closes flake)
- W3: 04-04 (A33 SW state persistence; spike-first + CDP worker.close())
- W4: 04-05 (A34 fetch+XHR network_error; ROADMAP SC #2 + validates Plan 04-01 P1 #11 end-to-end)
- W5: 04-06 (dark-logo currentColor + cursor verification + 01-07-SUMMARY back-patch; operator empirical)
- W6: 04-07 (04-VERIFICATION.md aggregator + ROADMAP backfill + v1 close prep)

Honors locked decisions D-P4-01..05 (full Phase 4 + all 3 P1 polish + both visual items + alpha-independent + ROADMAP backfill).
Implements RESEARCH Q1 (setimmediate option a), Q2 (spike-first SW persistence), Q3 (A29 cs-injection-world), Finding 4 (cursor already shipped — verification only).
UI-SPEC dark-logo currentColor strategy with inline-SVG injection landed per UI-SPEC §"Implementation amendment".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 09:30:49 +02:00

414 lines
26 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
phase: 04
slug: harden-clean-up-optional
plan: 04
type: execute
wave: 3
depends_on:
- 01
- 02
- 03
files_modified:
- tests/uat/extension-page-harness.ts
- tests/uat/lib/harness-page-driver.ts
- tests/uat/harness.test.ts
autonomous: true
requirements: []
tags:
- uat-harness
- a33
- sw-state-persistence
- sw-eviction
- spike-first
- cdp-worker-close
- roadmap-sc-1
- charter-d-p4-01
user_setup: []
must_haves:
truths:
- "Wave 0 spike verifies empirically whether the offscreen document survives 5-min SW idle + worker.close() while MediaRecorder is actively recording — informs whether A33 is verification-only OR needs IndexedDB persistence work"
- "stopServiceWorker(browser, extensionId) helper exists in tests/uat/lib/harness-page-driver.ts using Puppeteer CDP browser.waitForTarget + worker.close() per Chrome devrel canonical pattern"
- "assertA33 / driveA33 land per the spike outcome: if PASS (offscreen survives) → verification-only A33 that does a real 5-min wall-clock idle + SW kill + SAVE → asserts archive's video/last_30sec.webm size > 100 KB"
- "A33 is env-gated by SKIP_LONG_UAT (default: RUN for closure + alpha gate; SKIP_LONG_UAT=1 to skip for per-commit iteration)"
- "UAT harness count flips from 33 → 34 (A33 added); 34/34 GREEN when SKIP_LONG_UAT unset"
- "ROADMAP SC #1 (SW state persistence) GREEN — A33 empirical evidence that a real-world 5-min idle + SAVE produces a non-empty video buffer"
artifacts:
- path: "tests/uat/extension-page-harness.ts"
provides: "assertA33 page-side stub (or thin driver) for SAVE_ARCHIVE dispatch after host-side SW kill"
contains: "assertA33"
- path: "tests/uat/lib/harness-page-driver.ts"
provides: "stopServiceWorker(browser, extensionId) NEW helper + driveA33 host-side CDP-kill + JSZip video-size check"
contains: "worker.close()"
- path: "tests/uat/harness.test.ts"
provides: "driveA33 import + wrapped driver const (passes handles.browser + handles.extensionId + handles.downloadsDir) + drivers-array push + SKIP_LONG_UAT env-gate wrapper"
contains: "driveA33Wrapped"
key_links:
- from: "tests/uat/harness.test.ts driveA33Wrapped"
to: "tests/uat/lib/harness-page-driver.ts driveA33(page, browser, extensionId, downloadsDir)"
via: "(page) => driveA33(page, handles.browser, handles.extensionId, handles.downloadsDir)"
pattern: "handles\\.browser.*handles\\.extensionId"
- from: "tests/uat/lib/harness-page-driver.ts stopServiceWorker"
to: "Chrome MV3 SW target via Puppeteer CDP"
via: "browser.waitForTarget(t => t.type()==='service_worker' && t.url().startsWith(`chrome-extension://${extensionId}`)) + target.worker().close()"
pattern: "service_worker"
- from: "tests/uat/lib/harness-page-driver.ts driveA33 video-size check"
to: "zip.file('video/last_30sec.webm') → byteLength > 100_000"
via: "JSZip.loadAsync + entry.async('uint8array')"
pattern: "video/last_30sec\\.webm"
---
<objective>
Ship the A33 harness assertion that empirically verifies ROADMAP SC #1 (SW state persistence across the 30s idle unload edge cases). Per RESEARCH Q2: the current architecture stores segments only in offscreen-document RAM (src/offscreen/recorder.ts:91 `let segments: Blob[] = []`). The SW NEVER stores the buffer. So the actual question becomes: does the offscreen document survive 5 minutes of SW idle? Per Chrome docs, the offscreen has its own lifecycle independent of the SW, with active `MediaRecorder` being the canonical "compelling reason" to keep the offscreen alive.
This plan uses the SPIKE-FIRST approach to avoid over-engineering:
**Wave 0 (spike):** A30-min empirical investigation. Start recording; wait 5 min real wall-clock; force-kill the SW via Puppeteer CDP `worker.close()` (Chrome devrel canonical pattern; Puppeteer ≥22.1.0 supports it — our pin ^25 is comfortably above); dispatch SAVE_ARCHIVE from page.evaluate; check the resulting archive's `video/last_30sec.webm` size.
**Wave 1 (impl):** Based on spike outcome:
- **If spike PASSES** (likely outcome per RESEARCH architecture analysis + Chrome docs): A33 is a VERIFICATION-ONLY harness assertion that wraps the spike methodology into a repeatable test. Ships the spike's exact pattern as `driveA33` + `assertA33` + orchestrator wiring + env-gate. ROADMAP SC #1 is satisfied by the CURRENT architecture; no persistence layer needed.
- **If spike FAILS** (the offscreen dies along with the SW, contrary to Chrome docs): A33 implementation expands per RESEARCH Q2 sub-question (b) recommendation (Option C: IndexedDB persistence in offscreen — Blobs serialize cleanly to IDB; structured-clone supports them natively; per-segment write ~3 MB; ~3 writes per 30s window). This is a wider plan rewrite; the plan-checker should flag for re-planning if it materializes. RESEARCH confidence on offscreen-surviving-SW-kill is MEDIUM; the spike-first approach is the canonical risk hedge per Plan 01-07 precedent.
Purpose: Forms the empirical evidence for ROADMAP SC #1 ("After running the extension idle for >5 minutes, then exporting, the archive still contains a non-empty video buffer"). The spike-first approach hedges against the RESEARCH MEDIUM-confidence assumption (A3 — offscreen survives SW eviction). If the assumption holds, A33 is a verification gate; if not, persistence work is plain-needed.
Output: 1 NEW assertion (A33; harness count 33→34); 1 NEW helper (`stopServiceWorker` CDP wrapper); 3-file lockstep update per the Approach B pattern (extension-page-harness.ts + harness-page-driver.ts + harness.test.ts); env-gate via SKIP_LONG_UAT (default = RUN; set to '1' to skip for per-commit iteration).
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/04-harden-clean-up-optional/04-CONTEXT.md
@.planning/phases/04-harden-clean-up-optional/04-RESEARCH.md
@.planning/phases/04-harden-clean-up-optional/04-PATTERNS.md
# Source files — locus of the harness extension
@tests/uat/extension-page-harness.ts
@tests/uat/lib/harness-page-driver.ts
@tests/uat/harness.test.ts
@tests/uat/lib/launch.ts
@src/offscreen/recorder.ts
@src/background/index.ts
# Prior plan SUMMARYs to mirror — Approach B harness extension precedent
@.planning/phases/02-stabilize-export-pipeline/02-04-SUMMARY.md
@.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-04-SUMMARY.md
<interfaces>
<!-- Key shapes the executor consumes directly. Extracted from codebase + RESEARCH 2026-05-21. -->
From tests/uat/lib/launch.ts:80-90 (HarnessHandles — already exposes browser + extensionId; no extension needed):
```typescript
export interface HarnessHandles {
readonly browser: Browser; // ← already exposed; used by driveA33
readonly extensionId: string; // ← already exposed; used by driveA33
readonly harnessPage: Page;
readonly victimPage: Page;
readonly downloadsDir: string;
readonly swConsole: string[];
readonly offConsole: string[];
}
```
From RESEARCH Q2 Code Example Pattern 1 (stopServiceWorker — NEW helper; verbatim from Chrome devrel doc):
```typescript
import type { Browser } from 'puppeteer';
/**
* Force-terminate the MV3 service worker via Puppeteer CDP. Required
* because Puppeteer's persistent CDP attach keeps SWs alive indefinitely;
* natural 30s idle eviction does NOT fire under test conditions per Chrome
* docs (https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle).
*
* Reference: https://developer.chrome.com/docs/extensions/how-to/test/test-serviceworker-termination-with-puppeteer
*/
async function stopServiceWorker(browser: Browser, extensionId: string): Promise<void> {
const host = `chrome-extension://${extensionId}`;
const target = await browser.waitForTarget(
(t) => t.type() === 'service_worker' && t.url().startsWith(host),
);
const worker = await target.worker();
if (worker !== null) {
await worker.close();
}
}
```
From RESEARCH Q2 Code Example Pattern 4 (driveA33 — host-side body):
```typescript
const A33_IDLE_WAIT_MS = 5 * 60 * 1000; // 300_000 — real wall-clock
const A33_NEW_SW_BOOT_MS = 500; // post-worker.close() settle
const A33_OVERALL_TIMEOUT_MS = A33_IDLE_WAIT_MS + 60_000; // 360_000
const A33_SAVE_ARCHIVE_TIMEOUT_MS = 15_000;
export async function driveA33(
page: Page,
browser: Browser,
extensionId: string,
downloadsDir: string,
): Promise<AssertionRecord> {
const r: AssertionRecord = { name: 'A33', passed: false, checks: [], diagnostics: [] };
// Step 1: prime recording on the probe tab
await page.evaluate(() => (window as any).__mokoshHarness.setupFreshRecording());
// Step 2: 5-min wall-clock idle
r.diagnostics.push(`waiting ${A33_IDLE_WAIT_MS}ms for SW idle window`);
await new Promise((res) => setTimeout(res, A33_IDLE_WAIT_MS));
// Step 3: force SW termination via CDP
await stopServiceWorker(browser, extensionId);
r.diagnostics.push('SW terminated via worker.close()');
// Step 4: brief settle for SW teardown
await new Promise((res) => setTimeout(res, A33_NEW_SW_BOOT_MS));
// Step 5: dispatch SAVE_ARCHIVE — wakes SW back up as an event
// (event-driven respawn is the canonical MV3 wakeup path)
const saveResult = await page.evaluate(() => (window as any).__mokoshHarness.dispatchSaveArchive());
r.checks.push({
name: 'A33.1: SAVE_ARCHIVE ack success after 5-min idle + SW kill',
expected: true,
actual: saveResult.success,
passed: saveResult.success === true,
});
// Step 6: verify zip contains non-empty video buffer
const zipPath = findLatestZip(downloadsDir);
if (zipPath === null) {
r.checks.push({ name: 'A33.0: zip present', expected: '>=1 zip', actual: 'none', passed: false });
r.passed = false;
return r;
}
const zip = await JSZip.loadAsync(readFileSync(zipPath));
const videoEntry = zip.file('video/last_30sec.webm');
const videoSize = videoEntry !== null
? (await videoEntry.async('uint8array')).byteLength
: 0;
r.checks.push({
name: 'A33.2: video/last_30sec.webm size > 0 (buffer survived SW eviction)',
expected: '>0',
actual: String(videoSize),
passed: videoSize > 0,
});
r.checks.push({
name: 'A33.3: video size > 100 KB (sanity floor; real archives 1-3 MB)',
expected: '>100000',
actual: String(videoSize),
passed: videoSize > 100_000,
});
r.passed = r.checks.every((c) => c.passed);
return r;
}
```
From RESEARCH Q2 sub-question (c) env-gate recommendation:
```typescript
// In tests/uat/harness.test.ts drivers-array entry:
{
name: 'A33',
drive: process.env.SKIP_LONG_UAT === '1'
? async (): Promise<AssertionRecord> => ({
name: 'A33',
passed: true,
checks: [],
diagnostics: ['A33 SKIPPED (SKIP_LONG_UAT=1; unset to run 5-min idle test)'],
})
: driveA33Wrapped,
},
```
Default polarity: SKIP_LONG_UAT unset → RUN A33 (this matches the closure + alpha-gate semantics; per-commit dev iteration uses SKIP_LONG_UAT=1).
From src/offscreen/recorder.ts:91 (architecture invariant — segments only in offscreen RAM):
```typescript
let segments: Blob[] = []; // module-level state; NO chrome.storage.local persistence; NO IndexedDB
```
The spike's job: verify this RAM-only design survives a 5-min SW idle. If it does, ROADMAP SC #1 is satisfied with ZERO source-code changes — just a new harness assertion that exercises the path.
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Wave 0 SPIKE — empirical verification that offscreen survives 5-min SW idle</name>
<files>tests/uat/lib/harness-page-driver.ts</files>
<read_first>tests/uat/lib/harness-page-driver.ts (full; ~2200 lines — read selectively: imports lines 1-40, findLatestZip ~1395, driveA30 host-side filter ~2039-2148), tests/uat/extension-page-harness.ts:600-700 (setupFreshRecording helper), src/offscreen/recorder.ts:80-100 (segments array context), .planning/phases/04-harden-clean-up-optional/04-RESEARCH.md Q2 sub-question (b), .planning/phases/01-stabilize-video-pipeline/01-07-SUMMARY.md (Plan 07 spike precedent)</read_first>
<action>
1. Add the `stopServiceWorker(browser, extensionId)` helper to `tests/uat/lib/harness-page-driver.ts` per the Code Example in `<interfaces>` above. Place it near the top of the file (after existing imports + before existing driveA-* functions). Add the `import type { Browser } from 'puppeteer';` if not already present.
2. Create a one-shot spike script `tests/uat/spike-a33-sw-persistence.ts` (NEW; treat as scratch file for this spike — delete after spike concludes; record outcome in plan SUMMARY). The script:
- Imports `launchHarnessBrowser` from `./lib/launch.ts`.
- Imports `stopServiceWorker` + `findLatestZip` from `./lib/harness-page-driver.ts`.
- Launches the harness browser, primes recording via the harness page's setupFreshRecording method.
- `console.log('SPIKE: waiting 5 minutes for SW idle window...')`
- `await new Promise(r => setTimeout(r, 5 * 60 * 1000));`
- `await stopServiceWorker(handles.browser, handles.extensionId);`
- `await new Promise(r => setTimeout(r, 500));` (settle)
- Dispatch SAVE_ARCHIVE via `await handles.harnessPage.evaluate(() => (window as any).__mokoshHarness.dispatchSaveArchive());`
- `await new Promise(r => setTimeout(r, 5000));` (let download complete)
- `const zipPath = findLatestZip(handles.downloadsDir);`
- `const zip = await JSZip.loadAsync(readFileSync(zipPath));`
- `const videoEntry = zip.file('video/last_30sec.webm');`
- `const videoSize = videoEntry ? (await videoEntry.async('uint8array')).byteLength : 0;`
- `console.log(\`SPIKE RESULT: videoSize=${videoSize} bytes (>0 = OFFSCREEN SURVIVED; =0 = OFFSCREEN DIED)\`);`
- `await handles.browser.close();`
3. Run the spike: `tsx tests/uat/spike-a33-sw-persistence.ts` with HEADLESS=1 (so it runs in CI mode; ~5 min wall-clock).
4. Record the result. If videoSize > 100_000 → SPIKE PASSED (offscreen survives) → proceed to Task 2 with verification-only A33. If videoSize ≤ 100_000 OR throw → SPIKE FAILED → SUMMARY documents the failure mode + flag to plan-checker for re-planning (IndexedDB persistence work would expand Plan 04-04 substantially; that's a planning event, not an execution event).
5. Commit the `stopServiceWorker` helper (Task 1's persisting artifact). The spike script is OK to delete OR keep committed as `tests/uat/spike-*.ts` for future SW-lifecycle investigations.
</action>
<verify>
<automated>npx tsc --noEmit && HEADLESS=1 tsx tests/uat/spike-a33-sw-persistence.ts 2>&1 | tee /tmp/04-04-spike.log; grep -c 'SPIKE RESULT' /tmp/04-04-spike.log</automated>
</verify>
<acceptance_criteria>
- `stopServiceWorker(browser, extensionId)` helper exists at `tests/uat/lib/harness-page-driver.ts` with the canonical Chrome devrel signature (`Browser` + extensionId args; `target.worker()?.close()` body).
- Spike script ran to completion (no Puppeteer throw).
- Spike result logged with explicit `videoSize=<N> bytes` line.
- If videoSize > 100_000: spike PASSED; proceed to Task 2 with verification-only path.
- If videoSize ≤ 100_000: spike FAILED; pause plan + flag to plan-checker for re-planning (out of scope for this task to escalate, but SUMMARY documents).
- Total spike wall-clock: ~6-7 minutes (5 min idle + ~1-2 min orchestration).
</acceptance_criteria>
<done>Spike outcome recorded in plan SUMMARY; stopServiceWorker helper committed. Atomic commit: `feat(04-04): Wave 0 spike — stopServiceWorker helper + 5-min SW idle empirical result`.</done>
</task>
<task type="auto">
<name>Task 2: Wave 1 — A33 assertion + driveA33 + orchestrator wiring (assumes spike PASSED)</name>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts, tests/uat/harness.test.ts</files>
<read_first>tests/uat/extension-page-harness.ts:3517-3636 (assertA30 — canonical setupFreshRecording + SAVE pattern), tests/uat/extension-page-harness.ts:3971-4000 (__mokoshHarness global registration block), tests/uat/lib/harness-page-driver.ts:2039-2148 (driveA30 — host-side filter pattern), tests/uat/harness.test.ts:100-110 (import block), tests/uat/harness.test.ts:340-360 (wrapped-driver block), tests/uat/harness.test.ts:459-486 (drivers-array push block), tests/uat/harness.test.ts:225-240 (SKIP_PROD_REBUILD env-gate pattern)</read_first>
<action>
**GATING CONDITION:** Task 1 spike produced videoSize > 100_000. (If FAILED, this task is BLOCKED and the plan must be re-planned to add IndexedDB persistence work.)
3-file lockstep update per the Approach B harness extension pattern:
**File 1: tests/uat/extension-page-harness.ts**
- Locate the existing `__mokoshHarness` registration block (~line 3971) and the `__mokoshHarness` Window interface declaration (~line 3950).
- Add a thin `dispatchSaveArchive()` helper to `__mokoshHarness` if not already present (it may exist as `dispatchSaveArchiveForA33` or similar; reuse the existing SAVE_ARCHIVE dispatch). If the existing `setupFreshRecording` already covers Step 1 (priming the recording), no new page-side helper is needed — driveA33 calls it directly via `page.evaluate`.
- If a Step-1 page-side helper IS needed for driveA33: add a thin wrapper `setupFreshRecordingForA33` that's a 1-line forwarder to existing `setupFreshRecording`. Per RESEARCH note (FORBIDDEN_HOOK_STRINGS stays at 12): NO new test-only symbol needed — the new helper calls existing production-surface APIs.
- Add `assertA33` ENTRY to the `__mokoshHarness` window interface declaration (`assertA33: () => Promise<AssertionResult>;`) IF a page-side assertA33 is needed. Per RESEARCH driveA33 pattern: the host-side driveA33 owns the 5-min wait + SW kill + SAVE dispatch via the existing harness methods — likely NO new `assertA33` page-side function is needed; the host-side drives everything via existing primitives.
- If page-side function is NOT needed: just verify orchestrator uses host-only driveA33 (Step 1's setupFreshRecording is already there; Step 5's dispatchSaveArchive call uses existing SAVE_ARCHIVE messaging).
- Decision recorded in plan SUMMARY.
**File 2: tests/uat/lib/harness-page-driver.ts**
- Append `driveA33` function per RESEARCH Code Example Pattern 4 (full body in `<interfaces>` above).
- Place it after the existing driveA32 (which is the most-recent Phase 3 addition).
- Verify the `stopServiceWorker` helper from Task 1 is in scope (same file).
- Filter-pipeline form; no `continue`; typed function signature `(page, browser, extensionId, downloadsDir) => Promise<AssertionRecord>` per the new 4-arg shape.
- Add `import { readFileSync } from 'node:fs';` + `import JSZip from 'jszip';` if not already present (they should be — these are reused from driveA29/30/31).
**File 3: tests/uat/harness.test.ts**
- Import: add `driveA33,` to the import block at ~line 101 (alongside `driveA29`-`driveA32`).
- Wrapped-driver: add at ~line 357 (after `driveA31Wrapped`):
```typescript
// Plan 04-04 — driveA33 needs Browser + extensionId for CDP-based SW kill
// AND downloadsDir for host-side JSZip parse of post-restart zip.
const driveA33Wrapped: (page: import('puppeteer').Page) => Promise<AssertionRecord> =
(page) => driveA33(page, handles.browser, handles.extensionId, handles.downloadsDir);
```
- Drivers-array push: add at ~line 486 (after the existing A32 entry):
```typescript
// Plan 04-04 A33: SW state persistence 5-min idle (ROADMAP SC #1; RESEARCH Q2).
// Forces SW eviction via Puppeteer CDP worker.close() per the canonical
// Chrome devrel pattern (RESEARCH Pattern 1). Verifies offscreen-RAM
// segments survive SW restart. Env-gated by SKIP_LONG_UAT for fast
// per-commit iteration; defaults to RUN for Phase 4 closure + alpha gate.
{
name: 'A33',
drive: process.env.SKIP_LONG_UAT === '1'
? async (): Promise<AssertionRecord> => ({
name: 'A33',
passed: true,
checks: [],
diagnostics: ['A33 SKIPPED (SKIP_LONG_UAT=1; unset to run 5-min idle test)'],
})
: driveA33Wrapped,
},
```
Verify:
- `npx tsc --noEmit` exits 0.
- `npm run build:test` exits 0.
- Quick UAT: `HEADLESS=1 SKIP_PROD_REBUILD=1 SKIP_LONG_UAT=1 npm run test:uat` exits 0 with 34/34 GREEN (A33 SKIPPED message visible; preserves baseline + adds A33 skip placeholder).
- Full UAT: `HEADLESS=1 SKIP_PROD_REBUILD=1 npm run test:uat` exits 0 with 34/34 GREEN (A33 actually runs ~6 min wall-clock; A33.1 SAVE ack + A33.2 size > 0 + A33.3 size > 100 KB all PASS).
- Tier-1 FORBIDDEN_HOOK_STRINGS check: `grep -c 'FORBIDDEN_HOOK_STRINGS' tests/uat/harness.test.ts tests/background/no-test-hooks-in-prod-bundle.test.ts` — verify the inventory count in both files unchanged (preserves the 12-entry invariant per CONTEXT §"Claude's Discretion").
</action>
<verify>
<automated>npx tsc --noEmit && npm run build:test && HEADLESS=1 SKIP_PROD_REBUILD=1 SKIP_LONG_UAT=1 npm run test:uat 2>&1 | tail -5 | tee /tmp/04-04-task-2-skip.log; grep -c '34/34' /tmp/04-04-task-2-skip.log</automated>
</verify>
<acceptance_criteria>
- `npx tsc --noEmit` exits 0.
- `npm run build:test` exits 0.
- UAT harness count flips 33 → 34 (A33 added).
- Skip-mode run: `HEADLESS=1 SKIP_PROD_REBUILD=1 SKIP_LONG_UAT=1 npm run test:uat` GREEN 34/34 (A33 SKIPPED placeholder GREEN; total takes ~95s — unchanged).
- Full-mode run: `HEADLESS=1 SKIP_PROD_REBUILD=1 npm run test:uat` GREEN 34/34 (~6.5 min; A33 actually runs and passes A33.1 + A33.2 + A33.3).
- `grep -c 'A33' tests/uat/harness.test.ts` returns ≥ 4 (import + wrapped + push + comment banner).
- `grep -c 'SKIP_LONG_UAT' tests/uat/harness.test.ts` returns ≥ 2 (env-gate + comment).
- FORBIDDEN_HOOK_STRINGS count unchanged at 12 (no new test-only symbols introduced per CONTEXT §"Claude's Discretion"; verify by `wc -l` of the inventory arrays).
</acceptance_criteria>
<done>A33 lands; UAT 33→34 GREEN; SW persistence empirically verified at 5-min idle scale. Atomic commit: `feat(04-04): Wave 1 — A33 SW state persistence harness assertion (34/34 GREEN)`.</done>
</task>
</tasks>
<threat_model>
## Trust Boundaries
| Boundary | Description |
|----------|-------------|
| Puppeteer CDP → Chrome MV3 SW realm | `worker.close()` invokes the SW's `self.close()` via CDP `ServiceWorker.unregister` — this terminates the SW realm but does NOT touch the offscreen document's WebContents target. Native CDP surface; no untrusted input. |
| Test idle interval (5 min wall-clock) → MediaRecorder active segment buffer | the MediaRecorder is in `state === 'recording'` during the idle; segments rotate every 10s; the offscreen-RAM array accumulates 30 segments (5 min × 60 sec / 10 sec per segment); trim-to-last-3 keeps memory bounded ≤ 30 MB (well under CON-ram-ceiling) |
## STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|-----------|----------|-----------|-------------|-----------------|
| T-04-04-01 | Tampering | a future architectural change might move the segments array to SW-side state, breaking the offscreen-survives-SW assumption A33 verifies | mitigate | A33 is a regression-catching gate; if a future PR moves segments off-offscreen, A33 fails fast (videoSize=0 after SW kill) |
| T-04-04-02 | DoS (CI) | A33's 5-min idle adds ~5 min to harness wall-clock (95s → 395s); per-commit CI lanes would suffer | mitigate | Env-gated by SKIP_LONG_UAT (default RUN for closure + alpha; documented per-commit SKIP_LONG_UAT=1 for dev iteration) |
| T-04-04-03 | Repudiation | natural 30s idle eviction does NOT fire under Puppeteer's CDP attach per Chrome docs; if a developer naively writes "wait 5 min and hope SW dies" the test silently passes via a SW that never died | mitigate | The CDP `worker.close()` call is explicit + cited in code comment; RESEARCH Pitfall 4 documents the misconception |
| T-04-04-04 | Spoofing | Puppeteer 25.x patch versions could in theory change `Worker.close()` semantics; the canonical Chrome devrel pattern is pinned at Puppeteer ≥22.1.0 | accept | The project pin `puppeteer: ^25.0.2` is comfortably past the 22.1.0 floor; minor patch drift expected to be backward-compatible per Puppeteer's semver discipline. If A33 ever fails post-Puppeteer-upgrade, the SUMMARY's commit ref provides the exact Puppeteer version where it was validated. |
</threat_model>
<verification>
- `npx tsc --noEmit` exits 0.
- `npm run build:test` exits 0.
- UAT harness count 33 → 34.
- Skip-mode: `HEADLESS=1 SKIP_LONG_UAT=1 npm run test:uat` GREEN 34/34 in ~95s.
- Full-mode: `HEADLESS=1 npm run test:uat` GREEN 34/34 in ~6.5 min.
- ROADMAP SC #1 GREEN — A33 empirical evidence: video buffer survived 5-min SW idle + worker.close().
- FORBIDDEN_HOOK_STRINGS count unchanged at 12.
- vitest baseline preserved (≥ 181 GREEN from Plans 04-01 + 04-02).
- A29 + A30 + A31 + A32 unchanged (no regression to existing assertions).
</verification>
<success_criteria>
- Wave 0 spike PASSED — empirical evidence that offscreen survives 5-min SW idle (Task 1).
- assertA33 + driveA33 + stopServiceWorker helper + harness orchestrator wiring landed (Task 2).
- UAT harness count 33 → 34 GREEN.
- ROADMAP SC #1 (SW state persistence) GREEN.
- Env-gated long-test pattern established (SKIP_LONG_UAT) — pattern reused by any future ≥5-min test.
- Pre-checkpoint bundle gates 6/6 PASS unchanged (Plan 04-04 makes no source-code changes).
</success_criteria>
<output>
After completion, create `.planning/phases/04-harden-clean-up-optional/04-04-SUMMARY.md` capturing:
- Spike outcome (videoSize value + interpretation; SPIKE PASSED/FAILED tag)
- stopServiceWorker helper diff (full body)
- driveA33 diff (full body)
- Orchestrator wiring diff (3 sites in harness.test.ts)
- SKIP_LONG_UAT env-gate decision (default RUN; rationale)
- UAT before/after (33/33 → 34/34)
- Full-mode wall-clock benchmark (e.g., ~6.5 min)
- ROADMAP SC #1 closure evidence
- Commit refs (Task 1 spike + Task 2 impl)
- If spike FAILED: detailed failure mode + flag for re-planning (this branch is unlikely per RESEARCH MEDIUM-confidence; document as ALPHA-PATH-NOT-TAKEN)
</output>
</content>
</invoke>