Files
Mark 76fffb35b9 fix(04): revise plans per checker iter-1 — 2 BLOCKERS + 2 WARNINGS fixed
Plan-checker iter-1 found 2 BLOCKERS + 4 WARNINGS. Iter-2 revision applies
surgical fixes to 4 plans + VALIDATION:

BLOCKER 1 (Plan 04-06 Task 4): wrong SW chunk glob `dist/assets/index*-bg.js`
matched zero files → Gates 2/3/4 silently PASSED. Replaced with canonical
`dist/assets/index.ts-*.js` (verified empirically: index.ts-8LkXuqac.js
on disk; RESEARCH Q1). Added glob-existence pre-gate `ls | wc -l >= 1`
to fail-loudly on future Vite chunk-naming shift.

BLOCKER 2 (Plan 04-04 Task 1): spike called non-existent
__mokoshHarness.dispatchSaveArchive (verified: harness surface is
assertA1..A31 + getManifestVersion only). Applied Option B — spike
+ driveA33 now dispatch SAVE_ARCHIVE via chrome.runtime.sendMessage
inline in page.evaluate (matches 9 existing assertA* methods:
A5/A11/A12/A13/A26/A28/A29/A30/A31). No new harness helper introduced.

WARNING 1 (Plan 04-02 Task 2): verify omitted UAT harness run. Added
`HEADLESS=1 SKIP_PROD_REBUILD=0 npm run test:uat 2>&1 | grep -c 'UAT
harness: 33/33 assertions passed'` to verify command (stdout format
confirmed at tests/uat/harness.test.ts:537).

WARNING 4 (Plan 04-07 Task 1): weak operator-ack gate (placeholder would
pass). Added `grep -cE 'approved|All good|APPROVED|approved by|operator
ack|all good' 04-VERIFICATION.md` to verify command. Covers both
canonical Plan 04-06 resume-signal ("approved" lowercase) AND prior-art
Plan 01-10 cycle-2 ack ("All good" titlecase).

WARNINGS 2 + 3 left as-is (truly advisory: scope-sanity threshold +
conservative dependency without file overlap).

04-VALIDATION.md per-task map rows updated for the 5 revised task entries
(04-02 T2 + 04-04 T1 + 04-04 T2 + 04-06 T4 + 04-07 T1). Frontmatter
adds `revised: 2026-05-21` + iter-2 notes block.

3 plans unchanged on disk (04-01, 04-03, 04-05).

Empirical confirmations used in revision:
- Harness surface: grep extension-page-harness.ts:4018 confirms
  __mokoshHarness.{assertA1..A31, getManifestVersion}; no dispatchSaveArchive
- SW chunk filename: ls dist/assets/ shows index.ts-8LkXuqac.js;
  no index*-bg.js matches
- SAVE_ARCHIVE precedent count: 9 existing assertA* methods use the
  chrome.runtime.sendMessage pattern
- UAT harness stdout format: harness.test.ts:537 emits canonical
  "UAT harness: N/N assertions passed"

Ready for plan-checker iter-3 re-verification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:00:07 +02:00

33 KiB
Raw Permalink Blame History

phase, slug, plan, type, wave, depends_on, files_modified, autonomous, requirements, tags, user_setup, must_haves
phase slug plan type wave depends_on files_modified autonomous requirements tags user_setup must_haves
04 harden-clean-up-optional 04 execute 3
01
02
03
tests/uat/extension-page-harness.ts
tests/uat/lib/harness-page-driver.ts
tests/uat/harness.test.ts
true
uat-harness
a33
sw-state-persistence
sw-eviction
spike-first
cdp-worker-close
roadmap-sc-1
charter-d-p4-01
truths artifacts key_links
Wave 0 spike verifies empirically whether the offscreen document survives 5-min SW idle + worker.close() while MediaRecorder is actively recording — informs whether A33 is verification-only OR needs IndexedDB persistence work
stopServiceWorker(browser, extensionId) helper exists in tests/uat/lib/harness-page-driver.ts using Puppeteer CDP browser.waitForTarget + worker.close() per Chrome devrel canonical pattern
assertA33 / driveA33 land per the spike outcome: if PASS (offscreen survives) → verification-only A33 that does a real 5-min wall-clock idle + SW kill + SAVE → asserts archive's video/last_30sec.webm size > 100 KB
A33 is env-gated by SKIP_LONG_UAT (default: RUN for closure + alpha gate; SKIP_LONG_UAT=1 to skip for per-commit iteration)
UAT harness count flips from 33 → 34 (A33 added); 34/34 GREEN when SKIP_LONG_UAT unset
ROADMAP SC #1 (SW state persistence) GREEN — A33 empirical evidence that a real-world 5-min idle + SAVE produces a non-empty video buffer
SAVE_ARCHIVE dispatch reuses the canonical chrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}, ...) pattern from the harness page realm (NOT a new __mokoshHarness helper); matches the established assertA5/A11/A12/A13/A26/A28/A29/A30/A31 precedent
path provides contains
tests/uat/extension-page-harness.ts assertA33 page-side stub (or thin driver) for SAVE_ARCHIVE dispatch after host-side SW kill assertA33
path provides contains
tests/uat/lib/harness-page-driver.ts stopServiceWorker(browser, extensionId) NEW helper + driveA33 host-side CDP-kill + JSZip video-size check worker.close()
path provides contains
tests/uat/harness.test.ts driveA33 import + wrapped driver const (passes handles.browser + handles.extensionId + handles.downloadsDir) + drivers-array push + SKIP_LONG_UAT env-gate wrapper driveA33Wrapped
from to via pattern
tests/uat/harness.test.ts driveA33Wrapped tests/uat/lib/harness-page-driver.ts driveA33(page, browser, extensionId, downloadsDir) (page) => driveA33(page, handles.browser, handles.extensionId, handles.downloadsDir) handles.browser.*handles.extensionId
from to via pattern
tests/uat/lib/harness-page-driver.ts stopServiceWorker Chrome MV3 SW target via Puppeteer CDP browser.waitForTarget(t => t.type()==='service_worker' && t.url().startsWith(`chrome-extension://${extensionId}`)) + target.worker().close() service_worker
from to via pattern
tests/uat/lib/harness-page-driver.ts driveA33 video-size check zip.file('video/last_30sec.webm') → byteLength > 100_000 JSZip.loadAsync + entry.async('uint8array') video/last_30sec.webm
Ship the A33 harness assertion that empirically verifies ROADMAP SC #1 (SW state persistence across the 30s idle unload edge cases). Per RESEARCH Q2: the current architecture stores segments only in offscreen-document RAM (src/offscreen/recorder.ts:91 `let segments: Blob[] = []`). The SW NEVER stores the buffer. So the actual question becomes: does the offscreen document survive 5 minutes of SW idle? Per Chrome docs, the offscreen has its own lifecycle independent of the SW, with active `MediaRecorder` being the canonical "compelling reason" to keep the offscreen alive.

This plan uses the SPIKE-FIRST approach to avoid over-engineering:

Wave 0 (spike): A30-min empirical investigation. Start recording; wait 5 min real wall-clock; force-kill the SW via Puppeteer CDP worker.close() (Chrome devrel canonical pattern; Puppeteer ≥22.1.0 supports it — our pin ^25 is comfortably above); dispatch SAVE_ARCHIVE from page.evaluate; check the resulting archive's video/last_30sec.webm size.

Wave 1 (impl): Based on spike outcome:

  • If spike PASSES (likely outcome per RESEARCH architecture analysis + Chrome docs): A33 is a VERIFICATION-ONLY harness assertion that wraps the spike methodology into a repeatable test. Ships the spike's exact pattern as driveA33 + assertA33 + orchestrator wiring + env-gate. ROADMAP SC #1 is satisfied by the CURRENT architecture; no persistence layer needed.
  • If spike FAILS (the offscreen dies along with the SW, contrary to Chrome docs): A33 implementation expands per RESEARCH Q2 sub-question (b) recommendation (Option C: IndexedDB persistence in offscreen — Blobs serialize cleanly to IDB; structured-clone supports them natively; per-segment write ~3 MB; ~3 writes per 30s window). This is a wider plan rewrite; the plan-checker should flag for re-planning if it materializes. RESEARCH confidence on offscreen-surviving-SW-kill is MEDIUM; the spike-first approach is the canonical risk hedge per Plan 01-07 precedent.

Purpose: Forms the empirical evidence for ROADMAP SC #1 ("After running the extension idle for >5 minutes, then exporting, the archive still contains a non-empty video buffer"). The spike-first approach hedges against the RESEARCH MEDIUM-confidence assumption (A3 — offscreen survives SW eviction). If the assumption holds, A33 is a verification gate; if not, persistence work is plain-needed.

Output: 1 NEW assertion (A33; harness count 33→34); 1 NEW helper (stopServiceWorker CDP wrapper); 3-file lockstep update per the Approach B pattern (extension-page-harness.ts + harness-page-driver.ts + harness.test.ts); env-gate via SKIP_LONG_UAT (default = RUN; set to '1' to skip for per-commit iteration).

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/04-harden-clean-up-optional/04-CONTEXT.md @.planning/phases/04-harden-clean-up-optional/04-RESEARCH.md @.planning/phases/04-harden-clean-up-optional/04-PATTERNS.md

Source files — locus of the harness extension

@tests/uat/extension-page-harness.ts @tests/uat/lib/harness-page-driver.ts @tests/uat/harness.test.ts @tests/uat/lib/launch.ts @src/offscreen/recorder.ts @src/background/index.ts

Prior plan SUMMARYs to mirror — Approach B harness extension precedent

@.planning/phases/02-stabilize-export-pipeline/02-04-SUMMARY.md @.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-04-SUMMARY.md

From tests/uat/lib/launch.ts:80-90 (HarnessHandles — already exposes browser + extensionId; no extension needed):

export interface HarnessHandles {
  readonly browser: Browser;             // ← already exposed; used by driveA33
  readonly extensionId: string;          // ← already exposed; used by driveA33
  readonly harnessPage: Page;
  readonly victimPage: Page;
  readonly downloadsDir: string;
  readonly swConsole: string[];
  readonly offConsole: string[];
}

From RESEARCH Q2 Code Example Pattern 1 (stopServiceWorker — NEW helper; verbatim from Chrome devrel doc):

import type { Browser } from 'puppeteer';

/**
 * Force-terminate the MV3 service worker via Puppeteer CDP. Required
 * because Puppeteer's persistent CDP attach keeps SWs alive indefinitely;
 * natural 30s idle eviction does NOT fire under test conditions per Chrome
 * docs (https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle).
 *
 * Reference: https://developer.chrome.com/docs/extensions/how-to/test/test-serviceworker-termination-with-puppeteer
 */
async function stopServiceWorker(browser: Browser, extensionId: string): Promise<void> {
  const host = `chrome-extension://${extensionId}`;
  const target = await browser.waitForTarget(
    (t) => t.type() === 'service_worker' && t.url().startsWith(host),
  );
  const worker = await target.worker();
  if (worker !== null) {
    await worker.close();
  }
}

SAVE_ARCHIVE dispatch (REVISION iter-2 — Option B per plan-checker BLOCKER 2): The harness page realm has chrome.runtime available (it's an extension-page realm — chrome-extension://<id>/tests/uat/extension-page-harness.html). The canonical SAVE_ARCHIVE dispatch is chrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}, callback) — used by 9 existing assertions (A5/A11/A12/A13/A26/A28/A29/A30/A31; verified via grep "type: 'SAVE_ARCHIVE'" tests/uat/extension-page-harness.ts). The __mokoshHarness surface is assertA1..A31 + getManifestVersion; there is NO dispatchSaveArchive helper. We do NOT add one — driveA33 dispatches SAVE_ARCHIVE directly via page.evaluate + a promise-wrapped chrome.runtime.sendMessage callback. This matches Plan 04-05's approach (which dispatches SAVE_ARCHIVE from inside an existing assertA34 method that runs in the harness page realm).

From RESEARCH Q2 Code Example Pattern 4 (driveA33 — host-side body; REVISION iter-2 inline-dispatched SAVE):

const A33_IDLE_WAIT_MS = 5 * 60 * 1000;           // 300_000 — real wall-clock
const A33_NEW_SW_BOOT_MS = 500;                   // post-worker.close() settle
const A33_OVERALL_TIMEOUT_MS = A33_IDLE_WAIT_MS + 60_000;  // 360_000
const A33_SAVE_ARCHIVE_TIMEOUT_MS = 15_000;

export async function driveA33(
  page: Page,
  browser: Browser,
  extensionId: string,
  downloadsDir: string,
): Promise<AssertionRecord> {
  const r: AssertionRecord = { name: 'A33', passed: false, checks: [], diagnostics: [] };

  // Step 1: prime recording on the probe tab via the existing harness primitive.
  // setupFreshRecording is a module-internal helper inside extension-page-harness.ts
  // and is reachable from page.evaluate only if exposed; in practice driveA33 calls
  // assertA1 (or an equivalent existing harness method that primes a fresh recording)
  // OR a thin Plan-04-04 page-side wrapper if the prior arts don't suffice. Verify
  // in Task 2 read_first which existing assertA* method delivers a fresh-recording
  // SUT and reuse it directly (no new harness method needed).
  await page.evaluate(async () => {
    // Reuse the same fresh-recording primitive that A5/A26/A30/A31 use as their Step 1.
    // The exact call depends on whether setupFreshRecording is exposed; if not, A33's
    // first step calls __mokoshHarness.assertA1 (which is the canonical "fresh
    // recording bootstrap" assertion in the harness surface).
    const harness = (window as { __mokoshHarness: { assertA1: () => Promise<unknown> } }).__mokoshHarness;
    await harness.assertA1();
  });

  // Step 2: 5-min wall-clock idle
  r.diagnostics.push(`waiting ${A33_IDLE_WAIT_MS}ms for SW idle window`);
  await new Promise((res) => setTimeout(res, A33_IDLE_WAIT_MS));

  // Step 3: force SW termination via CDP
  await stopServiceWorker(browser, extensionId);
  r.diagnostics.push('SW terminated via worker.close()');

  // Step 4: brief settle for SW teardown
  await new Promise((res) => setTimeout(res, A33_NEW_SW_BOOT_MS));

  // Step 5: dispatch SAVE_ARCHIVE via chrome.runtime.sendMessage from the harness
  // page realm — matches the established A5/A11/A12/A13/A26/A28/A29/A30/A31 pattern.
  // Wakes SW back up as an event (event-driven respawn is the canonical MV3 wakeup path).
  const saveResult = await page.evaluate(
    (timeoutMs: number) =>
      new Promise<{ success: boolean; error?: string }>((resolve) => {
        const timer = setTimeout(() => {
          resolve({ success: false, error: `SAVE_ARCHIVE timed out after ${timeoutMs}ms` });
        }, timeoutMs);
        chrome.runtime.sendMessage({ type: 'SAVE_ARCHIVE' }, (response: unknown) => {
          clearTimeout(timer);
          if (chrome.runtime.lastError !== undefined) {
            resolve({ success: false, error: String(chrome.runtime.lastError.message) });
            return;
          }
          resolve(response as { success: boolean; error?: string });
        });
      }),
    A33_SAVE_ARCHIVE_TIMEOUT_MS,
  );

  r.checks.push({
    name: 'A33.1: SAVE_ARCHIVE ack success after 5-min idle + SW kill',
    expected: true,
    actual: saveResult.success,
    passed: saveResult.success === true,
  });

  // Step 6: verify zip contains non-empty video buffer
  const zipPath = findLatestZip(downloadsDir);
  if (zipPath === null) {
    r.checks.push({ name: 'A33.0: zip present', expected: '>=1 zip', actual: 'none', passed: false });
    r.passed = false;
    return r;
  }
  const zip = await JSZip.loadAsync(readFileSync(zipPath));
  const videoEntry = zip.file('video/last_30sec.webm');
  const videoSize = videoEntry !== null
    ? (await videoEntry.async('uint8array')).byteLength
    : 0;
  r.checks.push({
    name: 'A33.2: video/last_30sec.webm size > 0 (buffer survived SW eviction)',
    expected: '>0',
    actual: String(videoSize),
    passed: videoSize > 0,
  });
  r.checks.push({
    name: 'A33.3: video size > 100 KB (sanity floor; real archives 1-3 MB)',
    expected: '>100000',
    actual: String(videoSize),
    passed: videoSize > 100_000,
  });

  r.passed = r.checks.every((c) => c.passed);
  return r;
}

From RESEARCH Q2 sub-question (c) env-gate recommendation:

// In tests/uat/harness.test.ts drivers-array entry:
{
  name: 'A33',
  drive: process.env.SKIP_LONG_UAT === '1'
    ? async (): Promise<AssertionRecord> => ({
        name: 'A33',
        passed: true,
        checks: [],
        diagnostics: ['A33 SKIPPED (SKIP_LONG_UAT=1; unset to run 5-min idle test)'],
      })
    : driveA33Wrapped,
},

Default polarity: SKIP_LONG_UAT unset → RUN A33 (this matches the closure + alpha-gate semantics; per-commit dev iteration uses SKIP_LONG_UAT=1).

From src/offscreen/recorder.ts:91 (architecture invariant — segments only in offscreen RAM):

let segments: Blob[] = [];  // module-level state; NO chrome.storage.local persistence; NO IndexedDB

The spike's job: verify this RAM-only design survives a 5-min SW idle. If it does, ROADMAP SC #1 is satisfied with ZERO source-code changes — just a new harness assertion that exercises the path.

Task 1: Wave 0 SPIKE — empirical verification that offscreen survives 5-min SW idle tests/uat/lib/harness-page-driver.ts tests/uat/lib/harness-page-driver.ts (full; ~2200 lines — read selectively: imports lines 1-40, findLatestZip ~1395, driveA30 host-side filter ~2039-2148), tests/uat/extension-page-harness.ts:3932-4021 (__mokoshHarness global registration block — confirm available surface BEFORE writing spike), tests/uat/extension-page-harness.ts:600-700 (setupFreshRecording helper context), src/offscreen/recorder.ts:80-100 (segments array context), .planning/phases/04-harden-clean-up-optional/04-RESEARCH.md Q2 sub-question (b), .planning/phases/01-stabilize-video-pipeline/01-07-SUMMARY.md (Plan 07 spike precedent) 1. Add the `stopServiceWorker(browser, extensionId)` helper to `tests/uat/lib/harness-page-driver.ts` per the Code Example in `` above. Place it near the top of the file (after existing imports + before existing driveA-* functions). Add the `import type { Browser } from 'puppeteer';` if not already present.
2. Create a one-shot spike script `tests/uat/spike-a33-sw-persistence.ts` (NEW; treat as scratch file for this spike — delete after spike concludes; record outcome in plan SUMMARY). The script:
   - Imports `launchHarnessBrowser` from `./lib/launch.ts`.
   - Imports `stopServiceWorker` + `findLatestZip` from `./lib/harness-page-driver.ts`.
   - Launches the harness browser.
   - **Prime recording (REVISION iter-2 — Option B; no `dispatchSaveArchive` helper exists on `__mokoshHarness`):** call the existing fresh-recording primitive via `await handles.harnessPage.evaluate(() => (window as { __mokoshHarness: { assertA1: () => Promise<unknown> } }).__mokoshHarness.assertA1());`. The Task 1 read_first MUST verify that `__mokoshHarness.assertA1` is the canonical fresh-recording bootstrap (it is per the existing harness — `Harness ready. window.__mokoshHarness.{assertA1..A31, getManifestVersion} available.`); if a different assertA* method is more direct for "prime + leave recording active for 5 min", choose that instead and document in the spike script comment.
   - `console.log('SPIKE: waiting 5 minutes for SW idle window...')`
   - `await new Promise(r => setTimeout(r, 5 * 60 * 1000));`
   - `await stopServiceWorker(handles.browser, handles.extensionId);`
   - `await new Promise(r => setTimeout(r, 500));` (settle)
   - **Dispatch SAVE_ARCHIVE (REVISION iter-2 — Option B; canonical chrome.runtime.sendMessage from harness page realm):**
     ```typescript
     const saveResult = await handles.harnessPage.evaluate(
       (timeoutMs: number) =>
         new Promise<{ success: boolean; error?: string }>((resolve) => {
           const timer = setTimeout(() => {
             resolve({ success: false, error: `SAVE_ARCHIVE timed out after ${timeoutMs}ms` });
           }, timeoutMs);
           chrome.runtime.sendMessage({ type: 'SAVE_ARCHIVE' }, (response: unknown) => {
             clearTimeout(timer);
             if (chrome.runtime.lastError !== undefined) {
               resolve({ success: false, error: String(chrome.runtime.lastError.message) });
               return;
             }
             resolve(response as { success: boolean; error?: string });
           });
         }),
       15_000,
     );
     console.log(`SPIKE: SAVE_ARCHIVE ack -> ${JSON.stringify(saveResult)}`);
     ```
   - `await new Promise(r => setTimeout(r, 5000));` (let download complete)
   - `const zipPath = findLatestZip(handles.downloadsDir);`
   - `const zip = await JSZip.loadAsync(readFileSync(zipPath));`
   - `const videoEntry = zip.file('video/last_30sec.webm');`
   - `const videoSize = videoEntry ? (await videoEntry.async('uint8array')).byteLength : 0;`
   - `console.log(\`SPIKE RESULT: videoSize=${videoSize} bytes (>0 = OFFSCREEN SURVIVED; =0 = OFFSCREEN DIED)\`);`
   - `await handles.browser.close();`

3. Run the spike: `tsx tests/uat/spike-a33-sw-persistence.ts` with HEADLESS=1 (so it runs in CI mode; ~5 min wall-clock).

4. Record the result. If videoSize > 100_000 → SPIKE PASSED (offscreen survives) → proceed to Task 2 with verification-only A33. If videoSize ≤ 100_000 OR throw → SPIKE FAILED → SUMMARY documents the failure mode + flag to plan-checker for re-planning (IndexedDB persistence work would expand Plan 04-04 substantially; that's a planning event, not an execution event).

5. Commit the `stopServiceWorker` helper (Task 1's persisting artifact). The spike script is OK to delete OR keep committed as `tests/uat/spike-*.ts` for future SW-lifecycle investigations.
npx tsc --noEmit && HEADLESS=1 tsx tests/uat/spike-a33-sw-persistence.ts 2>&1 | tee /tmp/04-04-spike.log; grep -c 'SPIKE RESULT' /tmp/04-04-spike.log - `stopServiceWorker(browser, extensionId)` helper exists at `tests/uat/lib/harness-page-driver.ts` with the canonical Chrome devrel signature (`Browser` + extensionId args; `target.worker()?.close()` body). - Spike script ran to completion (no Puppeteer throw). - Spike result logged with explicit `videoSize= bytes` line. - Spike SAVE_ARCHIVE dispatch uses `chrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}, ...)` directly (NOT a non-existent `__mokoshHarness.dispatchSaveArchive()` call); verify by `grep -c 'dispatchSaveArchive' tests/uat/spike-a33-sw-persistence.ts` returns 0 AND `grep -c "type: 'SAVE_ARCHIVE'" tests/uat/spike-a33-sw-persistence.ts` returns ≥ 1. - If videoSize > 100_000: spike PASSED; proceed to Task 2 with verification-only path. - If videoSize ≤ 100_000: spike FAILED; pause plan + flag to plan-checker for re-planning (out of scope for this task to escalate, but SUMMARY documents). - Total spike wall-clock: ~6-7 minutes (5 min idle + ~1-2 min orchestration). Spike outcome recorded in plan SUMMARY; stopServiceWorker helper committed. Atomic commit: `feat(04-04): Wave 0 spike — stopServiceWorker helper + 5-min SW idle empirical result`. Task 2: Wave 1 — A33 assertion + driveA33 + orchestrator wiring (assumes spike PASSED) tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts, tests/uat/harness.test.ts tests/uat/extension-page-harness.ts:3517-3636 (assertA30 — canonical setupFreshRecording + SAVE pattern), tests/uat/extension-page-harness.ts:3878-3917 (assertA31 — most-recent chrome.runtime.sendMessage SAVE_ARCHIVE pattern; copy this), tests/uat/extension-page-harness.ts:3932-4021 (__mokoshHarness global registration block — confirm NO new method added per REVISION iter-2), tests/uat/lib/harness-page-driver.ts:2039-2148 (driveA30 — host-side filter pattern), tests/uat/harness.test.ts:100-110 (import block), tests/uat/harness.test.ts:340-360 (wrapped-driver block), tests/uat/harness.test.ts:459-486 (drivers-array push block), tests/uat/harness.test.ts:225-240 (SKIP_PROD_REBUILD env-gate pattern) **GATING CONDITION:** Task 1 spike produced videoSize > 100_000. (If FAILED, this task is BLOCKED and the plan must be re-planned to add IndexedDB persistence work.)
3-file lockstep update per the Approach B harness extension pattern:

**File 1: tests/uat/extension-page-harness.ts**
- REVISION iter-2 — Option B per plan-checker BLOCKER 2: the existing `__mokoshHarness` surface is `assertA1..A31 + getManifestVersion`; `dispatchSaveArchive` does NOT exist and we do NOT add it. SAVE_ARCHIVE dispatch happens directly via `chrome.runtime.sendMessage` inside driveA33's `page.evaluate` (matches the established assertA31 pattern at lines 3886-3890).
- Decision: NO new page-side function. driveA33 (host-side) drives Step 1 (prime) by calling an existing `__mokoshHarness.assertA<N>` method that bootstraps a fresh recording (confirm in read_first which existing assertA* is the canonical "prime fresh recording" entrypoint — `assertA1` is the leading candidate; falling back to `assertA5`/`assertA26` if a more direct method matches the spike's actual call site). Step 5 (SAVE) uses inline `chrome.runtime.sendMessage` per the `<interfaces>` block above.
- Verify no edits needed to `__mokoshHarness` registration block (lines 3932-4015): the surface stays at 31 assertA* + getManifestVersion. The Tier-1 FORBIDDEN_HOOK_STRINGS inventory stays at 12 entries (no new test-only symbol).
- If, during read_first, the planner determines that NONE of the existing assertA* methods deliver "prime + leave recording active for ≥5 min", THEN add a thin page-side primer `primeForA33` that calls existing production-surface APIs (REQUEST_PERMISSIONS → START_RECORDING via chrome.runtime.sendMessage); this is a deviation from Option B and must be flagged in the SUMMARY. Per RESEARCH note (FORBIDDEN_HOOK_STRINGS stays at 12): NO new test-only `__MOKOSH_UAT__`-gated symbol; any new page-side helper uses production APIs only.

**File 2: tests/uat/lib/harness-page-driver.ts**
- Append `driveA33` function per RESEARCH Code Example Pattern 4 (full body in `<interfaces>` above; REVISION iter-2 inline-dispatched SAVE).
- Place it after the existing driveA32 (which is the most-recent Phase 3 addition).
- Verify the `stopServiceWorker` helper from Task 1 is in scope (same file).
- Filter-pipeline form; no `continue`; typed function signature `(page, browser, extensionId, downloadsDir) => Promise<AssertionRecord>` per the new 4-arg shape.
- Add `import { readFileSync } from 'node:fs';` + `import JSZip from 'jszip';` if not already present (they should be — these are reused from driveA29/30/31).
- The Step-5 SAVE_ARCHIVE inline `page.evaluate` block uses `chrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}, callback)` per the `<interfaces>` Code Example; verify by `grep -c "type: 'SAVE_ARCHIVE'" tests/uat/lib/harness-page-driver.ts` increases by ≥ 1 vs pre-edit baseline (was 0 in driveA29/30/31 because those call sites are inside extension-page-harness.ts assertA* methods; A33 is unique in dispatching from the host-side via page.evaluate).

**File 3: tests/uat/harness.test.ts**
- Import: add `driveA33,` to the import block at ~line 101 (alongside `driveA29`-`driveA32`).
- Wrapped-driver: add at ~line 357 (after `driveA31Wrapped`):
  ```typescript
  // Plan 04-04 — driveA33 needs Browser + extensionId for CDP-based SW kill
  //             AND downloadsDir for host-side JSZip parse of post-restart zip.
  const driveA33Wrapped: (page: import('puppeteer').Page) => Promise<AssertionRecord> =
    (page) => driveA33(page, handles.browser, handles.extensionId, handles.downloadsDir);
  ```
- Drivers-array push: add at ~line 486 (after the existing A32 entry):
  ```typescript
  // Plan 04-04 A33: SW state persistence 5-min idle (ROADMAP SC #1; RESEARCH Q2).
  // Forces SW eviction via Puppeteer CDP worker.close() per the canonical
  // Chrome devrel pattern (RESEARCH Pattern 1). Verifies offscreen-RAM
  // segments survive SW restart. Env-gated by SKIP_LONG_UAT for fast
  // per-commit iteration; defaults to RUN for Phase 4 closure + alpha gate.
  {
    name: 'A33',
    drive: process.env.SKIP_LONG_UAT === '1'
      ? async (): Promise<AssertionRecord> => ({
          name: 'A33',
          passed: true,
          checks: [],
          diagnostics: ['A33 SKIPPED (SKIP_LONG_UAT=1; unset to run 5-min idle test)'],
        })
      : driveA33Wrapped,
  },
  ```

Verify:
- `npx tsc --noEmit` exits 0.
- `npm run build:test` exits 0.
- Quick UAT: `HEADLESS=1 SKIP_PROD_REBUILD=1 SKIP_LONG_UAT=1 npm run test:uat` exits 0 with 34/34 GREEN (A33 SKIPPED message visible; preserves baseline + adds A33 skip placeholder).
- Full UAT: `HEADLESS=1 SKIP_PROD_REBUILD=1 npm run test:uat` exits 0 with 34/34 GREEN (A33 actually runs ~6 min wall-clock; A33.1 SAVE ack + A33.2 size > 0 + A33.3 size > 100 KB all PASS).
- Tier-1 FORBIDDEN_HOOK_STRINGS check: `grep -c 'FORBIDDEN_HOOK_STRINGS' tests/uat/harness.test.ts tests/background/no-test-hooks-in-prod-bundle.test.ts` — verify the inventory count in both files unchanged (preserves the 12-entry invariant per CONTEXT §"Claude's Discretion").
- REVISION iter-2 gate: `grep -c 'dispatchSaveArchive' tests/uat/lib/harness-page-driver.ts tests/uat/extension-page-harness.ts tests/uat/harness.test.ts` returns 0 (the non-existent helper is NOT introduced).
npx tsc --noEmit && npm run build:test && HEADLESS=1 SKIP_PROD_REBUILD=1 SKIP_LONG_UAT=1 npm run test:uat 2>&1 | tail -5 | tee /tmp/04-04-task-2-skip.log; grep -c '34/34' /tmp/04-04-task-2-skip.log; grep -c 'dispatchSaveArchive' tests/uat/lib/harness-page-driver.ts tests/uat/extension-page-harness.ts tests/uat/harness.test.ts - `npx tsc --noEmit` exits 0. - `npm run build:test` exits 0. - UAT harness count flips 33 → 34 (A33 added). - Skip-mode run: `HEADLESS=1 SKIP_PROD_REBUILD=1 SKIP_LONG_UAT=1 npm run test:uat` GREEN 34/34 (A33 SKIPPED placeholder GREEN; total takes ~95s — unchanged). - Full-mode run: `HEADLESS=1 SKIP_PROD_REBUILD=1 npm run test:uat` GREEN 34/34 (~6.5 min; A33 actually runs and passes A33.1 + A33.2 + A33.3). - `grep -c 'A33' tests/uat/harness.test.ts` returns ≥ 4 (import + wrapped + push + comment banner). - `grep -c 'SKIP_LONG_UAT' tests/uat/harness.test.ts` returns ≥ 2 (env-gate + comment). - FORBIDDEN_HOOK_STRINGS count unchanged at 12 (no new test-only symbols introduced per CONTEXT §"Claude's Discretion"; verify by `wc -l` of the inventory arrays). - REVISION iter-2 gate (Option B): `grep -c 'dispatchSaveArchive' tests/uat/` returns 0 across all harness files; SAVE_ARCHIVE dispatched via `chrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}, ...)` only. A33 lands; UAT 33→34 GREEN; SW persistence empirically verified at 5-min idle scale. Atomic commit: `feat(04-04): Wave 1 — A33 SW state persistence harness assertion (34/34 GREEN)`.

<threat_model>

Trust Boundaries

Boundary Description
Puppeteer CDP → Chrome MV3 SW realm worker.close() invokes the SW's self.close() via CDP ServiceWorker.unregister — this terminates the SW realm but does NOT touch the offscreen document's WebContents target. Native CDP surface; no untrusted input.
Test idle interval (5 min wall-clock) → MediaRecorder active segment buffer the MediaRecorder is in state === 'recording' during the idle; segments rotate every 10s; the offscreen-RAM array accumulates 30 segments (5 min × 60 sec / 10 sec per segment); trim-to-last-3 keeps memory bounded ≤ 30 MB (well under CON-ram-ceiling)

STRIDE Threat Register

Threat ID Category Component Disposition Mitigation Plan
T-04-04-01 Tampering a future architectural change might move the segments array to SW-side state, breaking the offscreen-survives-SW assumption A33 verifies mitigate A33 is a regression-catching gate; if a future PR moves segments off-offscreen, A33 fails fast (videoSize=0 after SW kill)
T-04-04-02 DoS (CI) A33's 5-min idle adds ~5 min to harness wall-clock (95s → 395s); per-commit CI lanes would suffer mitigate Env-gated by SKIP_LONG_UAT (default RUN for closure + alpha; documented per-commit SKIP_LONG_UAT=1 for dev iteration)
T-04-04-03 Repudiation natural 30s idle eviction does NOT fire under Puppeteer's CDP attach per Chrome docs; if a developer naively writes "wait 5 min and hope SW dies" the test silently passes via a SW that never died mitigate The CDP worker.close() call is explicit + cited in code comment; RESEARCH Pitfall 4 documents the misconception
T-04-04-04 Spoofing Puppeteer 25.x patch versions could in theory change Worker.close() semantics; the canonical Chrome devrel pattern is pinned at Puppeteer ≥22.1.0 accept The project pin puppeteer: ^25.0.2 is comfortably past the 22.1.0 floor; minor patch drift expected to be backward-compatible per Puppeteer's semver discipline. If A33 ever fails post-Puppeteer-upgrade, the SUMMARY's commit ref provides the exact Puppeteer version where it was validated.
</threat_model>
- `npx tsc --noEmit` exits 0. - `npm run build:test` exits 0. - UAT harness count 33 → 34. - Skip-mode: `HEADLESS=1 SKIP_LONG_UAT=1 npm run test:uat` GREEN 34/34 in ~95s. - Full-mode: `HEADLESS=1 npm run test:uat` GREEN 34/34 in ~6.5 min. - ROADMAP SC #1 GREEN — A33 empirical evidence: video buffer survived 5-min SW idle + worker.close(). - FORBIDDEN_HOOK_STRINGS count unchanged at 12. - vitest baseline preserved (≥ 181 GREEN from Plans 04-01 + 04-02). - A29 + A30 + A31 + A32 unchanged (no regression to existing assertions). - REVISION iter-2 invariant: `grep -c 'dispatchSaveArchive' tests/uat/` returns 0 across spike script + harness files.

<success_criteria>

  • Wave 0 spike PASSED — empirical evidence that offscreen survives 5-min SW idle (Task 1).
  • assertA33 + driveA33 + stopServiceWorker helper + harness orchestrator wiring landed (Task 2).
  • UAT harness count 33 → 34 GREEN.
  • ROADMAP SC #1 (SW state persistence) GREEN.
  • Env-gated long-test pattern established (SKIP_LONG_UAT) — pattern reused by any future ≥5-min test.
  • Pre-checkpoint bundle gates 6/6 PASS unchanged (Plan 04-04 makes no source-code changes). </success_criteria>
After completion, create `.planning/phases/04-harden-clean-up-optional/04-04-SUMMARY.md` capturing: - Spike outcome (videoSize value + interpretation; SPIKE PASSED/FAILED tag) - stopServiceWorker helper diff (full body) - driveA33 diff (full body; inline chrome.runtime.sendMessage SAVE per REVISION iter-2 Option B) - Orchestrator wiring diff (3 sites in harness.test.ts) - SKIP_LONG_UAT env-gate decision (default RUN; rationale) - UAT before/after (33/33 → 34/34) - Full-mode wall-clock benchmark (e.g., ~6.5 min) - ROADMAP SC #1 closure evidence - Commit refs (Task 1 spike + Task 2 impl) - If spike FAILED: detailed failure mode + flag for re-planning (this branch is unlikely per RESEARCH MEDIUM-confidence; document as ALPHA-PATH-NOT-TAKEN)