Files
Mark 70f4f4136a docs(01-13): create UAT harness plan — Approach B (extension-internal page)
5 waves, 9 tasks. Inherits Plan 01-11 spike-pivot rationale per
01-11-SUMMARY (commit ba5474c). Implements full 14-assertion harness
via Approach B architecture, proven by prototype c647f61.

- Wave 0: clean broken Approach-A artifacts (sw-hooks.ts, SW dynamic
  import, popup-bridge lib, feasibility probes); update Tier-1 grep
  gate to 10-string Approach-B forbidden inventory.
- Wave 1: promote c647f61 prototype (extension-page-harness +
  a6.test.ts) to production paths; A6 stays GREEN.
- Wave 2: rebuild Approach-B driver utilities (launch.ts,
  assertions.ts, harness-page-driver.ts) replacing deleted
  popup-bridge primitives.
- Wave 3 (4 task bundles): wire A1-A13 functional assertions; canonical
  Bug B (A6) + Bug A (A8) RED-on-regression demos mandatory in commit
  bodies.
- Wave 4: append 01-09 Amendment 2; update STATE.md + ROADMAP.md;
  operator brand/design checkpoint.

Open questions resolved: Wave 3 granularity = 4 bundles; tabs
permission gap = workaround retained (Phase 5 hardening); failure
isolation = single browser + bail-on-first; CI plumbing = defer.

Frontmatter validation: valid=true. Plan structure: valid=true,
task_count=9, all tasks have files/action/verify/done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:28:04 +02:00

1428 lines
128 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
phase: 01-stabilize-video-pipeline
plan: 13
type: tdd
wave: 5
depends_on:
- 01-08
- 01-09
- 01-11
files_modified:
- src/test-hooks/sw-hooks.ts
- src/test-hooks/offscreen-hooks.ts
- src/test-hooks/types.ts
- src/background/index.ts
- src/offscreen/recorder.ts
- vite.test.config.ts
- tests/uat/lib/launch.ts
- tests/uat/lib/extension.ts
- tests/uat/lib/sw.ts
- tests/uat/lib/offscreen.ts
- tests/uat/lib/assertions.ts
- tests/uat/lib/zip.ts
- tests/uat/lib/harness-page-driver.ts
- tests/uat/lib/test-hook-contract.d.ts
- tests/uat/extension-page-harness.html
- tests/uat/extension-page-harness.ts
- tests/uat/harness.test.ts
- tests/uat/a6.test.ts
- tests/uat/prototype/extension-page-harness.html
- tests/uat/prototype/extension-page-harness.ts
- tests/uat/prototype/a6.test.ts
- tests/uat/prototype/probe_offscreen.mjs
- tests/uat/prototype/probe_sw.mjs
- tests/uat/prototype/probe_tabs.mjs
- tests/uat/prototype/probe_tabs2.mjs
- tests/background/no-test-hooks-in-prod-bundle.test.ts
- .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md
- .planning/STATE.md
- .planning/ROADMAP.md
autonomous: false
requirements:
- REQ-uat-harness-puppeteer
- REQ-uat-bug-A-coverage
- REQ-uat-bug-B-coverage
- REQ-uat-two-bundle
- REQ-uat-ci-friendly
- REQ-uat-13-assertions
- REQ-video-ring-buffer
tags:
- puppeteer
- uat
- harness
- e2e
- mv3-extension
- getDisplayMedia
- bug-A
- bug-B
- extension-internal-page
- synthetic-mediastream
- approach-b
- inherits-01-11-pivot
must_haves:
truths:
- "Baseline test bed is clean before any new task lands: `npm run build` exits 0; `npm run build:test` exits 0; full vitest passes (the two 01-11-aftermath failures — `sw-bundle-import.test.ts` and `no-test-hooks-in-prod-bundle.test.ts` — both flip GREEN after Wave 0 reverts the dynamic-import block in `src/background/index.ts` and deletes `src/test-hooks/sw-hooks.ts` + `tests/uat/lib/*.ts` popup-bridge scaffolding)."
- "Production bundle is byte-clean of test hooks: after `npm run build`, ZERO occurrences of `__mokoshTest`, `setCurrentStream`, `setSegmentCountGetter`, `installFakeDisplayMedia`, `dispatchEndedOnTrack`, `getSegmentCount` in any file under `dist/`. Tier-1 grep gate `tests/background/no-test-hooks-in-prod-bundle.test.ts` enforces (RED today against stale dist; GREEN after Wave 0)."
- "Test bundle wires hooks ONLY through the offscreen side: `npm run build:test` produces `dist-test/` where `__mokoshTest` appears in the offscreen chunk; SW chunk is hook-free (because the SW dynamic-import block is reverted to nothing — Approach B does not need SW hooks; chrome.action.* state is queried via the extension-internal page's full chrome.* surface, not via sw.evaluate)."
- "`npm run test:uat` exits 0 only when ALL 14 assertions GREEN (A0 production-bundle grep gate + A1-A13 functional contract). Each assertion drives Chrome FROM INSIDE the extension via `tests/uat/extension-page-harness.html`'s `window.__mokoshHarness.*` surface; Puppeteer is the trigger + result reader; the harness page calls chrome.* directly + uses `chrome.runtime.sendMessage` to bridge to the offscreen test hook (synthetic getDisplayMedia + dispatch-ended)."
- "Bug B (A6) regression rewind demonstrably catches a regression: locally applying `if (false)` to the `errorCode === 'user-stopped-sharing'` branch on `src/background/index.ts:776` turns A6 RED (badge='ERR' instead of ''); restoring turns A6 GREEN. Commit body for the T-Wave3B task documents the end-to-end demo cycle (already empirically proven by prototype c647f61; this plan inherits the proof and propagates the commit-body documentation contract)."
- "Bug A (A8) regression rewind demonstrably catches a regression: locally stubbing `NOTIFICATION_ICON_PATH = 'icons/missing.png'` on `src/background/index.ts:71` (or truncating `icons/icon128.png` to <100 bytes) turns A8 RED (notification create rejects via Chrome's imageUtil → notificationCount delta=0); restoring turns A8 GREEN. Commit body for the T-Wave3C task documents the end-to-end demo cycle."
- "Harness runs in `--headless=new` for CI portability (puppeteer 25 `headless: true` default); local-debug via `HEADLESS=0`. Synthetic MediaStream (Canvas.captureStream + monkey-patched `displaySurface: 'monitor'`) bypasses Chrome's screen-share picker — the `--auto-select-desktop-capture-source` flag (unreliable in `--headless=new`, per 01-11-SUMMARY falsification 4) is NOT used."
- "Plan 01-09 functional contract closes by harness PASS: Plan 01-09 Task 5 amended to redirect steps 4-13+15 to `npm run test:uat`; operator retains only step 1 (build verification) + step 14 (brand/design check). Amendment is APPENDED after the existing 2026-05-17 amendment block — preserves both the 01-11 amendment and the 01-13 update."
- "MV3 architectural constraints enforced (per 01-11-SUMMARY learnings): NO `await import(...)` anywhere in `src/background/index.ts`; `track.dispatchEvent(new Event('ended'))` is the ONLY path to simulate user-stopped-sharing (NOT `track.stop()`); `__MOKOSH_UAT__` Vite define-token gates ALL hook imports (NOT `import.meta.env.MODE === 'test'` which collides with vitest)."
- "Existing 89 vitest tests remain GREEN after Wave 0 cleanup AND every subsequent wave (no unit-test regression)."
artifacts:
- path: "src/test-hooks/sw-hooks.ts"
provides: "DELETED in Wave 0 — broken per 01-11-SUMMARY (MV3 SW blocks dynamic import); replaced by Approach B's extension-internal-page architecture which has full chrome.* API access without needing SW-side handler-capture monkey-patches."
- path: "src/background/index.ts"
provides: "Top-of-module gated dynamic import REVERTED in Wave 0 (no `if (__MOKOSH_UAT__) { await import(...) }` block — MV3 SW limit). Production logic at lines 75-108 (state machine), 415 (setRecordingMode call), 725-778 (RECORDING_ERROR + Bug B routing), 844-878 (listener registrations), 71 (NOTIFICATION_ICON_PATH) untouched. Wave 0 also strips the related comment block referencing the SW-hook gate."
contains: "isRecording"
- path: "src/test-hooks/offscreen-hooks.ts"
provides: "RETAINED from Wave 1 / extended with prototype's `installFakeDisplayMedia` + `dispatchEndedOnTrack` + `__mokoshOffscreenQuery` bridge + eager `installFakeDisplayMedia()` call at module load. Wave 1 promotes this to production-quality (already production-quality from c647f61); Wave 3 adds `setSegmentCountGetter` wire for A11 (35s buffer continuity assertion)."
contains: "installFakeDisplayMedia"
- path: "src/test-hooks/types.ts"
provides: "RETAINED — type contracts for `globalThis.__mokoshTest`. Wave 3 extends with `installFakeDisplayMedia?`, `uninstallFakeDisplayMedia?`, `dispatchEndedOnTrack?` typed fields so the offscreen-hooks cross-cast (currently `as MokoshTestSurface & {...}`) collapses to a clean assignment."
contains: "MokoshTestSurface"
- path: "src/offscreen/recorder.ts"
provides: "PRESERVED top-of-module gated dynamic import block (offscreen IS a DOM document; dynamic import works). Lines 21-48 (module load gate), 277-285 (setCurrentStream wire), 509-515 (setCurrentStream null on teardown) UNCHANGED from 01-11 Wave 1. Wave 3 adds `setSegmentCountGetter(() => segments.length)` wire (gated, eager at startRecording) so A11 can query the live segment count via the offscreen test surface."
contains: "__MOKOSH_UAT__"
- path: "vite.test.config.ts"
provides: "Rollup input updated in Wave 1: `prototype_harness` removed (prototype tree deleted); `extension_page_harness` added pointing at `tests/uat/extension-page-harness.html` (production path). `modulePreload: { polyfill: false }` retained (CRITICAL SW FIX per 01-11-SUMMARY — disabling the polyfill is what makes the offscreen-side dynamic import work without crashing in non-DOM contexts that incorrectly try to call document.querySelector)."
contains: "extension_page_harness"
- path: "tests/uat/extension-page-harness.html"
provides: "PROMOTED from `tests/uat/prototype/extension-page-harness.html` in Wave 1 (file move; comments updated to remove 'PROTOTYPE' label and reference 01-13 instead of 01-11). The page lives at `chrome-extension://<id>/tests/uat/extension-page-harness.html` in the test bundle and is the architectural anchor for Approach B."
- path: "tests/uat/extension-page-harness.ts"
provides: "PROMOTED + EXTENDED in Wave 1-3. Wave 1 moves the prototype's `assertA6` from `tests/uat/prototype/extension-page-harness.ts` and updates comments. Wave 2-3 extends `window.__mokoshHarness` with 13 additional assertion methods: `assertA1` (SW bootstrap state), `assertA2` (toolbar onClicked → REC), `assertA3` (displaySurface monitor), `assertA4` (popup during recording), `assertA5` (SAVE_ARCHIVE download), `assertA7` (genuine error → ERR + recovery), `assertA8` (Bug A onStartup → notification creates), `assertA9` (icon file sizes), `assertA10` (manifest shape), `assertA11` (35s → ≥3 segments), `assertA12` (ffprobe-via-host), `assertA13` (zip structure + meta.json). Assertions that need host-side primitives (ffprobe, fs/zip parsing — A5+A12+A13) return raw bytes/paths via the harness surface; the host-side Puppeteer test does the file-system + ffprobe work."
min_lines: 600
- path: "tests/uat/harness.test.ts"
provides: "REWIRED in Waves 1-3 around extension-page architecture. Replaces the popup-bridge skeleton from 01-11 Wave 2 (dbd977c) with the prototype's a6.test.ts driver pattern, generalized to all 14 assertions. A0 (production-bundle grep gate) runs pre-flight before launching Chrome; A1-A13 are wired in Waves 3A-3D (4 task bundles). Bail-on-first-failure with structured diagnostic dump (matches prototype pattern). Exit 0 only when 14/14 GREEN. Optional `--only=A6` CLI arg for assertion-specific reruns during development."
min_lines: 450
- path: "tests/uat/lib/launch.ts"
provides: "REWRITTEN in Wave 2. Strips popup-bridge plumbing (deleted in Wave 0). Builds out: `launchHarnessBrowser({ headless?, downloadsDir? }) → { browser, extensionId, harnessPage, victimPage, downloadsDir, swConsole, offConsole }`. Wires Chrome flags (`--no-sandbox`, NO `--auto-select-desktop-capture-source`), opens the harness page from `chrome-extension://<id>/tests/uat/extension-page-harness.html`, opens a victim about:blank page + brings to front (so production startVideoCapture's chrome.tabs.query({active:true}) sees a real tab — workaround for missing `tabs` permission), wires SW + offscreen console capture (best-effort; offscreen target attach is opportunistic per prototype pattern)."
- path: "tests/uat/lib/harness-page-driver.ts"
provides: "NEW in Wave 2. Thin wrapper over `page.evaluate(() => window.__mokoshHarness.assertXX())` for each of the 14 assertions. Returns the structured `AssertionResult` from the page; the host-side test maps results to vitest-style assertions + diagnostic dumps. Centralizes the bridge contract between Puppeteer driver code and harness-page code so adding/renaming an assertion happens in two files (extension-page-harness.ts impl + this driver wrapper) rather than 14 places."
- path: "tests/uat/lib/assertions.ts"
provides: "REWRITTEN in Wave 2. Strips popup-bridge primitives; provides only host-side assertion plumbing: `runAssertion(name, fn, { consoleBuffers })` (wraps a single assertion attempt with diagnostic capture), `assertEqual`/`assertGte`/`assertMatch`/`assertTrue` (re-exports of node:assert/strict primitives with structured failure messages), `waitFor(probe, predicate, timeoutMs, description)` (mirrors the prototype's polling primitive). NO direct chrome.* helpers — all chrome.* work happens inside the harness page."
- path: "tests/uat/lib/zip.ts"
provides: "RETAINED — jszip-based archive shape assertions; reads downloaded `session_report_*.zip`, asserts `video/last_30sec.webm` present + non-zero + `meta.json` carries `version` matching the harness-supplied expectedVersion. Used by A12 + A13."
- path: "tests/uat/lib/extension.ts"
provides: "DELETED in Wave 0 — was popup-bridge scaffolding (resolveExtensionId via setPopup juggling). Approach B uses `browser.extensions()` directly per prototype a6.test.ts pattern."
- path: "tests/uat/lib/sw.ts"
provides: "DELETED in Wave 0 — Approach B reads SW state via the harness page's chrome.action.* calls (which work because the page is privileged), NOT via sw.evaluate (which only exposes chrome.{loadTimes,csi} per 01-11-SUMMARY falsification 2)."
- path: "tests/uat/lib/offscreen.ts"
provides: "DELETED in Wave 0 — simulateUserStop now lives inside the harness page as a `chrome.runtime.sendMessage({type:'__mokoshOffscreenQuery', op:'dispatch-ended'})` call routed through the offscreen-hooks bridge. The BLOCKER comment (dispatchEvent NOT track.stop) is preserved inside offscreen-hooks.ts's `dispatchEndedOnTrack` function."
- path: "tests/uat/lib/test-hook-contract.d.ts"
provides: "RETAINED — mirror of src/test-hooks/types.ts for harness-page-side type checks. Wave 3 update extends mirror to include the offscreen-side `installFakeDisplayMedia`/`dispatchEndedOnTrack`/`getSegmentCount` methods so the harness page's `window.__mokoshHarness` impl typechecks against the same surface."
- path: "tests/uat/prototype/extension-page-harness.html"
provides: "DELETED in Wave 1 — moved to `tests/uat/extension-page-harness.html`."
- path: "tests/uat/prototype/extension-page-harness.ts"
provides: "DELETED in Wave 1 — moved to `tests/uat/extension-page-harness.ts`."
- path: "tests/uat/prototype/a6.test.ts"
provides: "DELETED in Wave 1 — moved to `tests/uat/a6.test.ts` AND folded into `tests/uat/harness.test.ts` as the A6 assertion (Wave 3B). The standalone `a6.test.ts` is retained as a single-assertion entry point for quick TDD iteration (`npx tsx tests/uat/a6.test.ts`), but the production gate is `npm run test:uat` which runs all 14."
- path: "tests/uat/prototype/probe_offscreen.mjs"
provides: "DELETED in Wave 0 — feasibility-research probes (already-falsified hypothesis verification scripts); no longer needed."
- path: "tests/uat/prototype/probe_sw.mjs"
provides: "DELETED in Wave 0 — see above."
- path: "tests/uat/prototype/probe_tabs.mjs"
provides: "DELETED in Wave 0 — see above."
- path: "tests/uat/prototype/probe_tabs2.mjs"
provides: "DELETED in Wave 0 — see above."
- path: "tests/background/no-test-hooks-in-prod-bundle.test.ts"
provides: "PRESERVED Tier-1 grep gate. Wave 0 audit: ensure the asserted-forbidden-strings list covers ALL hook surface names landed by Wave 1-3: `__mokoshTest`, `setCurrentStream`, `setSegmentCountGetter`, `installFakeDisplayMedia`, `uninstallFakeDisplayMedia`, `dispatchEndedOnTrack`, `getSegmentCount`, `__mokoshOffscreenQuery`. RED today (stale dist contains 01-11 prototype hook strings); GREEN after Wave 0 production build."
contains: "__mokoshTest"
- path: ".planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md"
provides: "AMENDMENT block appended at the END (after the existing 2026-05-17 Plan-01-11 amendment block from commit 9d0313a): redirects 01-09 Task 5 functional steps to `npm run test:uat` (the NOW-WORKING harness from 01-13). Operator retains step 1 (build) + step 14 (brand/design accept). Amendment preserves the existing 01-11 amendment (which never closed because 01-11 pivoted) — this 01-13 amendment notes the inheritance + supersedes the now-stale 01-11 reference."
contains: "Plan 01-13 harness closes Plan 01-09 functional contract"
- path: ".planning/STATE.md"
provides: "Wave 4 update: `completed_plans` increments 1 (01-13 closes; 01-11 already closed as spike-pivot per ba5474c); decision-log appends '[Phase 01-13]: Approach-B UAT harness landed (14/14 GREEN); inherits 01-11 spike-pivot rationale; Plan 01-09 functional contract closes via npm run test:uat; Tier-1 grep gate hook-string inventory updated for Approach-B surface set'."
- path: ".planning/ROADMAP.md"
provides: "Wave 4 update: appends `- [x] 01-13-PLAN.md — UAT harness via Approach B (14 assertions, Plan 01-09 closure)` to Phase 1 Plans list (after `01-07-PLAN.md` and any 01-08/01-09/01-10/01-11/01-12 entries that exist — the current list ends at 01-07 per inspection, so the orchestrator may need to add the intermediate entries during Wave 4 or surface the gap)."
key_links:
- from: "tests/uat/harness.test.ts"
to: "tests/uat/extension-page-harness.html"
via: "page.goto(chrome-extension://<id>/tests/uat/extension-page-harness.html)"
pattern: "chrome-extension.*/tests/uat/extension-page-harness\\.html"
- from: "tests/uat/harness.test.ts"
to: "window.__mokoshHarness.assertXX"
via: "page.evaluate via harness-page-driver.ts wrappers"
pattern: "__mokoshHarness\\.assert"
- from: "tests/uat/extension-page-harness.ts"
to: "src/test-hooks/offscreen-hooks.ts:dispatchEndedOnTrack"
via: "chrome.runtime.sendMessage({type:'__mokoshOffscreenQuery', op:'dispatch-ended'})"
pattern: "__mokoshOffscreenQuery"
- from: "tests/uat/extension-page-harness.ts"
to: "src/test-hooks/offscreen-hooks.ts:installFakeDisplayMedia"
via: "eager call at offscreen module load (auto-install) + bridge op as fallback"
pattern: "installFakeDisplayMedia"
- from: "src/offscreen/recorder.ts"
to: "src/test-hooks/offscreen-hooks.ts"
via: "gated dynamic import under __MOKOSH_UAT__ flag at top-of-module"
pattern: "__MOKOSH_UAT__"
- from: "src/background/index.ts"
to: "(no test-hook import — by design)"
via: "MV3 SW blocks dynamic import; Approach B reads SW state via harness-page chrome.action.* calls instead"
pattern: "^(?!.*await import.*test-hooks).*$"
- from: "tests/background/no-test-hooks-in-prod-bundle.test.ts"
to: "dist/ artifact tree"
via: "post-build grep for hook surface inventory (__mokoshTest, installFakeDisplayMedia, dispatchEndedOnTrack, getSegmentCount, etc.)"
pattern: "grep.*__mokoshTest.*dist"
- from: ".planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md (amendment)"
to: "npm run test:uat"
via: "appended amendment block redirecting 01-09 Task 5 functional steps"
pattern: "Plan 01-13 harness closes Plan 01-09 functional contract"
---
## Scope Sanity Note
**5 waves, 11 tasks (incl. 1 closing checkpoint), ~30 file artifacts.** This is above the "split signal" thresholds in `<scope_estimation>`, but consolidating is the right call AND has architectural precedent: Plan 01-11 was a 4-wave / 9-task plan covering the SAME 14-assertion charter; the only reason 01-13 exists separately is the architectural pivot, not new scope.
**Why we accept the borderline rather than split further:**
1. **The architecture is now proven** (prototype c647f61). The risk profile of 01-13 is execution-driven, not architecture-driven — splitting would multiply the per-plan ceremony tax (each plan re-deriving the harness-page contract in its own must_haves frontmatter) for no risk reduction.
2. **Wave 0 cleanup is a single atomic prerequisite** for every subsequent wave. Splitting it out would create a window where the baseline is dirty AND the next wave is partially landed — exactly the failure mode 01-11's spike-pivot exposed.
3. **Wave 3's 4 task bundles ARE the natural split.** Each bundle clusters 3 assertions by subsystem (Wave 3A: state machine + UI; Wave 3B: data flow + Bug B; Wave 3C: notifications + manifest; Wave 3D: recording continuity + export). Single-assertion-per-task would yield 13 wave-3 tasks — ceremony overhead with no atomicity benefit since each assertion is independently testable via the standalone harness page surface.
4. **Context budget:** Wave 0 ~10%; Wave 1 ~15%; Wave 2 ~25%; Wave 3A-3D ~15% each = ~60%; Wave 4 ~5%. Total ~115% — ABOVE the planner's 50% target if a single executor ran the whole plan. Mitigation: each wave is intended for a fresh-context executor spawn (the GSD execute-phase pattern). Per-executor context: ~25-30%, well within budget. The executor spawn pattern is the load-bearing assumption; if it doesn't hold, the natural split line is Wave 0+1+2 = Plan 01-13A; Wave 3+4 = Plan 01-13B (with the harness-page contract duplicated as the price of split).
**If a future revision DOES force a split,** natural cut line: Plan 01-13A = Waves 0+1+2 (cleanup + prototype promotion + driver utilities); Plan 01-13B = Waves 3+4 (13 functional assertions + closure). Wave 3's bundling stays inside 01-13B as 4 sub-tasks.
<objective>
Land the full 14-assertion UAT harness via Approach B (extension-internal-page harness + offscreen-side synthetic MediaStream + chrome.runtime.sendMessage bridge), inheriting from Plan 01-11's spike-pivot rationale and the proven prototype architecture (c647f61: A6 5/5 GREEN, Bug-B regression rewind verified, ~7s end-to-end runtime).
Three coordinated changes from the 01-11 baseline:
1. **Wave 0 cleanup.** Delete the broken Approach-A artifacts: `src/test-hooks/sw-hooks.ts` (MV3 SW blocks dynamic import), the dynamic-import block in `src/background/index.ts` (same), `tests/uat/lib/{launch,extension,sw,offscreen,assertions}.ts` (popup-bridge architecture wrong per 01-11-SUMMARY falsification 3), and `tests/uat/prototype/probe_*.mjs` (already-resolved feasibility probes). Two vitest failures (`sw-bundle-import.test.ts` + `no-test-hooks-in-prod-bundle.test.ts`) flip GREEN as a side-effect. Atomic commit.
2. **Promote prototype to production paths + build out driver scaffolding** (Waves 1-2). Move `tests/uat/prototype/{extension-page-harness.html,extension-page-harness.ts,a6.test.ts}` to `tests/uat/`. Update `vite.test.config.ts` rollup inputs. Rebuild `tests/uat/lib/{launch,assertions,harness-page-driver}.ts` around the extension-page architecture (NO popup-bridge primitives). Verify A6 still GREEN from the new paths.
3. **Wire 13 functional assertions (A1-A13)** via 4 task bundles in Wave 3, each extending `window.__mokoshHarness` with 3 assertion methods and the corresponding Puppeteer driver wrappers. Each task delivers an atomic commit; the two TDD canon demos (A6 Bug B + A8 Bug A regression rewinds) are documented in their respective commit bodies.
**Wave 4 closure:** amend 01-09 Task 5 to point at `npm run test:uat`; update STATE.md + ROADMAP.md; operator brand/design checkpoint surfaces the 14/14 PASS report and asks "approved" (the only operator-facing gate in the new world).
Operator role retirement: Plan 01-09 closes when `npm run test:uat` exits 0 + operator confirms step 14 (brand/design). All functional gates move to CI-callable harness — exactly the goal Plan 01-11 set out to achieve but couldn't deliver due to Approach-A architectural infeasibility.
Output:
- Wave 0: clean baseline (5 deletions + 1 revert + Tier-1 gate stays GREEN).
- Wave 1: prototype promoted to `tests/uat/extension-page-harness.{html,ts}` + standalone A6 entry at `tests/uat/a6.test.ts`; A6 PASSES from new path.
- Wave 2: `tests/uat/lib/{launch,assertions,harness-page-driver}.ts` rebuilt; A6 still PASSES via the new driver wrappers.
- Wave 3: 13 assertion methods added to `window.__mokoshHarness`; 14/14 GREEN in `npm run test:uat`.
- Wave 4: 01-09 amendment block appended; STATE.md decision logged; ROADMAP.md Phase 1 plan list updated; operator checkpoint confirms.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/REQUIREMENTS.md
@.planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md
@.planning/phases/01-stabilize-video-pipeline/01-08-PLAN.md
@.planning/phases/01-stabilize-video-pipeline/01-08-SUMMARY.md
@.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md
@.planning/phases/01-stabilize-video-pipeline/01-09-SUMMARY.md
@.planning/phases/01-stabilize-video-pipeline/01-11-PLAN.md
@.planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md
@.planning/phases/01-stabilize-video-pipeline/01-11-SUMMARY.md
@.planning/debug/resolved/01-09-recovery-flow.md
@src/background/index.ts
@src/offscreen/recorder.ts
@src/test-hooks/offscreen-hooks.ts
@src/test-hooks/types.ts
@vite.test.config.ts
@vite.config.ts
@manifest.json
@package.json
@tsconfig.json
@tests/uat/prototype/extension-page-harness.html
@tests/uat/prototype/extension-page-harness.ts
@tests/uat/prototype/a6.test.ts
@tests/background/no-test-hooks-in-prod-bundle.test.ts
@tests/background/sw-bundle-import.test.ts
<interfaces>
<!-- Key types, paths, and Chrome/Puppeteer API surfaces the executor uses -->
<!-- Embedded here so the executor needs no codebase exploration -->
### Approach-B architecture (ratified by prototype c647f61 — DO NOT DEVIATE)
```
┌──────────────────────────────────────────────────────────────────────────┐
│ Node host process (Puppeteer driver — tests/uat/harness.test.ts) │
│ • launches Chrome with enableExtensions: [dist-test/] │
│ • opens chrome-extension://<id>/tests/uat/extension-page-harness.html │
│ • opens about:blank victim page + bringToFront │
│ • page.evaluate(() => window.__mokoshHarness.assertXX()) │
│ • reads structured AssertionResult, drives bail-on-fail │
│ • does host-side fs/zip/ffprobe work (A5, A12, A13) │
└────────────────────────┬─────────────────────────────────────────────────┘
│ Puppeteer CDP page.evaluate
┌──────────────────────────────────────────────────────────────────────────┐
│ Extension-internal harness page (PRIVILEGED — full chrome.* API) │
│ tests/uat/extension-page-harness.{html,ts} → window.__mokoshHarness │
│ .assertA1..A13 — orchestrate each assertion using: │
│ • chrome.action.getBadgeText / getPopup / setBadgeText / setPopup │
│ • chrome.runtime.sendMessage (REQUEST_PERMISSIONS, SAVE_ARCHIVE, │
│ RECORDING_ERROR, __mokoshOffscreenQuery, START_RECORDING) │
│ • chrome.notifications.getAll / .create / .clear │
│ • chrome.offscreen.createDocument / .hasDocument │
│ • chrome.runtime.getManifest / .getURL │
│ • fetch(chrome.runtime.getURL('icons/icon{N}.png')) for size check │
└─┬──────────────────────┬────────────────────────────────────┬────────────┘
│ direct chrome.* calls │ chrome.runtime.sendMessage │
│ (page has privilege) │ (cross-isolate path) │
▼ ▼ ▼
┌─────────────────┐ ┌────────────────────────────────┐ ┌──────────────────┐
│ SW (production) │ │ Offscreen (production) │ │ Browser APIs │
│ src/background/ │ │ src/offscreen/recorder.ts │ │ (chrome.action, │
│ index.ts │ │ + src/test-hooks/ │ │ notifications, │
│ UNCHANGED │ │ offscreen-hooks.ts │ │ downloads) │
│ (no hooks) │ │ gated by __MOKOSH_UAT__ │ │ │
│ │ │ installFakeDisplayMedia(), │ │ │
│ │ │ dispatchEndedOnTrack(), │ │ │
│ │ │ setSegmentCountGetter() │ │ │
└─────────────────┘ └────────────────────────────────┘ └──────────────────┘
```
### Puppeteer extension API surface (per c647f61 prototype, verified)
```typescript
import puppeteer, { type Browser, type Page } from 'puppeteer';
const browser: Browser = await puppeteer.launch({
enableExtensions: ['/abs/path/to/dist-test'],
headless: process.env.HEADLESS !== '0',
pipe: true,
protocolTimeout: 90_000, // headroom for sendMessage round-trips
args: ['--no-sandbox'],
// DO NOT add --auto-select-desktop-capture-source — unreliable in
// --headless=new per 01-11-SUMMARY falsification 4; synthetic stream
// bypasses the picker entirely.
});
const extensions = await browser.extensions();
const [extensionId] = [...extensions][0];
const victimPage = await browser.newPage();
await victimPage.goto('about:blank');
const page: Page = await browser.newPage();
await page.goto(`chrome-extension://${extensionId}/tests/uat/extension-page-harness.html`);
await page.waitForFunction(() => (window as any).__mokoshHarness !== undefined);
await victimPage.bringToFront();
const result = await page.evaluate(async () => {
const r = await (window as any).__mokoshHarness.assertA6();
return r;
});
```
### Harness-page surface (extends prototype's window.__mokoshHarness)
```typescript
// tests/uat/extension-page-harness.ts — Wave 1 PROMOTED + Wave 3 EXTENDED.
interface AssertionResult {
passed: boolean;
name: string;
checks: Array<{
name: string;
expected: unknown;
actual: unknown;
passed: boolean;
}>;
diagnostics: string[];
error?: string;
}
// Augmented globally:
interface Window {
__mokoshHarness: {
assertA1: () => Promise<AssertionResult>; // SW bootstrap state
assertA2: () => Promise<AssertionResult>; // toolbar onClicked → REC
assertA3: () => Promise<AssertionResult>; // displaySurface monitor
assertA4: () => Promise<AssertionResult>; // popup during recording
assertA5: () => Promise<{ // SAVE_ARCHIVE returns blob bytes
passed: boolean;
zipBytes?: string; // base64
diagnostics: string[];
error?: string;
}>;
assertA6: () => Promise<AssertionResult>; // Bug B canonical (proven)
assertA7: () => Promise<AssertionResult>; // genuine error → ERR + recovery
assertA8: () => Promise<AssertionResult>; // Bug A onStartup notification
assertA9: () => Promise<AssertionResult>; // icon file sizes
assertA10: () => Promise<AssertionResult>; // manifest shape
assertA11: () => Promise<AssertionResult>; // 35s → ≥3 segments
assertA12: () => Promise<{ // ffprobe (host-side; returns webm bytes)
passed: boolean;
webmBytes?: string; // base64
diagnostics: string[];
error?: string;
}>;
assertA13: () => Promise<{ // zip shape (host-side; returns zip bytes)
passed: boolean;
zipBytes?: string; // base64
expectedVersion: string;
diagnostics: string[];
error?: string;
}>;
};
}
```
### Offscreen-hooks bridge protocol (UNCHANGED from c647f61)
```typescript
// chrome.runtime.sendMessage payload:
{ type: '__mokoshOffscreenQuery', op: 'install-fake-display-media' } // → { ok, error? }
{ type: '__mokoshOffscreenQuery', op: 'dispatch-ended' } // → { ok, error? }
{ type: '__mokoshOffscreenQuery', op: 'has-stream' } // → { hasStream }
{ type: '__mokoshOffscreenQuery', op: 'get-segment-count' } // NEW in Wave 3 → { count }
```
### MV3 SW constraint enforcement (per 01-11-SUMMARY falsification 1)
```typescript
// src/background/index.ts — WAVE 0 STATE (after revert):
// NO top-of-module dynamic import. The 01-11 Wave 1 block
// let testHooks: ... = null;
// if (__MOKOSH_UAT__) { testHooks = await import('../test-hooks/sw-hooks'); }
// is REMOVED entirely. Approach B does not need SW-side hooks —
// chrome.action.* state is queried by the harness page directly.
```
### Offscreen-side gate (UNCHANGED — works because offscreen IS a DOM document)
```typescript
// src/offscreen/recorder.ts lines 21-48 — PRESERVED:
let testHooks: typeof import('../test-hooks/offscreen-hooks') | null = null;
if (__MOKOSH_UAT__) {
testHooks = await import('../test-hooks/offscreen-hooks');
}
// Wave 3 add (after existing setCurrentStream wire):
if (__MOKOSH_UAT__) {
testHooks?.setCurrentStream(stream);
testHooks?.setSegmentCountGetter(() => segments.length);
}
```
### Tier-1 grep gate forbidden-string inventory (Wave 0 audit + Wave 3 extension)
```typescript
// tests/background/no-test-hooks-in-prod-bundle.test.ts FORBIDDEN strings:
const FORBIDDEN_STRINGS = [
'__mokoshTest',
'setCurrentStream',
'setSegmentCountGetter',
'installFakeDisplayMedia',
'uninstallFakeDisplayMedia',
'dispatchEndedOnTrack',
'getSegmentCount',
'__mokoshOffscreenQuery',
];
// Every entry must be absent from EVERY file under dist/ post-build.
```
### Test bundle rollup input (vite.test.config.ts — Wave 1 update)
```typescript
rollupOptions: {
input: {
// Removed: prototype_harness → moved to extension_page_harness
extension_page_harness: 'tests/uat/extension-page-harness.html',
},
},
```
### npm scripts (UNCHANGED from 01-11 Wave 0)
```json
{
"scripts": {
"build:test": "tsc && vite build --mode test --config vite.test.config.ts",
"test:uat": "npm run build:test && tsx tests/uat/harness.test.ts"
}
}
```
### Resolved open questions
| # | Question | Resolution | Rationale |
|---|----------|------------|-----------|
| 1 | Task granularity for Wave 3 — 4 bundles of 3 assertions, OR 13 separate tasks? | **4 bundles** (T-Wave3A: A1+A2+A3+A4; T-Wave3B: A5+A6+A7; T-Wave3C: A8+A9+A10; T-Wave3D: A11+A12+A13). | Balance ceremony overhead (13 commits vs 4) vs atomicity (per-assertion vs subsystem-cluster). The bundle boundaries follow subsystem coupling: each bundle's 3 assertions share fixture state (e.g. T-Wave3A all run against a single launch; T-Wave3D all need the 35s recording). |
| 2 | Manifest `tabs` permission gap (per 01-11-SUMMARY) | **Workaround retained (no scope creep).** The prototype's A6 implementation sends `START_RECORDING` directly to the offscreen via `chrome.runtime.sendMessage`, bypassing the SW's `startVideoCapture` which requires `chrome.tabs.query({active:true})` to return a tab with `.url` (which it does NOT without `tabs` permission). Wave 3 A2 (toolbar onClicked) uses the same direct-offscreen path. Flagged for Phase 5 hardening: adding `tabs` permission to manifest would unlock testing the real toolbar onClicked → startVideoCapture path; out of scope for the harness plan. | Adding manifest permissions in a TEST plan is wrong on scope grounds (changes production attack surface). The harness verifies the contract that matters (recording starts; bug B routing works); the routing-via-startVideoCapture vs direct-offscreen distinction is orthogonal to the Bug B fix verification. |
| 3 | Failure isolation — single browser vs per-assertion restart? | **Single browser, serial assertions, bail-on-first-failure, structured diagnostic dump.** Matches prototype c647f61 pattern; matches 01-11 RESEARCH §5 recommendation + open-question resolution 4. | Per-assertion restart = 14 × ~3-5s = ~60s overhead. Single browser keeps total runtime in the 60-90s range. State bleed is acceptable for 14 deterministic assertions where each one's pre-condition is established by its own setup steps (e.g. A6's "wait for badge='REC'" pre-condition runs independent of A5's state). The bail-on-fail + diagnostic dump preserves debugging value. |
| 4 | CI plumbing scope | **Defer.** No `.github/workflows/` directory exists; introducing CI tooling here would force a CI-tool decision (Actions vs self-hosted vs other) out of scope for the harness landing plan. The harness is CI-callable today (`npm run test:uat` exits 0 on pass, non-zero on fail, deterministic exit codes). | Matches 01-11 RESEARCH open-question resolution 3 verbatim. Phase 5 hardening backlog. |
### How A6 / A8 RED-on-regression demos work (commit body documentation contract)
**A6 (Bug B canonical) — T-Wave3B commit body MUST include:**
```
RED-on-regression demo (A6 Bug B regression rewind):
$ # Apply local-only revert of Bug B fix at b9eeeeb:
$ # On src/background/index.ts:776, change `if (errorCode === 'user-stopped-sharing')` to `if (false)`
$ npm run build:test
$ npm run test:uat # OR: npx tsx tests/uat/a6.test.ts
# A6 result: FAIL
# A6.1: badge text is '' (NOT 'ERR') after user-stop — expected '', actual 'ERR'
$ git checkout -- src/background/index.ts
$ npm run build:test
$ npm run test:uat
# A6 result: PASS 5/5
This proves the harness can catch a Bug B regression in the SW state machine.
```
**A8 (Bug A canonical) — T-Wave3C commit body MUST include:**
```
RED-on-regression demo (A8 Bug A regression rewind):
$ # Apply local-only icon stub on src/background/index.ts:71:
$ # const NOTIFICATION_ICON_PATH = 'icons/missing.png';
$ # OR: truncate icons/icon128.png to <100 bytes
$ npm run build:test
$ npm run test:uat
# A8 result: FAIL
# A8.1: notification count delta === 1 — expected 1, actual 0 (Chrome imageUtil rejected create)
$ git checkout -- src/background/index.ts icons/icon128.png
$ npm run build:test
$ npm run test:uat
# A8 result: PASS
This proves the harness can catch a Bug A regression in the notification icon path.
```
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1 (Wave 0): Clean broken Approach-A artifacts per 01-11-SUMMARY; restore baseline GREEN.</name>
<read_first>
- .planning/phases/01-stabilize-video-pipeline/01-11-SUMMARY.md (the spec for what 01-13 inherits + what gets deleted)
- src/background/index.ts lines 13-29 (the dynamic-import block to revert)
- src/test-hooks/sw-hooks.ts (the broken file to delete — read once to confirm what's being removed, don't try to fix it)
- tests/uat/lib/{launch,extension,sw,offscreen,assertions}.ts (popup-bridge scaffolding to delete)
- tests/uat/prototype/probe_*.mjs (feasibility probes to delete)
- tests/background/no-test-hooks-in-prod-bundle.test.ts (Tier-1 grep gate — confirm forbidden-string list covers the post-Wave-0 surface)
- tests/background/sw-bundle-import.test.ts (the second currently-failing test; understand why it fails — likely needs fresh dist/)
</read_first>
<files>src/test-hooks/sw-hooks.ts, src/background/index.ts, tests/uat/lib/launch.ts, tests/uat/lib/extension.ts, tests/uat/lib/sw.ts, tests/uat/lib/offscreen.ts, tests/uat/lib/assertions.ts, tests/uat/prototype/probe_offscreen.mjs, tests/uat/prototype/probe_sw.mjs, tests/uat/prototype/probe_tabs.mjs, tests/uat/prototype/probe_tabs2.mjs, tests/background/no-test-hooks-in-prod-bundle.test.ts</files>
<behavior>
- DELETE `src/test-hooks/sw-hooks.ts` (broken per 01-11-SUMMARY: MV3 SW blocks dynamic import; the file's monkey-patches never run because the await import that loads them never resolves; the entire file is dead-on-arrival).
- REVERT `src/background/index.ts` lines 13-29 (the comment block + the implicit dynamic-import expectation). The current state of the file at lines 13-29 is a COMMENT block describing the expectation that no dynamic import lands here — perfect; verify the comment is accurate and concise. If any actual `if (__MOKOSH_UAT__) { await import(...) }` block exists, REMOVE it. (Current head ba5474c already has the comment-only form per inspection; this task ensures the state is clean.)
- DELETE the popup-bridge scaffolding under `tests/uat/lib/`:
- `tests/uat/lib/launch.ts` (popup-bridge launch helper; will be rewritten in Wave 2)
- `tests/uat/lib/extension.ts` (popup-bridge extension-id resolver)
- `tests/uat/lib/sw.ts` (sw.evaluate helpers — falsified per SUMMARY §2)
- `tests/uat/lib/offscreen.ts` (popup-bridge offscreen helpers)
- `tests/uat/lib/assertions.ts` (will be rewritten in Wave 2 with Approach-B primitives)
Keep `tests/uat/lib/zip.ts` (still valid — host-side jszip work).
Keep `tests/uat/lib/test-hook-contract.d.ts` (still valid — type contract mirror).
- DELETE the feasibility-research probes under `tests/uat/prototype/`:
- `tests/uat/prototype/probe_offscreen.mjs`
- `tests/uat/prototype/probe_sw.mjs`
- `tests/uat/prototype/probe_tabs.mjs`
- `tests/uat/prototype/probe_tabs2.mjs`
KEEP `tests/uat/prototype/extension-page-harness.{html,ts}` + `tests/uat/prototype/a6.test.ts` for Wave 1 promotion.
- AUDIT `tests/background/no-test-hooks-in-prod-bundle.test.ts` forbidden-string list. Current list (per the file's preamble) covers `__mokoshTest`, `simulateUserStop`, `getSegmentCount`, `setCurrentStream`, `setSegmentCountGetter`. Add: `installFakeDisplayMedia`, `uninstallFakeDisplayMedia`, `dispatchEndedOnTrack`, `__mokoshOffscreenQuery`. Remove: `simulateUserStop` (was Approach-A naming; Approach B uses `dispatchEndedOnTrack`). The total forbidden inventory after this audit: 8 strings.
- VERIFY harness.test.ts is still loadable but stale-imports do not block typecheck: it currently imports from `./lib/assertions` (deleted) etc. Wave 0 needs to handle this — simplest path: also DELETE `tests/uat/harness.test.ts` in Wave 0 since it will be entirely rewritten in Waves 1-3 around the extension-page architecture. (The standalone harness entry will land in Wave 1; the orchestrator-bundled harness in Waves 2-3.) Document the deletion in the commit body.
- After deletions: `npm run build` exits 0; `npm run build:test` exits 0; `dist/` and `dist-test/` both populated. The Tier-1 grep gate test passes against the new forbidden-string list (which it should — no production code references any hook string after the SW-side revert). The `sw-bundle-import.test.ts` flips GREEN once `npm run build` runs (it was failing because `dist/service-worker-loader.js` was stale/missing).
- Full vitest suite: 89 GREEN (88 pre-existing + 1 Tier-1 gate that was failing on stale state, now passing with updated forbidden list).
- `npx tsc --noEmit` exits 0 (the deletions remove stale imports; the harness.test.ts deletion removes its stale import chain).
</behavior>
<action>
1. Read `src/background/index.ts` lines 13-29; confirm the current state is comment-only (no `await import` block). If a dynamic-import block exists, edit to remove it; if only the comment block exists, edit to refine the comment to reflect 01-13 status: "Plan 01-13: NO SW-side test hook gate. MV3 SW blocks dynamic import (01-11 falsification 1 / Chromium es_modules.md). Approach B reads SW state via extension-internal harness page's chrome.action.* calls — see tests/uat/extension-page-harness.ts." Keep the existing Tier-1 grep gate citation.
2. Delete files:
```
rm src/test-hooks/sw-hooks.ts
rm tests/uat/lib/launch.ts
rm tests/uat/lib/extension.ts
rm tests/uat/lib/sw.ts
rm tests/uat/lib/offscreen.ts
rm tests/uat/lib/assertions.ts
rm tests/uat/prototype/probe_offscreen.mjs
rm tests/uat/prototype/probe_sw.mjs
rm tests/uat/prototype/probe_tabs.mjs
rm tests/uat/prototype/probe_tabs2.mjs
rm tests/uat/harness.test.ts
```
Use `git rm` to keep the index in sync. The harness.test.ts deletion is intentional — it will be reborn in Wave 1.
3. Edit `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS array (or whatever the existing constant is named):
- Remove: `simulateUserStop`
- Add: `installFakeDisplayMedia`, `uninstallFakeDisplayMedia`, `dispatchEndedOnTrack`, `__mokoshOffscreenQuery`
Update the file preamble to cite 01-13 (replace any "Plan 01-11 Task 1" references with "Plan 01-13 Wave 0" where the description was about the gate's CURRENT scope, NOT historical provenance — preserve historical provenance for traceability).
4. Run `npm run build` (production); confirm exit 0; confirm `dist/service-worker-loader.js` exists.
5. Run `grep -rln '__mokoshTest\|installFakeDisplayMedia\|uninstallFakeDisplayMedia\|dispatchEndedOnTrack\|getSegmentCount\|setCurrentStream\|setSegmentCountGetter\|__mokoshOffscreenQuery' dist/` → 0 matches.
6. Run `npm run build:test` (test); confirm exit 0; confirm `dist-test/` populated. Confirm `grep -rln 'installFakeDisplayMedia\|dispatchEndedOnTrack' dist-test/` → ≥1 match (offscreen-hooks chunk).
7. Run `npx tsc --noEmit` → exit 0 (no stale imports left in tests/).
8. Run `npx vitest run --reporter=dot` → 89 GREEN (the two prior failures flip; no new failures).
9. Commit atomically: `chore(01-13): wave-0 — clean broken Approach A artifacts per 01-11-SUMMARY`. Commit body cites: (a) sw-hooks.ts deletion + SW dynamic-import revert + falsification reference; (b) popup-bridge tests/uat/lib/* deletions + falsification reference; (c) feasibility probe deletions; (d) Tier-1 gate forbidden-string list update + rationale; (e) harness.test.ts deletion (will be rewritten in Wave 1).
Per project style: NO `as any`; NO `@ts-ignore`; absolute imports; extensive comments for the Tier-1 gate edit explaining the surface-inventory expansion.
</action>
<verify>
<automated>npm run build && grep -rln '__mokoshTest\|installFakeDisplayMedia\|uninstallFakeDisplayMedia\|dispatchEndedOnTrack\|getSegmentCount\|setCurrentStream\|setSegmentCountGetter\|__mokoshOffscreenQuery' dist/ | wc -l | grep -q '^0$' && npm run build:test && test -d dist-test && npx tsc --noEmit && npx vitest run --reporter=dot</automated>
</verify>
<acceptance_criteria>
- `src/test-hooks/sw-hooks.ts` does not exist.
- `src/background/index.ts` lines 13-29 are comment-only (NO `await import` block); comment text references 01-13 + the architectural constraint.
- `tests/uat/lib/{launch,extension,sw,offscreen,assertions}.ts` do not exist; `tests/uat/lib/{zip.ts,test-hook-contract.d.ts}` retained.
- `tests/uat/prototype/probe_*.mjs` do not exist; `tests/uat/prototype/{extension-page-harness.{html,ts},a6.test.ts}` retained.
- `tests/uat/harness.test.ts` does not exist (will be reborn in Wave 1).
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list contains exactly the 8 hooks per the inventory in interfaces; preamble updated to cite 01-13.
- `npm run build` exit 0; `grep -rln ...` returns 0 matches in `dist/`.
- `npm run build:test` exit 0; `dist-test/` populated.
- `npx tsc --noEmit` exit 0.
- `npx vitest run` exit 0 with 89 GREEN.
- Commit message follows Mark's `<type>(<scope>): <subject>` style with em-dash separator + `Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>` trailer.
</acceptance_criteria>
<done>Baseline GREEN; broken Approach-A artifacts deleted; Tier-1 grep gate updated for Approach-B surface inventory; ready for Wave 1 prototype promotion.</done>
</task>
<task type="auto" tdd="true">
<name>Task 2 (Wave 1): Promote c647f61 prototype to production paths; A6 stays GREEN from new path.</name>
<read_first>
- tests/uat/prototype/extension-page-harness.html (current PROTOTYPE; will be moved)
- tests/uat/prototype/extension-page-harness.ts (current PROTOTYPE; comments need 01-13 update)
- tests/uat/prototype/a6.test.ts (current PROTOTYPE; comments need 01-13 update)
- vite.test.config.ts (rollup input update needed)
- src/test-hooks/offscreen-hooks.ts (extending in Wave 3; for now confirm it works with the promoted paths — the bridge protocol is path-agnostic)
- src/test-hooks/types.ts (will be extended in Wave 3 with installFakeDisplayMedia/dispatchEndedOnTrack/uninstallFakeDisplayMedia typed fields; in Wave 1 just confirm the cross-cast in offscreen-hooks.ts still works)
</read_first>
<files>tests/uat/extension-page-harness.html, tests/uat/extension-page-harness.ts, tests/uat/a6.test.ts, tests/uat/prototype/extension-page-harness.html, tests/uat/prototype/extension-page-harness.ts, tests/uat/prototype/a6.test.ts, vite.test.config.ts</files>
<behavior>
- Move (via git mv) `tests/uat/prototype/extension-page-harness.html` → `tests/uat/extension-page-harness.html`.
- Move (via git mv) `tests/uat/prototype/extension-page-harness.ts` → `tests/uat/extension-page-harness.ts`.
- Move (via git mv) `tests/uat/prototype/a6.test.ts` → `tests/uat/a6.test.ts`.
- The `tests/uat/prototype/` directory is now EMPTY — delete it (git rm -r if needed; usually `git mv` of the contents leaves the dir untracked, in which case it's a no-op).
- Update comments in the moved files: replace "PROTOTYPE" / "Plan 01-11 PROTOTYPE" references with "Plan 01-13 production harness" where the comment was describing the file's CURRENT role (NOT its historical provenance — preserve "originally landed as 01-11 prototype at c647f61" where the comment was describing provenance).
- Update `vite.test.config.ts` rollup inputs: replace `prototype_harness: 'tests/uat/prototype/extension-page-harness.html'` with `extension_page_harness: 'tests/uat/extension-page-harness.html'`. Update the inline comment to reflect the new path (no more "prototype" reference).
- Update path references in the moved files:
- `tests/uat/extension-page-harness.html` line 9 (HTML preamble): change `chrome-extension://&lt;id&gt;/tests/uat/prototype/extension-page-harness.html` → `chrome-extension://&lt;id&gt;/tests/uat/extension-page-harness.html`.
- `tests/uat/extension-page-harness.ts`: update the file-header docstring's path reference from the prototype path to the production path. Keep the architectural narrative + research findings intact.
- `tests/uat/a6.test.ts`: update the `harnessUrl` constant (line ~176): `chrome-extension://${extensionId}/tests/uat/extension-page-harness.html` (drop `/prototype/`).
- After moves + comment updates + config update: `npm run build:test` exits 0 + emits `dist-test/tests/uat/extension-page-harness.html` (or whatever path crxjs picks; verify by `ls dist-test/`). Run `npx tsx tests/uat/a6.test.ts` → exits 0 with "A6 result: PASS" (5/5 checks GREEN).
- Full vitest suite: 89 GREEN (no unit-test regression — the moves don't touch any vitest-discovered files).
- `npx tsc --noEmit` exit 0.
</behavior>
<action>
1. `git mv tests/uat/prototype/extension-page-harness.html tests/uat/extension-page-harness.html`
2. `git mv tests/uat/prototype/extension-page-harness.ts tests/uat/extension-page-harness.ts`
3. `git mv tests/uat/prototype/a6.test.ts tests/uat/a6.test.ts`
4. After moves, `ls tests/uat/prototype/` should be empty. If empty, the directory is implicitly removed by git on next commit; no explicit `rmdir` needed.
5. Edit `tests/uat/extension-page-harness.html`:
- Update the `<p>` line referencing the file path: change `/tests/uat/prototype/extension-page-harness.html` → `/tests/uat/extension-page-harness.html`.
- Update the page title to drop "(extension-internal page)" if redundant; keep it for clarity per project verbosity style. (Planner discretion: keep the existing title or refine.)
6. Edit `tests/uat/extension-page-harness.ts`:
- File-header docstring: change "Plan 01-11 PROTOTYPE" → "Plan 01-13 production UAT harness (inherited from 01-11 prototype c647f61 per 01-11-SUMMARY architectural pivot)".
- Update the path reference in the docstring from `tests/uat/prototype/extension-page-harness.html` to `tests/uat/extension-page-harness.html`.
- Keep ALL the existing assertA6 implementation, the helper functions (waitFor, sendMessageWithTimeout, ensureOffscreen, startRecording, offscreenQuery, getActiveNotificationCount), the architectural-finding comment block, and the global Window augmentation. These are the load-bearing code; do not modify their logic.
- The `window.__mokoshHarness` install at the bottom should already only expose `assertA6` — leave as-is; Wave 3 will extend it.
7. Edit `tests/uat/a6.test.ts`:
- File-header docstring: "Plan 01-11 PROTOTYPE" → "Plan 01-13 standalone A6 entry point for TDD iteration".
- Update `harnessUrl` constant (line ~176): drop `/prototype/`.
- Keep ALL the puppeteer launch + page + result-print + main entry logic. These are the load-bearing test plumbing.
8. Edit `vite.test.config.ts`:
- Replace `prototype_harness: 'tests/uat/prototype/extension-page-harness.html'` with `extension_page_harness: 'tests/uat/extension-page-harness.html'`.
- Update the surrounding comment to reflect the new path + rename.
- Preserve the `modulePreload: { polyfill: false }` line (CRITICAL SW FIX per 01-11-SUMMARY).
9. Run `npm run build:test` → exits 0; verify `ls dist-test/` shows the harness HTML emitted under the expected path (likely `dist-test/tests/uat/extension-page-harness.html` per crxjs conventions; the exact path is verified by inspection).
10. Run `npx tsx tests/uat/a6.test.ts` → exits 0 with PASS report. (If FAIL: triage immediately — the move broke something. Most likely culprit: the harness page can't load because the rollup emission path differs from the URL the test fetches; cross-check `ls dist-test/` against the URL in a6.test.ts:176 and align.)
11. Run `npx tsc --noEmit` → exit 0.
12. Run `npx vitest run --reporter=dot` → 89 GREEN.
13. Run `npm run build && grep -rln '__mokoshTest\|installFakeDisplayMedia\|dispatchEndedOnTrack' dist/ | wc -l` → 0 (Tier-1 grep gate stays GREEN; the moves don't touch production code).
14. Commit atomically: `feat(01-13): wave-1 — promote c647f61 prototype to production paths; A6 GREEN`. Commit body: lists each file move, the comment updates, the vite config update, and the verification that A6 still passes 5/5 from the new path.
</action>
<verify>
<automated>npm run build:test && npx tsc --noEmit && npx tsx tests/uat/a6.test.ts && npx vitest run --reporter=dot && npm run build && test "$(grep -rln '__mokoshTest\|installFakeDisplayMedia\|dispatchEndedOnTrack' dist/ 2>/dev/null | wc -l)" = "0"</automated>
</verify>
<acceptance_criteria>
- `tests/uat/extension-page-harness.html` + `tests/uat/extension-page-harness.ts` + `tests/uat/a6.test.ts` exist at production paths; comments updated to reference 01-13.
- `tests/uat/prototype/` is empty/removed.
- `vite.test.config.ts` `rollupOptions.input.extension_page_harness` points at the new path.
- `npx tsx tests/uat/a6.test.ts` exits 0 with "A6 result: PASS" + 5/5 checks GREEN.
- `npm run build:test` exit 0; `npm run build` exit 0; production grep gate stays GREEN.
- `npx tsc --noEmit` exit 0; `npx vitest run` 89 GREEN.
- Commit message follows Mark's style.
</acceptance_criteria>
<done>Prototype promoted to production paths; A6 functional; baseline preserved; ready for Wave 2 driver scaffolding.</done>
</task>
<task type="auto" tdd="true">
<name>Task 3 (Wave 2): Build out Approach-B harness driver utilities (launch + assertions + harness-page-driver); A6 still GREEN via new driver.</name>
<read_first>
- tests/uat/a6.test.ts (the standalone driver — the model for what launch.ts + harness-page-driver.ts will abstract)
- tests/uat/extension-page-harness.ts (the surface to call via harness-page-driver)
- tests/uat/lib/zip.ts (kept from 01-11; harness-side jszip work — confirm compat)
- tests/uat/lib/test-hook-contract.d.ts (kept from 01-11; type mirror)
- src/test-hooks/offscreen-hooks.ts (the bridge protocol — confirm the harness-page-driver's `evaluate` calls match the offscreen-hooks bridge ops)
</read_first>
<files>tests/uat/lib/launch.ts, tests/uat/lib/assertions.ts, tests/uat/lib/harness-page-driver.ts</files>
<behavior>
- `tests/uat/lib/launch.ts` (NEW): exports `launchHarnessBrowser(options?: { headless?: boolean; downloadsDir?: string }): Promise<HarnessHandles>` returning `{ browser, extensionId, harnessPage, victimPage, downloadsDir, swConsole, offConsole }`. Implementation mirrors `tests/uat/a6.test.ts` launchChrome + victim/harness page setup verbatim, refactored to a reusable helper. `downloadsDir` defaults to `mkdtempSync(join(tmpdir(), 'mokosh-uat-'))`. Wires Chrome download path via CDP `Browser.setDownloadBehavior` so A5 SAVE_ARCHIVE downloads land in `downloadsDir`. `swConsole`/`offConsole` are accumulating string[] buffers populated by `worker.on('console', ...)` + `target.on('targetcreated', ...)` (best-effort offscreen attach per prototype pattern).
- `tests/uat/lib/assertions.ts` (REWRITTEN): exports `runAssertion(name, fn, { consoleBuffers })` (wraps a single assertion with try/catch + diagnostic dump on failure), `assertEqual`/`assertGte`/`assertMatch`/`assertTrue` (structured failure messages; use `node:assert/strict` under the hood), `waitFor(probe, predicate, timeoutMs, description)` (mirrors prototype's polling primitive verbatim — extract from extension-page-harness.ts into shared lib so both harness-page and host-side can use it). Define `AssertionRecord` + `ConsoleBuffers` types.
- `tests/uat/lib/harness-page-driver.ts` (NEW): exports one driver function per assertion: `driveA1(page)`, `driveA2(page)`, ..., `driveA13(page)`. Each is a thin wrapper around `page.evaluate(() => window.__mokoshHarness.assertXX())` that returns the structured `AssertionResult` (or the extended shape for A5/A12/A13 with `zipBytes`/`webmBytes`). Centralizing this means adding/renaming an assertion = two-file edit (extension-page-harness.ts impl + this driver wrapper) instead of touching every place that calls it.
- Wave 2 ONLY wires `driveA6`. Driver wrappers for A1-A5, A7-A13 are stubbed (`throw new Error('NOT YET IMPLEMENTED — Wave 3 wires this')`) so Wave 3 fills them in.
- Rewrite `tests/uat/a6.test.ts` to use `launchHarnessBrowser` + `driveA6` (drops ~80 LoC of plumbing duplicated from launch.ts). The test stays GREEN — same A6 5/5 PASS outcome, but via the shared lib.
- `npx tsc --noEmit` exit 0; `npx tsx tests/uat/a6.test.ts` exit 0 with PASS report.
- `npm run build:test` exit 0; `npm run build` exit 0; Tier-1 grep gate GREEN.
- Full vitest suite: 89 GREEN.
</behavior>
<action>
1. Create `tests/uat/lib/launch.ts`:
```typescript
// tests/uat/lib/launch.ts — Plan 01-13 Wave 2.
//
// Approach-B harness launch helper. Inherits the Puppeteer launch +
// victim-page-bringToFront + harness-page-open pattern from the proven
// tests/uat/a6.test.ts prototype (commit c647f61). Refactored into a
// reusable helper so Wave 3's 13 assertion drivers share the same
// setup overhead (one Chrome launch + one harness page + one victim
// page per `npm run test:uat` run).
//
// Architectural commitments (per 01-11-SUMMARY):
// - Drive Chrome FROM INSIDE: harnessPage runs at
// chrome-extension://<id>/tests/uat/extension-page-harness.html
// with full chrome.* API access.
// - victimPage is about:blank brought to front so production
// chrome.tabs.query({active:true}) sees a real tab (the harness
// page itself is a chrome-extension:// URL with no .url surfaced
// without `tabs` permission — workaround for the missing-permission
// gap; flagged for Phase 5 hardening).
// - Downloads land in a per-run tmp dir (mkdtempSync) so A5 polling
// does not collide with operator downloads.
// - SW + offscreen consoles forwarded to swConsole/offConsole
// accumulating buffers (best-effort; offscreen attach via
// targetcreated listener — opportunistic per prototype pattern).
//
// References:
// - puppeteer.launch options: https://pptr.dev/api/puppeteer.launchoptions
// - CDP Browser.setDownloadBehavior:
// https://chromedevtools.github.io/devtools-protocol/tot/Browser/#method-setDownloadBehavior
import { mkdtempSync, existsSync, statSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { dirname, join, resolve as resolvePath } from 'node:path';
import { fileURLToPath } from 'node:url';
import puppeteer, { type Browser, type Page } from 'puppeteer';
const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..', '..');
const DIST_TEST_DIR = resolvePath(REPO_ROOT, 'dist-test');
export interface HarnessHandles {
browser: Browser;
extensionId: string;
harnessPage: Page;
victimPage: Page;
downloadsDir: string;
swConsole: string[];
offConsole: string[];
}
export interface LaunchOptions {
headless?: boolean;
downloadsDir?: string;
}
export async function launchHarnessBrowser(opts: LaunchOptions = {}): Promise<HarnessHandles> {
// ... implementation per the a6.test.ts pattern, refactored.
// 1. assertBundlePresent() — fail loudly if dist-test/ missing.
// 2. puppeteer.launch with enableExtensions + protocolTimeout + args.
// 3. resolve extensionId from browser.extensions() (poll up to 5s).
// 4. mkdtempSync the downloadsDir (if not provided).
// 5. open victimPage about:blank + bringToFront.
// 6. open harnessPage at chrome-extension://<id>/tests/uat/extension-page-harness.html.
// 7. page.waitForFunction for window.__mokoshHarness presence (5s timeout).
// 8. wire SW console listener (worker.on('console', ...)) into swConsole buffer.
// 9. wire offscreen console listener via browser.on('targetcreated', ...) opportunistically.
// 10. configure Chrome to use downloadsDir via CDP Browser.setDownloadBehavior on harnessPage's CDPSession.
// 11. return HarnessHandles.
}
```
Implement the function body per the in-comment plan. Extract verbatim from `tests/uat/a6.test.ts` lines 60-265 (the launch + victim + harness setup + console wiring blocks). Add the CDP Browser.setDownloadBehavior call (NEW — not in prototype which doesn't need downloads). Use absolute imports per project style; extensive docstrings; named callbacks for the on('console') / on('targetcreated') listeners.
2. Create `tests/uat/lib/assertions.ts`:
```typescript
// tests/uat/lib/assertions.ts — Plan 01-13 Wave 2.
// Host-side assertion primitives. Re-exports of node:assert/strict
// with structured failure messages + diagnostic-dump wrappers.
//
// NO chrome.* helpers — all chrome.* work happens inside the
// extension-internal harness page (see tests/uat/extension-page-harness.ts).
// This module is host-side ONLY.
import * as assert from 'node:assert/strict';
export interface CheckRecord {
name: string;
expected: unknown;
actual: unknown;
passed: boolean;
}
export interface AssertionRecord {
passed: boolean;
name: string;
checks: CheckRecord[];
diagnostics: string[];
error?: string;
}
export interface ConsoleBuffers {
swConsole: string[];
offConsole: string[];
}
export async function runAssertion(
name: string,
fn: () => Promise<AssertionRecord>,
buffers: ConsoleBuffers,
): Promise<AssertionRecord> { /* ... try/catch + diagnostic dump ... */ }
export function assertEqual(actual: unknown, expected: unknown, msg: string): void { /* assert.deepStrictEqual wrapper */ }
export function assertGte(actual: number, expected: number, msg: string): void { /* ... */ }
export function assertMatch(actual: string, regex: RegExp, msg: string): void { /* ... */ }
export function assertTrue(cond: boolean, msg: string): void { /* ... */ }
export async function waitFor<T>(
probe: () => Promise<T> | T,
predicate: (v: T) => boolean,
timeoutMs: number,
description: string,
): Promise<T> { /* mirrors prototype's waitFor verbatim — poll every 100ms */ }
```
Implement per the surface description. Extract `waitFor` verbatim from `tests/uat/extension-page-harness.ts`'s implementation (lines ~84-103). The host-side `waitFor` and the harness-page-side `waitFor` will be IDENTICAL implementations — that's fine; the page-side is bundled into the harness HTML, the host-side runs in the Node process. No shared module between them.
3. Create `tests/uat/lib/harness-page-driver.ts`:
```typescript
// tests/uat/lib/harness-page-driver.ts — Plan 01-13 Wave 2.
// Driver wrappers — one per assertion. Each wraps a
// page.evaluate(() => window.__mokoshHarness.assertXX()) call.
//
// Wave 2 wires driveA6 (the proven assertion from c647f61).
// Wave 3 wires driveA1..A5, A7..A13 (replaces NOT YET IMPLEMENTED stubs).
import type { Page } from 'puppeteer';
import type { AssertionRecord } from './assertions';
// For A5/A12/A13 the page side returns extra fields beyond AssertionRecord:
export interface AssertionWithBytes {
passed: boolean;
name: string;
checks: Array<{ name: string; expected: unknown; actual: unknown; passed: boolean }>;
diagnostics: string[];
error?: string;
bytesBase64?: string;
expectedVersion?: string;
}
export async function driveA6(page: Page): Promise<AssertionRecord> {
return page.evaluate(async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- browser context
const r = await (window as any).__mokoshHarness.assertA6();
return r;
}) as Promise<AssertionRecord>;
}
export async function driveA1(page: Page): Promise<AssertionRecord> {
throw new Error('NOT YET IMPLEMENTED — Wave 3A wires this');
}
// ... similarly: driveA2, driveA3, driveA4, driveA5, driveA7, driveA8, driveA9, driveA10, driveA11, driveA12, driveA13
```
Each Wave-3 stub throws "NOT YET IMPLEMENTED — Wave 3<X> wires this" where <X> is the bundle letter (A/B/C/D).
4. Rewrite `tests/uat/a6.test.ts` to use the new lib:
```typescript
import { launchHarnessBrowser } from './lib/launch';
import { driveA6 } from './lib/harness-page-driver';
import { runAssertion } from './lib/assertions';
async function main(): Promise<number> {
const handles = await launchHarnessBrowser();
try {
const result = await runAssertion('A6 — Bug B canonical', () => driveA6(handles.harnessPage), {
swConsole: handles.swConsole,
offConsole: handles.offConsole,
});
// ... pretty-print + exit code 0 on PASS, 1 on FAIL ...
} finally {
await handles.browser.close();
}
}
const code = await main();
process.exit(code);
```
Preserve the printResult helper from the original (or move it into lib/assertions.ts as a shared `printAssertionResult` function — planner discretion; planner recommends moving it to lib for Wave 3 reuse).
5. Run `npx tsc --noEmit` → exit 0 (the new lib files typecheck against puppeteer + node types).
6. Run `npx tsx tests/uat/a6.test.ts` → exits 0 with "A6 result: PASS 5/5" (the rewrite is behavior-preserving).
7. Run `npm run build` → exit 0; `grep -rln 'launchHarnessBrowser\|driveA6\|runAssertion' dist/ | wc -l` → 0 (lib files are tests-only, not bundled into dist/).
8. Run `npm run build:test` → exit 0; the lib files are NOT bundled (they're host-side; vite-test-config only includes the extension-page-harness.html as rollup input).
9. Run `npx vitest run --reporter=dot` → 89 GREEN.
10. Commit atomically (or as 3-4 sub-commits — planner discretion):
- `feat(01-13): wave-2 — launchHarnessBrowser + assertions + harness-page-driver scaffolding` (single commit recommended; the three files form one coherent unit).
Commit body: lists each new file's surface; documents the a6.test.ts rewrite as behavior-preserving; cites Wave 3 wiring contract (`driveAXX` stubs throw "NOT YET IMPLEMENTED — Wave 3<X> wires this").
</action>
<verify>
<automated>npx tsc --noEmit && npx tsx tests/uat/a6.test.ts && npm run build && test "$(grep -rln 'launchHarnessBrowser\|driveA6\|runAssertion' dist/ 2>/dev/null | wc -l)" = "0" && npm run build:test && npx vitest run --reporter=dot</automated>
</verify>
<acceptance_criteria>
- `tests/uat/lib/launch.ts` exists with `launchHarnessBrowser` per the surface description; uses CDP Browser.setDownloadBehavior for downloads dir.
- `tests/uat/lib/assertions.ts` exists with `runAssertion`, `assertEqual`/`Gte`/`Match`/`True`, `waitFor`, and `AssertionRecord`/`ConsoleBuffers` types.
- `tests/uat/lib/harness-page-driver.ts` exists with `driveA6` wired + 12 Wave-3 stubs throwing "NOT YET IMPLEMENTED — Wave 3<X> wires this".
- `tests/uat/a6.test.ts` rewritten to use the new lib; PASSES 5/5.
- `npx tsc --noEmit` exit 0; `npx tsx tests/uat/a6.test.ts` exit 0; full vitest 89 GREEN.
- `npm run build` exit 0; production bundle does NOT contain any of the new lib symbol names (Tier-1 grep gate GREEN).
- Commit message follows Mark's style.
</acceptance_criteria>
<done>Approach-B driver scaffolding live; A6 still PASSES through the new lib; Wave 3 stubs ready to be filled in.</done>
</task>
<task type="auto" tdd="true">
<name>Task 4 (Wave 3A): Wire A1+A2+A3+A4 (SW bootstrap + toolbar onClicked + displaySurface monitor + popup during recording); + create harness.test.ts orchestrator with A0 grep gate.</name>
<read_first>
- tests/uat/extension-page-harness.ts (the surface where A1-A4 impl lands)
- tests/uat/lib/harness-page-driver.ts (the driver stubs to wire)
- tests/uat/lib/launch.ts (HarnessHandles shape — what the orchestrator gets)
- tests/uat/lib/assertions.ts (runAssertion + printAssertionResult)
- src/background/index.ts lines 75-108 (state machine — A1+A2+A4 contract)
- src/background/index.ts lines 411-415 (setRecordingMode call inside startVideoCapture)
- src/background/index.ts lines 844-878 (chrome.action.onClicked + onStartup listener registrations)
- src/offscreen/recorder.ts lines 270-296 (getDisplayMedia + post-grant displaySurface monitor enforcement — A3 contract)
- tests/background/no-test-hooks-in-prod-bundle.test.ts (the grep gate the harness.test.ts A0 re-verifies as belt-and-suspenders)
</read_first>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts, tests/uat/harness.test.ts</files>
<behavior>
- Extend `window.__mokoshHarness` with `assertA1`, `assertA2`, `assertA3`, `assertA4` methods, each returning a structured `AssertionResult`.
- A1 (SW bootstrap state): query `chrome.action.getBadgeText({})` (expect ''), `chrome.action.getPopup({})` (expect '' — idle mode per src/background/index.ts:110). isRecording check: send `chrome.runtime.sendMessage({type:'PING_STATE'})` to a NEW handler we add — OR — infer from badge state (badge==='' implies idle implies isRecording=false per the state machine). Recommend the badge-proxy approach (no production code change; the state-machine contract makes badge an accurate proxy). PASSES today.
- A2 (toolbar onClicked → REC): send `START_RECORDING` directly to offscreen (workaround for missing `tabs` permission, per prototype pattern). The production chrome.action.onClicked → startVideoCapture path needs `tabs` permission to query `chrome.tabs.query({active:true,...})` for a real tab; the harness page bypasses this by sending START_RECORDING to offscreen + manually setting badge='REC' + popup=popup.html (mimicking what setRecordingMode would do). Then assert getBadgeText==='REC' + getPopup==='src/popup/index.html'. The contract verified is: when START_RECORDING reaches offscreen, recording starts; the SW-side production state-machine transitions (setBadgeState, setPopup) are tested in unit tests (badge-state-machine.test.ts) and don't need re-verification here. Document the workaround clearly in the assertA2 impl comment + flag for Phase 5 hardening (tabs permission addition).
- A3 (displaySurface monitor): with A2's recording active, read displaySurface via `chrome.runtime.sendMessage({type:'__mokoshOffscreenQuery', op:'get-display-surface'})` — ADD this op to offscreen-hooks.ts in Wave 3D (since it's also needed for A11). For Wave 3A, use a workaround: query `window.__mokoshHarness.getCurrentDisplaySurface()` which can't work (page doesn't have offscreen access)... CORRECTION: add a new bridge op `get-display-surface` to offscreen-hooks.ts in THIS task (Wave 3A — not 3D). The op returns the value of `currentStream.getVideoTracks()[0].getSettings().displaySurface`. Assert === 'monitor' (per src/offscreen/recorder.ts:296 enforcement: production code throws and tears down the stream if observed !== 'monitor', so if recording is live, displaySurface is guaranteed monitor; the assertion confirms the offscreen-hooks fake stream's monkey-patched getSettings() correctly returns 'monitor').
- A4 (popup during recording): with A2's recording active, attempt to read getPopup (should be 'src/popup/index.html' from A2). Trigger NOTHING that would create a second offscreen (no second START_RECORDING). Verify: getPopup unchanged. No offscreen-creation path — assert the offscreen document count via `chrome.offscreen.hasDocument()` returns true (recording's offscreen is the only one).
- Wire `driveA1`/`driveA2`/`driveA3`/`driveA4` in `tests/uat/lib/harness-page-driver.ts` (replace the NOT YET IMPLEMENTED stubs).
- Create `tests/uat/harness.test.ts` (NEW — was deleted in Wave 0):
```typescript
// tests/uat/harness.test.ts — Plan 01-13 Wave 3.
// Top-to-bottom orchestrator for all 14 assertions (A0 + A1..A13).
// ...
```
Wave 3A wires A0+A1+A2+A3+A4; stubs A5+A7+A8+A9+A10+A11+A12+A13 as `throw new Error('NOT YET IMPLEMENTED — Wave 3<X> wires this')`. A6 uses the proven `driveA6` from Wave 2. Bail-on-first-failure; exit 0 only when 14/14 GREEN.
A0 (production-bundle grep gate): pre-flight. Run `npm run build` (or skip via `SKIP_PROD_REBUILD=1`); grep `dist/` for the 8 forbidden hook strings; assert 0 matches. This runs BEFORE Chrome launches.
- Add bridge op `get-display-surface` to `src/test-hooks/offscreen-hooks.ts` (Wave 3A scope creep, BUT necessary for A3 — alternative is duplicating get-current-stream + .getSettings() work in the harness page which is uglier). Document the addition; update the offscreen-hooks comment block to reflect the protocol expansion.
- Also extend `MokoshTestSurface` in `src/test-hooks/types.ts` to include typed fields `installFakeDisplayMedia?`, `uninstallFakeDisplayMedia?`, `dispatchEndedOnTrack?` so the offscreen-hooks `as MokoshTestSurface & {...}` cross-cast collapses to a clean assignment. (Carries the type-cleanup that Wave 1 didn't get to because Wave 1 was move-only.)
- Update `tests/uat/lib/test-hook-contract.d.ts` to mirror the type extension.
- Tier-1 grep gate: ensure `dist/` stays clean of the new bridge op string `get-display-surface` (add to FORBIDDEN_STRINGS list).
- After this task: `npm run test:uat` exits non-zero; diagnostic: "5/14 passed: A0, A1, A2, A3, A4 GREEN; A5..A13 NOT YET IMPLEMENTED". A6 PASSES via the proven driveA6 — so technically 6/14 passed including A6; phrasing in the diagnostic: "6/14 GREEN, 8 NOT YET IMPLEMENTED".
</behavior>
<action>
1. Add `get-display-surface` bridge op to `src/test-hooks/offscreen-hooks.ts`:
Inside the existing `chrome.runtime.onMessage.addListener` block, add a new `if (op === 'get-display-surface')` branch. Returns `{ displaySurface: currentStream?.getVideoTracks()[0]?.getSettings().displaySurface ?? null }`. Document the op in the protocol comment block at lines ~297-303.
2. Extend `MokoshTestSurface` in `src/test-hooks/types.ts`:
Add `installFakeDisplayMedia?: () => void;`, `uninstallFakeDisplayMedia?: () => void;`, `dispatchEndedOnTrack?: () => { ok: boolean; error?: string };` as typed fields. Update the JSDoc to note these are offscreen-only (undefined in SW isolate — but the SW isolate doesn't get hooks in Approach B anyway; the fields are present-but-inert just like the existing handlers fields).
3. Update `tests/uat/lib/test-hook-contract.d.ts` to mirror the type extension.
4. Collapse the cross-cast in `src/test-hooks/offscreen-hooks.ts` lines ~284-288 (the `as MokoshTestSurface & {...}` block) to a clean `as MokoshTestSurface` since the type now includes the methods.
5. Extend `tests/uat/extension-page-harness.ts` `window.__mokoshHarness` with `assertA1`, `assertA2`, `assertA3`, `assertA4` methods. Each follows the assertA6 pattern: AssertionResult shape with `passed`, `name`, `checks[]`, `diagnostics[]`, `error?`. Specifically:
- **assertA1**: queries `chrome.action.getBadgeText({})` + `chrome.action.getPopup({})` + verifies `isRecording=false` via badge-proxy (`badge !== 'REC'` implies isRecording=false). Each check is a CheckRecord. PASS if all 3 checks pass.
- **assertA2**: ensure offscreen + send START_RECORDING + manually setBadge('REC') + setPopup('src/popup/index.html') + waitFor getBadgeText==='REC' + assert popup==='src/popup/index.html'. Document workaround inline (chrome.tabs permission gap).
- **assertA3**: assumes A2 left recording active. Bridge-query `get-display-surface`. Assert === 'monitor'.
- **assertA4**: assumes A2 left recording active. Snapshot getPopup (expect 'src/popup/index.html'). Verify chrome.offscreen.hasDocument === true (recording's offscreen is the only one). No new offscreen creation attempted (the production toolbar-click-during-recording path is no-op per src/background/index.ts:863-866).
6. Wire `driveA1`/`driveA2`/`driveA3`/`driveA4` in `tests/uat/lib/harness-page-driver.ts` (replace stubs).
7. Create `tests/uat/harness.test.ts`:
```typescript
// tests/uat/harness.test.ts — Plan 01-13 Wave 3 orchestrator.
// ...
import { execFileSync } from 'node:child_process';
import { readdirSync, readFileSync, statSync } from 'node:fs';
import { join, resolve as resolvePath } from 'node:path';
import { dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
import { launchHarnessBrowser } from './lib/launch';
import { driveA1, driveA2, driveA3, driveA4, driveA5, driveA6, driveA7, driveA8, driveA9, driveA10, driveA11, driveA12, driveA13 } from './lib/harness-page-driver';
import { runAssertion } from './lib/assertions';
// FORBIDDEN_STRINGS used by A0 (mirror of tests/background/no-test-hooks-in-prod-bundle.test.ts inventory):
const FORBIDDEN_HOOK_STRINGS = [
'__mokoshTest', 'setCurrentStream', 'setSegmentCountGetter',
'installFakeDisplayMedia', 'uninstallFakeDisplayMedia',
'dispatchEndedOnTrack', 'getSegmentCount', '__mokoshOffscreenQuery',
'get-display-surface',
];
async function assertA0_GrepGate(): Promise<{passed: boolean; matches: string[]}> {
// Skip prod rebuild if SKIP_PROD_REBUILD=1; otherwise run `npm run build`.
if (process.env.SKIP_PROD_REBUILD !== '1') {
execFileSync('npm', ['run', 'build'], { stdio: 'inherit' });
}
const distDir = resolvePath(dirname(fileURLToPath(import.meta.url)), '..', '..', 'dist');
const matches: string[] = [];
// Recursive grep walk; for each file under dist/, check each forbidden string.
// Implementation per tests/background/no-test-hooks-in-prod-bundle.test.ts pattern.
// ...
return { passed: matches.length === 0, matches };
}
async function main(): Promise<number> {
// Pre-flight A0:
const a0 = await assertA0_GrepGate();
if (!a0.passed) {
console.error(`A0 FAIL: production bundle hook-string leak. Matches:\n${a0.matches.join('\n')}`);
return 1;
}
console.log('A0: GREEN (production bundle hook-free)');
const handles = await launchHarnessBrowser();
const buffers = { swConsole: handles.swConsole, offConsole: handles.offConsole };
const results: Array<{ name: string; passed: boolean; }> = [];
const drivers = [
{ name: 'A1', drive: driveA1 },
{ name: 'A2', drive: driveA2 },
{ name: 'A3', drive: driveA3 },
{ name: 'A4', drive: driveA4 },
{ name: 'A5', drive: driveA5 },
{ name: 'A6', drive: driveA6 },
{ name: 'A7', drive: driveA7 },
{ name: 'A8', drive: driveA8 },
{ name: 'A9', drive: driveA9 },
{ name: 'A10', drive: driveA10 },
{ name: 'A11', drive: driveA11 },
{ name: 'A12', drive: driveA12 },
{ name: 'A13', drive: driveA13 },
];
try {
for (const { name, drive } of drivers) {
try {
const result = await runAssertion(name, () => drive(handles.harnessPage), buffers);
results.push({ name, passed: result.passed });
if (!result.passed) {
// bail-on-first-failure
break;
}
} catch (err) {
// NOT YET IMPLEMENTED is the Wave-stub error; counts as a fail
results.push({ name, passed: false });
break;
}
}
} finally {
await handles.browser.close();
}
const passed = results.filter(r => r.passed).length;
const total = drivers.length + 1; // +1 for A0
console.log(`\nUAT harness: ${passed + 1}/${total} assertions passed`);
return passed === drivers.length ? 0 : 1;
}
const code = await main();
process.exit(code);
```
Implementation per the pseudocode. NO `as any`; absolute imports; extensive comments. The bail-on-first-failure semantics + structured diagnostic dump matches the prototype pattern. The optional `--only=A6` CLI arg (planner's discretion to include or defer to Wave 3D) lets developers run a single assertion for iteration.
8. Verify Tier-1 grep gate updates: edit `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list to add `get-display-surface` (the new bridge op).
9. Run `npm run build` → exit 0; grep gate stays GREEN (the new offscreen-hooks bridge op is gated behind `__MOKOSH_UAT__`; tree-shaken from production).
10. Run `npm run build:test` → exit 0; the offscreen chunk in dist-test/ contains `get-display-surface`.
11. Run `npx tsx tests/uat/harness.test.ts` → exits 1 (Wave 3B+ stubs throw); diagnostic shows "6/14 GREEN: A0+A1+A2+A3+A4+A6; 8 NOT YET IMPLEMENTED" (Wave 3 wires the rest). The first NOT YET IMPLEMENTED stop is A5 — bail-on-first-failure; the catch in main() handles this gracefully.
12. Run `npx tsx tests/uat/a6.test.ts` standalone → still exits 0 (5/5 PASS) — proves the standalone iteration entry still works.
13. Run `npx tsc --noEmit` → exit 0.
14. Run `npx vitest run --reporter=dot` → 89 GREEN.
15. RED-on-regression demos (commit body — light-touch since these aren't the canonical TDD demos; those land in 3B+3C):
- A1: locally `chrome.action.setPopup({popup: 'foo.html'})` from a probe before launching harness → A1 should FAIL on the getPopup==='' check. Revert; PASS.
- A2: locally short-circuit START_RECORDING in offscreen → A2 should FAIL with timeout. Revert; PASS.
- A3: locally remove the displaySurface monkey-patch in offscreen-hooks.ts:179-186 → A3 should FAIL (displaySurface is undefined for raw canvas captureStream tracks). Revert; PASS.
- A4: locally introduce a getPopup mutation in chrome.action.onClicked handler — actually skip A4 RED demo, the assertion is essentially a no-op verification.
Document at least 2 of the 4 in the commit body.
16. Commit atomically: `feat(01-13): wave-3A — A1+A2+A3+A4 + harness orchestrator + A0 grep gate`. Body lists assertions wired, the bridge op addition, the type extension, the harness orchestrator structure, RED demos cited.
</action>
<verify>
<automated>npx tsc --noEmit &amp;&amp; npm run build &amp;&amp; test "$(grep -rln 'get-display-surface' dist/ 2&gt;/dev/null | wc -l)" = "0" &amp;&amp; npm run build:test &amp;&amp; (set +e; npx tsx tests/uat/harness.test.ts; test $? -ne 0) &amp;&amp; npx tsx tests/uat/a6.test.ts &amp;&amp; npx vitest run --reporter=dot</automated>
</verify>
<acceptance_criteria>
- `window.__mokoshHarness` exposes assertA1/A2/A3/A4 (plus the existing assertA6).
- `tests/uat/lib/harness-page-driver.ts` wires driveA1/A2/A3/A4 (driveA6 still wired; A5+A7..A13 stay stubbed).
- `tests/uat/harness.test.ts` exists; A0 + A1 + A2 + A3 + A4 + A6 GREEN (= 6/14); A5/A7..A13 throw NOT-YET-IMPLEMENTED; bail-on-first-failure stops at A5.
- `tests/uat/a6.test.ts` standalone still PASSES (5/5).
- `src/test-hooks/offscreen-hooks.ts` adds `get-display-surface` bridge op.
- `src/test-hooks/types.ts` extends MokoshTestSurface with installFakeDisplayMedia / uninstallFakeDisplayMedia / dispatchEndedOnTrack typed fields; offscreen-hooks.ts cross-cast collapsed.
- `tests/uat/lib/test-hook-contract.d.ts` mirrors the type extension.
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list includes `get-display-surface`.
- `npm run build` exit 0; Tier-1 grep gate GREEN (no hook strings in dist/).
- `npm run build:test` exit 0; offscreen chunk in dist-test/ contains the new bridge op.
- `npx tsc --noEmit` exit 0; vitest 89 GREEN.
- At least 2 RED-on-regression demos documented in commit body.
</acceptance_criteria>
<done>Wave 3A landed: 6/14 GREEN; state-machine + recording + display-surface + popup contracts verified; ready for Wave 3B (Bug B canonical).</done>
</task>
<task type="auto" tdd="true">
<name>Task 5 (Wave 3B): Wire A5+A6+A7 (SAVE_ARCHIVE download + Bug B canonical regression rewind + genuine error path).</name>
<read_first>
- tests/uat/extension-page-harness.ts (the surface where A5/A7 land; A6 already wired)
- tests/uat/lib/harness-page-driver.ts (driveA5/A6/A7 stubs to wire)
- tests/uat/lib/zip.ts (host-side jszip work for A5 archive validation)
- tests/uat/lib/launch.ts (downloadsDir from HarnessHandles)
- src/background/index.ts lines 725-794 (RECORDING_ERROR handler + Bug B routing — A6+A7 contract)
- src/background/index.ts lines 730-734 (SAVE_ARCHIVE handler — A5 contract)
- src/offscreen/recorder.ts lines 489-525 (onUserStoppedSharing — A6's dispatch-ended target)
- .planning/debug/resolved/01-09-recovery-flow.md (Bug B canonical debug record — A6's exact contract)
</read_first>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts</files>
<behavior>
- A5 (SAVE_ARCHIVE download): with recording active from A2, send `chrome.runtime.sendMessage({type:'SAVE_ARCHIVE'})`. The SW handler triggers the production save-archive flow (saveArchive in src/background/index.ts:731) which calls `chrome.downloads.download(...)`. The download lands in `handles.downloadsDir` (configured at launch via CDP Browser.setDownloadBehavior). Host-side polling: the assertA5 method returns the zip bytes via base64 (the page can `fetch(blob URL)` BUT cannot read the downloads dir directly — alternative: have the page send a runtime message to capture the archive bytes BEFORE the download; the production saveArchive produces the zip via JSZip and triggers download(url). For test purposes the cleanest path is: harness page calls a new SW bridge op `__mokoshSwQuery` with op `save-archive-to-bytes` that runs the same archive creation logic but returns the bytes via sendMessage instead of triggering download). CORRECTION: simpler — keep the production saveArchive path; host-side polls `handles.downloadsDir` for `session_report_*.zip` for up to 15s; reads bytes from disk; assertion 5's page-side method returns `{passed: true}` once SW sendMessage resolves, host-side does the file-system check. The driveA5 wrapper handles both — page returns trigger ack, host returns AssertionRecord including the bytes via fs.readFileSync.
- **A6 (BUG B canonical) — ALREADY PROVEN**: leave the existing assertA6 implementation untouched. It works (c647f61 5/5 GREEN). Wave 3B's commit body documents the RED-on-regression demo cycle per the contract.
- A7 (genuine error → ERR + recovery notification): start a fresh recording (A6 stopped it). Snapshot notificationCount via `chrome.notifications.getAll(...)`. Send `chrome.runtime.sendMessage({type:'RECORDING_ERROR', error: 'codec-unsupported'})`. Wait 200ms. Assert: badge='ERR'; popup='src/popup/index.html'; notificationCount delta === 1; the last notification id starts with `mokosh-recovery-`. PASSES today.
- Wire driveA5/A7 (A6 already wired); harness.test.ts orchestrator advances through A5+A6+A7 GREEN (= 9/14 with A0+A1+A2+A3+A4+A5+A6+A7); A8..A13 still stubbed.
- **MANDATORY commit-body documentation: A6 RED-on-regression demo cycle.** The executor LOCALLY (not committed): edits `src/background/index.ts:776` from `if (errorCode === 'user-stopped-sharing')` to `if (false)`. Rebuilds `npm run build:test`. Runs `npm run test:uat` (or `npx tsx tests/uat/a6.test.ts`). A6 FAILS with diagnostic: "A6.1: badge text is '' (NOT 'ERR') after user-stop — expected '', actual 'ERR'". Reverts `git checkout -- src/background/index.ts`. Rebuilds. Re-runs. A6 PASSES 5/5. Documents the exact diagnostic + cycle in the commit body. This is the canonical Bug B regression catch — load-bearing for the plan's success criteria.
</behavior>
<action>
1. Extend `window.__mokoshHarness` in `tests/uat/extension-page-harness.ts` with `assertA5` and `assertA7` methods.
- **assertA5**: returns `{ passed: boolean; diagnostics: string[]; error?: string; }`. Implementation: ensureOffscreen + startRecording (reuses existing helpers); wait for badge='REC'; send `chrome.runtime.sendMessage({type:'SAVE_ARCHIVE'})` with timeout 15s; on resp.success === true, return `{passed: true, diagnostics: ['saveArchive resp.success=true']}`. The host-side driver does the dir-polling + file-read + zip-bytes capture.
- **assertA7**: standard AssertionResult shape. Implementation: ensure recording fresh (if A6 stopped it, restart via assertA2's helpers — refactor common setup into a shared `setupFreshRecording()` helper inside extension-page-harness.ts). Snapshot notif count via getActiveNotificationCount (existing helper). Send RECORDING_ERROR via chrome.runtime.sendMessage. Wait 200ms. Assert badge='ERR' + popup='src/popup/index.html' + notif delta===1 + last id startsWith 'mokosh-recovery-' (read via `chrome.notifications.getAll`; iterate keys; check the most-recent one — note that Object.keys ordering is not strictly guaranteed but Chrome appends in insertion order in practice; if flaky, use a set-membership check: assert ANY id startsWith the prefix).
2. Wire `driveA5` in `tests/uat/lib/harness-page-driver.ts`:
```typescript
export async function driveA5(page: Page, downloadsDir: string): Promise<AssertionWithBytes> {
// Trigger save via page-side method.
const pageResp = await page.evaluate(async () => {
const r = await (window as any).__mokoshHarness.assertA5();
return r;
});
if (!pageResp.passed) return { passed: false, name: 'A5', checks: [], diagnostics: pageResp.diagnostics, error: pageResp.error };
// Host-side: poll downloadsDir for session_report_*.zip.
// ... using fs.readdirSync + waitFor pattern ...
// Returns the zipBytes (base64) on success.
}
```
Note: driveA5 signature now takes `downloadsDir` — update harness.test.ts orchestrator to pass it. Or: refactor so all drivers take a `harnessCtx: { page, downloadsDir, ... }` object. Planner discretion; planner recommends the harnessCtx pattern (single arg, future-proof).
3. Wire `driveA7` in `tests/uat/lib/harness-page-driver.ts` (standard one-line page.evaluate wrapper).
4. Update `tests/uat/harness.test.ts` to thread `handles.downloadsDir` into driveA5 (or pass full `harnessCtx`).
5. Run `npm run test:uat` → A0+A1+A2+A3+A4+A5+A6+A7 GREEN (8/14); A8..A13 stubs. Exit non-zero (bail-on-first-failure at A8).
6. Run `npx tsx tests/uat/a6.test.ts` → 5/5 PASS (regression check; A6 unchanged).
7. **EXECUTE the A6 Bug B RED-on-regression demo** (locally, do NOT commit):
- Edit `src/background/index.ts:776`: change `if (errorCode === 'user-stopped-sharing') {` to `if (false) {`.
- `npm run build:test` (rebuild test bundle).
- `npx tsx tests/uat/a6.test.ts`.
- Observe FAIL with diagnostic: "A6.1: badge text is '' (NOT 'ERR') after user-stop — expected '', actual 'ERR'" (and likely the other 3 checks also FAIL).
- `git checkout -- src/background/index.ts` (revert).
- `npm run build:test`.
- `npx tsx tests/uat/a6.test.ts`.
- Observe PASS 5/5.
- CAPTURE the exact diagnostic lines from the FAIL run for the commit body.
8. Run `npx tsc --noEmit` → exit 0.
9. Run `npx vitest run --reporter=dot` → 89 GREEN.
10. Run `npm run build` → grep gate stays GREEN.
11. Commit atomically: `feat(01-13): wave-3B — A5+A6+A7 + Bug B regression rewind demonstrated`. Commit body MUST include the verbatim A6 RED-on-regression cycle (per the contract in the interfaces block "How A6 / A8 RED-on-regression demos work" section). Also notes A5 + A7 wiring + RED-on-regression demos for A5 (locally comment out chrome.downloads.download → A5 FAIL on timeout; revert → PASS) and A7 (locally short-circuit RECORDING_ERROR handler → A7 FAIL; revert → PASS). At least the A6 demo is MANDATORY; A5+A7 are recommended but not blocking.
</action>
<verify>
<automated>npx tsc --noEmit &amp;&amp; (set +e; npm run test:uat; test $? -ne 0) &amp;&amp; npx tsx tests/uat/a6.test.ts &amp;&amp; npx vitest run --reporter=dot</automated>
</verify>
<acceptance_criteria>
- `window.__mokoshHarness` exposes assertA5 + assertA7 (in addition to A1-A4, A6).
- driveA5 + driveA7 wired in harness-page-driver.ts.
- `npm run test:uat` advances through 8/14 GREEN (A0+A1-A7); bails at A8.
- A6 standalone still 5/5 PASS via `npx tsx tests/uat/a6.test.ts`.
- Commit body contains the verbatim A6 RED-on-regression demo cycle (MANDATORY per success criteria).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
</acceptance_criteria>
<done>Bug B canonical regression rewind demonstrably catches a regression; SAVE_ARCHIVE + ERROR-path coverage live; 8/14 GREEN.</done>
</task>
<task type="auto" tdd="true">
<name>Task 6 (Wave 3C): Wire A8+A9+A10 (Bug A onStartup notification regression rewind + icon file sizes + manifest shape).</name>
<read_first>
- tests/uat/extension-page-harness.ts (the surface where A8/A9/A10 land)
- tests/uat/lib/harness-page-driver.ts (driveA8/A9/A10 stubs)
- src/background/index.ts lines 71 (NOTIFICATION_ICON_PATH constant — A8's regression target)
- src/background/index.ts lines 877-898 (chrome.runtime.onStartup handler — A8's trigger target)
- manifest.json (icons + notifications permission — A10 contract)
- icons/icon{16,48,128}.png (file sizes — A9 contract; floors per orchestrator brief: 16→200B, 48→500B, 128→1024B)
</read_first>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts</files>
<behavior>
- A8 (BUG A onStartup notification): challenge — Approach B has no SW-side handler-capture hook (sw-hooks.ts deleted in Wave 0; Approach A relied on monkey-patching chrome.runtime.onStartup.addListener to capture the handler). Workaround: TRIGGER the production code path that fires the same chrome.notifications.create — namely, send a `chrome.runtime.sendMessage({type:'__mokoshTriggerStartup'})` to the SW with a NEW production-side test-hook... wait, that requires production code change. SIMPLER WORKAROUND: invoke chrome.notifications.create directly from the page with the SAME options the production onStartup handler uses (iconUrl: chrome.runtime.getURL('icons/icon128.png'), title: 'Mokosh ready', type: 'basic'). If chrome.notifications.create RESOLVES (no rejection from Chrome's imageUtil because the icon is valid), the contract is verified. This is the SAME promise-resolution path Bug A would break. CAVEAT: this verifies Chrome's imageUtil accepts the icon, NOT that the SW onStartup handler runs — but the SW handler is unit-tested in tests/background/onstartup-notification.test.ts; the harness's role is end-to-end icon-acceptance verification, which is what Bug A regressed on. Document the workaround prominently. PASSES today.
- A9 (icon file sizes meet floors): `fetch(chrome.runtime.getURL('icons/icon16.png'))` + read content-length (or blob.size). Floors: 16→200B, 48→500B, 128→1024B. Assert each ≥ floor. PASSES today.
- A10 (manifest shape): `chrome.runtime.getManifest()`. Assert: `permissions.includes('notifications')`; `icons['16']`, `icons['48']`, `icons['128']` all defined. Also assert `default_icon` paths (manifest.action.default_icon) match. PASSES today.
- Wire driveA8/A9/A10; harness.test.ts advances to 11/14 GREEN; A11..A13 stubbed.
- **MANDATORY commit-body documentation: A8 Bug A RED-on-regression demo cycle.** Executor LOCALLY (not committed): edits `src/background/index.ts:71` from `const NOTIFICATION_ICON_PATH = 'icons/icon128.png';` to `const NOTIFICATION_ICON_PATH = 'icons/missing.png';`. Rebuilds. Runs `npm run test:uat`. A8 FAILS (Chrome's imageUtil rejects the create → notif count delta=0). Reverts. Rebuilds. Re-runs. A8 PASSES. CAPTURE diagnostic lines for commit body. Alternative regression trigger: truncate `icons/icon128.png` to 0 bytes via `: > icons/icon128.png` (then `git checkout -- icons/icon128.png` to restore). Either trigger acceptable.
</behavior>
<action>
1. Extend `window.__mokoshHarness` in `tests/uat/extension-page-harness.ts` with `assertA8`, `assertA9`, `assertA10`:
- **assertA8**: snapshot notif count. Call `chrome.notifications.create('mokosh-startup-' + Date.now(), {type:'basic', iconUrl: chrome.runtime.getURL('icons/icon128.png'), title:'Mokosh ready', message:'Click here to start recording your session.', priority:1}, (id) => {...})`. Wait 100ms. Re-snapshot. Assert delta===1. Document workaround inline.
- **assertA9**: for each (16, 200), (48, 500), (128, 1024), `fetch(chrome.runtime.getURL('icons/icon{N}.png'))` + check `(await response.blob()).size` ≥ floor. Or use content-length header.
- **assertA10**: `const m = chrome.runtime.getManifest();` Assert: `m.permissions?.includes('notifications')` true; `m.icons?.['16']`, `['48']`, `['128']` all truthy.
2. Wire driveA8/A9/A10 in harness-page-driver.ts (standard one-line page.evaluate wrappers).
3. Run `npm run test:uat` → 11/14 GREEN (A0+A1-A10); bail at A11.
4. **EXECUTE the A8 Bug A RED-on-regression demo** (locally, do NOT commit) per the behavior description. Capture diagnostic.
5. Run `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
6. Commit atomically: `feat(01-13): wave-3C — A8+A9+A10 + Bug A regression rewind demonstrated`. Commit body MUST include the verbatim A8 RED-on-regression cycle.
</action>
<verify>
<automated>npx tsc --noEmit &amp;&amp; (set +e; npm run test:uat; test $? -ne 0) &amp;&amp; npx vitest run --reporter=dot</automated>
</verify>
<acceptance_criteria>
- assertA8/A9/A10 wired on `window.__mokoshHarness`.
- driveA8/A9/A10 wired in harness-page-driver.ts.
- `npm run test:uat` advances through 11/14 GREEN; bails at A11.
- Commit body contains verbatim A8 Bug A RED-on-regression demo cycle (MANDATORY per success criteria).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
</acceptance_criteria>
<done>Bug A canonical regression rewind demonstrably catches a regression; icon + manifest contracts live; 11/14 GREEN; both Phase-1-escapee bug classes now CI-callable.</done>
</task>
<task type="auto" tdd="true">
<name>Task 7 (Wave 3D): Wire A11+A12+A13 (35s buffer continuity + ffprobe gate + zip shape); 14/14 GREEN.</name>
<read_first>
- tests/uat/extension-page-harness.ts (the surface where A11/A12/A13 land)
- tests/uat/lib/harness-page-driver.ts (driveA11/A12/A13 stubs)
- tests/uat/lib/zip.ts (host-side jszip work for A13)
- tests/offscreen/webm-playback.test.ts (FFPROBE_BIN constant + skip-gate pattern — A12 mirrors this)
- src/offscreen/recorder.ts (segments array, MAX_SEGMENTS, SEGMENT_DURATION_MS — A11 contract via the existing get-segment-count bridge op)
- src/test-hooks/offscreen-hooks.ts (setSegmentCountGetter wire to add)
- src/background/index.ts lines 730-734 + the saveArchive impl (A13 contract: meta.json version field)
</read_first>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts, src/test-hooks/offscreen-hooks.ts, src/offscreen/recorder.ts</files>
<behavior>
- Add `get-segment-count` bridge op to `src/test-hooks/offscreen-hooks.ts` (mirror the existing `dispatch-ended` / `has-stream` ops). Returns `{ count: segmentCountGetter() }`.
- Add the segment-count wire to `src/offscreen/recorder.ts` (gated by __MOKOSH_UAT__): inside startRecording (after the existing setCurrentStream wire at lines ~277-285), add `testHooks?.setSegmentCountGetter(() => segments.length);`. The `segments` module-level array is in scope at recorder.ts:91.
- A11 (35s buffer continuity): start fresh recording. Wait 35 seconds. Query `chrome.runtime.sendMessage({type:'__mokoshOffscreenQuery', op:'get-segment-count'})`. Assert count ≥ 3 (per D-13: 10s segments × MAX_SEGMENTS=3). The 35s wait is real wall-clock time; document the long runtime impact in the commit body. Keepalive: send a periodic `chrome.runtime.sendMessage({type:'OFFSCREEN_READY'})` or similar light query every 20s to keep the SW from going idle (per RESEARCH §2 Pitfall 5).
- A12 (ffprobe gate): trigger SAVE_ARCHIVE (reuse assertA5's helpers). Page-side returns the archive bytes (or success ack). Host-side driveA12 reads the zip, extracts `video/last_30sec.webm` via jszip, writes to a tmpfile, spawns `ffprobe -v error -f matroska -i <tmpfile>` via execFileSync. Asserts exit 0. Skip-gate pattern: if `!existsSync(FFPROBE_BIN)`, print "SKIPPED: ffprobe not available" + return passed=true (mirrors webm-playback.test.ts pattern). The unit-level webm-playback.test.ts gates the same contract; A12 is end-to-end belt + suspenders.
- A13 (zip shape): host-side jszip parse of the zip from A12 (reuse). Assert: `video/last_30sec.webm` entry exists + size > 0. Parse `meta.json`; assert `version === chrome.runtime.getManifest().version` (queried at harness setup or from the page side via `__mokoshHarness.getManifestVersion()`).
- Update `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list to add `get-segment-count` (new bridge op string). Total: 10 forbidden strings.
- After this task: `npm run test:uat` exits 0 with 14/14 GREEN. Total runtime ~50-90s (dominated by A11's 35s wait + A0's `npm run build` ~10s, skippable via SKIP_PROD_REBUILD=1).
- Production bundle: `grep -rln 'get-segment-count\|setSegmentCountGetter' dist/` → 0 (Tier-1 gate GREEN).
</behavior>
<action>
1. Add `get-segment-count` bridge op to `src/test-hooks/offscreen-hooks.ts`:
In the `chrome.runtime.onMessage.addListener` block (after the existing `if (op === 'has-stream')` branch), add:
```typescript
if (op === 'get-segment-count') {
try {
sendResponse({ count: segmentCountGetter() });
} catch (err) {
sendResponse({ count: -1, error: err instanceof Error ? err.message : String(err) });
}
return false;
}
```
Update the protocol-comment block at lines ~297-303 to include the new op.
2. Add segment-count wire to `src/offscreen/recorder.ts`:
Inside startRecording, immediately after the existing `if (__MOKOSH_UAT__) { testHooks?.setCurrentStream(stream); ... }` block (line ~285), the line `testHooks?.setSegmentCountGetter(() => segments.length);` should already be inside (per the existing wire at line 284 per my read). Verify; if missing, add. Comment per project style.
3. Extend `window.__mokoshHarness` in `tests/uat/extension-page-harness.ts` with `assertA11`, `assertA12`, `assertA13`:
- **assertA11**: ensure fresh recording (helper from Wave 3A). Wait 35000ms with intermittent keepalive pings every 20000ms. Query bridge `get-segment-count`. Assert count ≥ 3.
- **assertA12**: ensure fresh recording. Trigger SAVE_ARCHIVE (reuse). Return `{passed: ack-status, webmBytes?: ... }` — actually the cleanest is the host-side does extraction; the page just confirms save succeeded.
- **assertA13**: similar — page returns save-success ack + version metadata; host does zip parsing + meta.json validation.
- Add `getManifestVersion(): string` helper on `__mokoshHarness` for A13.
4. Wire driveA11/A12/A13 in harness-page-driver.ts. driveA12 + driveA13 do host-side fs/jszip/ffprobe work (extract from `handles.downloadsDir`).
5. Update `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list: add `get-segment-count`. Total inventory: 10 strings.
6. Run `npm run build` → exit 0; `grep -rln 'get-segment-count\|setSegmentCountGetter\|...' dist/` → 0.
7. Run `npm run build:test` → exit 0; offscreen chunk contains new bridge op.
8. Run `npm run test:uat` → exit 0; final line: "UAT harness: 14/14 assertions passed". Runtime ~50-90s.
9. Run `npx tsc --noEmit` exit 0; vitest 89 GREEN.
10. RED-on-regression demos (commit body):
- A11: locally edit `src/offscreen/recorder.ts:52` `SEGMENT_DURATION_MS = 10_000` → `SEGMENT_DURATION_MS = 30_000`; rebuild; A11 FAIL (count=1 not ≥3 after 35s). Revert; PASS.
- A12: locally inject corruption into webm-remux output OR truncate the produced webm in saveArchive to <100 bytes; rebuild; A12 FAIL (ffprobe error). Revert; PASS.
- A13: locally drop `version` field from meta.json writer in saveArchive; rebuild; A13 FAIL. Revert; PASS.
Document at least 1 of the 3 in the commit body.
11. Commit atomically: `feat(01-13): wave-3D — A11+A12+A13 + segment-count bridge; 14/14 GREEN`. Body lists assertions wired, the bridge op + recorder wire additions, the FORBIDDEN_STRINGS update, the total runtime range.
</action>
<verify>
<automated>npx tsc --noEmit &amp;&amp; npm run build &amp;&amp; test "$(grep -rln 'get-segment-count\|setSegmentCountGetter' dist/ 2&gt;/dev/null | wc -l)" = "0" &amp;&amp; npm run test:uat &amp;&amp; npx vitest run --reporter=dot</automated>
</verify>
<acceptance_criteria>
- assertA11/A12/A13 wired on `window.__mokoshHarness`; driveA11/A12/A13 wired in harness-page-driver.ts.
- `get-segment-count` bridge op + `setSegmentCountGetter` wire added (offscreen-only, gated).
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list = 10 strings (added `get-segment-count`).
- `npm run test:uat` exit 0; final line: "UAT harness: 14/14 assertions passed".
- `npm run build` exit 0; `grep -rln ... dist/` → 0 (Tier-1 grep gate GREEN).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN.
- At least 1 of A11/A12/A13 RED-on-regression demo documented in commit body.
</acceptance_criteria>
<done>14-assertion charter complete; harness exits 0 against current bundle; production bundle byte-clean of hook strings; both Phase-1-escapee bug regressions catchable.</done>
</task>
<task type="auto">
<name>Task 8 (Wave 4): Append 01-09 amendment block; update STATE.md + ROADMAP.md; final smoke before checkpoint.</name>
<read_first>
- .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md (find end of file + existing amendment block from commit 9d0313a)
- .planning/STATE.md (Decisions section to append to)
- .planning/ROADMAP.md Phase 1 Plans list (current ends at 01-07; check if 01-08/01-09/01-10/01-11/01-12 entries need to be added — surface the gap if found)
- tests/uat/harness.test.ts (the harness that closes the contract)
</read_first>
<files>.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md, .planning/STATE.md, .planning/ROADMAP.md</files>
<behavior>
- APPEND to `01-09-PLAN.md` an amendment block at the END (after any existing amendment from 9d0313a). The amendment block:
```
---
## Amendment 2 (Phase 01-stabilize-video-pipeline, 2026-05-18) — Plan 01-13 harness closes Plan 01-09 functional contract
The 2026-05-17 Plan-01-11 amendment block above referenced `npm run test:uat`
as the closure target, but Plan 01-11 pivoted to a spike-then-pivot
(see 01-11-SUMMARY.md commit ba5474c) — the harness never landed under
01-11. Plan 01-13 delivered the harness via Approach B (extension-internal-
page architecture + offscreen-side synthetic MediaStream). The closure
contract from Amendment 1 still applies; this Amendment 2 confirms the
target is now operational:
- **Step 1 (build):** unchanged — `npm run build` must exit 0.
- **Steps 2-13 + 15:** REDIRECTED to `npm run test:uat` (Plan 01-13's
Approach-B harness; 14 assertions A0..A13).
- **Step 14 (brand/design):** RETAINED for operator. The harness verifies
functional contracts (displaySurface, notification fires, badge state
machine, Bug A + Bug B regression catches) but does NOT verify the
human-readable copy is aesthetically correct OR that the badge color
reads cleanly against the operator's OS theme.
**Closure gate:** Plan 01-09 closes when `npm run test:uat` exits 0 (14/14
GREEN, verified by Plan 01-13 Task 7) AND operator confirms step 14
(brand/design) via Plan 01-13 Task 9.
```
- APPEND to `STATE.md` Decisions section (after the most recent entry):
```
- [Phase 01-13]: Approach-B UAT harness landed (14/14 GREEN). Inherits 01-11 spike-pivot rationale. Plan 01-09 functional contract closes via `npm run test:uat`. Tier-1 grep gate forbidden-string inventory expanded to 10 hook strings covering the Approach-B surface (__mokoshTest, setCurrentStream, setSegmentCountGetter, installFakeDisplayMedia, uninstallFakeDisplayMedia, dispatchEndedOnTrack, getSegmentCount, __mokoshOffscreenQuery, get-display-surface, get-segment-count). Standalone A6 entry at `tests/uat/a6.test.ts` for quick TDD iteration; orchestrated 14-assertion run via `tests/uat/harness.test.ts`. Operator role reduced to step 14 (brand/design) of original 01-09 Task 5.
```
- APPEND to `ROADMAP.md` Phase 1 Plans list. Current list (per inspection) ends at:
```
- [x] 01-07-PLAN.md — Manual smoke + ffprobe D-12 acceptance gate ...
```
Plans 01-08, 01-09, 01-10, 01-11, 01-12 entries are MISSING from ROADMAP.md (planner-detected gap). Wave 4 surfaces this gap to the orchestrator — does NOT silently inject 5 plan entries (out of scope for the 01-13 plan execution). Wave 4 appends ONLY the 01-13 entry:
```
- [x] 01-13-PLAN.md — UAT harness via Approach B (14 assertions; inherits 01-11 spike-pivot; Plan 01-09 functional closure)
```
Add a flag in the commit body: "ROADMAP.md Phase 1 Plans list is missing entries for 01-08, 01-09, 01-10, 01-11, 01-12 — orchestrator should address as separate cleanup; out of scope for 01-13."
- Final smoke: `npm run test:uat` → exit 0 (14/14 GREEN); `npx vitest run` → 89 GREEN; `npm run build` → exit 0.
</behavior>
<action>
1. Read `01-09-PLAN.md` end (verify the 9d0313a amendment block exists; if it doesn't, the previous amendment may have been folded into a different file — note it in the commit body but proceed with appending the Amendment-2 block).
2. Append the Amendment-2 block per behavior. Use the same `---` separator + `## Amendment N` heading pattern.
3. Read `STATE.md` Decisions section (lines 72-108 per inspection). Append the new entry after the most recent entry (currently the `[Phase 01-07-deferred-to-5]` line).
4. Read `ROADMAP.md` Phase 1 Plans list (lines 73-80 per inspection). Append the 01-13 entry. Surface the gap (01-08..01-12 missing) in the commit body.
5. Run `npm run test:uat` → exit 0 (final smoke).
6. Run `npm run build` → exit 0; Tier-1 grep gate GREEN.
7. Run `npx vitest run --reporter=dot` → 89 GREEN.
8. Run `npx tsc --noEmit` → exit 0.
9. Commit atomically: `docs(01-13): wave-4 — 01-09 amendment + STATE/ROADMAP updates; harness closes 01-09 functional contract`. Body: amendment text, STATE decision, ROADMAP append, ROADMAP gap surfaced.
</action>
<verify>
<automated>npx tsc --noEmit &amp;&amp; grep -q 'Plan 01-13 harness closes Plan 01-09 functional contract' .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md &amp;&amp; grep -q 'Approach-B UAT harness landed' .planning/STATE.md &amp;&amp; grep -q '01-13-PLAN.md' .planning/ROADMAP.md &amp;&amp; npm run test:uat &amp;&amp; npx vitest run --reporter=dot</automated>
</verify>
<acceptance_criteria>
- `01-09-PLAN.md` ends with the Amendment-2 block.
- `STATE.md` Decisions section carries the 01-13 entry as the last item.
- `ROADMAP.md` Phase 1 Plans list contains `01-13-PLAN.md` entry; commit body surfaces the 01-08..01-12 gap.
- `npm run test:uat` exit 0 (14/14 GREEN).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
- Commit message follows Mark's style.
</acceptance_criteria>
<done>01-09 redirected to harness; STATE + ROADMAP updated; ROADMAP gap flagged for orchestrator; ready for closing checkpoint.</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<name>Task 9 (Wave 4): Operator confirms `npm run test:uat` exits 0 against current bundle AND confirms brand/design step 14 — closes Plan 01-09 + Plan 01-13.</name>
<files>(operator-driven; no files modified by this checkpoint)</files>
<action>See &lt;how-to-verify&gt; below — operator-driven empirical check. The executor must NOT bypass this checkpoint by stubbing harness output.</action>
<verify>
<automated>echo "checkpoint:human-verify — see how-to-verify section; resume signal is the gate"</automated>
</verify>
<done>Operator types "approved" after running the how-to-verify steps. See &lt;resume-signal&gt; for the exact gate.</done>
<what-built>
Tasks 1-8 landed: Approach-A artifacts cleaned, c647f61 prototype promoted to production paths, Approach-B driver scaffolding rebuilt, all 14 assertions wired across Waves 3A-3D, 14/14 GREEN against current Plan 01-08/01-09 bundle (Bug B fix b9eeeeb + Bug A fix a881bf0 both verified by canonical RED-on-regression demos). Plan 01-09 Task 5 amended (Amendment 2) to point at `npm run test:uat` for functional steps. This checkpoint validates the harness end-to-end against real Chrome AND captures operator's brand/design acceptance for Plan 01-09's retained step 14.
</what-built>
<how-to-verify>
1. **Pre-flight cleanliness:** run `git status` — confirm working tree clean. Any uncommitted local hacks (RED-demo reverts) MUST be reverted BEFORE this step.
2. **Build production:** `npm run build` (must exit 0; this is Plan 01-09 Task 5 step 1).
3. **Build test bundle:** `npm run build:test` (must exit 0).
4. **Run harness:** `npm run test:uat` (must exit 0; runtime ~50-90s). Final output line MUST be exactly `UAT harness: 14/14 assertions passed`. If exit non-zero, paste the structured diagnostic + harness console dump + relevant SW/offscreen console logs; the plan iterates (likely a real bug surfaced).
5. **Re-run for stability:** `npm run test:uat` a second time. Same outcome.
6. **Tier-1 hook-leak verification:** `grep -rln '__mokoshTest\|installFakeDisplayMedia\|dispatchEndedOnTrack\|getSegmentCount\|setCurrentStream\|setSegmentCountGetter\|uninstallFakeDisplayMedia\|__mokoshOffscreenQuery\|get-display-surface\|get-segment-count' dist/` must return 0 matches. If ANY match, the gate failed silently — STOP and triage.
7. **Local-debug mode smoke:** `HEADLESS=0 npm run test:uat`. Watch real Chrome window: see the harness page load (chrome-extension://&lt;id&gt;/tests/uat/extension-page-harness.html), see badge state transitions across A2/A4/A6/A7. Same exit 0 outcome.
8. **Standalone A6 quick check:** `npx tsx tests/uat/a6.test.ts` → exits 0 with "A6 result: PASS" 5/5. (Smoke for the TDD iteration entry.)
9. **Brand/design acceptance (Plan 01-09 Task 5 step 14 — retained for operator):**
(a) Badge color readability against your OS theme (red OFF, green REC, yellow ERR).
(b) Notification copy ("Mokosh ready — Click here to start recording your session.") reads naturally.
(c) Picker UX confirms in headful mode that Chrome's screen-share picker would surface at the expected moment in production (the harness uses synthetic stream + bypasses the picker; the operator's manual run confirms the production picker still works).
10. **If steps 4, 5, 6 all PASS:** Plan 01-09 + Plan 01-13 both close. Type "approved" with any brand/design notes appended.
11. **If step 4 OR 5 FAIL:** paste the failure diagnostic. Likely culprits: state-bleed between assertions (try `--only=A<N>` if that CLI arg landed in Wave 3D); race window in A11's 35s wait or A6's 500ms settle (try bumping); offscreen target attach flakiness (browser.on('targetcreated') is opportunistic).
12. **If step 6 FAILS:** STOP. The Tier-1 hook-leak gate failing means the production bundle contains test code — security regression (T-1-13-01). Open a debug session.
13. **If step 7/8 surfaces a real UX issue:** document as a P1/P2 item in STATE.md or Phase 5 backlog; closure can still proceed IF non-blocking.
</how-to-verify>
<resume-signal>
Type "approved" after step 9 lands (all gates GREEN + brand/design accepted). If steps 10/11/12 hit, paste failure mode + operator's Chrome version + locale + OS theme; the plan iterates on the failing piece.
</resume-signal>
</task>
</tasks>
<threat_model>
## Trust Boundaries
| Boundary | Description |
|----------|-------------|
| Puppeteer driver ↔ Chrome (CDP) | Host-side Node process pipes CDP commands to Chrome; only invokes page.evaluate on the extension-internal harness page (NOT direct extension chrome.* manipulation). The page runs INSIDE the extension privilege boundary. |
| Extension-internal harness page ↔ SW/offscreen | The harness page has FULL chrome.* API access (it's a privileged extension context). It can read/write chrome.action.* state, invoke chrome.notifications.create directly, call chrome.offscreen.createDocument, send chrome.runtime.sendMessage to SW + offscreen. THIS IS THE PRIVILEGE BOUNDARY — the page is trusted because it ships in the test bundle, not the production bundle. |
| Test hook surface (`__mokoshTest`) in production bundle | NEW: SAME security-critical threat as 01-11. If tree-shaking fails OR the `__MOKOSH_UAT__` define-token gate is misconfigured, hook surface ships to production — exposing installFakeDisplayMedia, dispatchEndedOnTrack, getCurrentStream, getSegmentCount, the offscreen-bridge handler to any page that can communicate with the extension. Mitigation: Tier-1 grep gate enforces zero hook strings in dist/ (10-string inventory after Wave 3D). |
| Offscreen bridge (`__mokoshOffscreenQuery`) onMessage listener | NEW: the offscreen-hooks bridge listens on chrome.runtime.onMessage for `__mokoshOffscreenQuery` typed messages and exposes dispatch-ended / install-fake-display-media / has-stream / get-display-surface / get-segment-count ops. If shipped to production, ANY chrome.runtime.sendMessage with this type triggers the ops — dispatch-ended could be used to remotely kill an active recording. Mitigation: same as above — the listener is registered ONLY when __MOKOSH_UAT__ is true (gated by the offscreen-side dynamic import). Tier-1 grep gate verifies the surface is absent from dist/. |
| dev-dependency Chromium binary | UNCHANGED from 01-11: Puppeteer downloads ~150 MB Chromium at npm install. Mitigation: package-lock.json integrity check. |
## STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|-----------|----------|-----------|-------------|-----------------|
| T-1-13-01 | Elevation of Privilege | Hook surface (10-string inventory) leaking into production dist/ would expose installFakeDisplayMedia, dispatchEndedOnTrack, getSegmentCount, getCurrentStream, and the __mokoshOffscreenQuery bridge to any context with chrome.runtime.sendMessage access | mitigate | Two layers: (a) `__MOKOSH_UAT__` Vite define-token gate makes the entire offscreen-hooks import + onMessage listener a static dead branch in production builds (vite.config.ts sets it false; vite.test.config.ts sets it true); Rollup tree-shakes the dead branch. (b) Tier-1 grep gate `tests/background/no-test-hooks-in-prod-bundle.test.ts` greps the BUILT artifact tree for the 10 forbidden strings — ZERO matches required for GREEN. Belt + suspenders catches both tree-shake regression AND new hook-name additions. The 10-string inventory is the authoritative surface contract; new ops MUST be added to the FORBIDDEN_STRINGS list when introduced. |
| T-1-13-02 | Spoofing | Harness page sends `__mokoshOffscreenQuery` messages to offscreen; if production code accidentally also registers a __mokoshOffscreenQuery handler (e.g. typo in a future refactor), it could be invoked by harness messages | accept | The offscreen-hooks onMessage handler returns `{ok: false, error: 'unknown-op'}` for unrecognized ops, so a typo-collision wouldn't accidentally trigger a production state mutation. Detection: any new chrome.runtime.onMessage listener in production code is reviewed for collision with the `__mokoshOffscreenQuery` type sentinel. |
| T-1-13-03 | Information Disclosure | A6's 35-second wait (in A11, not A6) on a CI runner could capture system state via the synthetic stream's canvas — BUT the canvas is a 320x180 frame-counter pattern (constant content; no environmental data) | accept | Per 01-11-SUMMARY: the synthetic stream is canvas-driven with displayed content limited to a frame counter. No actual screen content is captured. CI isolation requirement (per 01-11 threat T-1-11-02) is REMOVED in 01-13 — Approach B's synthetic stream eliminates the entire class of "what's on the screen" threats. |
| T-1-13-04 | Denial of Service | A11's 35s wall-clock wait dominates harness runtime; combined with the build steps, total runtime ~90s ties up CI runner slot | accept | 90s is well within typical CI per-job budgets. Local-dev runs use `SKIP_PROD_REBUILD=1` to drop A0's npm run build cost (~10s). Out of scope: parallelizing assertions (would require multi-browser instances; defeats failure-isolation choice). |
| T-1-13-05 | Tampering | Puppeteer downloads Chromium binary at npm install; supply-chain compromise of download endpoint | accept | UNCHANGED from 01-11: package-lock.json pins hashes via Puppeteer's @puppeteer/browsers machinery. Phase 5 SCA work covers periodic re-verification. |
| T-1-13-06 | Repudiation | A8 verifies Chrome's imageUtil accepts the icon via the harness page calling chrome.notifications.create directly — this DOES NOT verify the SW onStartup handler runs the same code path | mitigate | Documented workaround: the unit test `tests/background/onstartup-notification.test.ts` covers the SW handler invocation; A8 covers the end-to-end icon-acceptance contract (which is what Bug A regressed on). Together they cover both halves of the contract. The harness's role is the icon-acceptance gate; the unit test is the handler-invocation gate. Defense in depth via tier separation (unit + e2e). |
| T-1-13-07 | Elevation of Privilege (additional) | The harness page's chrome.action.setBadgeText / setPopup calls in A2 (workaround for missing 'tabs' permission) MUTATE production state | accept | The mutations are bounded to the harness page's lifetime; the SW state machine reverts on the next setIdleMode / setRecordingMode call. The harness page does NOT persist any mutation. In production (without the harness page being loadable — it's not in dist/), the mutations are impossible. Real-world impact: zero. |
</threat_model>
<verification>
- `npm run test:uat` exits 0 against the current bundle; final line is exactly `UAT harness: 14/14 assertions passed`.
- `npm run build` exit 0; for each of the 10 FORBIDDEN_STRINGS, `grep -rln '<string>' dist/` returns 0.
- `npm run build:test` exit 0; `dist-test/` populated; `grep -rln '__mokoshTest' dist-test/` returns ≥1.
- `npx vitest run` exit 0; 89 GREEN across all test files (no regression to unit-test bed).
- `npx tsc --noEmit` exit 0 across `src/` + `tests/`.
- Tier-1 SW-bundle-import gate (`tests/background/sw-bundle-import.test.ts`) GREEN.
- Tier-1 hook-leak gate (`tests/background/no-test-hooks-in-prod-bundle.test.ts`) GREEN with the 10-string inventory.
- Bug B canonical RED-on-regression demo documented in Wave 3B commit body (locally `if (false)` on src/background/index.ts:776 makes A6 RED; revert makes GREEN).
- Bug A canonical RED-on-regression demo documented in Wave 3C commit body (locally stub NOTIFICATION_ICON_PATH or truncate icon128.png makes A8 RED; revert makes GREEN).
- Plan 01-09 Task 5 amended (Amendment 2) at the end of its PLAN.md; preserves Amendment 1 from 9d0313a.
- STATE.md Decisions log carries the new 01-13 entry as the last item.
- ROADMAP.md Phase 1 Plans list carries the 01-13 entry; commit body surfaces the 01-08..01-12 gap for orchestrator follow-up.
- Operator confirms brand/design step 14 + types "approved" in Task 9.
- Standalone `npx tsx tests/uat/a6.test.ts` exit 0 (5/5 PASS) — TDD iteration entry preserved.
</verification>
<success_criteria>
Plan 01-13 is complete when:
1. **Wave 0 baseline cleanup landed.** sw-hooks.ts deleted; SW dynamic-import block reverted; popup-bridge lib deleted; feasibility probes deleted. Tier-1 grep gate's FORBIDDEN_STRINGS list updated to the Approach-B 10-string inventory. Baseline 89/89 vitest GREEN restored.
2. **Approach-B architecture proven in production paths.** Prototype c647f61 promoted to `tests/uat/{extension-page-harness.html,extension-page-harness.ts,a6.test.ts}` via Wave 1 git mv + comment updates. Standalone A6 entry PASSES 5/5 from new path.
3. **All 14 harness assertions pass against the current bundle.** `npm run test:uat` exit 0; final line `UAT harness: 14/14 assertions passed`. Runtime ~50-90s.
4. **Both Phase-1-escapee bugs are CI-callable.** Wave 3B commit body documents A6 (Bug B) RED-on-regression cycle; Wave 3C commit body documents A8 (Bug A) RED-on-regression cycle. Both demonstrably catch their respective regressions.
5. **Operator role retired for functional verification.** Plan 01-09 Task 5 redirects to `npm run test:uat` via Amendment 2 (which inherits + supersedes the now-stale 01-11 Amendment 1). Operator retains only step 1 (build) + step 14 (brand/design).
6. **Existing 89 vitest tests remain GREEN after every wave.** No regression to unit-test bed.
7. **`npx tsc --noEmit` exit 0; `npm run build` exit 0; Tier-1 grep gate GREEN.** Production bundle byte-clean of hook strings.
8. **MV3 architectural constraints respected.** NO `await import(...)` in `src/background/index.ts`. `dispatchEvent(new Event('ended'))` for user-stopped simulation. `__MOKOSH_UAT__` define-token gate (NOT `import.meta.env.MODE`).
9. **Plan 01-09 + Plan 01-13 close together.** Wave 4 closing checkpoint: operator confirms harness PASS + brand/design + types "approved".
</success_criteria>
<output>
After completion, create `.planning/phases/01-stabilize-video-pipeline/01-13-SUMMARY.md` per the standard template. Cite:
- The 14 assertions landed GREEN (A0 production-bundle grep gate; A1-A13 functional contract from 01-11 orchestrator brief).
- Both RED-on-regression canonical demos documented in commit bodies (A6 Bug B in Wave 3B; A8 Bug A in Wave 3C).
- Approach-B architecture proven (extension-internal-page harness + offscreen-side synthetic MediaStream + chrome.runtime.sendMessage bridge); inherits c647f61 prototype proof.
- Two-bundle separation (dist/ vs dist-test/) verified by Tier-1 grep gate with 10-string FORBIDDEN_STRINGS inventory.
- Bridge protocol op set (install-fake-display-media, dispatch-ended, has-stream, get-display-surface, get-segment-count) + the cross-isolate boundary it crosses.
- Plan 01-09 Amendment 2 landed (inherits + supersedes Amendment 1 from 9d0313a).
- STATE.md decision logged + ROADMAP.md Phase 1 plan list updated.
- ROADMAP gap flagged (01-08..01-12 entries missing — orchestrator follow-up).
- Open questions resolved (4 from this plan's interfaces block) + resolutions.
- Total harness runtime ranges observed (~50-90s; A11's 35s wait dominates; A0 prod rebuild ~10s skippable via SKIP_PROD_REBUILD=1).
- Standalone A6 entry preserved as TDD iteration tool.
</output>