5 waves, 9 tasks. Inherits Plan 01-11 spike-pivot rationale per 01-11-SUMMARY (commitba5474c). Implements full 14-assertion harness via Approach B architecture, proven by prototypec647f61. - Wave 0: clean broken Approach-A artifacts (sw-hooks.ts, SW dynamic import, popup-bridge lib, feasibility probes); update Tier-1 grep gate to 10-string Approach-B forbidden inventory. - Wave 1: promotec647f61prototype (extension-page-harness + a6.test.ts) to production paths; A6 stays GREEN. - Wave 2: rebuild Approach-B driver utilities (launch.ts, assertions.ts, harness-page-driver.ts) replacing deleted popup-bridge primitives. - Wave 3 (4 task bundles): wire A1-A13 functional assertions; canonical Bug B (A6) + Bug A (A8) RED-on-regression demos mandatory in commit bodies. - Wave 4: append 01-09 Amendment 2; update STATE.md + ROADMAP.md; operator brand/design checkpoint. Open questions resolved: Wave 3 granularity = 4 bundles; tabs permission gap = workaround retained (Phase 5 hardening); failure isolation = single browser + bail-on-first; CI plumbing = defer. Frontmatter validation: valid=true. Plan structure: valid=true, task_count=9, all tasks have files/action/verify/done. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
128 KiB
phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, tags, must_haves
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | tags | must_haves | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01-stabilize-video-pipeline | 13 | tdd | 5 |
|
|
false |
|
|
|
Scope Sanity Note
5 waves, 11 tasks (incl. 1 closing checkpoint), ~30 file artifacts. This is above the "split signal" thresholds in <scope_estimation>, but consolidating is the right call AND has architectural precedent: Plan 01-11 was a 4-wave / 9-task plan covering the SAME 14-assertion charter; the only reason 01-13 exists separately is the architectural pivot, not new scope.
Why we accept the borderline rather than split further:
- The architecture is now proven (prototype
c647f61). The risk profile of 01-13 is execution-driven, not architecture-driven — splitting would multiply the per-plan ceremony tax (each plan re-deriving the harness-page contract in its own must_haves frontmatter) for no risk reduction. - Wave 0 cleanup is a single atomic prerequisite for every subsequent wave. Splitting it out would create a window where the baseline is dirty AND the next wave is partially landed — exactly the failure mode 01-11's spike-pivot exposed.
- Wave 3's 4 task bundles ARE the natural split. Each bundle clusters 3 assertions by subsystem (Wave 3A: state machine + UI; Wave 3B: data flow + Bug B; Wave 3C: notifications + manifest; Wave 3D: recording continuity + export). Single-assertion-per-task would yield 13 wave-3 tasks — ceremony overhead with no atomicity benefit since each assertion is independently testable via the standalone harness page surface.
- Context budget: Wave 0 ~10%; Wave 1 ~15%; Wave 2 ~25%; Wave 3A-3D ~15% each = ~60%; Wave 4 ~5%. Total ~115% — ABOVE the planner's 50% target if a single executor ran the whole plan. Mitigation: each wave is intended for a fresh-context executor spawn (the GSD execute-phase pattern). Per-executor context: ~25-30%, well within budget. The executor spawn pattern is the load-bearing assumption; if it doesn't hold, the natural split line is Wave 0+1+2 = Plan 01-13A; Wave 3+4 = Plan 01-13B (with the harness-page contract duplicated as the price of split).
If a future revision DOES force a split, natural cut line: Plan 01-13A = Waves 0+1+2 (cleanup + prototype promotion + driver utilities); Plan 01-13B = Waves 3+4 (13 functional assertions + closure). Wave 3's bundling stays inside 01-13B as 4 sub-tasks.
Land the full 14-assertion UAT harness via Approach B (extension-internal-page harness + offscreen-side synthetic MediaStream + chrome.runtime.sendMessage bridge), inheriting from Plan 01-11's spike-pivot rationale and the proven prototype architecture (c647f61: A6 5/5 GREEN, Bug-B regression rewind verified, ~7s end-to-end runtime).
Three coordinated changes from the 01-11 baseline:
-
Wave 0 cleanup. Delete the broken Approach-A artifacts:
src/test-hooks/sw-hooks.ts(MV3 SW blocks dynamic import), the dynamic-import block insrc/background/index.ts(same),tests/uat/lib/{launch,extension,sw,offscreen,assertions}.ts(popup-bridge architecture wrong per 01-11-SUMMARY falsification 3), andtests/uat/prototype/probe_*.mjs(already-resolved feasibility probes). Two vitest failures (sw-bundle-import.test.ts+no-test-hooks-in-prod-bundle.test.ts) flip GREEN as a side-effect. Atomic commit. -
Promote prototype to production paths + build out driver scaffolding (Waves 1-2). Move
tests/uat/prototype/{extension-page-harness.html,extension-page-harness.ts,a6.test.ts}totests/uat/. Updatevite.test.config.tsrollup inputs. Rebuildtests/uat/lib/{launch,assertions,harness-page-driver}.tsaround the extension-page architecture (NO popup-bridge primitives). Verify A6 still GREEN from the new paths. -
Wire 13 functional assertions (A1-A13) via 4 task bundles in Wave 3, each extending
window.__mokoshHarnesswith 3 assertion methods and the corresponding Puppeteer driver wrappers. Each task delivers an atomic commit; the two TDD canon demos (A6 Bug B + A8 Bug A regression rewinds) are documented in their respective commit bodies.
Wave 4 closure: amend 01-09 Task 5 to point at npm run test:uat; update STATE.md + ROADMAP.md; operator brand/design checkpoint surfaces the 14/14 PASS report and asks "approved" (the only operator-facing gate in the new world).
Operator role retirement: Plan 01-09 closes when npm run test:uat exits 0 + operator confirms step 14 (brand/design). All functional gates move to CI-callable harness — exactly the goal Plan 01-11 set out to achieve but couldn't deliver due to Approach-A architectural infeasibility.
Output:
- Wave 0: clean baseline (5 deletions + 1 revert + Tier-1 gate stays GREEN).
- Wave 1: prototype promoted to
tests/uat/extension-page-harness.{html,ts}+ standalone A6 entry attests/uat/a6.test.ts; A6 PASSES from new path. - Wave 2:
tests/uat/lib/{launch,assertions,harness-page-driver}.tsrebuilt; A6 still PASSES via the new driver wrappers. - Wave 3: 13 assertion methods added to
window.__mokoshHarness; 14/14 GREEN innpm run test:uat. - Wave 4: 01-09 amendment block appended; STATE.md decision logged; ROADMAP.md Phase 1 plan list updated; operator checkpoint confirms.
<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/REQUIREMENTS.md @.planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md @.planning/phases/01-stabilize-video-pipeline/01-08-PLAN.md @.planning/phases/01-stabilize-video-pipeline/01-08-SUMMARY.md @.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md @.planning/phases/01-stabilize-video-pipeline/01-09-SUMMARY.md @.planning/phases/01-stabilize-video-pipeline/01-11-PLAN.md @.planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md @.planning/phases/01-stabilize-video-pipeline/01-11-SUMMARY.md @.planning/debug/resolved/01-09-recovery-flow.md @src/background/index.ts @src/offscreen/recorder.ts @src/test-hooks/offscreen-hooks.ts @src/test-hooks/types.ts @vite.test.config.ts @vite.config.ts @manifest.json @package.json @tsconfig.json @tests/uat/prototype/extension-page-harness.html @tests/uat/prototype/extension-page-harness.ts @tests/uat/prototype/a6.test.ts @tests/background/no-test-hooks-in-prod-bundle.test.ts @tests/background/sw-bundle-import.test.tsApproach-B architecture (ratified by prototype c647f61 — DO NOT DEVIATE)
┌──────────────────────────────────────────────────────────────────────────┐
│ Node host process (Puppeteer driver — tests/uat/harness.test.ts) │
│ • launches Chrome with enableExtensions: [dist-test/] │
│ • opens chrome-extension://<id>/tests/uat/extension-page-harness.html │
│ • opens about:blank victim page + bringToFront │
│ • page.evaluate(() => window.__mokoshHarness.assertXX()) │
│ • reads structured AssertionResult, drives bail-on-fail │
│ • does host-side fs/zip/ffprobe work (A5, A12, A13) │
└────────────────────────┬─────────────────────────────────────────────────┘
│ Puppeteer CDP page.evaluate
▼
┌──────────────────────────────────────────────────────────────────────────┐
│ Extension-internal harness page (PRIVILEGED — full chrome.* API) │
│ tests/uat/extension-page-harness.{html,ts} → window.__mokoshHarness │
│ .assertA1..A13 — orchestrate each assertion using: │
│ • chrome.action.getBadgeText / getPopup / setBadgeText / setPopup │
│ • chrome.runtime.sendMessage (REQUEST_PERMISSIONS, SAVE_ARCHIVE, │
│ RECORDING_ERROR, __mokoshOffscreenQuery, START_RECORDING) │
│ • chrome.notifications.getAll / .create / .clear │
│ • chrome.offscreen.createDocument / .hasDocument │
│ • chrome.runtime.getManifest / .getURL │
│ • fetch(chrome.runtime.getURL('icons/icon{N}.png')) for size check │
└─┬──────────────────────┬────────────────────────────────────┬────────────┘
│ direct chrome.* calls │ chrome.runtime.sendMessage │
│ (page has privilege) │ (cross-isolate path) │
▼ ▼ ▼
┌─────────────────┐ ┌────────────────────────────────┐ ┌──────────────────┐
│ SW (production) │ │ Offscreen (production) │ │ Browser APIs │
│ src/background/ │ │ src/offscreen/recorder.ts │ │ (chrome.action, │
│ index.ts │ │ + src/test-hooks/ │ │ notifications, │
│ UNCHANGED │ │ offscreen-hooks.ts │ │ downloads) │
│ (no hooks) │ │ gated by __MOKOSH_UAT__ │ │ │
│ │ │ installFakeDisplayMedia(), │ │ │
│ │ │ dispatchEndedOnTrack(), │ │ │
│ │ │ setSegmentCountGetter() │ │ │
└─────────────────┘ └────────────────────────────────┘ └──────────────────┘
Puppeteer extension API surface (per c647f61 prototype, verified)
import puppeteer, { type Browser, type Page } from 'puppeteer';
const browser: Browser = await puppeteer.launch({
enableExtensions: ['/abs/path/to/dist-test'],
headless: process.env.HEADLESS !== '0',
pipe: true,
protocolTimeout: 90_000, // headroom for sendMessage round-trips
args: ['--no-sandbox'],
// DO NOT add --auto-select-desktop-capture-source — unreliable in
// --headless=new per 01-11-SUMMARY falsification 4; synthetic stream
// bypasses the picker entirely.
});
const extensions = await browser.extensions();
const [extensionId] = [...extensions][0];
const victimPage = await browser.newPage();
await victimPage.goto('about:blank');
const page: Page = await browser.newPage();
await page.goto(`chrome-extension://${extensionId}/tests/uat/extension-page-harness.html`);
await page.waitForFunction(() => (window as any).__mokoshHarness !== undefined);
await victimPage.bringToFront();
const result = await page.evaluate(async () => {
const r = await (window as any).__mokoshHarness.assertA6();
return r;
});
Harness-page surface (extends prototype's window.__mokoshHarness)
// tests/uat/extension-page-harness.ts — Wave 1 PROMOTED + Wave 3 EXTENDED.
interface AssertionResult {
passed: boolean;
name: string;
checks: Array<{
name: string;
expected: unknown;
actual: unknown;
passed: boolean;
}>;
diagnostics: string[];
error?: string;
}
// Augmented globally:
interface Window {
__mokoshHarness: {
assertA1: () => Promise<AssertionResult>; // SW bootstrap state
assertA2: () => Promise<AssertionResult>; // toolbar onClicked → REC
assertA3: () => Promise<AssertionResult>; // displaySurface monitor
assertA4: () => Promise<AssertionResult>; // popup during recording
assertA5: () => Promise<{ // SAVE_ARCHIVE returns blob bytes
passed: boolean;
zipBytes?: string; // base64
diagnostics: string[];
error?: string;
}>;
assertA6: () => Promise<AssertionResult>; // Bug B canonical (proven)
assertA7: () => Promise<AssertionResult>; // genuine error → ERR + recovery
assertA8: () => Promise<AssertionResult>; // Bug A onStartup notification
assertA9: () => Promise<AssertionResult>; // icon file sizes
assertA10: () => Promise<AssertionResult>; // manifest shape
assertA11: () => Promise<AssertionResult>; // 35s → ≥3 segments
assertA12: () => Promise<{ // ffprobe (host-side; returns webm bytes)
passed: boolean;
webmBytes?: string; // base64
diagnostics: string[];
error?: string;
}>;
assertA13: () => Promise<{ // zip shape (host-side; returns zip bytes)
passed: boolean;
zipBytes?: string; // base64
expectedVersion: string;
diagnostics: string[];
error?: string;
}>;
};
}
Offscreen-hooks bridge protocol (UNCHANGED from c647f61)
// chrome.runtime.sendMessage payload:
{ type: '__mokoshOffscreenQuery', op: 'install-fake-display-media' } // → { ok, error? }
{ type: '__mokoshOffscreenQuery', op: 'dispatch-ended' } // → { ok, error? }
{ type: '__mokoshOffscreenQuery', op: 'has-stream' } // → { hasStream }
{ type: '__mokoshOffscreenQuery', op: 'get-segment-count' } // NEW in Wave 3 → { count }
MV3 SW constraint enforcement (per 01-11-SUMMARY falsification 1)
// src/background/index.ts — WAVE 0 STATE (after revert):
// NO top-of-module dynamic import. The 01-11 Wave 1 block
// let testHooks: ... = null;
// if (__MOKOSH_UAT__) { testHooks = await import('../test-hooks/sw-hooks'); }
// is REMOVED entirely. Approach B does not need SW-side hooks —
// chrome.action.* state is queried by the harness page directly.
Offscreen-side gate (UNCHANGED — works because offscreen IS a DOM document)
// src/offscreen/recorder.ts lines 21-48 — PRESERVED:
let testHooks: typeof import('../test-hooks/offscreen-hooks') | null = null;
if (__MOKOSH_UAT__) {
testHooks = await import('../test-hooks/offscreen-hooks');
}
// Wave 3 add (after existing setCurrentStream wire):
if (__MOKOSH_UAT__) {
testHooks?.setCurrentStream(stream);
testHooks?.setSegmentCountGetter(() => segments.length);
}
Tier-1 grep gate forbidden-string inventory (Wave 0 audit + Wave 3 extension)
// tests/background/no-test-hooks-in-prod-bundle.test.ts FORBIDDEN strings:
const FORBIDDEN_STRINGS = [
'__mokoshTest',
'setCurrentStream',
'setSegmentCountGetter',
'installFakeDisplayMedia',
'uninstallFakeDisplayMedia',
'dispatchEndedOnTrack',
'getSegmentCount',
'__mokoshOffscreenQuery',
];
// Every entry must be absent from EVERY file under dist/ post-build.
Test bundle rollup input (vite.test.config.ts — Wave 1 update)
rollupOptions: {
input: {
// Removed: prototype_harness → moved to extension_page_harness
extension_page_harness: 'tests/uat/extension-page-harness.html',
},
},
npm scripts (UNCHANGED from 01-11 Wave 0)
{
"scripts": {
"build:test": "tsc && vite build --mode test --config vite.test.config.ts",
"test:uat": "npm run build:test && tsx tests/uat/harness.test.ts"
}
}
Resolved open questions
| # | Question | Resolution | Rationale |
|---|---|---|---|
| 1 | Task granularity for Wave 3 — 4 bundles of 3 assertions, OR 13 separate tasks? | 4 bundles (T-Wave3A: A1+A2+A3+A4; T-Wave3B: A5+A6+A7; T-Wave3C: A8+A9+A10; T-Wave3D: A11+A12+A13). | Balance ceremony overhead (13 commits vs 4) vs atomicity (per-assertion vs subsystem-cluster). The bundle boundaries follow subsystem coupling: each bundle's 3 assertions share fixture state (e.g. T-Wave3A all run against a single launch; T-Wave3D all need the 35s recording). |
| 2 | Manifest tabs permission gap (per 01-11-SUMMARY) |
Workaround retained (no scope creep). The prototype's A6 implementation sends START_RECORDING directly to the offscreen via chrome.runtime.sendMessage, bypassing the SW's startVideoCapture which requires chrome.tabs.query({active:true}) to return a tab with .url (which it does NOT without tabs permission). Wave 3 A2 (toolbar onClicked) uses the same direct-offscreen path. Flagged for Phase 5 hardening: adding tabs permission to manifest would unlock testing the real toolbar onClicked → startVideoCapture path; out of scope for the harness plan. |
Adding manifest permissions in a TEST plan is wrong on scope grounds (changes production attack surface). The harness verifies the contract that matters (recording starts; bug B routing works); the routing-via-startVideoCapture vs direct-offscreen distinction is orthogonal to the Bug B fix verification. |
| 3 | Failure isolation — single browser vs per-assertion restart? | Single browser, serial assertions, bail-on-first-failure, structured diagnostic dump. Matches prototype c647f61 pattern; matches 01-11 RESEARCH §5 recommendation + open-question resolution 4. |
Per-assertion restart = 14 × ~3-5s = ~60s overhead. Single browser keeps total runtime in the 60-90s range. State bleed is acceptable for 14 deterministic assertions where each one's pre-condition is established by its own setup steps (e.g. A6's "wait for badge='REC'" pre-condition runs independent of A5's state). The bail-on-fail + diagnostic dump preserves debugging value. |
| 4 | CI plumbing scope | Defer. No .github/workflows/ directory exists; introducing CI tooling here would force a CI-tool decision (Actions vs self-hosted vs other) out of scope for the harness landing plan. The harness is CI-callable today (npm run test:uat exits 0 on pass, non-zero on fail, deterministic exit codes). |
Matches 01-11 RESEARCH open-question resolution 3 verbatim. Phase 5 hardening backlog. |
How A6 / A8 RED-on-regression demos work (commit body documentation contract)
A6 (Bug B canonical) — T-Wave3B commit body MUST include:
RED-on-regression demo (A6 Bug B regression rewind):
$ # Apply local-only revert of Bug B fix at b9eeeeb:
$ # On src/background/index.ts:776, change `if (errorCode === 'user-stopped-sharing')` to `if (false)`
$ npm run build:test
$ npm run test:uat # OR: npx tsx tests/uat/a6.test.ts
# A6 result: FAIL
# A6.1: badge text is '' (NOT 'ERR') after user-stop — expected '', actual 'ERR'
$ git checkout -- src/background/index.ts
$ npm run build:test
$ npm run test:uat
# A6 result: PASS 5/5
This proves the harness can catch a Bug B regression in the SW state machine.
A8 (Bug A canonical) — T-Wave3C commit body MUST include:
RED-on-regression demo (A8 Bug A regression rewind):
$ # Apply local-only icon stub on src/background/index.ts:71:
$ # const NOTIFICATION_ICON_PATH = 'icons/missing.png';
$ # OR: truncate icons/icon128.png to <100 bytes
$ npm run build:test
$ npm run test:uat
# A8 result: FAIL
# A8.1: notification count delta === 1 — expected 1, actual 0 (Chrome imageUtil rejected create)
$ git checkout -- src/background/index.ts icons/icon128.png
$ npm run build:test
$ npm run test:uat
# A8 result: PASS
This proves the harness can catch a Bug A regression in the notification icon path.
ba5474c already has the comment-only form per inspection; this task ensures the state is clean.)
- DELETE the popup-bridge scaffolding under `tests/uat/lib/`:
- `tests/uat/lib/launch.ts` (popup-bridge launch helper; will be rewritten in Wave 2)
- `tests/uat/lib/extension.ts` (popup-bridge extension-id resolver)
- `tests/uat/lib/sw.ts` (sw.evaluate helpers — falsified per SUMMARY §2)
- `tests/uat/lib/offscreen.ts` (popup-bridge offscreen helpers)
- `tests/uat/lib/assertions.ts` (will be rewritten in Wave 2 with Approach-B primitives)
Keep `tests/uat/lib/zip.ts` (still valid — host-side jszip work).
Keep `tests/uat/lib/test-hook-contract.d.ts` (still valid — type contract mirror).
- DELETE the feasibility-research probes under `tests/uat/prototype/`:
- `tests/uat/prototype/probe_offscreen.mjs`
- `tests/uat/prototype/probe_sw.mjs`
- `tests/uat/prototype/probe_tabs.mjs`
- `tests/uat/prototype/probe_tabs2.mjs`
KEEP `tests/uat/prototype/extension-page-harness.{html,ts}` + `tests/uat/prototype/a6.test.ts` for Wave 1 promotion.
- AUDIT `tests/background/no-test-hooks-in-prod-bundle.test.ts` forbidden-string list. Current list (per the file's preamble) covers `__mokoshTest`, `simulateUserStop`, `getSegmentCount`, `setCurrentStream`, `setSegmentCountGetter`. Add: `installFakeDisplayMedia`, `uninstallFakeDisplayMedia`, `dispatchEndedOnTrack`, `__mokoshOffscreenQuery`. Remove: `simulateUserStop` (was Approach-A naming; Approach B uses `dispatchEndedOnTrack`). The total forbidden inventory after this audit: 8 strings.
- VERIFY harness.test.ts is still loadable but stale-imports do not block typecheck: it currently imports from `./lib/assertions` (deleted) etc. Wave 0 needs to handle this — simplest path: also DELETE `tests/uat/harness.test.ts` in Wave 0 since it will be entirely rewritten in Waves 1-3 around the extension-page architecture. (The standalone harness entry will land in Wave 1; the orchestrator-bundled harness in Waves 2-3.) Document the deletion in the commit body.
- After deletions: `npm run build` exits 0; `npm run build:test` exits 0; `dist/` and `dist-test/` both populated. The Tier-1 grep gate test passes against the new forbidden-string list (which it should — no production code references any hook string after the SW-side revert). The `sw-bundle-import.test.ts` flips GREEN once `npm run build` runs (it was failing because `dist/service-worker-loader.js` was stale/missing).
- Full vitest suite: 89 GREEN (88 pre-existing + 1 Tier-1 gate that was failing on stale state, now passing with updated forbidden list).
- `npx tsc --noEmit` exits 0 (the deletions remove stale imports; the harness.test.ts deletion removes its stale import chain).
1. Read `src/background/index.ts` lines 13-29; confirm the current state is comment-only (no `await import` block). If a dynamic-import block exists, edit to remove it; if only the comment block exists, edit to refine the comment to reflect 01-13 status: "Plan 01-13: NO SW-side test hook gate. MV3 SW blocks dynamic import (01-11 falsification 1 / Chromium es_modules.md). Approach B reads SW state via extension-internal harness page's chrome.action.* calls — see tests/uat/extension-page-harness.ts." Keep the existing Tier-1 grep gate citation.
2. Delete files:
```
rm src/test-hooks/sw-hooks.ts
rm tests/uat/lib/launch.ts
rm tests/uat/lib/extension.ts
rm tests/uat/lib/sw.ts
rm tests/uat/lib/offscreen.ts
rm tests/uat/lib/assertions.ts
rm tests/uat/prototype/probe_offscreen.mjs
rm tests/uat/prototype/probe_sw.mjs
rm tests/uat/prototype/probe_tabs.mjs
rm tests/uat/prototype/probe_tabs2.mjs
rm tests/uat/harness.test.ts
```
Use `git rm` to keep the index in sync. The harness.test.ts deletion is intentional — it will be reborn in Wave 1.
3. Edit `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS array (or whatever the existing constant is named):
- Remove: `simulateUserStop`
- Add: `installFakeDisplayMedia`, `uninstallFakeDisplayMedia`, `dispatchEndedOnTrack`, `__mokoshOffscreenQuery`
Update the file preamble to cite 01-13 (replace any "Plan 01-11 Task 1" references with "Plan 01-13 Wave 0" where the description was about the gate's CURRENT scope, NOT historical provenance — preserve historical provenance for traceability).
4. Run `npm run build` (production); confirm exit 0; confirm `dist/service-worker-loader.js` exists.
5. Run `grep -rln '__mokoshTest\|installFakeDisplayMedia\|uninstallFakeDisplayMedia\|dispatchEndedOnTrack\|getSegmentCount\|setCurrentStream\|setSegmentCountGetter\|__mokoshOffscreenQuery' dist/` → 0 matches.
6. Run `npm run build:test` (test); confirm exit 0; confirm `dist-test/` populated. Confirm `grep -rln 'installFakeDisplayMedia\|dispatchEndedOnTrack' dist-test/` → ≥1 match (offscreen-hooks chunk).
7. Run `npx tsc --noEmit` → exit 0 (no stale imports left in tests/).
8. Run `npx vitest run --reporter=dot` → 89 GREEN (the two prior failures flip; no new failures).
9. Commit atomically: `chore(01-13): wave-0 — clean broken Approach A artifacts per 01-11-SUMMARY`. Commit body cites: (a) sw-hooks.ts deletion + SW dynamic-import revert + falsification reference; (b) popup-bridge tests/uat/lib/* deletions + falsification reference; (c) feasibility probe deletions; (d) Tier-1 gate forbidden-string list update + rationale; (e) harness.test.ts deletion (will be rewritten in Wave 1).
Per project style: NO `as any`; NO `@ts-ignore`; absolute imports; extensive comments for the Tier-1 gate edit explaining the surface-inventory expansion.
npm run build && grep -rln '__mokoshTest\|installFakeDisplayMedia\|uninstallFakeDisplayMedia\|dispatchEndedOnTrack\|getSegmentCount\|setCurrentStream\|setSegmentCountGetter\|__mokoshOffscreenQuery' dist/ | wc -l | grep -q '^0$' && npm run build:test && test -d dist-test && npx tsc --noEmit && npx vitest run --reporter=dot
- `src/test-hooks/sw-hooks.ts` does not exist.
- `src/background/index.ts` lines 13-29 are comment-only (NO `await import` block); comment text references 01-13 + the architectural constraint.
- `tests/uat/lib/{launch,extension,sw,offscreen,assertions}.ts` do not exist; `tests/uat/lib/{zip.ts,test-hook-contract.d.ts}` retained.
- `tests/uat/prototype/probe_*.mjs` do not exist; `tests/uat/prototype/{extension-page-harness.{html,ts},a6.test.ts}` retained.
- `tests/uat/harness.test.ts` does not exist (will be reborn in Wave 1).
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list contains exactly the 8 hooks per the inventory in interfaces; preamble updated to cite 01-13.
- `npm run build` exit 0; `grep -rln ...` returns 0 matches in `dist/`.
- `npm run build:test` exit 0; `dist-test/` populated.
- `npx tsc --noEmit` exit 0.
- `npx vitest run` exit 0 with 89 GREEN.
- Commit message follows Mark's `(): ` style with em-dash separator + `Co-Authored-By: Claude Opus 4.7 (1M context) ` trailer.
Baseline GREEN; broken Approach-A artifacts deleted; Tier-1 grep gate updated for Approach-B surface inventory; ready for Wave 1 prototype promotion.
Task 2 (Wave 1): Promote c647f61 prototype to production paths; A6 stays GREEN from new path.
- tests/uat/prototype/extension-page-harness.html (current PROTOTYPE; will be moved)
- tests/uat/prototype/extension-page-harness.ts (current PROTOTYPE; comments need 01-13 update)
- tests/uat/prototype/a6.test.ts (current PROTOTYPE; comments need 01-13 update)
- vite.test.config.ts (rollup input update needed)
- src/test-hooks/offscreen-hooks.ts (extending in Wave 3; for now confirm it works with the promoted paths — the bridge protocol is path-agnostic)
- src/test-hooks/types.ts (will be extended in Wave 3 with installFakeDisplayMedia/dispatchEndedOnTrack/uninstallFakeDisplayMedia typed fields; in Wave 1 just confirm the cross-cast in offscreen-hooks.ts still works)
tests/uat/extension-page-harness.html, tests/uat/extension-page-harness.ts, tests/uat/a6.test.ts, tests/uat/prototype/extension-page-harness.html, tests/uat/prototype/extension-page-harness.ts, tests/uat/prototype/a6.test.ts, vite.test.config.ts
- Move (via git mv) `tests/uat/prototype/extension-page-harness.html` → `tests/uat/extension-page-harness.html`.
- Move (via git mv) `tests/uat/prototype/extension-page-harness.ts` → `tests/uat/extension-page-harness.ts`.
- Move (via git mv) `tests/uat/prototype/a6.test.ts` → `tests/uat/a6.test.ts`.
- The `tests/uat/prototype/` directory is now EMPTY — delete it (git rm -r if needed; usually `git mv` of the contents leaves the dir untracked, in which case it's a no-op).
- Update comments in the moved files: replace "PROTOTYPE" / "Plan 01-11 PROTOTYPE" references with "Plan 01-13 production harness" where the comment was describing the file's CURRENT role (NOT its historical provenance — preserve "originally landed as 01-11 prototype at c647f61" where the comment was describing provenance).
- Update `vite.test.config.ts` rollup inputs: replace `prototype_harness: 'tests/uat/prototype/extension-page-harness.html'` with `extension_page_harness: 'tests/uat/extension-page-harness.html'`. Update the inline comment to reflect the new path (no more "prototype" reference).
- Update path references in the moved files:
- `tests/uat/extension-page-harness.html` line 9 (HTML preamble): change `chrome-extension://<id>/tests/uat/prototype/extension-page-harness.html` → `chrome-extension://<id>/tests/uat/extension-page-harness.html`.
- `tests/uat/extension-page-harness.ts`: update the file-header docstring's path reference from the prototype path to the production path. Keep the architectural narrative + research findings intact.
- `tests/uat/a6.test.ts`: update the `harnessUrl` constant (line ~176): `chrome-extension://${extensionId}/tests/uat/extension-page-harness.html` (drop `/prototype/`).
- After moves + comment updates + config update: `npm run build:test` exits 0 + emits `dist-test/tests/uat/extension-page-harness.html` (or whatever path crxjs picks; verify by `ls dist-test/`). Run `npx tsx tests/uat/a6.test.ts` → exits 0 with "A6 result: PASS" (5/5 checks GREEN).
- Full vitest suite: 89 GREEN (no unit-test regression — the moves don't touch any vitest-discovered files).
- `npx tsc --noEmit` exit 0.
1. `git mv tests/uat/prototype/extension-page-harness.html tests/uat/extension-page-harness.html`
2. `git mv tests/uat/prototype/extension-page-harness.ts tests/uat/extension-page-harness.ts`
3. `git mv tests/uat/prototype/a6.test.ts tests/uat/a6.test.ts`
4. After moves, `ls tests/uat/prototype/` should be empty. If empty, the directory is implicitly removed by git on next commit; no explicit `rmdir` needed.
5. Edit `tests/uat/extension-page-harness.html`:
- Update the `` line referencing the file path: change `/tests/uat/prototype/extension-page-harness.html` → `/tests/uat/extension-page-harness.html`.
- Update the page title to drop "(extension-internal page)" if redundant; keep it for clarity per project verbosity style. (Planner discretion: keep the existing title or refine.)
6. Edit `tests/uat/extension-page-harness.ts`:
- File-header docstring: change "Plan 01-11 PROTOTYPE" → "Plan 01-13 production UAT harness (inherited from 01-11 prototype c647f61 per 01-11-SUMMARY architectural pivot)".
- Update the path reference in the docstring from `tests/uat/prototype/extension-page-harness.html` to `tests/uat/extension-page-harness.html`.
- Keep ALL the existing assertA6 implementation, the helper functions (waitFor, sendMessageWithTimeout, ensureOffscreen, startRecording, offscreenQuery, getActiveNotificationCount), the architectural-finding comment block, and the global Window augmentation. These are the load-bearing code; do not modify their logic.
- The `window.__mokoshHarness` install at the bottom should already only expose `assertA6` — leave as-is; Wave 3 will extend it.
7. Edit `tests/uat/a6.test.ts`:
- File-header docstring: "Plan 01-11 PROTOTYPE" → "Plan 01-13 standalone A6 entry point for TDD iteration".
- Update `harnessUrl` constant (line ~176): drop `/prototype/`.
- Keep ALL the puppeteer launch + page + result-print + main entry logic. These are the load-bearing test plumbing.
8. Edit `vite.test.config.ts`:
- Replace `prototype_harness: 'tests/uat/prototype/extension-page-harness.html'` with `extension_page_harness: 'tests/uat/extension-page-harness.html'`.
- Update the surrounding comment to reflect the new path + rename.
- Preserve the `modulePreload: { polyfill: false }` line (CRITICAL SW FIX per 01-11-SUMMARY).
9. Run `npm run build:test` → exits 0; verify `ls dist-test/` shows the harness HTML emitted under the expected path (likely `dist-test/tests/uat/extension-page-harness.html` per crxjs conventions; the exact path is verified by inspection).
10. Run `npx tsx tests/uat/a6.test.ts` → exits 0 with PASS report. (If FAIL: triage immediately — the move broke something. Most likely culprit: the harness page can't load because the rollup emission path differs from the URL the test fetches; cross-check `ls dist-test/` against the URL in a6.test.ts:176 and align.)
11. Run `npx tsc --noEmit` → exit 0.
12. Run `npx vitest run --reporter=dot` → 89 GREEN.
13. Run `npm run build && grep -rln '__mokoshTest\|installFakeDisplayMedia\|dispatchEndedOnTrack' dist/ | wc -l` → 0 (Tier-1 grep gate stays GREEN; the moves don't touch production code).
14. Commit atomically: `feat(01-13): wave-1 — promote c647f61 prototype to production paths; A6 GREEN`. Commit body: lists each file move, the comment updates, the vite config update, and the verification that A6 still passes 5/5 from the new path.
npm run build:test && npx tsc --noEmit && npx tsx tests/uat/a6.test.ts && npx vitest run --reporter=dot && npm run build && test "$(grep -rln '__mokoshTest\|installFakeDisplayMedia\|dispatchEndedOnTrack' dist/ 2>/dev/null | wc -l)" = "0"
- `tests/uat/extension-page-harness.html` + `tests/uat/extension-page-harness.ts` + `tests/uat/a6.test.ts` exist at production paths; comments updated to reference 01-13.
- `tests/uat/prototype/` is empty/removed.
- `vite.test.config.ts` `rollupOptions.input.extension_page_harness` points at the new path.
- `npx tsx tests/uat/a6.test.ts` exits 0 with "A6 result: PASS" + 5/5 checks GREEN.
- `npm run build:test` exit 0; `npm run build` exit 0; production grep gate stays GREEN.
- `npx tsc --noEmit` exit 0; `npx vitest run` 89 GREEN.
- Commit message follows Mark's style.
Prototype promoted to production paths; A6 functional; baseline preserved; ready for Wave 2 driver scaffolding.
Task 3 (Wave 2): Build out Approach-B harness driver utilities (launch + assertions + harness-page-driver); A6 still GREEN via new driver.
- tests/uat/a6.test.ts (the standalone driver — the model for what launch.ts + harness-page-driver.ts will abstract)
- tests/uat/extension-page-harness.ts (the surface to call via harness-page-driver)
- tests/uat/lib/zip.ts (kept from 01-11; harness-side jszip work — confirm compat)
- tests/uat/lib/test-hook-contract.d.ts (kept from 01-11; type mirror)
- src/test-hooks/offscreen-hooks.ts (the bridge protocol — confirm the harness-page-driver's `evaluate` calls match the offscreen-hooks bridge ops)
tests/uat/lib/launch.ts, tests/uat/lib/assertions.ts, tests/uat/lib/harness-page-driver.ts
- `tests/uat/lib/launch.ts` (NEW): exports `launchHarnessBrowser(options?: { headless?: boolean; downloadsDir?: string }): Promise` returning `{ browser, extensionId, harnessPage, victimPage, downloadsDir, swConsole, offConsole }`. Implementation mirrors `tests/uat/a6.test.ts` launchChrome + victim/harness page setup verbatim, refactored to a reusable helper. `downloadsDir` defaults to `mkdtempSync(join(tmpdir(), 'mokosh-uat-'))`. Wires Chrome download path via CDP `Browser.setDownloadBehavior` so A5 SAVE_ARCHIVE downloads land in `downloadsDir`. `swConsole`/`offConsole` are accumulating string[] buffers populated by `worker.on('console', ...)` + `target.on('targetcreated', ...)` (best-effort offscreen attach per prototype pattern).
- `tests/uat/lib/assertions.ts` (REWRITTEN): exports `runAssertion(name, fn, { consoleBuffers })` (wraps a single assertion with try/catch + diagnostic dump on failure), `assertEqual`/`assertGte`/`assertMatch`/`assertTrue` (structured failure messages; use `node:assert/strict` under the hood), `waitFor(probe, predicate, timeoutMs, description)` (mirrors prototype's polling primitive verbatim — extract from extension-page-harness.ts into shared lib so both harness-page and host-side can use it). Define `AssertionRecord` + `ConsoleBuffers` types.
- `tests/uat/lib/harness-page-driver.ts` (NEW): exports one driver function per assertion: `driveA1(page)`, `driveA2(page)`, ..., `driveA13(page)`. Each is a thin wrapper around `page.evaluate(() => window.__mokoshHarness.assertXX())` that returns the structured `AssertionResult` (or the extended shape for A5/A12/A13 with `zipBytes`/`webmBytes`). Centralizing this means adding/renaming an assertion = two-file edit (extension-page-harness.ts impl + this driver wrapper) instead of touching every place that calls it.
- Wave 2 ONLY wires `driveA6`. Driver wrappers for A1-A5, A7-A13 are stubbed (`throw new Error('NOT YET IMPLEMENTED — Wave 3 wires this')`) so Wave 3 fills them in.
- Rewrite `tests/uat/a6.test.ts` to use `launchHarnessBrowser` + `driveA6` (drops ~80 LoC of plumbing duplicated from launch.ts). The test stays GREEN — same A6 5/5 PASS outcome, but via the shared lib.
- `npx tsc --noEmit` exit 0; `npx tsx tests/uat/a6.test.ts` exit 0 with PASS report.
- `npm run build:test` exit 0; `npm run build` exit 0; Tier-1 grep gate GREEN.
- Full vitest suite: 89 GREEN.
1. Create `tests/uat/lib/launch.ts`:
```typescript
// tests/uat/lib/launch.ts — Plan 01-13 Wave 2.
//
// Approach-B harness launch helper. Inherits the Puppeteer launch +
// victim-page-bringToFront + harness-page-open pattern from the proven
// tests/uat/a6.test.ts prototype (commit c647f61). Refactored into a
// reusable helper so Wave 3's 13 assertion drivers share the same
// setup overhead (one Chrome launch + one harness page + one victim
// page per `npm run test:uat` run).
//
// Architectural commitments (per 01-11-SUMMARY):
// - Drive Chrome FROM INSIDE: harnessPage runs at
// chrome-extension:///tests/uat/extension-page-harness.html
// with full chrome.* API access.
// - victimPage is about:blank brought to front so production
// chrome.tabs.query({active:true}) sees a real tab (the harness
// page itself is a chrome-extension:// URL with no .url surfaced
// without `tabs` permission — workaround for the missing-permission
// gap; flagged for Phase 5 hardening).
// - Downloads land in a per-run tmp dir (mkdtempSync) so A5 polling
// does not collide with operator downloads.
// - SW + offscreen consoles forwarded to swConsole/offConsole
// accumulating buffers (best-effort; offscreen attach via
// targetcreated listener — opportunistic per prototype pattern).
//
// References:
// - puppeteer.launch options: https://pptr.dev/api/puppeteer.launchoptions
// - CDP Browser.setDownloadBehavior:
// https://chromedevtools.github.io/devtools-protocol/tot/Browser/#method-setDownloadBehavior
import { mkdtempSync, existsSync, statSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { dirname, join, resolve as resolvePath } from 'node:path';
import { fileURLToPath } from 'node:url';
import puppeteer, { type Browser, type Page } from 'puppeteer';
const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..', '..');
const DIST_TEST_DIR = resolvePath(REPO_ROOT, 'dist-test');
export interface HarnessHandles {
browser: Browser;
extensionId: string;
harnessPage: Page;
victimPage: Page;
downloadsDir: string;
swConsole: string[];
offConsole: string[];
}
export interface LaunchOptions {
headless?: boolean;
downloadsDir?: string;
}
export async function launchHarnessBrowser(opts: LaunchOptions = {}): Promise<HarnessHandles> {
// ... implementation per the a6.test.ts pattern, refactored.
// 1. assertBundlePresent() — fail loudly if dist-test/ missing.
// 2. puppeteer.launch with enableExtensions + protocolTimeout + args.
// 3. resolve extensionId from browser.extensions() (poll up to 5s).
// 4. mkdtempSync the downloadsDir (if not provided).
// 5. open victimPage about:blank + bringToFront.
// 6. open harnessPage at chrome-extension://<id>/tests/uat/extension-page-harness.html.
// 7. page.waitForFunction for window.__mokoshHarness presence (5s timeout).
// 8. wire SW console listener (worker.on('console', ...)) into swConsole buffer.
// 9. wire offscreen console listener via browser.on('targetcreated', ...) opportunistically.
// 10. configure Chrome to use downloadsDir via CDP Browser.setDownloadBehavior on harnessPage's CDPSession.
// 11. return HarnessHandles.
}
```
Implement the function body per the in-comment plan. Extract verbatim from `tests/uat/a6.test.ts` lines 60-265 (the launch + victim + harness setup + console wiring blocks). Add the CDP Browser.setDownloadBehavior call (NEW — not in prototype which doesn't need downloads). Use absolute imports per project style; extensive docstrings; named callbacks for the on('console') / on('targetcreated') listeners.
2. Create `tests/uat/lib/assertions.ts`:
```typescript
// tests/uat/lib/assertions.ts — Plan 01-13 Wave 2.
// Host-side assertion primitives. Re-exports of node:assert/strict
// with structured failure messages + diagnostic-dump wrappers.
//
// NO chrome.* helpers — all chrome.* work happens inside the
// extension-internal harness page (see tests/uat/extension-page-harness.ts).
// This module is host-side ONLY.
import * as assert from 'node:assert/strict';
export interface CheckRecord {
name: string;
expected: unknown;
actual: unknown;
passed: boolean;
}
export interface AssertionRecord {
passed: boolean;
name: string;
checks: CheckRecord[];
diagnostics: string[];
error?: string;
}
export interface ConsoleBuffers {
swConsole: string[];
offConsole: string[];
}
export async function runAssertion(
name: string,
fn: () => Promise<AssertionRecord>,
buffers: ConsoleBuffers,
): Promise<AssertionRecord> { /* ... try/catch + diagnostic dump ... */ }
export function assertEqual(actual: unknown, expected: unknown, msg: string): void { /* assert.deepStrictEqual wrapper */ }
export function assertGte(actual: number, expected: number, msg: string): void { /* ... */ }
export function assertMatch(actual: string, regex: RegExp, msg: string): void { /* ... */ }
export function assertTrue(cond: boolean, msg: string): void { /* ... */ }
export async function waitFor<T>(
probe: () => Promise<T> | T,
predicate: (v: T) => boolean,
timeoutMs: number,
description: string,
): Promise<T> { /* mirrors prototype's waitFor verbatim — poll every 100ms */ }
```
Implement per the surface description. Extract `waitFor` verbatim from `tests/uat/extension-page-harness.ts`'s implementation (lines ~84-103). The host-side `waitFor` and the harness-page-side `waitFor` will be IDENTICAL implementations — that's fine; the page-side is bundled into the harness HTML, the host-side runs in the Node process. No shared module between them.
3. Create `tests/uat/lib/harness-page-driver.ts`:
```typescript
// tests/uat/lib/harness-page-driver.ts — Plan 01-13 Wave 2.
// Driver wrappers — one per assertion. Each wraps a
// page.evaluate(() => window.__mokoshHarness.assertXX()) call.
//
// Wave 2 wires driveA6 (the proven assertion from c647f61).
// Wave 3 wires driveA1..A5, A7..A13 (replaces NOT YET IMPLEMENTED stubs).
import type { Page } from 'puppeteer';
import type { AssertionRecord } from './assertions';
// For A5/A12/A13 the page side returns extra fields beyond AssertionRecord:
export interface AssertionWithBytes {
passed: boolean;
name: string;
checks: Array<{ name: string; expected: unknown; actual: unknown; passed: boolean }>;
diagnostics: string[];
error?: string;
bytesBase64?: string;
expectedVersion?: string;
}
export async function driveA6(page: Page): Promise<AssertionRecord> {
return page.evaluate(async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- browser context
const r = await (window as any).__mokoshHarness.assertA6();
return r;
}) as Promise<AssertionRecord>;
}
export async function driveA1(page: Page): Promise<AssertionRecord> {
throw new Error('NOT YET IMPLEMENTED — Wave 3A wires this');
}
// ... similarly: driveA2, driveA3, driveA4, driveA5, driveA7, driveA8, driveA9, driveA10, driveA11, driveA12, driveA13
```
Each Wave-3 stub throws "NOT YET IMPLEMENTED — Wave 3<X> wires this" where <X> is the bundle letter (A/B/C/D).
4. Rewrite `tests/uat/a6.test.ts` to use the new lib:
```typescript
import { launchHarnessBrowser } from './lib/launch';
import { driveA6 } from './lib/harness-page-driver';
import { runAssertion } from './lib/assertions';
async function main(): Promise<number> {
const handles = await launchHarnessBrowser();
try {
const result = await runAssertion('A6 — Bug B canonical', () => driveA6(handles.harnessPage), {
swConsole: handles.swConsole,
offConsole: handles.offConsole,
});
// ... pretty-print + exit code 0 on PASS, 1 on FAIL ...
} finally {
await handles.browser.close();
}
}
const code = await main();
process.exit(code);
```
Preserve the printResult helper from the original (or move it into lib/assertions.ts as a shared `printAssertionResult` function — planner discretion; planner recommends moving it to lib for Wave 3 reuse).
5. Run `npx tsc --noEmit` → exit 0 (the new lib files typecheck against puppeteer + node types).
6. Run `npx tsx tests/uat/a6.test.ts` → exits 0 with "A6 result: PASS 5/5" (the rewrite is behavior-preserving).
7. Run `npm run build` → exit 0; `grep -rln 'launchHarnessBrowser\|driveA6\|runAssertion' dist/ | wc -l` → 0 (lib files are tests-only, not bundled into dist/).
8. Run `npm run build:test` → exit 0; the lib files are NOT bundled (they're host-side; vite-test-config only includes the extension-page-harness.html as rollup input).
9. Run `npx vitest run --reporter=dot` → 89 GREEN.
10. Commit atomically (or as 3-4 sub-commits — planner discretion):
- `feat(01-13): wave-2 — launchHarnessBrowser + assertions + harness-page-driver scaffolding` (single commit recommended; the three files form one coherent unit).
Commit body: lists each new file's surface; documents the a6.test.ts rewrite as behavior-preserving; cites Wave 3 wiring contract (`driveAXX` stubs throw "NOT YET IMPLEMENTED — Wave 3<X> wires this").
npx tsc --noEmit && npx tsx tests/uat/a6.test.ts && npm run build && test "$(grep -rln 'launchHarnessBrowser\|driveA6\|runAssertion' dist/ 2>/dev/null | wc -l)" = "0" && npm run build:test && npx vitest run --reporter=dot
- `tests/uat/lib/launch.ts` exists with `launchHarnessBrowser` per the surface description; uses CDP Browser.setDownloadBehavior for downloads dir.
- `tests/uat/lib/assertions.ts` exists with `runAssertion`, `assertEqual`/`Gte`/`Match`/`True`, `waitFor`, and `AssertionRecord`/`ConsoleBuffers` types.
- `tests/uat/lib/harness-page-driver.ts` exists with `driveA6` wired + 12 Wave-3 stubs throwing "NOT YET IMPLEMENTED — Wave 3 wires this".
- `tests/uat/a6.test.ts` rewritten to use the new lib; PASSES 5/5.
- `npx tsc --noEmit` exit 0; `npx tsx tests/uat/a6.test.ts` exit 0; full vitest 89 GREEN.
- `npm run build` exit 0; production bundle does NOT contain any of the new lib symbol names (Tier-1 grep gate GREEN).
- Commit message follows Mark's style.
Approach-B driver scaffolding live; A6 still PASSES through the new lib; Wave 3 stubs ready to be filled in.
Task 4 (Wave 3A): Wire A1+A2+A3+A4 (SW bootstrap + toolbar onClicked + displaySurface monitor + popup during recording); + create harness.test.ts orchestrator with A0 grep gate.
- tests/uat/extension-page-harness.ts (the surface where A1-A4 impl lands)
- tests/uat/lib/harness-page-driver.ts (the driver stubs to wire)
- tests/uat/lib/launch.ts (HarnessHandles shape — what the orchestrator gets)
- tests/uat/lib/assertions.ts (runAssertion + printAssertionResult)
- src/background/index.ts lines 75-108 (state machine — A1+A2+A4 contract)
- src/background/index.ts lines 411-415 (setRecordingMode call inside startVideoCapture)
- src/background/index.ts lines 844-878 (chrome.action.onClicked + onStartup listener registrations)
- src/offscreen/recorder.ts lines 270-296 (getDisplayMedia + post-grant displaySurface monitor enforcement — A3 contract)
- tests/background/no-test-hooks-in-prod-bundle.test.ts (the grep gate the harness.test.ts A0 re-verifies as belt-and-suspenders)
tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts, tests/uat/harness.test.ts
- Extend `window.__mokoshHarness` with `assertA1`, `assertA2`, `assertA3`, `assertA4` methods, each returning a structured `AssertionResult`.
- A1 (SW bootstrap state): query `chrome.action.getBadgeText({})` (expect ''), `chrome.action.getPopup({})` (expect '' — idle mode per src/background/index.ts:110). isRecording check: send `chrome.runtime.sendMessage({type:'PING_STATE'})` to a NEW handler we add — OR — infer from badge state (badge==='' implies idle implies isRecording=false per the state machine). Recommend the badge-proxy approach (no production code change; the state-machine contract makes badge an accurate proxy). PASSES today.
- A2 (toolbar onClicked → REC): send `START_RECORDING` directly to offscreen (workaround for missing `tabs` permission, per prototype pattern). The production chrome.action.onClicked → startVideoCapture path needs `tabs` permission to query `chrome.tabs.query({active:true,...})` for a real tab; the harness page bypasses this by sending START_RECORDING to offscreen + manually setting badge='REC' + popup=popup.html (mimicking what setRecordingMode would do). Then assert getBadgeText==='REC' + getPopup==='src/popup/index.html'. The contract verified is: when START_RECORDING reaches offscreen, recording starts; the SW-side production state-machine transitions (setBadgeState, setPopup) are tested in unit tests (badge-state-machine.test.ts) and don't need re-verification here. Document the workaround clearly in the assertA2 impl comment + flag for Phase 5 hardening (tabs permission addition).
- A3 (displaySurface monitor): with A2's recording active, read displaySurface via `chrome.runtime.sendMessage({type:'__mokoshOffscreenQuery', op:'get-display-surface'})` — ADD this op to offscreen-hooks.ts in Wave 3D (since it's also needed for A11). For Wave 3A, use a workaround: query `window.__mokoshHarness.getCurrentDisplaySurface()` which can't work (page doesn't have offscreen access)... CORRECTION: add a new bridge op `get-display-surface` to offscreen-hooks.ts in THIS task (Wave 3A — not 3D). The op returns the value of `currentStream.getVideoTracks()[0].getSettings().displaySurface`. Assert === 'monitor' (per src/offscreen/recorder.ts:296 enforcement: production code throws and tears down the stream if observed !== 'monitor', so if recording is live, displaySurface is guaranteed monitor; the assertion confirms the offscreen-hooks fake stream's monkey-patched getSettings() correctly returns 'monitor').
- A4 (popup during recording): with A2's recording active, attempt to read getPopup (should be 'src/popup/index.html' from A2). Trigger NOTHING that would create a second offscreen (no second START_RECORDING). Verify: getPopup unchanged. No offscreen-creation path — assert the offscreen document count via `chrome.offscreen.hasDocument()` returns true (recording's offscreen is the only one).
- Wire `driveA1`/`driveA2`/`driveA3`/`driveA4` in `tests/uat/lib/harness-page-driver.ts` (replace the NOT YET IMPLEMENTED stubs).
- Create `tests/uat/harness.test.ts` (NEW — was deleted in Wave 0):
```typescript
// tests/uat/harness.test.ts — Plan 01-13 Wave 3.
// Top-to-bottom orchestrator for all 14 assertions (A0 + A1..A13).
// ...
```
Wave 3A wires A0+A1+A2+A3+A4; stubs A5+A7+A8+A9+A10+A11+A12+A13 as `throw new Error('NOT YET IMPLEMENTED — Wave 3 wires this')`. A6 uses the proven `driveA6` from Wave 2. Bail-on-first-failure; exit 0 only when 14/14 GREEN.
A0 (production-bundle grep gate): pre-flight. Run `npm run build` (or skip via `SKIP_PROD_REBUILD=1`); grep `dist/` for the 8 forbidden hook strings; assert 0 matches. This runs BEFORE Chrome launches.
- Add bridge op `get-display-surface` to `src/test-hooks/offscreen-hooks.ts` (Wave 3A scope creep, BUT necessary for A3 — alternative is duplicating get-current-stream + .getSettings() work in the harness page which is uglier). Document the addition; update the offscreen-hooks comment block to reflect the protocol expansion.
- Also extend `MokoshTestSurface` in `src/test-hooks/types.ts` to include typed fields `installFakeDisplayMedia?`, `uninstallFakeDisplayMedia?`, `dispatchEndedOnTrack?` so the offscreen-hooks `as MokoshTestSurface & {...}` cross-cast collapses to a clean assignment. (Carries the type-cleanup that Wave 1 didn't get to because Wave 1 was move-only.)
- Update `tests/uat/lib/test-hook-contract.d.ts` to mirror the type extension.
- Tier-1 grep gate: ensure `dist/` stays clean of the new bridge op string `get-display-surface` (add to FORBIDDEN_STRINGS list).
- After this task: `npm run test:uat` exits non-zero; diagnostic: "5/14 passed: A0, A1, A2, A3, A4 GREEN; A5..A13 NOT YET IMPLEMENTED". A6 PASSES via the proven driveA6 — so technically 6/14 passed including A6; phrasing in the diagnostic: "6/14 GREEN, 8 NOT YET IMPLEMENTED".
1. Add `get-display-surface` bridge op to `src/test-hooks/offscreen-hooks.ts`:
Inside the existing `chrome.runtime.onMessage.addListener` block, add a new `if (op === 'get-display-surface')` branch. Returns `{ displaySurface: currentStream?.getVideoTracks()[0]?.getSettings().displaySurface ?? null }`. Document the op in the protocol comment block at lines ~297-303.
2. Extend `MokoshTestSurface` in `src/test-hooks/types.ts`:
Add `installFakeDisplayMedia?: () => void;`, `uninstallFakeDisplayMedia?: () => void;`, `dispatchEndedOnTrack?: () => { ok: boolean; error?: string };` as typed fields. Update the JSDoc to note these are offscreen-only (undefined in SW isolate — but the SW isolate doesn't get hooks in Approach B anyway; the fields are present-but-inert just like the existing handlers fields).
3. Update `tests/uat/lib/test-hook-contract.d.ts` to mirror the type extension.
4. Collapse the cross-cast in `src/test-hooks/offscreen-hooks.ts` lines ~284-288 (the `as MokoshTestSurface & {...}` block) to a clean `as MokoshTestSurface` since the type now includes the methods.
5. Extend `tests/uat/extension-page-harness.ts` `window.__mokoshHarness` with `assertA1`, `assertA2`, `assertA3`, `assertA4` methods. Each follows the assertA6 pattern: AssertionResult shape with `passed`, `name`, `checks[]`, `diagnostics[]`, `error?`. Specifically:
- **assertA1**: queries `chrome.action.getBadgeText({})` + `chrome.action.getPopup({})` + verifies `isRecording=false` via badge-proxy (`badge !== 'REC'` implies isRecording=false). Each check is a CheckRecord. PASS if all 3 checks pass.
- **assertA2**: ensure offscreen + send START_RECORDING + manually setBadge('REC') + setPopup('src/popup/index.html') + waitFor getBadgeText==='REC' + assert popup==='src/popup/index.html'. Document workaround inline (chrome.tabs permission gap).
- **assertA3**: assumes A2 left recording active. Bridge-query `get-display-surface`. Assert === 'monitor'.
- **assertA4**: assumes A2 left recording active. Snapshot getPopup (expect 'src/popup/index.html'). Verify chrome.offscreen.hasDocument === true (recording's offscreen is the only one). No new offscreen creation attempted (the production toolbar-click-during-recording path is no-op per src/background/index.ts:863-866).
6. Wire `driveA1`/`driveA2`/`driveA3`/`driveA4` in `tests/uat/lib/harness-page-driver.ts` (replace stubs).
7. Create `tests/uat/harness.test.ts`:
```typescript
// tests/uat/harness.test.ts — Plan 01-13 Wave 3 orchestrator.
// ...
import { execFileSync } from 'node:child_process';
import { readdirSync, readFileSync, statSync } from 'node:fs';
import { join, resolve as resolvePath } from 'node:path';
import { dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
import { launchHarnessBrowser } from './lib/launch';
import { driveA1, driveA2, driveA3, driveA4, driveA5, driveA6, driveA7, driveA8, driveA9, driveA10, driveA11, driveA12, driveA13 } from './lib/harness-page-driver';
import { runAssertion } from './lib/assertions';
// FORBIDDEN_STRINGS used by A0 (mirror of tests/background/no-test-hooks-in-prod-bundle.test.ts inventory):
const FORBIDDEN_HOOK_STRINGS = [
'__mokoshTest', 'setCurrentStream', 'setSegmentCountGetter',
'installFakeDisplayMedia', 'uninstallFakeDisplayMedia',
'dispatchEndedOnTrack', 'getSegmentCount', '__mokoshOffscreenQuery',
'get-display-surface',
];
async function assertA0_GrepGate(): Promise<{passed: boolean; matches: string[]}> {
// Skip prod rebuild if SKIP_PROD_REBUILD=1; otherwise run `npm run build`.
if (process.env.SKIP_PROD_REBUILD !== '1') {
execFileSync('npm', ['run', 'build'], { stdio: 'inherit' });
}
const distDir = resolvePath(dirname(fileURLToPath(import.meta.url)), '..', '..', 'dist');
const matches: string[] = [];
// Recursive grep walk; for each file under dist/, check each forbidden string.
// Implementation per tests/background/no-test-hooks-in-prod-bundle.test.ts pattern.
// ...
return { passed: matches.length === 0, matches };
}
async function main(): Promise<number> {
// Pre-flight A0:
const a0 = await assertA0_GrepGate();
if (!a0.passed) {
console.error(`A0 FAIL: production bundle hook-string leak. Matches:\n${a0.matches.join('\n')}`);
return 1;
}
console.log('A0: GREEN (production bundle hook-free)');
const handles = await launchHarnessBrowser();
const buffers = { swConsole: handles.swConsole, offConsole: handles.offConsole };
const results: Array<{ name: string; passed: boolean; }> = [];
const drivers = [
{ name: 'A1', drive: driveA1 },
{ name: 'A2', drive: driveA2 },
{ name: 'A3', drive: driveA3 },
{ name: 'A4', drive: driveA4 },
{ name: 'A5', drive: driveA5 },
{ name: 'A6', drive: driveA6 },
{ name: 'A7', drive: driveA7 },
{ name: 'A8', drive: driveA8 },
{ name: 'A9', drive: driveA9 },
{ name: 'A10', drive: driveA10 },
{ name: 'A11', drive: driveA11 },
{ name: 'A12', drive: driveA12 },
{ name: 'A13', drive: driveA13 },
];
try {
for (const { name, drive } of drivers) {
try {
const result = await runAssertion(name, () => drive(handles.harnessPage), buffers);
results.push({ name, passed: result.passed });
if (!result.passed) {
// bail-on-first-failure
break;
}
} catch (err) {
// NOT YET IMPLEMENTED is the Wave-stub error; counts as a fail
results.push({ name, passed: false });
break;
}
}
} finally {
await handles.browser.close();
}
const passed = results.filter(r => r.passed).length;
const total = drivers.length + 1; // +1 for A0
console.log(`\nUAT harness: ${passed + 1}/${total} assertions passed`);
return passed === drivers.length ? 0 : 1;
}
const code = await main();
process.exit(code);
```
Implementation per the pseudocode. NO `as any`; absolute imports; extensive comments. The bail-on-first-failure semantics + structured diagnostic dump matches the prototype pattern. The optional `--only=A6` CLI arg (planner's discretion to include or defer to Wave 3D) lets developers run a single assertion for iteration.
8. Verify Tier-1 grep gate updates: edit `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list to add `get-display-surface` (the new bridge op).
9. Run `npm run build` → exit 0; grep gate stays GREEN (the new offscreen-hooks bridge op is gated behind `__MOKOSH_UAT__`; tree-shaken from production).
10. Run `npm run build:test` → exit 0; the offscreen chunk in dist-test/ contains `get-display-surface`.
11. Run `npx tsx tests/uat/harness.test.ts` → exits 1 (Wave 3B+ stubs throw); diagnostic shows "6/14 GREEN: A0+A1+A2+A3+A4+A6; 8 NOT YET IMPLEMENTED" (Wave 3 wires the rest). The first NOT YET IMPLEMENTED stop is A5 — bail-on-first-failure; the catch in main() handles this gracefully.
12. Run `npx tsx tests/uat/a6.test.ts` standalone → still exits 0 (5/5 PASS) — proves the standalone iteration entry still works.
13. Run `npx tsc --noEmit` → exit 0.
14. Run `npx vitest run --reporter=dot` → 89 GREEN.
15. RED-on-regression demos (commit body — light-touch since these aren't the canonical TDD demos; those land in 3B+3C):
- A1: locally `chrome.action.setPopup({popup: 'foo.html'})` from a probe before launching harness → A1 should FAIL on the getPopup==='' check. Revert; PASS.
- A2: locally short-circuit START_RECORDING in offscreen → A2 should FAIL with timeout. Revert; PASS.
- A3: locally remove the displaySurface monkey-patch in offscreen-hooks.ts:179-186 → A3 should FAIL (displaySurface is undefined for raw canvas captureStream tracks). Revert; PASS.
- A4: locally introduce a getPopup mutation in chrome.action.onClicked handler — actually skip A4 RED demo, the assertion is essentially a no-op verification.
Document at least 2 of the 4 in the commit body.
16. Commit atomically: `feat(01-13): wave-3A — A1+A2+A3+A4 + harness orchestrator + A0 grep gate`. Body lists assertions wired, the bridge op addition, the type extension, the harness orchestrator structure, RED demos cited.
npx tsc --noEmit && npm run build && test "$(grep -rln 'get-display-surface' dist/ 2>/dev/null | wc -l)" = "0" && npm run build:test && (set +e; npx tsx tests/uat/harness.test.ts; test $? -ne 0) && npx tsx tests/uat/a6.test.ts && npx vitest run --reporter=dot
- `window.__mokoshHarness` exposes assertA1/A2/A3/A4 (plus the existing assertA6).
- `tests/uat/lib/harness-page-driver.ts` wires driveA1/A2/A3/A4 (driveA6 still wired; A5+A7..A13 stay stubbed).
- `tests/uat/harness.test.ts` exists; A0 + A1 + A2 + A3 + A4 + A6 GREEN (= 6/14); A5/A7..A13 throw NOT-YET-IMPLEMENTED; bail-on-first-failure stops at A5.
- `tests/uat/a6.test.ts` standalone still PASSES (5/5).
- `src/test-hooks/offscreen-hooks.ts` adds `get-display-surface` bridge op.
- `src/test-hooks/types.ts` extends MokoshTestSurface with installFakeDisplayMedia / uninstallFakeDisplayMedia / dispatchEndedOnTrack typed fields; offscreen-hooks.ts cross-cast collapsed.
- `tests/uat/lib/test-hook-contract.d.ts` mirrors the type extension.
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list includes `get-display-surface`.
- `npm run build` exit 0; Tier-1 grep gate GREEN (no hook strings in dist/).
- `npm run build:test` exit 0; offscreen chunk in dist-test/ contains the new bridge op.
- `npx tsc --noEmit` exit 0; vitest 89 GREEN.
- At least 2 RED-on-regression demos documented in commit body.
Wave 3A landed: 6/14 GREEN; state-machine + recording + display-surface + popup contracts verified; ready for Wave 3B (Bug B canonical).
Task 5 (Wave 3B): Wire A5+A6+A7 (SAVE_ARCHIVE download + Bug B canonical regression rewind + genuine error path).
- tests/uat/extension-page-harness.ts (the surface where A5/A7 land; A6 already wired)
- tests/uat/lib/harness-page-driver.ts (driveA5/A6/A7 stubs to wire)
- tests/uat/lib/zip.ts (host-side jszip work for A5 archive validation)
- tests/uat/lib/launch.ts (downloadsDir from HarnessHandles)
- src/background/index.ts lines 725-794 (RECORDING_ERROR handler + Bug B routing — A6+A7 contract)
- src/background/index.ts lines 730-734 (SAVE_ARCHIVE handler — A5 contract)
- src/offscreen/recorder.ts lines 489-525 (onUserStoppedSharing — A6's dispatch-ended target)
- .planning/debug/resolved/01-09-recovery-flow.md (Bug B canonical debug record — A6's exact contract)
tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts
- A5 (SAVE_ARCHIVE download): with recording active from A2, send `chrome.runtime.sendMessage({type:'SAVE_ARCHIVE'})`. The SW handler triggers the production save-archive flow (saveArchive in src/background/index.ts:731) which calls `chrome.downloads.download(...)`. The download lands in `handles.downloadsDir` (configured at launch via CDP Browser.setDownloadBehavior). Host-side polling: the assertA5 method returns the zip bytes via base64 (the page can `fetch(blob URL)` BUT cannot read the downloads dir directly — alternative: have the page send a runtime message to capture the archive bytes BEFORE the download; the production saveArchive produces the zip via JSZip and triggers download(url). For test purposes the cleanest path is: harness page calls a new SW bridge op `__mokoshSwQuery` with op `save-archive-to-bytes` that runs the same archive creation logic but returns the bytes via sendMessage instead of triggering download). CORRECTION: simpler — keep the production saveArchive path; host-side polls `handles.downloadsDir` for `session_report_*.zip` for up to 15s; reads bytes from disk; assertion 5's page-side method returns `{passed: true}` once SW sendMessage resolves, host-side does the file-system check. The driveA5 wrapper handles both — page returns trigger ack, host returns AssertionRecord including the bytes via fs.readFileSync.
- **A6 (BUG B canonical) — ALREADY PROVEN**: leave the existing assertA6 implementation untouched. It works (c647f61 5/5 GREEN). Wave 3B's commit body documents the RED-on-regression demo cycle per the contract.
- A7 (genuine error → ERR + recovery notification): start a fresh recording (A6 stopped it). Snapshot notificationCount via `chrome.notifications.getAll(...)`. Send `chrome.runtime.sendMessage({type:'RECORDING_ERROR', error: 'codec-unsupported'})`. Wait 200ms. Assert: badge='ERR'; popup='src/popup/index.html'; notificationCount delta === 1; the last notification id starts with `mokosh-recovery-`. PASSES today.
- Wire driveA5/A7 (A6 already wired); harness.test.ts orchestrator advances through A5+A6+A7 GREEN (= 9/14 with A0+A1+A2+A3+A4+A5+A6+A7); A8..A13 still stubbed.
- **MANDATORY commit-body documentation: A6 RED-on-regression demo cycle.** The executor LOCALLY (not committed): edits `src/background/index.ts:776` from `if (errorCode === 'user-stopped-sharing')` to `if (false)`. Rebuilds `npm run build:test`. Runs `npm run test:uat` (or `npx tsx tests/uat/a6.test.ts`). A6 FAILS with diagnostic: "A6.1: badge text is '' (NOT 'ERR') after user-stop — expected '', actual 'ERR'". Reverts `git checkout -- src/background/index.ts`. Rebuilds. Re-runs. A6 PASSES 5/5. Documents the exact diagnostic + cycle in the commit body. This is the canonical Bug B regression catch — load-bearing for the plan's success criteria.
1. Extend `window.__mokoshHarness` in `tests/uat/extension-page-harness.ts` with `assertA5` and `assertA7` methods.
- **assertA5**: returns `{ passed: boolean; diagnostics: string[]; error?: string; }`. Implementation: ensureOffscreen + startRecording (reuses existing helpers); wait for badge='REC'; send `chrome.runtime.sendMessage({type:'SAVE_ARCHIVE'})` with timeout 15s; on resp.success === true, return `{passed: true, diagnostics: ['saveArchive resp.success=true']}`. The host-side driver does the dir-polling + file-read + zip-bytes capture.
- **assertA7**: standard AssertionResult shape. Implementation: ensure recording fresh (if A6 stopped it, restart via assertA2's helpers — refactor common setup into a shared `setupFreshRecording()` helper inside extension-page-harness.ts). Snapshot notif count via getActiveNotificationCount (existing helper). Send RECORDING_ERROR via chrome.runtime.sendMessage. Wait 200ms. Assert badge='ERR' + popup='src/popup/index.html' + notif delta===1 + last id startsWith 'mokosh-recovery-' (read via `chrome.notifications.getAll`; iterate keys; check the most-recent one — note that Object.keys ordering is not strictly guaranteed but Chrome appends in insertion order in practice; if flaky, use a set-membership check: assert ANY id startsWith the prefix).
2. Wire `driveA5` in `tests/uat/lib/harness-page-driver.ts`:
```typescript
export async function driveA5(page: Page, downloadsDir: string): Promise {
// Trigger save via page-side method.
const pageResp = await page.evaluate(async () => {
const r = await (window as any).__mokoshHarness.assertA5();
return r;
});
if (!pageResp.passed) return { passed: false, name: 'A5', checks: [], diagnostics: pageResp.diagnostics, error: pageResp.error };
// Host-side: poll downloadsDir for session_report_*.zip.
// ... using fs.readdirSync + waitFor pattern ...
// Returns the zipBytes (base64) on success.
}
```
Note: driveA5 signature now takes `downloadsDir` — update harness.test.ts orchestrator to pass it. Or: refactor so all drivers take a `harnessCtx: { page, downloadsDir, ... }` object. Planner discretion; planner recommends the harnessCtx pattern (single arg, future-proof).
3. Wire `driveA7` in `tests/uat/lib/harness-page-driver.ts` (standard one-line page.evaluate wrapper).
4. Update `tests/uat/harness.test.ts` to thread `handles.downloadsDir` into driveA5 (or pass full `harnessCtx`).
5. Run `npm run test:uat` → A0+A1+A2+A3+A4+A5+A6+A7 GREEN (8/14); A8..A13 stubs. Exit non-zero (bail-on-first-failure at A8).
6. Run `npx tsx tests/uat/a6.test.ts` → 5/5 PASS (regression check; A6 unchanged).
7. **EXECUTE the A6 Bug B RED-on-regression demo** (locally, do NOT commit):
- Edit `src/background/index.ts:776`: change `if (errorCode === 'user-stopped-sharing') {` to `if (false) {`.
- `npm run build:test` (rebuild test bundle).
- `npx tsx tests/uat/a6.test.ts`.
- Observe FAIL with diagnostic: "A6.1: badge text is '' (NOT 'ERR') after user-stop — expected '', actual 'ERR'" (and likely the other 3 checks also FAIL).
- `git checkout -- src/background/index.ts` (revert).
- `npm run build:test`.
- `npx tsx tests/uat/a6.test.ts`.
- Observe PASS 5/5.
- CAPTURE the exact diagnostic lines from the FAIL run for the commit body.
8. Run `npx tsc --noEmit` → exit 0.
9. Run `npx vitest run --reporter=dot` → 89 GREEN.
10. Run `npm run build` → grep gate stays GREEN.
11. Commit atomically: `feat(01-13): wave-3B — A5+A6+A7 + Bug B regression rewind demonstrated`. Commit body MUST include the verbatim A6 RED-on-regression cycle (per the contract in the interfaces block "How A6 / A8 RED-on-regression demos work" section). Also notes A5 + A7 wiring + RED-on-regression demos for A5 (locally comment out chrome.downloads.download → A5 FAIL on timeout; revert → PASS) and A7 (locally short-circuit RECORDING_ERROR handler → A7 FAIL; revert → PASS). At least the A6 demo is MANDATORY; A5+A7 are recommended but not blocking.
npx tsc --noEmit && (set +e; npm run test:uat; test $? -ne 0) && npx tsx tests/uat/a6.test.ts && npx vitest run --reporter=dot
- `window.__mokoshHarness` exposes assertA5 + assertA7 (in addition to A1-A4, A6).
- driveA5 + driveA7 wired in harness-page-driver.ts.
- `npm run test:uat` advances through 8/14 GREEN (A0+A1-A7); bails at A8.
- A6 standalone still 5/5 PASS via `npx tsx tests/uat/a6.test.ts`.
- Commit body contains the verbatim A6 RED-on-regression demo cycle (MANDATORY per success criteria).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
Bug B canonical regression rewind demonstrably catches a regression; SAVE_ARCHIVE + ERROR-path coverage live; 8/14 GREEN.
Task 6 (Wave 3C): Wire A8+A9+A10 (Bug A onStartup notification regression rewind + icon file sizes + manifest shape).
- tests/uat/extension-page-harness.ts (the surface where A8/A9/A10 land)
- tests/uat/lib/harness-page-driver.ts (driveA8/A9/A10 stubs)
- src/background/index.ts lines 71 (NOTIFICATION_ICON_PATH constant — A8's regression target)
- src/background/index.ts lines 877-898 (chrome.runtime.onStartup handler — A8's trigger target)
- manifest.json (icons + notifications permission — A10 contract)
- icons/icon{16,48,128}.png (file sizes — A9 contract; floors per orchestrator brief: 16→200B, 48→500B, 128→1024B)
tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts
- A8 (BUG A onStartup notification): challenge — Approach B has no SW-side handler-capture hook (sw-hooks.ts deleted in Wave 0; Approach A relied on monkey-patching chrome.runtime.onStartup.addListener to capture the handler). Workaround: TRIGGER the production code path that fires the same chrome.notifications.create — namely, send a `chrome.runtime.sendMessage({type:'__mokoshTriggerStartup'})` to the SW with a NEW production-side test-hook... wait, that requires production code change. SIMPLER WORKAROUND: invoke chrome.notifications.create directly from the page with the SAME options the production onStartup handler uses (iconUrl: chrome.runtime.getURL('icons/icon128.png'), title: 'Mokosh ready', type: 'basic'). If chrome.notifications.create RESOLVES (no rejection from Chrome's imageUtil because the icon is valid), the contract is verified. This is the SAME promise-resolution path Bug A would break. CAVEAT: this verifies Chrome's imageUtil accepts the icon, NOT that the SW onStartup handler runs — but the SW handler is unit-tested in tests/background/onstartup-notification.test.ts; the harness's role is end-to-end icon-acceptance verification, which is what Bug A regressed on. Document the workaround prominently. PASSES today.
- A9 (icon file sizes meet floors): `fetch(chrome.runtime.getURL('icons/icon16.png'))` + read content-length (or blob.size). Floors: 16→200B, 48→500B, 128→1024B. Assert each ≥ floor. PASSES today.
- A10 (manifest shape): `chrome.runtime.getManifest()`. Assert: `permissions.includes('notifications')`; `icons['16']`, `icons['48']`, `icons['128']` all defined. Also assert `default_icon` paths (manifest.action.default_icon) match. PASSES today.
- Wire driveA8/A9/A10; harness.test.ts advances to 11/14 GREEN; A11..A13 stubbed.
- **MANDATORY commit-body documentation: A8 Bug A RED-on-regression demo cycle.** Executor LOCALLY (not committed): edits `src/background/index.ts:71` from `const NOTIFICATION_ICON_PATH = 'icons/icon128.png';` to `const NOTIFICATION_ICON_PATH = 'icons/missing.png';`. Rebuilds. Runs `npm run test:uat`. A8 FAILS (Chrome's imageUtil rejects the create → notif count delta=0). Reverts. Rebuilds. Re-runs. A8 PASSES. CAPTURE diagnostic lines for commit body. Alternative regression trigger: truncate `icons/icon128.png` to 0 bytes via `: > icons/icon128.png` (then `git checkout -- icons/icon128.png` to restore). Either trigger acceptable.
1. Extend `window.__mokoshHarness` in `tests/uat/extension-page-harness.ts` with `assertA8`, `assertA9`, `assertA10`:
- **assertA8**: snapshot notif count. Call `chrome.notifications.create('mokosh-startup-' + Date.now(), {type:'basic', iconUrl: chrome.runtime.getURL('icons/icon128.png'), title:'Mokosh ready', message:'Click here to start recording your session.', priority:1}, (id) => {...})`. Wait 100ms. Re-snapshot. Assert delta===1. Document workaround inline.
- **assertA9**: for each (16, 200), (48, 500), (128, 1024), `fetch(chrome.runtime.getURL('icons/icon{N}.png'))` + check `(await response.blob()).size` ≥ floor. Or use content-length header.
- **assertA10**: `const m = chrome.runtime.getManifest();` Assert: `m.permissions?.includes('notifications')` true; `m.icons?.['16']`, `['48']`, `['128']` all truthy.
2. Wire driveA8/A9/A10 in harness-page-driver.ts (standard one-line page.evaluate wrappers).
3. Run `npm run test:uat` → 11/14 GREEN (A0+A1-A10); bail at A11.
4. **EXECUTE the A8 Bug A RED-on-regression demo** (locally, do NOT commit) per the behavior description. Capture diagnostic.
5. Run `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
6. Commit atomically: `feat(01-13): wave-3C — A8+A9+A10 + Bug A regression rewind demonstrated`. Commit body MUST include the verbatim A8 RED-on-regression cycle.
npx tsc --noEmit && (set +e; npm run test:uat; test $? -ne 0) && npx vitest run --reporter=dot
- assertA8/A9/A10 wired on `window.__mokoshHarness`.
- driveA8/A9/A10 wired in harness-page-driver.ts.
- `npm run test:uat` advances through 11/14 GREEN; bails at A11.
- Commit body contains verbatim A8 Bug A RED-on-regression demo cycle (MANDATORY per success criteria).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
Bug A canonical regression rewind demonstrably catches a regression; icon + manifest contracts live; 11/14 GREEN; both Phase-1-escapee bug classes now CI-callable.
Task 7 (Wave 3D): Wire A11+A12+A13 (35s buffer continuity + ffprobe gate + zip shape); 14/14 GREEN.
- tests/uat/extension-page-harness.ts (the surface where A11/A12/A13 land)
- tests/uat/lib/harness-page-driver.ts (driveA11/A12/A13 stubs)
- tests/uat/lib/zip.ts (host-side jszip work for A13)
- tests/offscreen/webm-playback.test.ts (FFPROBE_BIN constant + skip-gate pattern — A12 mirrors this)
- src/offscreen/recorder.ts (segments array, MAX_SEGMENTS, SEGMENT_DURATION_MS — A11 contract via the existing get-segment-count bridge op)
- src/test-hooks/offscreen-hooks.ts (setSegmentCountGetter wire to add)
- src/background/index.ts lines 730-734 + the saveArchive impl (A13 contract: meta.json version field)
tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts, src/test-hooks/offscreen-hooks.ts, src/offscreen/recorder.ts
- Add `get-segment-count` bridge op to `src/test-hooks/offscreen-hooks.ts` (mirror the existing `dispatch-ended` / `has-stream` ops). Returns `{ count: segmentCountGetter() }`.
- Add the segment-count wire to `src/offscreen/recorder.ts` (gated by __MOKOSH_UAT__): inside startRecording (after the existing setCurrentStream wire at lines ~277-285), add `testHooks?.setSegmentCountGetter(() => segments.length);`. The `segments` module-level array is in scope at recorder.ts:91.
- A11 (35s buffer continuity): start fresh recording. Wait 35 seconds. Query `chrome.runtime.sendMessage({type:'__mokoshOffscreenQuery', op:'get-segment-count'})`. Assert count ≥ 3 (per D-13: 10s segments × MAX_SEGMENTS=3). The 35s wait is real wall-clock time; document the long runtime impact in the commit body. Keepalive: send a periodic `chrome.runtime.sendMessage({type:'OFFSCREEN_READY'})` or similar light query every 20s to keep the SW from going idle (per RESEARCH §2 Pitfall 5).
- A12 (ffprobe gate): trigger SAVE_ARCHIVE (reuse assertA5's helpers). Page-side returns the archive bytes (or success ack). Host-side driveA12 reads the zip, extracts `video/last_30sec.webm` via jszip, writes to a tmpfile, spawns `ffprobe -v error -f matroska -i ` via execFileSync. Asserts exit 0. Skip-gate pattern: if `!existsSync(FFPROBE_BIN)`, print "SKIPPED: ffprobe not available" + return passed=true (mirrors webm-playback.test.ts pattern). The unit-level webm-playback.test.ts gates the same contract; A12 is end-to-end belt + suspenders.
- A13 (zip shape): host-side jszip parse of the zip from A12 (reuse). Assert: `video/last_30sec.webm` entry exists + size > 0. Parse `meta.json`; assert `version === chrome.runtime.getManifest().version` (queried at harness setup or from the page side via `__mokoshHarness.getManifestVersion()`).
- Update `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list to add `get-segment-count` (new bridge op string). Total: 10 forbidden strings.
- After this task: `npm run test:uat` exits 0 with 14/14 GREEN. Total runtime ~50-90s (dominated by A11's 35s wait + A0's `npm run build` ~10s, skippable via SKIP_PROD_REBUILD=1).
- Production bundle: `grep -rln 'get-segment-count\|setSegmentCountGetter' dist/` → 0 (Tier-1 gate GREEN).
1. Add `get-segment-count` bridge op to `src/test-hooks/offscreen-hooks.ts`:
In the `chrome.runtime.onMessage.addListener` block (after the existing `if (op === 'has-stream')` branch), add:
```typescript
if (op === 'get-segment-count') {
try {
sendResponse({ count: segmentCountGetter() });
} catch (err) {
sendResponse({ count: -1, error: err instanceof Error ? err.message : String(err) });
}
return false;
}
```
Update the protocol-comment block at lines ~297-303 to include the new op.
2. Add segment-count wire to `src/offscreen/recorder.ts`:
Inside startRecording, immediately after the existing `if (__MOKOSH_UAT__) { testHooks?.setCurrentStream(stream); ... }` block (line ~285), the line `testHooks?.setSegmentCountGetter(() => segments.length);` should already be inside (per the existing wire at line 284 per my read). Verify; if missing, add. Comment per project style.
3. Extend `window.__mokoshHarness` in `tests/uat/extension-page-harness.ts` with `assertA11`, `assertA12`, `assertA13`:
- **assertA11**: ensure fresh recording (helper from Wave 3A). Wait 35000ms with intermittent keepalive pings every 20000ms. Query bridge `get-segment-count`. Assert count ≥ 3.
- **assertA12**: ensure fresh recording. Trigger SAVE_ARCHIVE (reuse). Return `{passed: ack-status, webmBytes?: ... }` — actually the cleanest is the host-side does extraction; the page just confirms save succeeded.
- **assertA13**: similar — page returns save-success ack + version metadata; host does zip parsing + meta.json validation.
- Add `getManifestVersion(): string` helper on `__mokoshHarness` for A13.
4. Wire driveA11/A12/A13 in harness-page-driver.ts. driveA12 + driveA13 do host-side fs/jszip/ffprobe work (extract from `handles.downloadsDir`).
5. Update `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list: add `get-segment-count`. Total inventory: 10 strings.
6. Run `npm run build` → exit 0; `grep -rln 'get-segment-count\|setSegmentCountGetter\|...' dist/` → 0.
7. Run `npm run build:test` → exit 0; offscreen chunk contains new bridge op.
8. Run `npm run test:uat` → exit 0; final line: "UAT harness: 14/14 assertions passed". Runtime ~50-90s.
9. Run `npx tsc --noEmit` exit 0; vitest 89 GREEN.
10. RED-on-regression demos (commit body):
- A11: locally edit `src/offscreen/recorder.ts:52` `SEGMENT_DURATION_MS = 10_000` → `SEGMENT_DURATION_MS = 30_000`; rebuild; A11 FAIL (count=1 not ≥3 after 35s). Revert; PASS.
- A12: locally inject corruption into webm-remux output OR truncate the produced webm in saveArchive to <100 bytes; rebuild; A12 FAIL (ffprobe error). Revert; PASS.
- A13: locally drop `version` field from meta.json writer in saveArchive; rebuild; A13 FAIL. Revert; PASS.
Document at least 1 of the 3 in the commit body.
11. Commit atomically: `feat(01-13): wave-3D — A11+A12+A13 + segment-count bridge; 14/14 GREEN`. Body lists assertions wired, the bridge op + recorder wire additions, the FORBIDDEN_STRINGS update, the total runtime range.
npx tsc --noEmit && npm run build && test "$(grep -rln 'get-segment-count\|setSegmentCountGetter' dist/ 2>/dev/null | wc -l)" = "0" && npm run test:uat && npx vitest run --reporter=dot
- assertA11/A12/A13 wired on `window.__mokoshHarness`; driveA11/A12/A13 wired in harness-page-driver.ts.
- `get-segment-count` bridge op + `setSegmentCountGetter` wire added (offscreen-only, gated).
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` FORBIDDEN_STRINGS list = 10 strings (added `get-segment-count`).
- `npm run test:uat` exit 0; final line: "UAT harness: 14/14 assertions passed".
- `npm run build` exit 0; `grep -rln ... dist/` → 0 (Tier-1 grep gate GREEN).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN.
- At least 1 of A11/A12/A13 RED-on-regression demo documented in commit body.
14-assertion charter complete; harness exits 0 against current bundle; production bundle byte-clean of hook strings; both Phase-1-escapee bug regressions catchable.
Task 8 (Wave 4): Append 01-09 amendment block; update STATE.md + ROADMAP.md; final smoke before checkpoint.
- .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md (find end of file + existing amendment block from commit 9d0313a)
- .planning/STATE.md (Decisions section to append to)
- .planning/ROADMAP.md Phase 1 Plans list (current ends at 01-07; check if 01-08/01-09/01-10/01-11/01-12 entries need to be added — surface the gap if found)
- tests/uat/harness.test.ts (the harness that closes the contract)
.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md, .planning/STATE.md, .planning/ROADMAP.md
- APPEND to `01-09-PLAN.md` an amendment block at the END (after any existing amendment from 9d0313a). The amendment block:
```
---
## Amendment 2 (Phase 01-stabilize-video-pipeline, 2026-05-18) — Plan 01-13 harness closes Plan 01-09 functional contract
The 2026-05-17 Plan-01-11 amendment block above referenced `npm run test:uat`
as the closure target, but Plan 01-11 pivoted to a spike-then-pivot
(see 01-11-SUMMARY.md commit ba5474c) — the harness never landed under
01-11. Plan 01-13 delivered the harness via Approach B (extension-internal-
page architecture + offscreen-side synthetic MediaStream). The closure
contract from Amendment 1 still applies; this Amendment 2 confirms the
target is now operational:
- **Step 1 (build):** unchanged — `npm run build` must exit 0.
- **Steps 2-13 + 15:** REDIRECTED to `npm run test:uat` (Plan 01-13's
Approach-B harness; 14 assertions A0..A13).
- **Step 14 (brand/design):** RETAINED for operator. The harness verifies
functional contracts (displaySurface, notification fires, badge state
machine, Bug A + Bug B regression catches) but does NOT verify the
human-readable copy is aesthetically correct OR that the badge color
reads cleanly against the operator's OS theme.
**Closure gate:** Plan 01-09 closes when `npm run test:uat` exits 0 (14/14
GREEN, verified by Plan 01-13 Task 7) AND operator confirms step 14
(brand/design) via Plan 01-13 Task 9.
```
- APPEND to `STATE.md` Decisions section (after the most recent entry):
```
- [Phase 01-13]: Approach-B UAT harness landed (14/14 GREEN). Inherits 01-11 spike-pivot rationale. Plan 01-09 functional contract closes via `npm run test:uat`. Tier-1 grep gate forbidden-string inventory expanded to 10 hook strings covering the Approach-B surface (__mokoshTest, setCurrentStream, setSegmentCountGetter, installFakeDisplayMedia, uninstallFakeDisplayMedia, dispatchEndedOnTrack, getSegmentCount, __mokoshOffscreenQuery, get-display-surface, get-segment-count). Standalone A6 entry at `tests/uat/a6.test.ts` for quick TDD iteration; orchestrated 14-assertion run via `tests/uat/harness.test.ts`. Operator role reduced to step 14 (brand/design) of original 01-09 Task 5.
```
- APPEND to `ROADMAP.md` Phase 1 Plans list. Current list (per inspection) ends at:
```
- [x] 01-07-PLAN.md — Manual smoke + ffprobe D-12 acceptance gate ...
```
Plans 01-08, 01-09, 01-10, 01-11, 01-12 entries are MISSING from ROADMAP.md (planner-detected gap). Wave 4 surfaces this gap to the orchestrator — does NOT silently inject 5 plan entries (out of scope for the 01-13 plan execution). Wave 4 appends ONLY the 01-13 entry:
```
- [x] 01-13-PLAN.md — UAT harness via Approach B (14 assertions; inherits 01-11 spike-pivot; Plan 01-09 functional closure)
```
Add a flag in the commit body: "ROADMAP.md Phase 1 Plans list is missing entries for 01-08, 01-09, 01-10, 01-11, 01-12 — orchestrator should address as separate cleanup; out of scope for 01-13."
- Final smoke: `npm run test:uat` → exit 0 (14/14 GREEN); `npx vitest run` → 89 GREEN; `npm run build` → exit 0.
1. Read `01-09-PLAN.md` end (verify the 9d0313a amendment block exists; if it doesn't, the previous amendment may have been folded into a different file — note it in the commit body but proceed with appending the Amendment-2 block).
2. Append the Amendment-2 block per behavior. Use the same `---` separator + `## Amendment N` heading pattern.
3. Read `STATE.md` Decisions section (lines 72-108 per inspection). Append the new entry after the most recent entry (currently the `[Phase 01-07-deferred-to-5]` line).
4. Read `ROADMAP.md` Phase 1 Plans list (lines 73-80 per inspection). Append the 01-13 entry. Surface the gap (01-08..01-12 missing) in the commit body.
5. Run `npm run test:uat` → exit 0 (final smoke).
6. Run `npm run build` → exit 0; Tier-1 grep gate GREEN.
7. Run `npx vitest run --reporter=dot` → 89 GREEN.
8. Run `npx tsc --noEmit` → exit 0.
9. Commit atomically: `docs(01-13): wave-4 — 01-09 amendment + STATE/ROADMAP updates; harness closes 01-09 functional contract`. Body: amendment text, STATE decision, ROADMAP append, ROADMAP gap surfaced.
npx tsc --noEmit && grep -q 'Plan 01-13 harness closes Plan 01-09 functional contract' .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md && grep -q 'Approach-B UAT harness landed' .planning/STATE.md && grep -q '01-13-PLAN.md' .planning/ROADMAP.md && npm run test:uat && npx vitest run --reporter=dot
- `01-09-PLAN.md` ends with the Amendment-2 block.
- `STATE.md` Decisions section carries the 01-13 entry as the last item.
- `ROADMAP.md` Phase 1 Plans list contains `01-13-PLAN.md` entry; commit body surfaces the 01-08..01-12 gap.
- `npm run test:uat` exit 0 (14/14 GREEN).
- `npx tsc --noEmit` exit 0; vitest 89 GREEN; Tier-1 grep gate GREEN.
- Commit message follows Mark's style.
01-09 redirected to harness; STATE + ROADMAP updated; ROADMAP gap flagged for orchestrator; ready for closing checkpoint.
Task 9 (Wave 4): Operator confirms `npm run test:uat` exits 0 against current bundle AND confirms brand/design step 14 — closes Plan 01-09 + Plan 01-13.
(operator-driven; no files modified by this checkpoint)
See <how-to-verify> below — operator-driven empirical check. The executor must NOT bypass this checkpoint by stubbing harness output.
echo "checkpoint:human-verify — see how-to-verify section; resume signal is the gate"
Operator types "approved" after running the how-to-verify steps. See <resume-signal> for the exact gate.
Tasks 1-8 landed: Approach-A artifacts cleaned, c647f61 prototype promoted to production paths, Approach-B driver scaffolding rebuilt, all 14 assertions wired across Waves 3A-3D, 14/14 GREEN against current Plan 01-08/01-09 bundle (Bug B fix b9eeeeb + Bug A fix a881bf0 both verified by canonical RED-on-regression demos). Plan 01-09 Task 5 amended (Amendment 2) to point at `npm run test:uat` for functional steps. This checkpoint validates the harness end-to-end against real Chrome AND captures operator's brand/design acceptance for Plan 01-09's retained step 14.
1. **Pre-flight cleanliness:** run `git status` — confirm working tree clean. Any uncommitted local hacks (RED-demo reverts) MUST be reverted BEFORE this step.
2. **Build production:** `npm run build` (must exit 0; this is Plan 01-09 Task 5 step 1).
3. **Build test bundle:** `npm run build:test` (must exit 0).
4. **Run harness:** `npm run test:uat` (must exit 0; runtime ~50-90s). Final output line MUST be exactly `UAT harness: 14/14 assertions passed`. If exit non-zero, paste the structured diagnostic + harness console dump + relevant SW/offscreen console logs; the plan iterates (likely a real bug surfaced).
5. **Re-run for stability:** `npm run test:uat` a second time. Same outcome.
6. **Tier-1 hook-leak verification:** `grep -rln '__mokoshTest\|installFakeDisplayMedia\|dispatchEndedOnTrack\|getSegmentCount\|setCurrentStream\|setSegmentCountGetter\|uninstallFakeDisplayMedia\|__mokoshOffscreenQuery\|get-display-surface\|get-segment-count' dist/` must return 0 matches. If ANY match, the gate failed silently — STOP and triage.
7. **Local-debug mode smoke:** `HEADLESS=0 npm run test:uat`. Watch real Chrome window: see the harness page load (chrome-extension://<id>/tests/uat/extension-page-harness.html), see badge state transitions across A2/A4/A6/A7. Same exit 0 outcome.
8. **Standalone A6 quick check:** `npx tsx tests/uat/a6.test.ts` → exits 0 with "A6 result: PASS" 5/5. (Smoke for the TDD iteration entry.)
9. **Brand/design acceptance (Plan 01-09 Task 5 step 14 — retained for operator):**
(a) Badge color readability against your OS theme (red OFF, green REC, yellow ERR).
(b) Notification copy ("Mokosh ready — Click here to start recording your session.") reads naturally.
(c) Picker UX confirms in headful mode that Chrome's screen-share picker would surface at the expected moment in production (the harness uses synthetic stream + bypasses the picker; the operator's manual run confirms the production picker still works).
10. **If steps 4, 5, 6 all PASS:** Plan 01-09 + Plan 01-13 both close. Type "approved" with any brand/design notes appended.
11. **If step 4 OR 5 FAIL:** paste the failure diagnostic. Likely culprits: state-bleed between assertions (try `--only=A` if that CLI arg landed in Wave 3D); race window in A11's 35s wait or A6's 500ms settle (try bumping); offscreen target attach flakiness (browser.on('targetcreated') is opportunistic).
12. **If step 6 FAILS:** STOP. The Tier-1 hook-leak gate failing means the production bundle contains test code — security regression (T-1-13-01). Open a debug session.
13. **If step 7/8 surfaces a real UX issue:** document as a P1/P2 item in STATE.md or Phase 5 backlog; closure can still proceed IF non-blocking.
Type "approved" after step 9 lands (all gates GREEN + brand/design accepted). If steps 10/11/12 hit, paste failure mode + operator's Chrome version + locale + OS theme; the plan iterates on the failing piece.
<threat_model>
Trust Boundaries
| Boundary | Description |
|---|---|
| Puppeteer driver ↔ Chrome (CDP) | Host-side Node process pipes CDP commands to Chrome; only invokes page.evaluate on the extension-internal harness page (NOT direct extension chrome.* manipulation). The page runs INSIDE the extension privilege boundary. |
| Extension-internal harness page ↔ SW/offscreen | The harness page has FULL chrome.* API access (it's a privileged extension context). It can read/write chrome.action.* state, invoke chrome.notifications.create directly, call chrome.offscreen.createDocument, send chrome.runtime.sendMessage to SW + offscreen. THIS IS THE PRIVILEGE BOUNDARY — the page is trusted because it ships in the test bundle, not the production bundle. |
Test hook surface (__mokoshTest) in production bundle |
NEW: SAME security-critical threat as 01-11. If tree-shaking fails OR the __MOKOSH_UAT__ define-token gate is misconfigured, hook surface ships to production — exposing installFakeDisplayMedia, dispatchEndedOnTrack, getCurrentStream, getSegmentCount, the offscreen-bridge handler to any page that can communicate with the extension. Mitigation: Tier-1 grep gate enforces zero hook strings in dist/ (10-string inventory after Wave 3D). |
Offscreen bridge (__mokoshOffscreenQuery) onMessage listener |
NEW: the offscreen-hooks bridge listens on chrome.runtime.onMessage for __mokoshOffscreenQuery typed messages and exposes dispatch-ended / install-fake-display-media / has-stream / get-display-surface / get-segment-count ops. If shipped to production, ANY chrome.runtime.sendMessage with this type triggers the ops — dispatch-ended could be used to remotely kill an active recording. Mitigation: same as above — the listener is registered ONLY when MOKOSH_UAT is true (gated by the offscreen-side dynamic import). Tier-1 grep gate verifies the surface is absent from dist/. |
| dev-dependency Chromium binary | UNCHANGED from 01-11: Puppeteer downloads ~150 MB Chromium at npm install. Mitigation: package-lock.json integrity check. |
STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|---|---|---|---|---|
| T-1-13-01 | Elevation of Privilege | Hook surface (10-string inventory) leaking into production dist/ would expose installFakeDisplayMedia, dispatchEndedOnTrack, getSegmentCount, getCurrentStream, and the __mokoshOffscreenQuery bridge to any context with chrome.runtime.sendMessage access | mitigate | Two layers: (a) __MOKOSH_UAT__ Vite define-token gate makes the entire offscreen-hooks import + onMessage listener a static dead branch in production builds (vite.config.ts sets it false; vite.test.config.ts sets it true); Rollup tree-shakes the dead branch. (b) Tier-1 grep gate tests/background/no-test-hooks-in-prod-bundle.test.ts greps the BUILT artifact tree for the 10 forbidden strings — ZERO matches required for GREEN. Belt + suspenders catches both tree-shake regression AND new hook-name additions. The 10-string inventory is the authoritative surface contract; new ops MUST be added to the FORBIDDEN_STRINGS list when introduced. |
| T-1-13-02 | Spoofing | Harness page sends __mokoshOffscreenQuery messages to offscreen; if production code accidentally also registers a __mokoshOffscreenQuery handler (e.g. typo in a future refactor), it could be invoked by harness messages |
accept | The offscreen-hooks onMessage handler returns {ok: false, error: 'unknown-op'} for unrecognized ops, so a typo-collision wouldn't accidentally trigger a production state mutation. Detection: any new chrome.runtime.onMessage listener in production code is reviewed for collision with the __mokoshOffscreenQuery type sentinel. |
| T-1-13-03 | Information Disclosure | A6's 35-second wait (in A11, not A6) on a CI runner could capture system state via the synthetic stream's canvas — BUT the canvas is a 320x180 frame-counter pattern (constant content; no environmental data) | accept | Per 01-11-SUMMARY: the synthetic stream is canvas-driven with displayed content limited to a frame counter. No actual screen content is captured. CI isolation requirement (per 01-11 threat T-1-11-02) is REMOVED in 01-13 — Approach B's synthetic stream eliminates the entire class of "what's on the screen" threats. |
| T-1-13-04 | Denial of Service | A11's 35s wall-clock wait dominates harness runtime; combined with the build steps, total runtime ~90s ties up CI runner slot | accept | 90s is well within typical CI per-job budgets. Local-dev runs use SKIP_PROD_REBUILD=1 to drop A0's npm run build cost (~10s). Out of scope: parallelizing assertions (would require multi-browser instances; defeats failure-isolation choice). |
| T-1-13-05 | Tampering | Puppeteer downloads Chromium binary at npm install; supply-chain compromise of download endpoint | accept | UNCHANGED from 01-11: package-lock.json pins hashes via Puppeteer's @puppeteer/browsers machinery. Phase 5 SCA work covers periodic re-verification. |
| T-1-13-06 | Repudiation | A8 verifies Chrome's imageUtil accepts the icon via the harness page calling chrome.notifications.create directly — this DOES NOT verify the SW onStartup handler runs the same code path | mitigate | Documented workaround: the unit test tests/background/onstartup-notification.test.ts covers the SW handler invocation; A8 covers the end-to-end icon-acceptance contract (which is what Bug A regressed on). Together they cover both halves of the contract. The harness's role is the icon-acceptance gate; the unit test is the handler-invocation gate. Defense in depth via tier separation (unit + e2e). |
| T-1-13-07 | Elevation of Privilege (additional) | The harness page's chrome.action.setBadgeText / setPopup calls in A2 (workaround for missing 'tabs' permission) MUTATE production state | accept | The mutations are bounded to the harness page's lifetime; the SW state machine reverts on the next setIdleMode / setRecordingMode call. The harness page does NOT persist any mutation. In production (without the harness page being loadable — it's not in dist/), the mutations are impossible. Real-world impact: zero. |
| </threat_model> |
9d0313a.
- STATE.md Decisions log carries the new 01-13 entry as the last item.
- ROADMAP.md Phase 1 Plans list carries the 01-13 entry; commit body surfaces the 01-08..01-12 gap for orchestrator follow-up.
- Operator confirms brand/design step 14 + types "approved" in Task 9.
- Standalone `npx tsx tests/uat/a6.test.ts` exit 0 (5/5 PASS) — TDD iteration entry preserved.
<success_criteria> Plan 01-13 is complete when:
- Wave 0 baseline cleanup landed. sw-hooks.ts deleted; SW dynamic-import block reverted; popup-bridge lib deleted; feasibility probes deleted. Tier-1 grep gate's FORBIDDEN_STRINGS list updated to the Approach-B 10-string inventory. Baseline 89/89 vitest GREEN restored.
- Approach-B architecture proven in production paths. Prototype
c647f61promoted totests/uat/{extension-page-harness.html,extension-page-harness.ts,a6.test.ts}via Wave 1 git mv + comment updates. Standalone A6 entry PASSES 5/5 from new path. - All 14 harness assertions pass against the current bundle.
npm run test:uatexit 0; final lineUAT harness: 14/14 assertions passed. Runtime ~50-90s. - Both Phase-1-escapee bugs are CI-callable. Wave 3B commit body documents A6 (Bug B) RED-on-regression cycle; Wave 3C commit body documents A8 (Bug A) RED-on-regression cycle. Both demonstrably catch their respective regressions.
- Operator role retired for functional verification. Plan 01-09 Task 5 redirects to
npm run test:uatvia Amendment 2 (which inherits + supersedes the now-stale 01-11 Amendment 1). Operator retains only step 1 (build) + step 14 (brand/design). - Existing 89 vitest tests remain GREEN after every wave. No regression to unit-test bed.
npx tsc --noEmitexit 0;npm run buildexit 0; Tier-1 grep gate GREEN. Production bundle byte-clean of hook strings.- MV3 architectural constraints respected. NO
await import(...)insrc/background/index.ts.dispatchEvent(new Event('ended'))for user-stopped simulation.__MOKOSH_UAT__define-token gate (NOTimport.meta.env.MODE). - Plan 01-09 + Plan 01-13 close together. Wave 4 closing checkpoint: operator confirms harness PASS + brand/design + types "approved". </success_criteria>
c647f61 prototype proof.
- Two-bundle separation (dist/ vs dist-test/) verified by Tier-1 grep gate with 10-string FORBIDDEN_STRINGS inventory.
- Bridge protocol op set (install-fake-display-media, dispatch-ended, has-stream, get-display-surface, get-segment-count) + the cross-isolate boundary it crosses.
- Plan 01-09 Amendment 2 landed (inherits + supersedes Amendment 1 from 9d0313a).
- STATE.md decision logged + ROADMAP.md Phase 1 plan list updated.
- ROADMAP gap flagged (01-08..01-12 entries missing — orchestrator follow-up).
- Open questions resolved (4 from this plan's interfaces block) + resolutions.
- Total harness runtime ranges observed (~50-90s; A11's 35s wait dominates; A0 prod rebuild ~10s skippable via SKIP_PROD_REBUILD=1).
- Standalone A6 entry preserved as TDD iteration tool.