Architectural pivot triggered by feasibility research prototype (commitc647f61, A6 PASS 5/5 + Bug-B regression rewind verified). Two empirical findings invalidate original architecture: 1. MV3 service workers BLOCK dynamic import. await import('test-hooks/ sw-hooks') in src/background/index.ts silently kills the SW — chunk loads, await never resolves, no listeners register. Cited Chromium es_modules.md + w3c/webextensions#212. 2. Puppeteer WebWorker.evaluate against MV3 SW only exposes chrome. {loadTimes,csi} — not the extension chrome.* API surface. The Wave 1 (cb1a729) SW-side hooks are fundamentally broken in test builds (production unaffected — gated by __MOKOSH_UAT__ which is false in prod). Executor must DELETE the SW-side dynamic import + sw-hooks.ts entirely; offscreen-side hooks stay (offscreen IS a DOM document; dynamic import works there). Replacement (Verdict-A architecture, proven by prototype): - Extension-internal harness page at chrome-extension://<id>/tests/ uat/extension-page-harness.html — privileged extension context with FULL chrome.* API access - Puppeteer drives the page via page.goto + page.evaluate - For SW state: page calls chrome.runtime.sendMessage; SW responds via production messaging - For getDisplayMedia: offscreen-side installFakeDisplayMedia() patches navigator.mediaDevices.getDisplayMedia → Canvas captureStream synthetic MediaStream A6 (Bug B regression catch) PROVEN. Industry-standard pattern (MetaMask, eyeo, Chrome MV3 official testing docs all converge). Effort remaining: ~7-10h subagent budget (Wave 0 + bonus debug-import commits keep; Wave 1 hooks rewire 30min; Wave 2 scaffolding 1-2h; Wave 3 13 more assertions 4-6h; Wave 4 closure 1h). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1073 lines
97 KiB
Markdown
1073 lines
97 KiB
Markdown
---
|
||
amended_at: "2026-05-18"
|
||
amendment: "A"
|
||
amendment_summary: "Wave 1 T2 + Wave 2 T3 + Wave 3 method guidance superseded. MV3 SW blocks dynamic import (verified empirically); SW-side test hooks DROPPED. Replaced with extension-internal harness page architecture (proven by c647f61 prototype). See 01-11-PLAN-AMENDMENT-A.md."
|
||
phase: 01-stabilize-video-pipeline
|
||
plan: 11
|
||
type: tdd
|
||
wave: 4
|
||
depends_on:
|
||
- 01-08
|
||
- 01-09
|
||
files_modified:
|
||
- package.json
|
||
- package-lock.json
|
||
- vite.test.config.ts
|
||
- tsconfig.json
|
||
- src/background/index.ts
|
||
- src/offscreen/recorder.ts
|
||
- src/test-hooks/sw-hooks.ts
|
||
- src/test-hooks/offscreen-hooks.ts
|
||
- src/test-hooks/types.ts
|
||
- tests/uat/harness.test.ts
|
||
- tests/uat/lib/launch.ts
|
||
- tests/uat/lib/extension.ts
|
||
- tests/uat/lib/sw.ts
|
||
- tests/uat/lib/offscreen.ts
|
||
- tests/uat/lib/assertions.ts
|
||
- tests/uat/lib/zip.ts
|
||
- tests/uat/lib/test-hook-contract.d.ts
|
||
- tests/uat/README.md
|
||
- tests/background/no-test-hooks-in-prod-bundle.test.ts
|
||
- .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md
|
||
autonomous: false
|
||
requirements:
|
||
- REQ-uat-harness-puppeteer
|
||
- REQ-uat-bug-A-coverage
|
||
- REQ-uat-bug-B-coverage
|
||
- REQ-uat-two-bundle
|
||
- REQ-uat-ci-friendly
|
||
- REQ-uat-13-assertions
|
||
- REQ-video-ring-buffer
|
||
tags:
|
||
- puppeteer
|
||
- uat
|
||
- harness
|
||
- e2e
|
||
- mv3-extension
|
||
- getDisplayMedia
|
||
- bug-B
|
||
- bug-A
|
||
- tier-1
|
||
- two-bundle
|
||
|
||
must_haves:
|
||
truths:
|
||
- "`npm run build:test` produces `dist-test/` with `__mokoshTest` hook surfaces injected into SW + offscreen contexts; `npm run build` produces `dist/` with ZERO occurrences of `__mokoshTest` (grep-verifiable)."
|
||
- "`npm run test:uat` orchestrates `build:test` + the Puppeteer harness end-to-end; exits 0 only when ALL 14 assertions pass (13 from the brief + assertion 0 = production-bundle hook-leak grep gate)."
|
||
- "Bug B harness assertion (track.dispatchEvent('ended') → badge OFF + popup '' + isRecording=false + NO recovery notification) demonstrably catches a regression: rewinding the b9eeeeb conditional routing locally turns this assertion RED; reapplying turns it GREEN."
|
||
- "Bug A harness assertion (onStartup → chrome.notifications.create resolves cleanly with the manifest's icon48.png iconUrl) demonstrably catches a regression: stubbing the icon48 file to <100 bytes turns this assertion RED; restoring turns it GREEN."
|
||
- "Harness runs in `--headless=new` for CI portability; local-debug mode supported via `HEADLESS=0`; no Xvfb required (per RESEARCH §3 empirical probes against Chrome 148)."
|
||
- "Test hooks live ONLY behind `import.meta.env.MODE === 'test'` guarded dynamic imports; Vite tree-shakes them from the production bundle; the no-test-hooks-in-prod-bundle.test.ts unit gate enforces this in the existing vitest suite (Tier-1 alongside sw-bundle-import.test.ts)."
|
||
- "Existing 83 vitest tests remain GREEN after this plan lands (no regression to the unit test bed)."
|
||
- "Plan 01-09 functional contract closes by harness PASS: its Task 5 operator-checkpoint amendment redirects to `npm run test:uat` for steps 4-13 + 15; operator retains only step 1 (build) + step 14 (brand/design check)."
|
||
artifacts:
|
||
- path: "vite.test.config.ts"
|
||
provides: "Vite config extending the production config; sets `mode: 'test'`, `build.outDir: 'dist-test'`, `build.emptyOutDir: true`."
|
||
contains: "dist-test"
|
||
- path: "src/test-hooks/types.ts"
|
||
provides: "Shared TS type declaring `globalThis.__mokoshTest` shape (handlers, getCurrentStream, simulateUserStop, notificationCount, lastNotificationOptions). Single source of truth for SW + offscreen + harness."
|
||
contains: "__mokoshTest"
|
||
- path: "src/test-hooks/sw-hooks.ts"
|
||
provides: "SW-side test hook: captures chrome.action.onClicked / chrome.runtime.onStartup / chrome.notifications.onClicked handler refs; wraps chrome.notifications.create to record notificationCount + lastNotificationOptions. Imported dynamically from src/background/index.ts under `import.meta.env.MODE === 'test'` guard."
|
||
contains: "handlers"
|
||
- path: "src/test-hooks/offscreen-hooks.ts"
|
||
provides: "Offscreen-side test hook: exposes the current MediaStream via getter; provides simulateUserStop wrapping `track.dispatchEvent(new Event('ended'))` per RESEARCH §7. Imported dynamically from src/offscreen/recorder.ts under `import.meta.env.MODE === 'test'` guard."
|
||
contains: "simulateUserStop"
|
||
- path: "src/background/index.ts"
|
||
provides: "Adds a single `if (import.meta.env.MODE === 'test') { await import('../test-hooks/sw-hooks'); }` block at top-of-module so the hook registration runs BEFORE any production addListener calls (capturing every handler)."
|
||
contains: "import.meta.env.MODE"
|
||
- path: "src/offscreen/recorder.ts"
|
||
provides: "Adds an `if (import.meta.env.MODE === 'test') { __sharedRefs.setMediaStreamGetter(() => mediaStream); }` block (the import itself is gated; the getter wires the runtime mediaStream reference into the hook surface). Same guard pattern as SW."
|
||
contains: "import.meta.env.MODE"
|
||
- path: "tests/uat/harness.test.ts"
|
||
provides: "Single Node script (run under tsx) implementing all 14 assertions sequentially. ~400 LoC. Top-to-bottom narrative — launch, click, assert, simulate Bug B, simulate Bug A, etc. Returns exit 0 on full pass, non-zero on any failure with structured diagnostic dump."
|
||
min_lines: 350
|
||
- path: "tests/uat/lib/launch.ts"
|
||
provides: "puppeteer.launch wrapper: builds args, sets enableExtensions to absolute dist-test path, chooses headless mode per CI env, configures downloads dir, exports a single launchHarnessBrowser() function."
|
||
- path: "tests/uat/lib/extension.ts"
|
||
provides: "Helpers to resolve the extension id, attach to the SW target, attach to the offscreen target (background_page type per RESEARCH §4 / Pitfall 1)."
|
||
- path: "tests/uat/lib/sw.ts"
|
||
provides: "SW context helpers: getBadgeText, getPopup, getManifestIcons, fireOnStartup (via captured handler ref), sendSyntheticRecordingError, keepalivePing."
|
||
- path: "tests/uat/lib/offscreen.ts"
|
||
provides: "Offscreen context helpers: waitForOffscreenTarget, getDisplaySurface, simulateUserStop (the dispatchEvent('ended') path per RESEARCH §7 BLOCKER finding)."
|
||
- path: "tests/uat/lib/assertions.ts"
|
||
provides: "Per-assertion helpers (assertEqual + structured diagnostic on failure); a runWithStartupDiagnostics wrapper that captures SW + offscreen console logs and dumps them on assertion failure for triage."
|
||
- path: "tests/uat/lib/zip.ts"
|
||
provides: "jszip-based archive shape assertions; reads downloaded `session_report_*.zip`, asserts `video/last_30sec.webm` present + `meta.json` carries `version === chrome.runtime.getManifest().version` (extension-side version read passed in)."
|
||
- path: "tests/uat/lib/test-hook-contract.d.ts"
|
||
provides: "Mirror of src/test-hooks/types.ts in TS-declaration form for the harness side; documents the wire contract between hook injector and harness consumer."
|
||
- path: "tests/uat/README.md"
|
||
provides: "How to run: `npm run test:uat`; local-debug headful mode via `HEADLESS=0`; CI semantics; troubleshooting (locale-specific picker string, Xvfb fallback if a future Chrome regresses headless, dev-dependency Chromium binary size note)."
|
||
- path: "tests/background/no-test-hooks-in-prod-bundle.test.ts"
|
||
provides: "Tier-1 unit-level grep gate (cousin of sw-bundle-import.test.ts): runs `npm run build` then asserts ZERO occurrences of `__mokoshTest` and ZERO occurrences of `simulateUserStop` in any file under `dist/`. RED today (the test runs before this plan lands its hook gating); GREEN after Task 1 verifies the gate AND the hook gating is correct."
|
||
- path: "package.json"
|
||
provides: "Adds `puppeteer` ^25.0.2 + `tsx` ^4 to devDependencies; adds two npm scripts: `build:test` (`tsc && vite build --mode test --config vite.test.config.ts`) and `test:uat` (`npm run build:test && tsx tests/uat/harness.test.ts`)."
|
||
contains: "test:uat"
|
||
- path: "tsconfig.json"
|
||
provides: "Includes `src/test-hooks/**/*` in compilation surface (so tsc validates the hook code). NO change to emit (vite handles bundling)."
|
||
- path: ".planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md"
|
||
provides: "AMENDMENT block at the end of the file: redirects Plan 01-09 Task 5 operator-checkpoint steps 4-13 + 15 to `npm run test:uat` (this plan's harness). Operator retains step 1 (build) + step 14 (brand/design accept) only. Plan 01-09 closes when `npm run test:uat` exits 0 AND operator confirms brand/design step 14."
|
||
contains: "Plan 01-11 amendment"
|
||
key_links:
|
||
- from: "tests/uat/harness.test.ts"
|
||
to: "tests/uat/lib/launch.ts:launchHarnessBrowser"
|
||
via: "import"
|
||
pattern: "import.*from.*lib/launch"
|
||
- from: "tests/uat/lib/launch.ts"
|
||
to: "puppeteer.launch"
|
||
via: "enableExtensions + headless + autoSelect flag"
|
||
pattern: "enableExtensions"
|
||
- from: "src/background/index.ts"
|
||
to: "src/test-hooks/sw-hooks.ts"
|
||
via: "guarded dynamic import"
|
||
pattern: "import\\.meta\\.env\\.MODE === ['\"]test['\"]"
|
||
- from: "src/offscreen/recorder.ts"
|
||
to: "src/test-hooks/offscreen-hooks.ts"
|
||
via: "guarded dynamic import + setMediaStreamGetter wire"
|
||
pattern: "import\\.meta\\.env\\.MODE === ['\"]test['\"]"
|
||
- from: "tests/uat/lib/offscreen.ts:simulateUserStop"
|
||
to: "track.dispatchEvent(new Event('ended'))"
|
||
via: "evaluate-in-offscreen-page on __mokoshTest.getCurrentStream().getVideoTracks()[0]"
|
||
pattern: "dispatchEvent\\(new Event\\(['\"]ended['\"]"
|
||
- from: "tests/background/no-test-hooks-in-prod-bundle.test.ts"
|
||
to: "dist/ artifact tree"
|
||
via: "post-build grep for __mokoshTest + simulateUserStop"
|
||
pattern: "grep.*__mokoshTest.*dist"
|
||
---
|
||
|
||
## Scope Sanity Note
|
||
|
||
**4 waves, 8 tasks, 18 file artifacts.** This sits at the upper end of the "split signal" threshold but consolidating is the right call:
|
||
|
||
1. The test infrastructure (Wave 0), the hook gating (Wave 1), the harness scaffolding (Wave 2), and the 14 assertions (Wave 3) are tightly coupled at the contract level — splitting them into separate plans would force the harness contract (the `__mokoshTest` shape) to be re-derived in each plan's frontmatter `must_haves`, multiplying the duplication tax.
|
||
2. Per RESEARCH §6, the two-bundle gate (`__mokoshTest` ABSENT in production) is the security-critical mitigation for shipping test hooks. That gate MUST be wired in the same plan that adds the hooks; splitting would create a window where the hooks exist but the gate doesn't.
|
||
3. Wave 4 (closure) is a single checkpoint task — bundling it with Wave 3 wouldn't change context cost meaningfully, and separating it keeps the operator-checkpoint scope visible in the wave structure.
|
||
4. Context budget: Wave 0 + Wave 1 + Wave 2 ~30%; Wave 3 ~35%; Wave 4 ~5% (checkpoint). Total ~70%. Above the 50% target — but the 14 assertions are deterministic and template-shaped, so per-assertion authoring cost is sub-linear once Wave 2 lands.
|
||
|
||
**If a future revision DOES force a split,** natural cut line: Plan 01-11A = Waves 0+1+2 (infrastructure + first 4 assertions as smoke); Plan 01-11B = Waves 3+4 (remaining 10 assertions + closure). This split incurs the contract-duplication tax and is NOT recommended absent a context-cost regression.
|
||
|
||
<objective>
|
||
Build a Puppeteer-driven Node UAT harness that retires the operator-as-assertion-library role. Plan 01-09's Task 5 took 4-6 hours of operator empirical UAT cycles (Bug A icons + Bug B state routing both escaped vitest unit coverage); every "visual" check in that task has a CDP-callable equivalent. This plan automates them.
|
||
|
||
Three coordinated changes:
|
||
1. **Two-bundle separation** via `vite.test.config.ts` extending the production config with `mode: 'test'` + `outDir: 'dist-test'`. Production builds stay hook-free.
|
||
2. **Test hooks** in `src/test-hooks/` consumed via guarded dynamic imports from SW + offscreen. The dynamic-import-inside-MODE-guard pattern (RESEARCH §6) lets Vite tree-shake the hook MODULES entirely from production, with a Tier-1 grep gate (`tests/background/no-test-hooks-in-prod-bundle.test.ts`) verifying the absence.
|
||
3. **Puppeteer harness** at `tests/uat/harness.test.ts` (plus a `lib/` helper split following MetaMask's POM shape per RESEARCH §5) implementing 14 assertions: assertion 0 (production-bundle hook-leak grep gate) + assertions 1-13 from the orchestrator brief. Bug B uses `track.dispatchEvent(new Event('ended'))` per RESEARCH §7 BLOCKER — NOT `track.stop()` which silently invalidates the assertion.
|
||
|
||
Operator role retirement: Plan 01-09's Task 5 is amended to redirect steps 4-13 + 15 to `npm run test:uat`. Operator retains only step 1 (build verification) + step 14 (brand/design acceptance). All functional gates move to CI-callable harness.
|
||
|
||
Output:
|
||
- `vite.test.config.ts` — production config extension with `mode: 'test'` + `outDir: 'dist-test'`.
|
||
- `src/test-hooks/{sw-hooks,offscreen-hooks,types}.ts` — gated hook modules.
|
||
- `src/background/index.ts` + `src/offscreen/recorder.ts` — gated dynamic import block (one line each + a `setMediaStreamGetter` wire in offscreen).
|
||
- `tests/uat/harness.test.ts` + `tests/uat/lib/*.ts` + `tests/uat/README.md` — harness + helpers.
|
||
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` — Tier-1 unit-level hook-leak gate.
|
||
- `package.json` — `puppeteer`, `tsx` devDeps + `build:test`, `test:uat` scripts.
|
||
- `tsconfig.json` — includes `src/test-hooks/**/*` for type-checking.
|
||
- `.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md` — amendment block redirecting Task 5 functional steps to `npm run test:uat`.
|
||
</objective>
|
||
|
||
<execution_context>
|
||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||
</execution_context>
|
||
|
||
<context>
|
||
@.planning/PROJECT.md
|
||
@.planning/ROADMAP.md
|
||
@.planning/STATE.md
|
||
@.planning/REQUIREMENTS.md
|
||
@.planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md
|
||
@.planning/phases/01-stabilize-video-pipeline/01-08-PLAN.md
|
||
@.planning/phases/01-stabilize-video-pipeline/01-08-SUMMARY.md
|
||
@.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md
|
||
@.planning/phases/01-stabilize-video-pipeline/01-09-SUMMARY.md
|
||
@.planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md
|
||
@.planning/debug/resolved/01-09-recovery-flow.md
|
||
@src/background/index.ts
|
||
@src/offscreen/recorder.ts
|
||
@manifest.json
|
||
@vite.config.ts
|
||
@tsconfig.json
|
||
@package.json
|
||
@tests/background/sw-bundle-import.test.ts
|
||
|
||
<interfaces>
|
||
<!-- Key types, paths, and Chrome/Puppeteer API surfaces the executor uses -->
|
||
<!-- Embedded here so the executor needs no codebase exploration -->
|
||
|
||
### Puppeteer 25.0.2 extension API surface (RESEARCH §1, empirically verified)
|
||
|
||
```typescript
|
||
import puppeteer, { Browser, Extension, Page, Target } from 'puppeteer';
|
||
|
||
const browser: Browser = await puppeteer.launch({
|
||
pipe: true,
|
||
enableExtensions: ['/abs/path/to/dist-test'], // string[] or true
|
||
headless: process.env.HEADLESS !== '0', // default headless=true; local debug HEADLESS=0
|
||
args: [
|
||
'--no-sandbox',
|
||
'--auto-select-desktop-capture-source=Entire screen', // RESEARCH §9 — locale-specific
|
||
// DO NOT add --use-fake-ui-for-media-stream (per RESEARCH §9 Pitfall, conflicts with auto-select)
|
||
],
|
||
});
|
||
|
||
const extensions = await browser.extensions(); // Map<id, Extension>
|
||
const [extId, ext] = [...extensions][0];
|
||
|
||
const swTarget = await browser.waitForTarget(
|
||
(t: Target) => t.type() === 'service_worker',
|
||
{ timeout: 10_000 },
|
||
);
|
||
const sw = await swTarget.worker(); // WebWorker — has .evaluate()
|
||
|
||
const page = await browser.newPage();
|
||
await page.goto('about:blank');
|
||
await page.triggerExtensionAction(ext); // simulates toolbar click (NEEDS popup === '')
|
||
|
||
// Offscreen page — RESEARCH §4 / Pitfall 1: target type 'background_page' NOT 'page'
|
||
const offTarget = browser.targets().find((t) =>
|
||
t.type() === 'background_page' && t.url().includes('offscreen'),
|
||
);
|
||
const offPage = await offTarget.asPage(); // NOT .page() — only .asPage() works
|
||
```
|
||
|
||
### Chrome SW state surface (read via sw.evaluate)
|
||
|
||
```typescript
|
||
// Read badge text
|
||
const badge = await sw.evaluate(() => chrome.action.getBadgeText({}));
|
||
|
||
// Read popup
|
||
const popup = await sw.evaluate(() => chrome.action.getPopup({}));
|
||
|
||
// Read manifest
|
||
const manifest = await sw.evaluate(() => chrome.runtime.getManifest());
|
||
// manifest.icons === { '16': 'icons/icon16.png', '48': '...', '128': '...' }
|
||
// manifest.permissions includes 'notifications', etc.
|
||
|
||
// Synthesize RECORDING_ERROR (no hook needed — goes through onMessage handler)
|
||
await sw.evaluate(() =>
|
||
chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: 'codec-unsupported' }),
|
||
);
|
||
|
||
// Invoke onStartup via captured handler ref (needs hook — see sw-hooks.ts)
|
||
await sw.evaluate(() => globalThis.__mokoshTest!.handlers.onStartup?.());
|
||
|
||
// Fetch an extension file and check size
|
||
const iconSize = await sw.evaluate(async () => {
|
||
const r = await fetch(chrome.runtime.getURL('icons/icon48.png'));
|
||
return r.ok ? Number(r.headers.get('content-length') ?? '0') : -1;
|
||
});
|
||
```
|
||
|
||
### Offscreen surface (read via offPage.evaluate)
|
||
|
||
```typescript
|
||
// Read displaySurface — RESEARCH §11 Req 3
|
||
const ds = await offPage.evaluate(() =>
|
||
globalThis.__mokoshTest!.getCurrentStream!()?.getVideoTracks()[0]?.getSettings().displaySurface ?? null,
|
||
);
|
||
|
||
// Simulate user-stopped — RESEARCH §7 BLOCKER. MUST be dispatchEvent, NOT track.stop().
|
||
await offPage.evaluate(() => {
|
||
const stream = globalThis.__mokoshTest!.getCurrentStream!();
|
||
if (stream === null) throw new Error('no current stream — recording must be active');
|
||
const track = stream.getVideoTracks()[0];
|
||
track.dispatchEvent(new Event('ended'));
|
||
// Track still readyState 'live' after dispatch; production handler will
|
||
// call stream.getTracks().forEach(t => t.stop()) which DOES release the
|
||
// capture (just doesn't refire 'ended' on the same track — spec).
|
||
});
|
||
```
|
||
|
||
### Test hook contract (NEW — src/test-hooks/types.ts)
|
||
|
||
```typescript
|
||
// src/test-hooks/types.ts
|
||
// SINGLE SOURCE OF TRUTH for the __mokoshTest wire shape.
|
||
// Imported by sw-hooks.ts (registers), offscreen-hooks.ts (registers),
|
||
// and tests/uat/lib/test-hook-contract.d.ts (consumes — mirror).
|
||
|
||
export interface MokoshTestSurface {
|
||
// SW handler refs (captured by sw-hooks.ts monkey-patching addListener)
|
||
handlers: {
|
||
onClicked: ((tab: chrome.tabs.Tab) => void | Promise<void>) | null;
|
||
onStartup: (() => void | Promise<void>) | null;
|
||
notificationOnClicked: ((notificationId: string) => void | Promise<void>) | null;
|
||
};
|
||
// SW notification observability
|
||
notificationCount: number;
|
||
lastNotificationOptions: chrome.notifications.NotificationOptions | null;
|
||
notificationIds: ReadonlyArray<string>;
|
||
// Offscreen getCurrentStream — undefined in SW context; defined in offscreen.
|
||
// Always-present in the type to keep the harness side simple; runtime null is
|
||
// the "not currently recording" signal.
|
||
getCurrentStream?: () => MediaStream | null;
|
||
}
|
||
|
||
declare global {
|
||
// eslint-disable-next-line no-var
|
||
var __mokoshTest: MokoshTestSurface | undefined;
|
||
}
|
||
|
||
export {};
|
||
```
|
||
|
||
### Production hook-gate pattern (src/background/index.ts top-of-module)
|
||
|
||
```typescript
|
||
// AT THE VERY TOP of src/background/index.ts, BEFORE any addListener calls.
|
||
// import.meta.env.MODE is statically replaced at build time by Vite (RESEARCH §6);
|
||
// the entire `if` block + its dynamic import are tree-shaken from production bundles
|
||
// because the literal === comparison resolves to `false` and Rollup deletes the
|
||
// unreachable branch.
|
||
if (import.meta.env.MODE === 'test') {
|
||
await import('../test-hooks/sw-hooks');
|
||
}
|
||
```
|
||
|
||
**CRITICAL ORDERING:** the hook import MUST run BEFORE any production `addListener` calls so the monkey-patches catch the handlers as they register. Top-of-module placement satisfies this.
|
||
|
||
### Production hook-gate pattern (src/offscreen/recorder.ts)
|
||
|
||
```typescript
|
||
// Top-of-module: register the hook.
|
||
if (import.meta.env.MODE === 'test') {
|
||
await import('../test-hooks/offscreen-hooks');
|
||
}
|
||
|
||
// Later, INSIDE startRecording after `mediaStream = stream;` (line ~247):
|
||
// Wire the runtime mediaStream reference into the hook. The hook's
|
||
// getCurrentStream getter reads through this wire. Gated identically so
|
||
// production bundle has zero hook reference at this site.
|
||
if (import.meta.env.MODE === 'test') {
|
||
globalThis.__mokoshTest?.getCurrentStream; // no-op read — actual wiring is in offscreen-hooks.ts setup
|
||
// The hook installs its own getter at registration time via a closure capture of
|
||
// a `currentStream` cell that we mutate here:
|
||
const hooks = await import('../test-hooks/offscreen-hooks');
|
||
hooks.setCurrentStream(stream);
|
||
}
|
||
```
|
||
|
||
(Note: the executor may flatten this — the simpler shape is to expose a `setCurrentStream` function from offscreen-hooks.ts that the recorder calls after assignment. The hook-side closes over a mutable `currentStream` variable. See Task 2 step 5.)
|
||
|
||
### Vite test config skeleton (vite.test.config.ts)
|
||
|
||
```typescript
|
||
import { defineConfig, mergeConfig } from 'vite';
|
||
import baseConfig from './vite.config';
|
||
|
||
export default defineConfig(() =>
|
||
mergeConfig(baseConfig, {
|
||
mode: 'test',
|
||
build: {
|
||
outDir: 'dist-test',
|
||
emptyOutDir: true,
|
||
},
|
||
}),
|
||
);
|
||
```
|
||
|
||
### npm scripts to add (package.json)
|
||
|
||
```jsonc
|
||
{
|
||
"scripts": {
|
||
"dev": "vite",
|
||
"build": "tsc && vite build",
|
||
"build:test": "tsc && vite build --mode test --config vite.test.config.ts",
|
||
"preview": "vite preview",
|
||
"test": "vitest run",
|
||
"test:uat": "npm run build:test && tsx tests/uat/harness.test.ts"
|
||
}
|
||
}
|
||
```
|
||
|
||
### Existing surfaces the executor must NOT alter (regression risk)
|
||
|
||
- `src/background/index.ts` lines 725-778 (RECORDING_ERROR conditional routing) — Bug B fix landed at b9eeeeb; harness asserts this is intact.
|
||
- `src/offscreen/recorder.ts` lines 451-480 (`onUserStoppedSharing`) — Bug B handler; harness assertion 6 verifies the dispatchEvent path reaches it.
|
||
- `tests/background/sw-bundle-import.test.ts` — Tier-1 gate; the new `no-test-hooks-in-prod-bundle.test.ts` follows the same pattern but inspects the BUILT artifact for hook leaks.
|
||
- `manifest.json` — already declares `notifications` permission + all 3 icon sizes; harness assertions 8, 9, 10 read these as-is.
|
||
- ALL existing 83 vitest tests — must remain GREEN.
|
||
|
||
### Resolved open questions from RESEARCH (5)
|
||
|
||
| # | Question | Resolution | Rationale |
|
||
|---|----------|------------|-----------|
|
||
| 1 | Where does `simulateUserStop` shim live? | `src/test-hooks/offscreen-hooks.ts` exports a `setCurrentStream(stream: MediaStream)` setter the recorder calls after assignment. The hook's `__mokoshTest.getCurrentStream` is a getter over the captured cell. `simulateUserStop` is harness-side (in `tests/uat/lib/offscreen.ts`) calling `dispatchEvent` directly on the track returned by `getCurrentStream()` — the offscreen-hooks side just exposes the stream; the simulate function is harness-side. | Minimum surface in production tree; the dispatchEvent invocation is harness-side so it's never bundled. |
|
||
| 2 | Notification assertions: count vs set-membership? | **Count + set-membership combined.** notificationCount asserts on TOTAL count (e.g. assertion 8: exactly 1 startup notification). notificationIds asserts on prefix membership (e.g. "an id starting 'mokosh-startup-' was created"). lastNotificationOptions asserts on iconUrl shape. | Pure count is brittle (retries inflate); pure set-membership misses overcount regressions. Combined assertions catch both. |
|
||
| 3 | CI plumbing scope: include or defer? | **Defer to Phase 5 (P1/P2 hardening) or its own Plan 01-12.** This plan ships a CI-callable harness (`npm run test:uat` exits 0 on pass, non-zero on fail) but no GitHub Actions wiring. Rationale: no existing CI infrastructure in the repo (verified — no `.github/workflows/` directory); adding CI here would force a CI-tool decision (Actions vs self-hosted) that is out of scope for Phase 1 stabilization. | Lowest-friction shipping; CI tool selection deserves its own plan. |
|
||
| 4 | Failure isolation: single browser vs per-assertion restart? | **Single browser, serial assertions.** Restart between assertions = ~3-5 s × 14 = 60+ s overhead per run. Single browser keeps total runtime under 60 s. Mitigation: structured diagnostic dump on first failure (SW console logs + offscreen console logs + screenshot) + `--bail` semantics (abort remaining assertions to keep failure mode unambiguous). | RESEARCH §5 recommendation matches; cost of state bleed is much lower than cost of state isolation overhead for 14 deterministic checks. |
|
||
| 5 | Test-hook contract location? | **Both.** Production-side canonical: `src/test-hooks/types.ts` (the file that ships with the test bundle and is type-checked by tsc). Harness-side mirror: `tests/uat/lib/test-hook-contract.d.ts` (decoupled from the production tree so the harness has no `import` reaching into `src/`). The mirror file's preamble cites the production-side file as the canonical source. Drift detection: a Tier-1-style test could later snapshot-diff the two; out of scope here, but documented as a follow-up note. | Type duplication is a small price for keeping `tests/` and `src/` import-separable. The drift risk is low because the shape is small (4 fields). |
|
||
|
||
### How to test Bug B without committing the revert
|
||
|
||
Per orchestrator brief ("rewinding the b9eeeeb conditional routing locally turns this assertion RED"):
|
||
|
||
1. Locally apply: `git apply <<'EOF' ... EOF` containing a temporary patch that reverts the `if (errorCode === 'user-stopped-sharing')` branch (so all errors route through `setErrorMode`).
|
||
2. Run `npm run test:uat`; assertion 6 (Bug B) MUST fail with a specific diagnostic (`expected badge text '' but got 'ERR'`).
|
||
3. Revert the local patch (`git checkout -- src/background/index.ts`).
|
||
4. Re-run `npm run test:uat`; assertion 6 MUST pass.
|
||
|
||
This RED-on-known-broken / GREEN-on-known-good cycle is the TDD discipline for the harness ITSELF. Each assertion in Task 5/6/7 includes this self-verification step in its action block.
|
||
|
||
</interfaces>
|
||
</context>
|
||
|
||
<tasks>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 1 (Wave 0): Install Puppeteer + tsx; add `vite.test.config.ts`; add `build:test` + `test:uat` npm scripts; commit Tier-1 hook-leak grep gate as RED.</name>
|
||
<read_first>
|
||
- package.json (existing scripts + devDeps — confirm puppeteer + tsx absent)
|
||
- vite.config.ts (the base config the new test config will merge over)
|
||
- tests/background/sw-bundle-import.test.ts (Tier-1 gate pattern to mirror)
|
||
- tsconfig.json (confirm `include` covers `src/**/*` — needed for src/test-hooks/)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §10 (two-bundle build orchestration)
|
||
</read_first>
|
||
<files>package.json, package-lock.json, vite.test.config.ts, tsconfig.json, tests/background/no-test-hooks-in-prod-bundle.test.ts</files>
|
||
<behavior>
|
||
- `npm install --save-dev puppeteer@^25.0.2 tsx@^4` lands cleanly. Both publish to npm registry as MIT-licensed packages with active maintenance windows (puppeteer 25.0.2 published 2025; tsx 4.x current). Pin both with caret ranges per project convention.
|
||
- `vite.test.config.ts` exists, extends `./vite.config.ts` via `mergeConfig`, sets `mode: 'test'` + `build.outDir: 'dist-test'` + `build.emptyOutDir: true`. Running `npx vite build --config vite.test.config.ts --mode test` produces `dist-test/` (verifiable via `test -d dist-test`).
|
||
- `package.json` `scripts` block adds `build:test` and `test:uat` per the interfaces block. `npm run build:test` exits 0 and produces `dist-test/`.
|
||
- `tsconfig.json` `include` covers `src/test-hooks/**/*` (verify it does already via the `src/**/*` glob; no edit needed if `include` is already that wildcard — check first and only add if absent).
|
||
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` exists with TWO `it` blocks:
|
||
(a) After `npm run build`, ZERO occurrences of `__mokoshTest` in any file under `dist/`. RED today because the gate test is committed BEFORE the hooks land — the gate is asserting on a not-yet-extant invariant. **CORRECTION:** RED-then-GREEN polarity here is inverted vs typical TDD: the gate ITSELF is GREEN today (no hooks → no leak), but the GATE must REMAIN GREEN after Task 2 lands the hooks. The test is committed in this task so the gate is operational BEFORE the hooks ship, eliminating the window-of-vulnerability where the production bundle could contain leaked hooks. Document this polarity in the test file preamble.
|
||
(b) After `npm run build`, ZERO occurrences of `simulateUserStop` in any file under `dist/`. Same polarity: GREEN today, must remain GREEN after hooks land.
|
||
- Both `it` blocks run a fresh `npm run build` as part of their setup (spawned via `child_process.execFile`, mirroring sw-bundle-import.test.ts's spawn pattern). They then `readdir`+`readFileSync` walk `dist/` and assert grep counts are zero. Skip the build spawn if `process.env.SKIP_BUILD === '1'` (developer escape hatch when running the test repeatedly during this task's iteration).
|
||
- The 83 baseline vitest tests + 2 new gate tests = 85 tests, ALL GREEN. (The Tier-1 gate is committed in a working state from day one.)
|
||
</behavior>
|
||
<action>
|
||
1. Read `package.json` to confirm `puppeteer` + `tsx` absent.
|
||
2. `npm install --save-dev puppeteer@^25.0.2 tsx@^4` — observe versions resolve correctly. Document the actually-resolved versions in the commit message body.
|
||
3. Update `package.json` `scripts` block per the interfaces section — add `build:test` and `test:uat`. Leave existing scripts (`dev`, `build`, `preview`, `test`) untouched.
|
||
4. Create `vite.test.config.ts` at repo root per the interfaces skeleton.
|
||
5. Verify `tsconfig.json` `include` covers `src/test-hooks/**/*` — if `include` is `["src/**/*"]` or omits `exclude` that would block, no edit needed. Document the actual `tsconfig.json` shape in the commit message body so reviewers see the verification ran.
|
||
6. Run `npm run build:test` → exit 0; `ls dist-test/` confirms emission. Run `npm run build` → exit 0; `ls dist/` confirms separate output.
|
||
7. Create `tests/background/no-test-hooks-in-prod-bundle.test.ts` with the two `it` blocks per behavior (a) + (b). Preamble docstring per project style: extensive (Google Python style mandate carries over — keep mirroring sw-bundle-import.test.ts's docstring density). Cite that this is a Tier-1 gate per `feedback-pre-checkpoint-bundle-gates.md` (the auto-loaded memory item).
|
||
8. Run `npx vitest run tests/background/no-test-hooks-in-prod-bundle.test.ts` → both GREEN (no hooks landed yet, nothing leaks).
|
||
9. Run `npx vitest run` (full suite) → 84 baseline + 2 new = 85 GREEN. Document the baseline + delta in the commit message body.
|
||
10. Run `npx tsc --noEmit` → exit 0.
|
||
11. Verify that NO `npm test` regression: rerun `npm test` → 85 GREEN.
|
||
Per project style: extensive docstrings; absolute imports; no `as any`; no `@ts-ignore`. The new test file is the first one to touch `child_process.execFile` since `sw-bundle-import.test.ts` — mirror that file's pattern verbatim (execFile + maxBuffer + timeout + stdout sentinel scheme). Do NOT introduce a new pattern.
|
||
</action>
|
||
<verify>
|
||
<automated>npm run build:test && npm run build && test -d dist-test && test -d dist && npx vitest run tests/background/no-test-hooks-in-prod-bundle.test.ts && npx tsc --noEmit</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- `package.json` devDeps include `puppeteer` + `tsx` at the pinned versions; `scripts` block carries `build:test` + `test:uat`.
|
||
- `vite.test.config.ts` exists, extends base config, emits to `dist-test/`.
|
||
- `npm run build:test` exits 0; `dist-test/` populated.
|
||
- `npm run build` exits 0; `dist/` populated separately (no clobber).
|
||
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` exists with 2 tests; both GREEN.
|
||
- Full vitest suite: 83 baseline + 2 new = 85 GREEN.
|
||
- `npx tsc --noEmit` exit 0.
|
||
</acceptance_criteria>
|
||
<done>Two-bundle infrastructure landed; Tier-1 hook-leak gate operational (GREEN, will remain GREEN after Task 2 hooks land); npm scripts wired; baseline preserved.</done>
|
||
</task>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 2 (Wave 1): Add gated test hooks to SW + offscreen; verify production bundle remains hook-free (Tier-1 gate stays GREEN).</name>
|
||
<read_first>
|
||
- src/background/index.ts (top-of-module — where the import.meta.env.MODE guard lands; lines 1-50)
|
||
- src/offscreen/recorder.ts (top-of-module + line ~247 where mediaStream is assigned)
|
||
- tests/background/sw-bundle-import.test.ts (the Tier-1 SW-bundle-loadability gate — confirm it still passes after hooks land in test bundle)
|
||
- tests/background/no-test-hooks-in-prod-bundle.test.ts (the gate from Task 1)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §6 (Vite tree-shaking gotchas)
|
||
- vite.test.config.ts (from Task 1)
|
||
</read_first>
|
||
<files>src/test-hooks/types.ts, src/test-hooks/sw-hooks.ts, src/test-hooks/offscreen-hooks.ts, src/background/index.ts, src/offscreen/recorder.ts, tests/uat/lib/test-hook-contract.d.ts</files>
|
||
<behavior>
|
||
- `src/test-hooks/types.ts` exports `MokoshTestSurface` + declares `globalThis.__mokoshTest` per the interfaces block.
|
||
- `src/test-hooks/sw-hooks.ts` registers the SW-side hook at module-load: monkey-patches `chrome.action.onClicked.addListener`, `chrome.runtime.onStartup.addListener`, `chrome.notifications.onClicked.addListener` to capture handler refs while still calling the originals. Wraps `chrome.notifications.create` to increment `notificationCount`, push id to `notificationIds`, save `lastNotificationOptions`. Initializes `globalThis.__mokoshTest = { handlers: {...}, notificationCount: 0, lastNotificationOptions: null, notificationIds: [] }`. NO `getCurrentStream` in SW (the field is optional per type — undefined in SW context).
|
||
- `src/test-hooks/offscreen-hooks.ts` registers the offscreen-side hook: exposes a mutable `currentStream: MediaStream | null` cell + `setCurrentStream(s)` setter + `__mokoshTest.getCurrentStream = () => currentStream` getter. The recorder calls `setCurrentStream` after the `mediaStream = stream` assignment (gated by the same MODE check).
|
||
- `src/background/index.ts` top-of-module gets:
|
||
```typescript
|
||
if (import.meta.env.MODE === 'test') {
|
||
await import('../test-hooks/sw-hooks');
|
||
}
|
||
```
|
||
Placement: BEFORE any `addListener` calls in the file so the monkey-patches catch every handler. This is a top-level `await` — supported in SW context per crxjs/Vite's MV3 module emission.
|
||
- `src/offscreen/recorder.ts` top-of-module gets the symmetric gated import; the `setCurrentStream` call lands inside `startRecording` right after `mediaStream = stream;` (line 247), also gated.
|
||
- `tests/uat/lib/test-hook-contract.d.ts` mirrors `MokoshTestSurface` for harness-side consumption (it's a declaration file; not bundled, only used at type-check time on the harness).
|
||
- After all changes, `npm run build` exits 0 AND `tests/background/no-test-hooks-in-prod-bundle.test.ts` REMAINS GREEN (the literal `__mokoshTest` does NOT appear in any file under `dist/`). `npm run build:test` exits 0 AND ONE OR MORE files under `dist-test/` contain `__mokoshTest` (verifiable by `grep -l __mokoshTest dist-test/`).
|
||
- `tests/background/sw-bundle-import.test.ts` REMAINS GREEN (Layer 1 + Layer 2; the gated dynamic import does not break the production bundle's module init).
|
||
- Full vitest suite: 85 GREEN (no regression).
|
||
</behavior>
|
||
<action>
|
||
1. Create `src/test-hooks/types.ts` per the interfaces block. Extensive JSDoc; cite this plan's Task 2 + RESEARCH §6 (gating mechanism) + RESEARCH §7 (Bug B BLOCKER context for getCurrentStream's role).
|
||
2. Create `src/test-hooks/sw-hooks.ts`. Monkey-patch pattern follows RESEARCH §6 Pattern 1. Wrap `chrome.notifications.create` so all four shape fields update (count, last options, ids array, no-op chain to the original create). Use absolute Chrome types from `@types/chrome` — no `as any`. Initialization at module load:
|
||
```typescript
|
||
const handlers: MokoshTestSurface['handlers'] = {
|
||
onClicked: null, onStartup: null, notificationOnClicked: null,
|
||
};
|
||
const notificationIds: string[] = [];
|
||
|
||
const origActionAdd = chrome.action.onClicked.addListener.bind(chrome.action.onClicked);
|
||
chrome.action.onClicked.addListener = (cb) => {
|
||
handlers.onClicked = cb;
|
||
return origActionAdd(cb);
|
||
};
|
||
// ... similarly for onStartup, notifications.onClicked ...
|
||
|
||
const origNotifCreate = chrome.notifications.create.bind(chrome.notifications);
|
||
(chrome.notifications.create as unknown) = (idOrOptions: string | chrome.notifications.NotificationOptions, optionsOrCb?: chrome.notifications.NotificationOptions | ((id: string) => void), maybeCb?: (id: string) => void) => {
|
||
// Handle both (id, options, cb) and (options, cb) overloads;
|
||
// surface the resolved id in notificationIds.
|
||
// Call origNotifCreate with the same args; wrap the callback to push id.
|
||
// Increment notificationCount; save lastNotificationOptions.
|
||
// Return the original return value (Chrome 88+ also Promise-returning).
|
||
};
|
||
|
||
globalThis.__mokoshTest = {
|
||
handlers,
|
||
notificationCount: 0,
|
||
lastNotificationOptions: null,
|
||
get notificationIds() { return notificationIds.slice(); },
|
||
};
|
||
```
|
||
The `as unknown` cast in the `create` reassignment is unavoidable because Chrome's `create` is typed as overloaded callable; document this explicitly with a comment citing the overload variance issue. NO `as any` — the `as unknown` + downstream typed body is the project-style escape hatch.
|
||
3. Create `src/test-hooks/offscreen-hooks.ts`:
|
||
```typescript
|
||
let currentStream: MediaStream | null = null;
|
||
export function setCurrentStream(stream: MediaStream | null): void {
|
||
currentStream = stream;
|
||
}
|
||
globalThis.__mokoshTest = {
|
||
// ...inherit SW's surface if it was set first; in offscreen context
|
||
// sw-hooks.ts did NOT run because this is a different document.
|
||
// So we initialize a fresh shape with only the offscreen-relevant fields:
|
||
handlers: { onClicked: null, onStartup: null, notificationOnClicked: null },
|
||
notificationCount: 0,
|
||
lastNotificationOptions: null,
|
||
notificationIds: [],
|
||
getCurrentStream: () => currentStream,
|
||
};
|
||
```
|
||
Note: the SW and offscreen are DIFFERENT JS isolates with DIFFERENT `globalThis`. The harness reads each surface via the appropriate `sw.evaluate` or `offPage.evaluate`. No cross-context shared state.
|
||
4. Edit `src/background/index.ts` — add the gated dynamic import at the TOP of the file (after any necessary type imports but BEFORE the existing logger initialization + addListener calls). Document inline that the placement is load-order-critical: this MUST run before any addListener.
|
||
5. Edit `src/offscreen/recorder.ts`:
|
||
(a) Top-of-module: gated dynamic import per the SW pattern.
|
||
(b) Inside `startRecording`, immediately after `mediaStream = stream;` (line ~247): gated `setCurrentStream(stream)` call. Use a top-level captured reference to the hooks module (set during the top-of-module import via a module-scoped `let hooks: typeof import('../test-hooks/offscreen-hooks') | null = null;` plus assignment in the import block). This avoids re-import per startRecording call.
|
||
6. Create `tests/uat/lib/test-hook-contract.d.ts`. Mirror `MokoshTestSurface`. Add a preamble docstring citing `src/test-hooks/types.ts` as the canonical source AND noting the drift-risk (manual sync) + the rationale for decoupling (no `import` from `tests/` into `src/`).
|
||
7. Run `npx tsc --noEmit` → exit 0 (all hook code typechecks).
|
||
8. Run `npm run build` (production). Then check `grep -rln __mokoshTest dist/` → ZERO matches. The Tier-1 gate test `tests/background/no-test-hooks-in-prod-bundle.test.ts` MUST stay GREEN.
|
||
9. Run `npm run build:test`. Then check `grep -rln __mokoshTest dist-test/` → ONE OR MORE matches (the hook code is bundled into the test build).
|
||
10. Run `npx vitest run` (full suite). 85 GREEN. The SW-bundle-import test must also be GREEN — verifies the gated dynamic import does NOT break production module init.
|
||
11. Sanity-check: open one of the production bundle's chunk files (the SW chunk via `dist/service-worker-loader.js` → its imported chunk) and confirm by eye that no `__mokoshTest` string is present. The grep gate is authoritative, but a manual eyeball ensures the gate isn't fooled by some bundler renaming.
|
||
DESIGN NOTE: the gated dynamic import IS the tree-shake trigger. If Vite ever fails to tree-shake a dynamic import behind a literal-comparison guard (which it shouldn't per RESEARCH §6 — the literal `'test'` !== `'production'` comparison is a static dead branch in production), the Tier-1 gate fails LOUDLY at CI time. The gate is THE mitigation for assumption A3 in RESEARCH §6.
|
||
</action>
|
||
<verify>
|
||
<automated>npx tsc --noEmit && npm run build && test "$(grep -rln __mokoshTest dist/ | wc -l)" = "0" && npm run build:test && test "$(grep -rln __mokoshTest dist-test/ | wc -l)" -ge "1" && npx vitest run --reporter=dot</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- `src/test-hooks/{types,sw-hooks,offscreen-hooks}.ts` exist with the contracts described.
|
||
- `src/background/index.ts` + `src/offscreen/recorder.ts` carry the gated dynamic import block; in offscreen, also the `setCurrentStream(stream)` call inside `startRecording`.
|
||
- `tests/uat/lib/test-hook-contract.d.ts` mirrors the type.
|
||
- `npm run build` exits 0; `grep -rln __mokoshTest dist/` → 0 matches.
|
||
- `npm run build:test` exits 0; `grep -rln __mokoshTest dist-test/` → ≥1 match.
|
||
- Tier-1 grep gate (`tests/background/no-test-hooks-in-prod-bundle.test.ts`) GREEN.
|
||
- Tier-1 SW-bundle-import gate (`tests/background/sw-bundle-import.test.ts`) GREEN.
|
||
- Full vitest suite: 85 GREEN.
|
||
- `npx tsc --noEmit` exit 0.
|
||
</acceptance_criteria>
|
||
<done>Hook surfaces live in test bundle; absent in production bundle (Tier-1 grep gate verifies); SW + offscreen module init unchanged for production; baseline preserved.</done>
|
||
</task>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 3 (Wave 2): Build harness scaffolding — `tests/uat/lib/{launch,extension,sw,offscreen,assertions,zip}.ts` + `harness.test.ts` skeleton with all 14 assertions stubbed as failing.</name>
|
||
<read_first>
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §1 (Puppeteer extension API patterns)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §4 (target type quirk for offscreen)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §7 (Bug B dispatchEvent contract — BLOCKER)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §11 (per-assertion implementation hints)
|
||
- src/test-hooks/types.ts (from Task 2)
|
||
- tests/uat/lib/test-hook-contract.d.ts (from Task 2)
|
||
- tests/background/sw-bundle-import.test.ts (execFile child-process pattern — only relevant for assertion 0 which uses fs.readdir directly, not a spawned child)
|
||
</read_first>
|
||
<files>tests/uat/lib/launch.ts, tests/uat/lib/extension.ts, tests/uat/lib/sw.ts, tests/uat/lib/offscreen.ts, tests/uat/lib/assertions.ts, tests/uat/lib/zip.ts, tests/uat/harness.test.ts, tests/uat/README.md</files>
|
||
<behavior>
|
||
- `tests/uat/lib/launch.ts` exports `launchHarnessBrowser(options?: HarnessOptions): Promise<HarnessHandles>` returning `{ browser, sw, ext, page, downloadsDir }`. Reads `HEADLESS` env var (`'0'` = headful for debug, anything else = headless). Wires Chrome args per RESEARCH §1 + §9.
|
||
- `tests/uat/lib/extension.ts` exports `attachToSw`, `attachToOffscreen`, `waitForOffscreen` per the RESEARCH §4 patterns. The offscreen attach uses the `background_page` target type + `.asPage()` (Pitfall 1).
|
||
- `tests/uat/lib/sw.ts` exports `getBadgeText(sw)`, `getPopup(sw)`, `getManifest(sw)`, `getIconSize(sw, path)`, `fireOnStartup(sw)`, `sendSyntheticRecordingError(sw, errorCode)`, `keepalivePing(sw)`, `getNotificationSnapshot(sw)`.
|
||
- `tests/uat/lib/offscreen.ts` exports `getDisplaySurface(offPage)`, `simulateUserStop(offPage)` (the dispatchEvent path per RESEARCH §7 BLOCKER — with an inline comment block citing the BLOCKER reasoning so future readers don't refactor it to `track.stop()`).
|
||
- `tests/uat/lib/assertions.ts` exports `assertEqual(actual, expected, msg)` + `assertMatch(actual, regex, msg)` + `assertTrue(cond, msg)` + a structured `runAssertion(name, fn)` wrapper that runs a single assertion, captures any SW/offscreen console logs since the last assertion, and dumps them to stderr on failure. Uses `node:assert/strict` per RESEARCH §4.
|
||
- `tests/uat/lib/zip.ts` exports `assertArchiveShape(zipBuf, expectedVersion)` — opens with jszip, asserts `video/last_30sec.webm` present + `meta.json` carries `version === expectedVersion`. The meta.json shape is per Plan 01-07 (existing archive contract — read once at the start of the harness and pass through).
|
||
- `tests/uat/harness.test.ts` is the single Node script (tsx-runnable). Top-to-bottom narrative:
|
||
```
|
||
0. Pre-flight grep gate (filesystem readdir on dist/) — assertion 0.
|
||
1. launchHarnessBrowser → attachToSw → attachToOffscreen-when-ready.
|
||
2. Assertion 1: SW bootstrap → setIdleMode (badge '', popup '', isRecording=false).
|
||
3. Assertion 2: triggerExtensionAction → wait → badge 'REC' + popup === src/popup/index.html + isRecording=true.
|
||
4. Assertion 3: offscreen track displaySurface === 'monitor'.
|
||
5. Assertion 4: triggerExtensionAction (while recording) → popup opens, NO new offscreen target.
|
||
6. Assertion 5: sendMessage SAVE_ARCHIVE → wait for download → check downloadsDir contains session_report_*.zip.
|
||
7. Assertion 6 (BUG B): simulateUserStop → wait 300ms → badge '' + popup '' + isRecording=false + notificationCount delta = 0.
|
||
8. Assertion 7 (ERROR path): sendSyntheticRecordingError('codec-unsupported') → badge 'ERR' + notificationCount delta = 1.
|
||
9. Assertion 8 (BUG A + onStartup): fireOnStartup → notifications.create called once with iconUrl matching icons/icon48.png (or icon128.png — verify which one the production code uses; the badge_state_machine plan uses icon128, but the test asserts whichever the production code actually invokes per the lastNotificationOptions snapshot).
|
||
10. Assertion 9: icon file sizes via sw.evaluate(fetch) ≥ floors (16: 200B, 48: 500B, 128: 1024B).
|
||
11. Assertion 10: manifest has 'notifications' permission + icons.16 + icons.48 + icons.128 declared.
|
||
12. Assertion 11 (35s record): start a fresh recording, wait 35s, query SW (via runtime message → offscreen → segments count) → segments.length >= 3.
|
||
13. Assertion 12 (ffprobe gate): trigger SAVE_ARCHIVE, extract video/last_30sec.webm, spawn ffprobe → exit 0.
|
||
14. Assertion 13 (zip shape): assertArchiveShape on the latest session_report_*.zip.
|
||
15. Final summary: `console.log('UAT harness: 14/14 assertions passed')`; exit 0.
|
||
```
|
||
ALL 14 assertions stubbed today as `runAssertion('N: title', async () => { throw new Error('NOT YET IMPLEMENTED — Task 5+ wires this'); });` so the harness exits non-zero with a clear "N assertions failed" diagnostic. Assertion 0 (filesystem-only) is wired in this task; assertions 1-13 are stubbed.
|
||
- `tests/uat/README.md` documents:
|
||
- How to run: `npm run test:uat` (build + harness).
|
||
- Local-debug headful mode: `HEADLESS=0 npm run test:uat`.
|
||
- Skipping the build (developer iteration): `SKIP_BUILD=1 npx tsx tests/uat/harness.test.ts` (the build is the npm-script wrapper; the harness itself can run against an existing `dist-test/`).
|
||
- Locale gotcha: `--auto-select-desktop-capture-source="Entire screen"` works on en_US; other locales need the locale-equivalent string. Fallback to operator-pick + `KEEP_PROFILE=1` documented as the Plan 01-09 fallback.
|
||
- dev-dep size: puppeteer pulls ~150MB Chromium binary; CI must accept this. Production `npm install --omit=dev` skips it cleanly.
|
||
- Xvfb is NOT required (per RESEARCH §3 empirical probes on Chrome 148).
|
||
- Failure isolation choice: single browser, serial assertions, bail on first failure (RESEARCH §5 + open-question resolution 4).
|
||
- Running `npm run test:uat` exits NON-ZERO today (the 13 stubbed assertions all throw); the diagnostic clearly identifies which assertion failed AND why ("NOT YET IMPLEMENTED — Task 5+ wires this"). Assertion 0 (the grep gate) PASSES — confirming the harness scaffolding wires correctly and the only failures are intentional stubs.
|
||
</behavior>
|
||
<action>
|
||
1. Create the `tests/uat/lib/` directory + all 6 helper files. Use absolute imports per project style. NO `as any`; type each helper's surface explicitly. Each helper file gets a top-of-file docstring per project style (extensive Google-style).
|
||
2. `launch.ts`: implementation uses `puppeteer.launch({ enableExtensions: [absolutePath], headless: ..., args: [...] })`. The absolutePath is computed via `path.resolve(__dirname, '../../../dist-test')` (the harness lives at `tests/uat/harness.test.ts` so `../../../` lands at repo root). Use `fileURLToPath` + `import.meta.url` for the `__dirname` shim (the harness runs as ESM under tsx).
|
||
3. `extension.ts`: implementation per RESEARCH §1 + §4 patterns. The offscreen attach uses `browser.waitForTarget(t => t.type() === 'background_page' && t.url().includes('offscreen'), { timeout: 5_000 })`. After getting the target, `.asPage()` returns the Page.
|
||
4. `sw.ts`: each helper is one or two lines of `sw.evaluate(...)`. The `getNotificationSnapshot` helper returns a structured `{ count, lastOptions, ids }` to keep the harness's reasoning unified.
|
||
5. `offscreen.ts` `simulateUserStop`:
|
||
```typescript
|
||
export async function simulateUserStop(offPage: Page): Promise<void> {
|
||
// RESEARCH §7 BLOCKER — DO NOT REFACTOR to track.stop().
|
||
// track.stop() does NOT fire 'ended' per W3C spec (verified probe7);
|
||
// dispatchEvent IS the only path that triggers our production
|
||
// onUserStoppedSharing handler. A test that calls track.stop() would
|
||
// silently pass while production reality fails — exactly the trap
|
||
// Bug B fix (commit b9eeeeb) addresses.
|
||
await offPage.evaluate(() => {
|
||
const stream = globalThis.__mokoshTest?.getCurrentStream?.();
|
||
if (!stream) throw new Error('no current MediaStream — recording must be active');
|
||
const track = stream.getVideoTracks()[0];
|
||
if (!track) throw new Error('no video track in stream');
|
||
track.dispatchEvent(new Event('ended'));
|
||
});
|
||
}
|
||
```
|
||
6. `assertions.ts`: `runAssertion(name, fn)` captures `console.log`/`console.error` from the harness's own process; for SW + offscreen console logs, accept an optional `consoleSinks` parameter — the harness wires SW.on('console', ...) + offPage.on('console', ...) listeners at launch and passes their accumulating buffers to runAssertion. On assertion failure: dump buffers to stderr with structured "SW console (last N):" + "Offscreen console (last N):" preambles; rethrow.
|
||
7. `zip.ts`: jszip-based reader. The `expectedVersion` comes from `chrome.runtime.getManifest().version` (queried once at the start of the harness via `sw.evaluate`). Assertion is exact equality.
|
||
8. `harness.test.ts`: the top-to-bottom narrative. Wrap the whole thing in a top-level `try/finally`; the `finally` always calls `browser.close()`. The 14 assertion stubs all throw the "NOT YET IMPLEMENTED" Error. Assertion 0 is wired in this task:
|
||
```typescript
|
||
await runAssertion('0: production bundle has no __mokoshTest leak', async () => {
|
||
// Filesystem-only — does not require the browser.
|
||
// We don't run `npm run build` here; that's the caller's responsibility
|
||
// (npm run test:uat does `npm run build:test` first; a separate `npm run build`
|
||
// confirmation could be added as a pre-flight, but the no-test-hooks-in-prod-bundle
|
||
// unit test already covers that and runs as part of `npm test`. Here we re-verify
|
||
// for E2E robustness against the case where the unit test passed against a stale dist/.)
|
||
const { execFileSync } = await import('node:child_process');
|
||
execFileSync('npm', ['run', 'build'], { stdio: 'inherit' });
|
||
const distDir = path.resolve(__dirname, '../../dist');
|
||
const matches = await grepRecursive(distDir, '__mokoshTest');
|
||
assertEqual(matches.length, 0, 'production dist/ must not contain __mokoshTest');
|
||
});
|
||
```
|
||
NOTE: assertion 0 spawns `npm run build` from inside the harness, which costs ~10s. The unit test (Task 1) makes this somewhat redundant — but the unit test runs in the vitest pass; the harness runs separately. Belt + suspenders. Alternative: skip the spawn if `process.env.SKIP_PROD_REBUILD === '1'` for developer iteration.
|
||
9. `README.md`: per the behavior list.
|
||
10. Run `npm run test:uat`. Expected output:
|
||
- `npm run build:test` runs first (succeeds; emits dist-test/).
|
||
- `tsx tests/uat/harness.test.ts` runs.
|
||
- Assertion 0 PASSES (filesystem grep gate).
|
||
- Assertions 1-13 all THROW "NOT YET IMPLEMENTED".
|
||
- Exit code: non-zero.
|
||
- Diagnostic line: "UAT harness: 1/14 assertions passed, 13 failed (first failure: Assertion 1)".
|
||
11. Run `npx tsc --noEmit` → exit 0 (all harness code type-clean against `@types/chrome` + puppeteer types + `tests/uat/lib/test-hook-contract.d.ts`).
|
||
12. Run `npx vitest run` (full suite) → 85 GREEN (no regression to unit tests; the harness lives outside vitest's discovery).
|
||
Per project style: extensive docstrings; absolute imports; no `as any`; no `@ts-ignore`; named callbacks (the runAssertion lambdas are short enough to be acceptable as inline arrows). Use if-else chains over early returns where the assertion logic has multi-arm branching; guard-clause early returns are fine for null-checks per established project exception.
|
||
</action>
|
||
<verify>
|
||
<automated>npx tsc --noEmit && npm run test:uat; test $? -ne 0 && npx vitest run --reporter=dot</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- All 7 helper files exist with the contracts described.
|
||
- `harness.test.ts` exists with assertion 0 wired (GREEN) + assertions 1-13 stubbed (RED).
|
||
- `README.md` documents the runtime + local-debug + CI semantics.
|
||
- `npm run test:uat` exits non-zero today; diagnostic clearly identifies assertion 0 as PASS + assertions 1-13 as "NOT YET IMPLEMENTED".
|
||
- `npx tsc --noEmit` exit 0 across both `src/` and `tests/` trees.
|
||
- Full vitest suite: 85 GREEN.
|
||
- No file under `src/` modified by this task (the harness is purely under `tests/`).
|
||
</acceptance_criteria>
|
||
<done>Harness scaffolding live with assertion 0 wired GREEN; assertions 1-13 staged as RED stubs for Tasks 4-7; baseline preserved.</done>
|
||
</task>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 4 (Wave 3 — bundle 1/4): Wire assertions 1, 2, 3, 4 (SW bootstrap + toolbar onClicked + displaySurface + popup-during-recording).</name>
|
||
<read_first>
|
||
- tests/uat/harness.test.ts (skeleton from Task 3)
|
||
- tests/uat/lib/{launch,extension,sw,offscreen}.ts (helpers from Task 3)
|
||
- src/background/index.ts lines 75-108 (setIdleMode/setRecordingMode state machine — the production code these assertions verify)
|
||
- src/background/index.ts lines 411-415 (setRecordingMode call site inside startVideoCapture)
|
||
- src/background/index.ts lines 844-858 (chrome.action.onClicked listener registration)
|
||
- src/offscreen/recorder.ts lines 241-247 (getDisplayMedia call + mediaStream assignment)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §1 (triggerExtensionAction + the popup-vs-onClicked MV3 contract)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md (the must-haves these assertions are verifying)
|
||
</read_first>
|
||
<files>tests/uat/harness.test.ts, tests/uat/lib/sw.ts</files>
|
||
<behavior>
|
||
- Assertion 1 (SW bootstrap): after `launchHarnessBrowser` + attach SW, query `getBadgeText` (empty), `getPopup` (empty), `getIsRecording` (false — exposed via a new helper that reads `globalThis.isRecording` from the SW context via `sw.evaluate`; the SW production code has `isRecording` as a module-level let, accessible from the SW global). PASSES today against current bundle.
|
||
- Assertion 2 (onClicked-idle): `page.triggerExtensionAction(ext)` → `await waitFor(() => getBadgeText() === 'REC', 5_000)` (poll up to 5s; the picker auto-selects the screen so getDisplayMedia resolves fast). Then assert popup === 'src/popup/index.html' + getIsRecording === true. PASSES today.
|
||
- Assertion 3 (displaySurface): after assertion 2 leaves recording active, attach to offscreen via `waitForOffscreen` + `attachToOffscreen`. Then `offsetPage.evaluate(() => __mokoshTest.getCurrentStream().getVideoTracks()[0].getSettings().displaySurface)` === 'monitor'. PASSES today (per Plan 01-09 D-15-display-surface; the post-grant validation in recorder.ts ensures monitor-only).
|
||
- Assertion 4 (click-during-recording): record the current offscreen target count, then `page.triggerExtensionAction(ext)` again. Assert: popup state unchanged (still 'src/popup/index.html'); NO new offscreen target spawned (count unchanged). The toolbar click with popup set opens the popup (which the harness can verify via `browser.targets().find(t => t.url().includes('popup/index.html'))` — the popup target appears as a `page` type briefly). PASSES today.
|
||
- All 4 assertions wired; each carries an inline RED-on-regression demonstration step in its action block: the executor must locally demonstrate the assertion CAN catch a regression before marking the assertion GREEN.
|
||
</behavior>
|
||
<action>
|
||
1. Wire assertion 1: replace the "NOT YET IMPLEMENTED" stub with the real logic per behavior. Add a `getIsRecording(sw)` helper to `tests/uat/lib/sw.ts`:
|
||
```typescript
|
||
export async function getIsRecording(sw: WebWorker): Promise<boolean> {
|
||
return await sw.evaluate(() => (globalThis as any).isRecording as boolean);
|
||
}
|
||
```
|
||
NOTE: this is the ONE site where `as any` is unavoidable — the production code declares `isRecording` as a module-level `let` in `src/background/index.ts:36`, which is NOT exposed on globalThis directly. To read it, we need to evaluate in the SW context AS the SW (which has implicit globalThis access to module-top let-bindings — verify this is true in MV3 SW context; if not, expose `isRecording` via a getter on `__mokoshTest` in `sw-hooks.ts`). Document the choice + rationale inline.
|
||
(Per RESEARCH §6 contract verification: SW module-level `let` IS accessible as `globalThis.isRecording` in MV3 SW context — verified by probe2. If the executor sees `undefined` returned, fall back to exposing via `__mokoshTest.isRecording` getter from sw-hooks.ts and document the SW-isolation finding.)
|
||
2. Wire assertion 2: implementation per behavior. After `triggerExtensionAction`, poll `getBadgeText` for up to 5 seconds — the badge transition is async (offscreen creation + getDisplayMedia + post-grant validation + setRecordingMode all happen in sequence). Use a polling helper from `assertions.ts` or inline:
|
||
```typescript
|
||
async function waitFor<T>(probe: () => Promise<T>, predicate: (v: T) => boolean, timeoutMs: number): Promise<T> {
|
||
const start = Date.now();
|
||
while (Date.now() - start < timeoutMs) {
|
||
const v = await probe();
|
||
if (predicate(v)) return v;
|
||
await new Promise(r => setTimeout(r, 100));
|
||
}
|
||
throw new Error(`waitFor timeout ${timeoutMs}ms`);
|
||
}
|
||
```
|
||
Use this in assertion 2 + 3 + 4.
|
||
3. Wire assertion 3: per behavior. The `waitForOffscreen` helper already handles the target wait + asPage; attach once after assertion 2 sets recording=true, then offPage.evaluate the displaySurface read.
|
||
4. Wire assertion 4: per behavior. Count `browser.targets()` filtered to offscreen-url-containing BEFORE the second click, then AFTER; assert equality. Also assert popup state unchanged.
|
||
5. RED-on-regression demonstration:
|
||
- For assertion 2: locally insert `chrome.action.onClicked.addListener(async () => { return; })` BEFORE the production listener and re-build:test; assertion 2 should FAIL (badge stays empty). Revert the hack; assertion 2 PASSES.
|
||
- For assertion 3: locally alter `recorder.ts` to call `getDisplayMedia({ video: true, audio: false })` (without displaySurface constraint) and rebuild; assertion 3 should FAIL (displaySurface defaults to 'browser' OR is undefined depending on Chrome behavior). Revert; PASSES.
|
||
- The executor commits ONLY the working assertions; the RED demos are local-only verifications. Document each RED demo's outcome in the commit message body.
|
||
6. Run `npm run test:uat`: assertions 0+1+2+3+4 PASS; assertions 5-13 still stubbed as RED. Exit non-zero. Diagnostic: "5/14 passed, 9 failed".
|
||
7. Run `npx tsc --noEmit` → exit 0.
|
||
8. Run full vitest suite → 85 GREEN.
|
||
</action>
|
||
<verify>
|
||
<automated>npx tsc --noEmit && (set +e; npm run test:uat; test $? -ne 0)</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- Assertions 0, 1, 2, 3, 4 all PASS in `npm run test:uat`.
|
||
- Assertions 5-13 still throw "NOT YET IMPLEMENTED".
|
||
- `npm run test:uat` exits non-zero (because 9 stubs remain).
|
||
- Diagnostic shows 5/14 passed.
|
||
- `npx tsc --noEmit` exit 0.
|
||
- Full vitest suite: 85 GREEN.
|
||
- Each wired assertion's commit message body cites the RED-demonstration outcome.
|
||
</acceptance_criteria>
|
||
<done>First 4 functional assertions live and GREEN; harness proves it can verify toolbar + displaySurface + popup-state via CDP.</done>
|
||
</task>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 5 (Wave 3 — bundle 2/4): Wire assertions 5, 6, 7 (SAVE_ARCHIVE download + Bug B user-stopped routing + ERROR-path).</name>
|
||
<read_first>
|
||
- tests/uat/harness.test.ts (assertions 1-4 GREEN from Task 4)
|
||
- tests/uat/lib/{sw,offscreen,zip}.ts (helpers; especially simulateUserStop's BLOCKER-citing comment)
|
||
- src/background/index.ts lines 725-778 (RECORDING_ERROR handler — Bug B conditional routing)
|
||
- src/offscreen/recorder.ts lines 451-480 (onUserStoppedSharing — the handler simulateUserStop must trigger)
|
||
- .planning/debug/resolved/01-09-recovery-flow.md (Bug B debug record — the exact contract assertion 6 verifies)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §7 (BLOCKER analysis — track.dispatchEvent is the ONLY valid path)
|
||
</read_first>
|
||
<files>tests/uat/harness.test.ts, tests/uat/lib/sw.ts</files>
|
||
<behavior>
|
||
- Assertion 5 (SAVE_ARCHIVE download): with recording active from prior assertions, `sw.evaluate(() => chrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}))` triggers the save flow. The download lands in `downloadsDir` (configured at launch via `--user-data-dir` + per-page download behavior, OR via `page._client().send('Browser.setDownloadBehavior', ...)` — RESEARCH didn't deep-dive this; the executor researches the cleanest path). Poll for `*session_report*.zip` appearance in downloadsDir for up to 15s. PASSES today.
|
||
- Assertion 6 (BUG B): snapshot `notificationCount` via `getNotificationSnapshot(sw)`. Then `simulateUserStop(offPage)`. Wait 300ms (offscreen handler → runtime message → SW handler → state transition is async). Assert: badge text === '' (NOT 'ERR'); popup === '' (NOT 'src/popup/index.html'); isRecording === false; notificationCount delta === 0 (no recovery notification fired for deliberate stop). PASSES today against b9eeeeb.
|
||
- Assertion 7 (ERROR-path preserved): start a fresh recording (since assertion 6 stopped it). Snapshot notificationCount. Then `sw.evaluate(() => chrome.runtime.sendMessage({type: 'RECORDING_ERROR', error: 'codec-unsupported'}))`. Wait 200ms. Assert: badge text === 'ERR'; notificationCount delta === 1; last notification id starts with 'mokosh-recovery-'. PASSES today.
|
||
- Each assertion carries the RED-on-regression demonstration; assertion 6's RED demo is the canonical "rewinding b9eeeeb" cycle from the orchestrator brief.
|
||
</behavior>
|
||
<action>
|
||
1. Wire assertion 5. Investigate Puppeteer's download path config: `browser.defaultBrowserContext().overridePermissions(...)` for downloads OR `CDP Browser.setDownloadBehavior` with `behavior: 'allow'` + `downloadPath: downloadsDir`. The harness creates `downloadsDir` in the launch helper (e.g. `os.tmpdir() + '/mokosh-uat-downloads-' + Date.now()`). After `sendMessage({type:'SAVE_ARCHIVE'})`, poll the dir for ~15s for any `session_report_*.zip`. Save the path for assertion 13. PASS = file appears + non-zero size.
|
||
2. Wire assertion 6 per behavior. Use the existing `simulateUserStop` helper (with its BLOCKER comment intact). The 300ms wait is the propagation budget; if assertions intermittently flake here, bump to 500ms — the offscreen handler is synchronous-into-sendMessage, the SW handler is synchronous-into-setIdleMode, so 300ms is generous but not extravagant.
|
||
3. Wire assertion 7 per behavior. Reads `lastNotificationOptions.title` or similar to verify "Mokosh stopped" recovery copy AND `notificationIds[notificationIds.length-1].startsWith('mokosh-recovery-')`.
|
||
4. RED-on-regression demonstrations (recorded in commit body):
|
||
- **Assertion 6 RED demo (THE canonical Bug B regression check)**: locally `git diff HEAD~1 -- src/background/index.ts` to recover the pre-b9eeeeb shape of the RECORDING_ERROR handler (unconditional setErrorMode); APPLY the inverse patch locally (do NOT commit). Rebuild test bundle. Run `npm run test:uat`. Assertion 6 MUST FAIL with diagnostic: "expected badge text '' but got 'ERR'". Revert (`git checkout -- src/background/index.ts`). Rebuild. Re-run. Assertion 6 PASSES. This proves the harness assertion CAN catch a Bug B regression. **Document this end-to-end demo in the commit message body.**
|
||
- Assertion 5 RED demo: locally comment out the `chrome.downloads.download(...)` call in `src/background/index.ts:saveArchive` and rebuild; assertion 5 should FAIL (timeout waiting for zip). Revert; PASSES.
|
||
- Assertion 7 RED demo: locally short-circuit the RECORDING_ERROR case to return without calling setErrorMode for codec-unsupported (e.g. early-return on case entry); assertion 7 should FAIL. Revert; PASSES.
|
||
5. Run `npm run test:uat`: 8/14 PASS, 6 stubs remain. Exit non-zero.
|
||
6. Run `npx tsc --noEmit` → exit 0. Vitest 85 GREEN.
|
||
</action>
|
||
<verify>
|
||
<automated>npx tsc --noEmit && (set +e; npm run test:uat; test $? -ne 0)</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- Assertions 0-7 all PASS.
|
||
- Assertions 8-13 still stubbed RED.
|
||
- `npm run test:uat` exits non-zero; diagnostic 8/14 passed.
|
||
- Bug B RED-on-regression demo documented in commit body (mandatory).
|
||
- `npx tsc --noEmit` exit 0; vitest 85 GREEN.
|
||
</acceptance_criteria>
|
||
<done>Bug B harness assertion live AND demonstrably catches regression; SAVE_ARCHIVE + ERROR-path coverage live; bug-class root cause (state-machine routing) now CI-callable.</done>
|
||
</task>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 6 (Wave 3 — bundle 3/4): Wire assertions 8, 9, 10 (Bug A onStartup notification + icon file sizes + manifest shape).</name>
|
||
<read_first>
|
||
- tests/uat/harness.test.ts (assertions 1-7 GREEN from Tasks 4-5)
|
||
- src/background/index.ts lines 860-881 (chrome.runtime.onStartup handler — the path Bug A's recovery notification was failing on before a881bf0)
|
||
- manifest.json (icons declared + notifications permission)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §11 (per-assertion implementation hints)
|
||
- icons/icon{16,48,128}.png (verify presence + size — the floors are 200/500/1024 bytes from the orchestrator brief)
|
||
</read_first>
|
||
<files>tests/uat/harness.test.ts</files>
|
||
<behavior>
|
||
- Assertion 8 (BUG A + onStartup): snapshot notificationCount. Then `sw.evaluate(() => globalThis.__mokoshTest!.handlers.onStartup?.())`. Wait 100ms (synchronous handler, but allow microtask drain). Assert: notificationCount delta === 1; `lastNotificationOptions.iconUrl` matches `/icons\/icon(?:128|48)\.png$/` (the production code uses NOTIFICATION_ICON_PATH = 'icons/icon128.png'); `lastNotificationOptions.title === 'Mokosh ready'`; `notificationIds[notificationIds.length-1].startsWith('mokosh-startup-')`. The PASS condition implies chrome.notifications.create's promise resolved cleanly — if Bug A regressed (icon below floor), Chrome's imageUtil throws and the create call REJECTS, so notificationCount would NOT increment. PASSES today against a881bf0.
|
||
- Assertion 9 (icon files present + sized): for each of (16, 200), (48, 500), (128, 1024), `sw.evaluate` a fetch of `chrome.runtime.getURL('icons/icon{N}.png')` and read `content-length`. Assert >= floor. PASSES today.
|
||
- Assertion 10 (manifest shape): `getManifest(sw)`. Assert: `permissions.includes('notifications')`; `icons['16']`, `icons['48']`, `icons['128']` all defined and equal to expected paths. PASSES today.
|
||
- Each assertion's RED-on-regression demo documented in commit body.
|
||
</behavior>
|
||
<action>
|
||
1. Wire assertion 8 per behavior. The `onStartup` handler in production carries inline try/catch around the `chrome.notifications.create` call (per src/background/index.ts:868-877); the hook's notificationCount wrapper increments regardless of create's resolution path. To verify Bug A specifically, ALSO assert that the iconUrl in lastNotificationOptions points to a file that resolves to >= 1024 bytes (cross-check with assertion 9's floor). This catches the Bug A regression EVEN IF a future change wraps the create call in a swallowing try/catch.
|
||
2. Wire assertion 9 per behavior. The fetch via sw.evaluate is the cleanest path — Chrome serves extension files from `chrome-extension://<id>/...` and fetch with a `chrome-extension://` URL works in SW context.
|
||
3. Wire assertion 10 per behavior. Direct `chrome.runtime.getManifest()` read.
|
||
4. RED-on-regression demos (commit body):
|
||
- **Assertion 8 RED demo (Bug A canonical)**: locally `echo "" > icons/icon128.png` (truncate to 0 bytes). Rebuild test bundle. Run `npm run test:uat`. Assertion 8 should FAIL — Chrome's imageUtil rejects the create call (or the wrapper's lastNotificationOptions snapshot has wrong shape). Restore (`git checkout -- icons/icon128.png`). Rebuild. Re-run. Assertion 8 PASSES. **Document in commit body.**
|
||
- Assertion 9 RED demo: same truncate; rebuild; assertion 9 should FAIL with "content-length 0 < floor 1024". Restore; PASSES.
|
||
- Assertion 10 RED demo: locally remove "notifications" from manifest.json permissions and rebuild test bundle; assertion 10 should FAIL. Restore; PASSES.
|
||
5. Run `npm run test:uat`: 11/14 PASS, 3 stubs remain (11, 12, 13).
|
||
6. `npx tsc --noEmit` exit 0; vitest 85 GREEN.
|
||
</action>
|
||
<verify>
|
||
<automated>npx tsc --noEmit && (set +e; npm run test:uat; test $? -ne 0)</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- Assertions 0-10 all PASS.
|
||
- Assertions 11-13 still stubbed RED.
|
||
- `npm run test:uat` exits non-zero; diagnostic 11/14 passed.
|
||
- Bug A RED-on-regression demo documented in commit body (mandatory).
|
||
- `npx tsc --noEmit` exit 0; vitest 85 GREEN.
|
||
</acceptance_criteria>
|
||
<done>Bug A harness assertion live AND demonstrably catches regression; icon + manifest coverage live; both Phase-1-escapee bug classes (Bug A + Bug B) now CI-callable.</done>
|
||
</task>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 7 (Wave 3 — bundle 4/4): Wire assertions 11, 12, 13 (35s buffer continuity + ffprobe gate + zip shape) — closes the 13-assertion charter.</name>
|
||
<read_first>
|
||
- tests/uat/harness.test.ts (assertions 1-10 GREEN from Tasks 4-6)
|
||
- tests/uat/lib/zip.ts (the jszip-based archive shape helper)
|
||
- tests/offscreen/webm-playback.test.ts (the existing ffprobe pattern — FFPROBE_BIN constant, skip-gate helper)
|
||
- src/background/webm-remux.ts (Plan 01-08's remux helper — what the harness's ffprobe gate validates)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md §11 (per-assertion implementation hints for 11, 12, 13)
|
||
</read_first>
|
||
<files>tests/uat/harness.test.ts, tests/uat/lib/zip.ts</files>
|
||
<behavior>
|
||
- Assertion 11 (35s buffer continuity): start a fresh recording. Wait 35 seconds (with keepalive pings every 20s per RESEARCH §2). Query the offscreen segments count via offPage.evaluate (the offscreen recorder maintains a `segments` ring; expose it via a `__mokoshTest.getSegmentCount()` getter — ADD this to offscreen-hooks.ts in this task). Assert: segmentCount >= 3 (per D-13: 10s segments × MAX_SEGMENTS=3 = 30s window). PASSES today.
|
||
- Assertion 12 (ffprobe gate): trigger SAVE_ARCHIVE (reusing the assertion 5 helper). Extract `video/last_30sec.webm` from the produced zip via jszip. Write to a tmpfile. Spawn `ffprobe -v error -f matroska -i <tmpfile>` via execFileSync. Assert exit code 0. (Skip-gate this assertion with a clear "SKIPPED: ffprobe binary not available" diagnostic if `which ffprobe` fails — matches the existing webm-playback.test.ts pattern.)
|
||
- Assertion 13 (zip shape): jszip parse the same zip. Assert: `video/last_30sec.webm` entry exists + has non-zero size. Assert: `meta.json` entry exists + parsed JSON has `version === <chrome.runtime.getManifest().version>` (read via sw.evaluate at the start of the harness or this assertion).
|
||
- The 35-second wait pushes the harness runtime past 60s. Add keepalive ping infrastructure (one ping every 20s during the wait) to avoid SW eviction per RESEARCH §2 / Pitfall 5.
|
||
</behavior>
|
||
<action>
|
||
1. ADD a `__mokoshTest.getSegmentCount()` getter to `src/test-hooks/offscreen-hooks.ts`. The offscreen recorder has a module-level `segments` array (from D-13 restart-segments); expose a function-level setter alongside `setCurrentStream`:
|
||
```typescript
|
||
// src/test-hooks/offscreen-hooks.ts
|
||
let currentStream: MediaStream | null = null;
|
||
let segmentCountGetter: () => number = () => 0;
|
||
export function setCurrentStream(s: MediaStream | null) { currentStream = s; }
|
||
export function setSegmentCountGetter(g: () => number) { segmentCountGetter = g; }
|
||
globalThis.__mokoshTest = {
|
||
// ...
|
||
getCurrentStream: () => currentStream,
|
||
getSegmentCount: () => segmentCountGetter(),
|
||
};
|
||
```
|
||
Update `src/test-hooks/types.ts` to add `getSegmentCount?: () => number` to MokoshTestSurface.
|
||
In `src/offscreen/recorder.ts`, after the existing `setCurrentStream(stream)` call, add (gated):
|
||
```typescript
|
||
if (import.meta.env.MODE === 'test') {
|
||
const hooks = await import('../test-hooks/offscreen-hooks');
|
||
hooks.setSegmentCountGetter(() => segments.length);
|
||
}
|
||
```
|
||
(Where `segments` is the module-level array. If the variable name differs, adapt. Read the file to confirm; commonly named `videoSegments` or `segments`.)
|
||
2. Wire assertion 11 per behavior. The 35s wait uses `await new Promise(r => setTimeout(r, 35_000))` with intermittent `await keepalivePing(sw)` every 20s. Use `setInterval` or a polling loop; document the keepalive purpose per RESEARCH §2.
|
||
3. Wire assertion 12 per behavior. Reuse the `FFPROBE_BIN` constant pattern from `tests/offscreen/webm-playback.test.ts`. Skip-gate: `if (!existsSync(FFPROBE_BIN)) { console.warn('Assertion 12: ffprobe not available — SKIPPED'); return; }`. The skip-gate is acceptable for assertion 12 because the unit-level tests (Plan 01-08's `tests/background/webm-remux.test.ts`) also have ffprobe gates that cover the same contract — the harness's ffprobe assertion is end-to-end validation, not the primary gate.
|
||
4. Wire assertion 13. Pass `expectedVersion = await sw.evaluate(() => chrome.runtime.getManifest().version)` into `assertArchiveShape`.
|
||
5. Update Tier-1 grep gate test (`tests/background/no-test-hooks-in-prod-bundle.test.ts`) to ALSO assert ZERO `getSegmentCount` in dist/ (new hook surface added in this task — confirm gate stays GREEN).
|
||
6. RED-on-regression demos (commit body):
|
||
- Assertion 11 RED demo: locally hack `SEGMENT_DURATION_MS = 30_000` in recorder.ts so 35s yields only 1 segment; rebuild; assertion 11 should FAIL. Revert; PASSES.
|
||
- Assertion 12 RED demo: locally inject a corrupted byte into the remux output (e.g. zero the EBML magic in webm-remux.ts before return); rebuild; assertion 12 should FAIL (ffprobe error). Revert; PASSES.
|
||
- Assertion 13 RED demo: locally drop `version` from the `meta.json` writer in saveArchive; rebuild; assertion 13 should FAIL. Revert; PASSES.
|
||
7. Run `npm run test:uat`: ALL 14 assertions PASS. Exit 0. Diagnostic: "UAT harness: 14/14 assertions passed".
|
||
8. `npx tsc --noEmit` → exit 0. `npx vitest run` → 85 GREEN.
|
||
9. **Verify Tier-1 grep gate updates:** `npm run build && grep -rln 'getSegmentCount' dist/` → 0 matches.
|
||
</action>
|
||
<verify>
|
||
<automated>npx tsc --noEmit && npm run test:uat && npx vitest run --reporter=dot && test "$(grep -rln getSegmentCount dist/ 2>/dev/null | wc -l)" = "0"</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- All 14 assertions PASS in `npm run test:uat`; exit 0.
|
||
- `npm run test:uat` total runtime ~50-90s (dominated by the 35s assertion 11 wait + the harness setup ~10s + assertion 0's `npm run build` ~10s; skip with `SKIP_PROD_REBUILD=1` for ~70s).
|
||
- `npx tsc --noEmit` exit 0; vitest 85 GREEN.
|
||
- Production bundle (`npm run build`): `grep -rln __mokoshTest dist/` → 0; `grep -rln simulateUserStop dist/` → 0; `grep -rln getSegmentCount dist/` → 0. Tier-1 gate remains GREEN.
|
||
- Each new assertion's RED-on-regression demo documented in commit body.
|
||
</acceptance_criteria>
|
||
<done>13-assertion charter complete; harness exits 0 against current Plan 01-09 bundle; Phase 1 functional contract fully CI-callable.</done>
|
||
</task>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 8 (Wave 4): Amend Plan 01-09 Task 5 operator checkpoint to redirect functional steps to `npm run test:uat`; update STATE.md decisions; close Plan 01-09 via this plan's harness PASS.</name>
|
||
<read_first>
|
||
- .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md lines 519-549 (the operator checkpoint that gets amended)
|
||
- .planning/phases/01-stabilize-video-pipeline/01-09-SUMMARY.md (current closure state)
|
||
- .planning/STATE.md (Decisions section + Phase 1 Closure Notes)
|
||
- tests/uat/harness.test.ts (the harness that NOW closes the functional contract)
|
||
</read_first>
|
||
<files>.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md, .planning/STATE.md</files>
|
||
<behavior>
|
||
- `.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md` gets an AMENDMENT block at the END of the file (does NOT rewrite the original Task 5 — preserves provenance per project convention from D-A1..D-A6 cascade pattern):
|
||
```
|
||
---
|
||
|
||
## Amendment (Phase 01-stabilize-video-pipeline, 2026-05-17) — Plan 01-11 harness retires operator functional steps
|
||
|
||
Plan 01-11 (Puppeteer UAT harness) lands a CI-callable replacement for the
|
||
functional verification work in this plan's Task 5. The operator's role is
|
||
reduced to:
|
||
|
||
- **Step 1 (build):** unchanged — `npm run build` must exit 0.
|
||
- **Steps 2-13:** REDIRECTED — replaced by `npm run test:uat` exit 0. The
|
||
Puppeteer harness implements 14 assertions (assertion 0 = production-
|
||
bundle hook-leak grep; assertions 1-13 = the original Task 5
|
||
functional checks).
|
||
- **Step 14 (brand/design — implicit in steps 4, 5, 6 of original task):**
|
||
RETAINED for operator. The harness verifies displaySurface === 'monitor'
|
||
+ notification fires; it does NOT verify the human-readable copy is
|
||
aesthetically correct OR that the badge color reads cleanly against the
|
||
operator's OS theme. Operator confirms.
|
||
- **Step 15 (genuine error UX):** REDIRECTED — assertion 7 verifies the
|
||
ERROR-path bandwidth.
|
||
|
||
**New closure gate:** Plan 01-09 closes when `npm run test:uat` exits 0
|
||
AND operator confirms step 14 (brand/design). The harness's 14/14 PASS
|
||
against current bundle (verified by this plan's Task 7) supplies the
|
||
first half today.
|
||
```
|
||
- `.planning/STATE.md` Decisions section gains a new entry (preserves the existing log; appends rather than rewriting):
|
||
```
|
||
- [Phase 01-11]: Operator role retirement landed via Puppeteer UAT harness. 14 assertions cover Plan 01-08/01-09 functional contract; operator retained only for brand/design step. `npm run test:uat` = the new CI gate for any Phase 1 SW/offscreen/manifest change. Tier-1 grep gate `tests/background/no-test-hooks-in-prod-bundle.test.ts` enforces zero `__mokoshTest` / `simulateUserStop` / `getSegmentCount` in production `dist/`.
|
||
```
|
||
- This task does NOT modify Plan 01-09's status fields, frontmatter, or original Task 5 body. The amendment is appended after the original `<output>` block (mirroring the CONTEXT.md amendment-append pattern from 2026-05-16).
|
||
- Operator (in the closing checkpoint below) confirms brand/design step 14 manually and types "approved" — at which point Plan 01-09 + Plan 01-11 close together.
|
||
</behavior>
|
||
<action>
|
||
1. Read `.planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md` to confirm the file structure ends with the `<output>` block (line ~596 based on the file's current shape).
|
||
2. Append the amendment block per the behavior description, AFTER the closing `</output>` tag. Use the same horizontal-rule + ## heading + AMENDED-BY metadata convention from CONTEXT.md amendments. Cite the harness path (`tests/uat/harness.test.ts`) and the npm script (`npm run test:uat`).
|
||
3. Read `.planning/STATE.md` Decisions section (lines 72-109).
|
||
4. Append the new entry to the Decisions list (after the most recent `[Phase 01-07-deferred-to-5]` entry per the convention). Do NOT modify any existing entry.
|
||
5. Verify both edits are content-only (no frontmatter changes; no status flips — those happen in the closing checkpoint).
|
||
6. Run `npx tsc --noEmit` → exit 0 (paranoia — neither edit touches TS, but baseline).
|
||
7. Run `npm run test:uat` → exit 0 (final smoke before the closing checkpoint).
|
||
8. Run `npx vitest run` → 85 GREEN.
|
||
</action>
|
||
<verify>
|
||
<automated>npx tsc --noEmit && grep -q 'Plan 01-11 harness retires operator functional steps' .planning/phases/01-stabilize-video-pipeline/01-09-PLAN.md && grep -q 'Operator role retirement landed via Puppeteer UAT harness' .planning/STATE.md && npm run test:uat && npx vitest run --reporter=dot</automated>
|
||
</verify>
|
||
<acceptance_criteria>
|
||
- `01-09-PLAN.md` ends with the appended amendment block (no edits to the original Task 5 body).
|
||
- `STATE.md` Decisions section carries the new entry as the last item (no edits to prior entries).
|
||
- `npm run test:uat` exits 0 (14/14 GREEN).
|
||
- `npx tsc --noEmit` exit 0; vitest 85 GREEN.
|
||
</acceptance_criteria>
|
||
<done>Plan 01-09 functional contract redirected to harness; STATE.md decisions log updated; ready for closing checkpoint.</done>
|
||
</task>
|
||
|
||
<task type="checkpoint:human-verify" gate="blocking">
|
||
<name>Task 9 (Wave 4): Operator confirms `npm run test:uat` exits 0 against current bundle AND confirms brand/design step 14 (Plan 01-09 Task 5 retained step) — closes Plan 01-09 + Plan 01-11.</name>
|
||
<files>(operator-driven; no files modified by this checkpoint)</files>
|
||
<action>See <how-to-verify> below — operator-driven empirical check. The executor must NOT bypass this checkpoint by stubbing harness output.</action>
|
||
<verify>
|
||
<automated>echo "checkpoint:human-verify — see how-to-verify section; resume signal is the gate"</automated>
|
||
</verify>
|
||
<done>Operator types "approved" after running the how-to-verify steps. See <resume-signal> for the exact gate.</done>
|
||
<what-built>
|
||
Tasks 1-8 landed: Puppeteer + tsx installed, vite.test.config.ts produces dist-test/, gated test hooks in src/test-hooks/ ship in test bundle and NOT in production bundle (Tier-1 grep gate verifies), Puppeteer harness at tests/uat/harness.test.ts implements 14 assertions, all 14 GREEN against current Plan 01-09 bundle (b9eeeeb Bug B fix + a881bf0 Bug A fix both verified by Bug B + Bug A canonical RED-on-regression demos). Plan 01-09 Task 5 redirected to `npm run test:uat` for functional steps. This checkpoint validates the harness end-to-end against real Chrome AND captures operator's brand/design acceptance for Plan 01-09's retained step 14.
|
||
</what-built>
|
||
<how-to-verify>
|
||
1. **Pre-flight cleanliness:** run `git status` — confirm working tree clean. Any uncommitted local hacks (RED-demo reverts) MUST be reverted BEFORE this step.
|
||
2. **Build production:** `npm run build` (must exit 0; this is Plan 01-09 Task 5 step 1).
|
||
3. **Build test bundle:** `npm run build:test` (must exit 0).
|
||
4. **Run harness:** `npm run test:uat` (must exit 0; runtime ~70-90s). Final output line MUST be exactly `UAT harness: 14/14 assertions passed`. If exit non-zero, paste the structured diagnostic + harness console dump + relevant SW/offscreen console logs; the plan iterates (likely a real bug surfaced).
|
||
5. **Re-run for stability:** `npm run test:uat` a second time. Same outcome. (Eliminates first-run flakiness from cold Chrome / cold dist-test cache.)
|
||
6. **Tier-1 hook-leak verification:** `grep -rln __mokoshTest dist/` must return 0 matches. Same for `simulateUserStop`, `getSegmentCount`, `setCurrentStream`, `setSegmentCountGetter`. If ANY match, the gate failed silently — STOP and triage.
|
||
7. **Local-debug mode smoke:** `HEADLESS=0 npm run test:uat`. Watch the real Chrome window: see the toolbar icon, see the picker auto-accept, see the badge transitions. Same exit 0 outcome. (This is the operator's chance to spot any visual oddity the automated assertions miss.)
|
||
8. **Brand/design acceptance (Plan 01-09 Task 5 step 14 — retained for operator):**
|
||
(a) Badge color readability against your OS theme: red OFF, green REC, yellow ERR should each contrast clearly with the toolbar background. If any is hard to see in light AND dark mode, document for Phase 5 hardening (do NOT block closure on this — file as a deferred item).
|
||
(b) Notification copy: "Mokosh ready — Click here to start recording your session." reads naturally in en_US. Russian operators may want a localized variant — document for Phase 5 (do NOT block closure on this).
|
||
(c) Picker UX: confirm Chrome's screen-share picker still surfaces (in headful mode) at the expected moment + with the correct monitor-only options.
|
||
9. **If steps 4, 5, 6 all PASS:** Plan 01-09 + Plan 01-11 both close. Type "approved" with any brand/design notes appended.
|
||
10. **If step 4 OR 5 FAIL:** paste the failure diagnostic. Likely culprits: locale-specific picker string mismatch (RESEARCH §9 — operator's Chrome may need a different `--auto-select-desktop-capture-source` value); race window in assertion 6 / 11 (try bumping the wait in the relevant assertion).
|
||
11. **If step 6 FAILS:** STOP. The Tier-1 hook-leak gate failing means the production bundle contains test code — this is a security regression (T-1-11-01). Do NOT proceed to closure. Open a debug session.
|
||
12. **If step 7 surfaces a real UX issue (not just a deferral):** document as a P1/P2 item in STATE.md or a phase-5 backlog file; closure can still proceed IF the issue is non-blocking.
|
||
</how-to-verify>
|
||
<resume-signal>
|
||
Type "approved" after step 9 lands (all gates GREEN + brand/design accepted). If steps 10/11/12 hit, paste the failure mode + operator's Chrome version + locale + OS theme; the plan iterates on the failing piece (likely Task 4-7 for assertion-specific issues; Task 1-2 for hook-leak issues; a fresh debug session for novel failures).
|
||
</resume-signal>
|
||
</task>
|
||
|
||
</tasks>
|
||
|
||
<threat_model>
|
||
## Trust Boundaries
|
||
|
||
| Boundary | Description |
|
||
|----------|-------------|
|
||
| Puppeteer driver ↔ Chrome SW (via CDP) | The harness pipes CDP commands to the SW context via `sw.evaluate`. Trust boundary is unchanged at runtime (the SW only accepts the harness's commands because the harness runs inside the Puppeteer-launched Chrome process); but the harness CAN invoke any production SW code path via `sw.evaluate`, so a malicious or buggy harness could in principle exfiltrate buffered video. Mitigation: harness code is in-tree, code-reviewed via the same pipeline as production. |
|
||
| Test hook surface (`__mokoshTest`) in production bundle | NEW: if tree-shaking fails or the MODE guard is misconfigured, the hook surface ships to production — exposing simulateUserStop, getCurrentStream, captured handler refs to any page that can `eval` against the SW. THIS IS THE SECURITY-CRITICAL THREAT. Mitigation: Tier-1 grep gate (`tests/background/no-test-hooks-in-prod-bundle.test.ts`) enforces zero `__mokoshTest` in `dist/`; runs as part of `npm test` so any CI pipeline picks it up. |
|
||
| dev-dependency Chromium binary | NEW: Puppeteer downloads ~150 MB Chromium binary at `npm install` time. Supply-chain compromise of the Chrome download endpoint would inject malicious code into developer machines. Mitigation: `package-lock.json` integrity check (Puppeteer pins the Chromium download hash via its `@puppeteer/browsers` dependency). Out of scope: separate SCA for Puppeteer itself. |
|
||
| --auto-select-desktop-capture-source flag in CI | NEW: in a CI container, the flag auto-accepts the "Entire screen" source — which is whatever Xvfb (or modern headless surface) presents. If a CI runner is shared with sensitive workloads, the 35-second recording assertion captures whatever is on screen during that window. Mitigation: document that CI MUST run the harness in an isolated container with no concurrent workload; local-dev runs capture the operator's real screen for 35s during assertion 11, documented in README.md. |
|
||
|
||
## STRIDE Threat Register
|
||
|
||
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|
||
|-----------|----------|-----------|-------------|-----------------|
|
||
| T-1-11-01 | Elevation of Privilege | `__mokoshTest` surface leaking into production `dist/` would expose simulateUserStop, captured chrome.* handler refs, and stream getter to any code with access to the SW context | mitigate | Two layers: (a) gated dynamic import per RESEARCH §6 (the literal `'test' !== 'production'` comparison is a static dead branch that Vite/Rollup tree-shake); (b) Tier-1 unit gate `tests/background/no-test-hooks-in-prod-bundle.test.ts` greps the BUILT artifact for `__mokoshTest` / `simulateUserStop` / `getSegmentCount` / `setCurrentStream` / `setSegmentCountGetter` — ZERO matches required for GREEN. Belt + suspenders catches both tree-shake regression AND new hook-name additions. |
|
||
| T-1-11-02 | Information Disclosure | 35-second recording assertion captures whatever is on the operator's screen during local-dev runs | accept | Operator-facing — local-dev runs are by definition under operator control; the recording is consumed only by ffprobe + jszip inside the harness process and is deleted with the temp downloads dir at process exit. CI runs document the isolated-container requirement in README.md. |
|
||
| T-1-11-03 | Tampering | Puppeteer downloads Chromium binary at `npm install`; supply-chain compromise of the download endpoint | accept | `package-lock.json` pins resolved hashes via Puppeteer's `@puppeteer/browsers` machinery. Same risk surface as any npm dependency. Phase 5 SCA work (out of scope here) covers periodic re-verification. |
|
||
| T-1-11-04 | Denial of Service | A pathological assertion 11 (35s wait) ties up CI runner time; combined with 14 sequential assertions, total runtime ~90s ties up a runner slot | accept | 90s is well within typical CI per-job budgets. Local-dev runs use `SKIP_PROD_REBUILD=1` to drop assertion 0's `npm run build` cost (~10s). Out of scope: parallelizing assertions (would require multi-browser instances, defeating the failure-isolation choice). |
|
||
| T-1-11-05 | Repudiation | The harness asserts the absence of recovery notification (Bug B path), but the assertion is a count-delta check — a notification fired BEFORE the snapshot would be invisible | mitigate | Each assertion snapshots `notificationCount` IMMEDIATELY before the trigger event AND immediately after the propagation wait. The delta is checked, not the absolute count. The `notificationIds` array is also asserted on for ID-prefix membership — even if delta counting were fooled by some interleaving, the absence of a 'mokosh-recovery-' prefix in the post-snapshot ids array catches the same regression. |
|
||
| T-1-11-06 | Spoofing | Harness reads `__mokoshTest.handlers.onStartup` and invokes it; a hostile production change could swap in a no-op handler that registers AFTER the hook captures the real handler | mitigate | The hook monkey-patches `addListener` AT THE TOP OF THE MODULE (before any production addListener calls). Any later addListener invocation still goes through the patched function and would OVERWRITE handlers.onStartup, not bypass. A malicious bypass would require directly calling `chrome.runtime.onStartup.addListener.call(...)` via a saved bound reference — none exist in the production tree (verified by grep `addListener.call|.bind(chrome.runtime.onStartup)` returns 0). Defense in depth: the assertion verifies the captured handler actually fires the notification side-effect; a stub handler would fail assertion 8's notificationCount check. |
|
||
</threat_model>
|
||
|
||
<verification>
|
||
- `npm run test:uat` exits 0 against the current Plan 01-09 bundle; final line is exactly `UAT harness: 14/14 assertions passed`.
|
||
- `npm run build` exit 0; `grep -rln __mokoshTest dist/` returns 0; `grep -rln simulateUserStop dist/` returns 0; `grep -rln getSegmentCount dist/` returns 0.
|
||
- `npm run build:test` exit 0; `dist-test/` populated; `grep -rln __mokoshTest dist-test/` returns ≥1.
|
||
- `npx vitest run` exit 0; 85 GREEN across all test files (83 baseline + 2 from Task 1's Tier-1 grep gate).
|
||
- `npx tsc --noEmit` exit 0 across `src/` + `tests/`.
|
||
- Tier-1 SW-bundle-import gate (`tests/background/sw-bundle-import.test.ts`) GREEN — verifies the gated dynamic import does not break production module init.
|
||
- Tier-1 hook-leak gate (`tests/background/no-test-hooks-in-prod-bundle.test.ts`) GREEN — verifies the production bundle is hook-free.
|
||
- Bug B canonical RED-on-regression demo documented in Task 5's commit body (locally reverting b9eeeeb makes assertion 6 RED; re-applying makes GREEN).
|
||
- Bug A canonical RED-on-regression demo documented in Task 6's commit body (locally truncating icons/icon128.png makes assertions 8 + 9 RED; restoring makes GREEN).
|
||
- Plan 01-09 Task 5 amended at the end of its PLAN.md (no rewrite of the original body); STATE.md Decisions log carries the new Plan 01-11 entry.
|
||
- Operator confirms brand/design step 14 + types "approved" in Task 9.
|
||
</verification>
|
||
|
||
<success_criteria>
|
||
Plan 01-11 is complete when:
|
||
1. **Two-bundle separation lives.** `npm run build` produces hook-free `dist/`; `npm run build:test` produces hook-enabled `dist-test/`. The Tier-1 grep gate enforces the production bundle's hook absence.
|
||
2. **All 14 harness assertions pass against the current Plan 01-09 bundle.** `npm run test:uat` exits 0; final line is `UAT harness: 14/14 assertions passed`.
|
||
3. **Both Phase-1-escapee bugs are now CI-callable.** Assertion 6 (Bug B state-machine routing) and Assertion 8 (Bug A icon-promoted notification) each have a RED-on-regression demo documented in their respective task's commit body, proving the harness assertion CAN catch a regression — not just pass under current conditions.
|
||
4. **Operator role retired for functional verification.** Plan 01-09 Task 5 steps 4-13 + 15 redirect to `npm run test:uat`; only step 1 (build) + step 14 (brand/design) retained. The amendment block in 01-09-PLAN.md preserves provenance (no rewrite of the original task).
|
||
5. **Existing 83 vitest tests remain GREEN.** Plus the 2 new Tier-1 gate tests in this plan = 85 total. No regression.
|
||
6. **`npx tsc --noEmit` exit 0.** All harness code + hook code type-clean.
|
||
7. **`npm run build` exit 0; `npm run build:test` exit 0.** Both production and test bundles emit cleanly.
|
||
8. **Operator confirms Task 9 brand/design acceptance + types "approved".** Plan 01-09 + Plan 01-11 close together.
|
||
</success_criteria>
|
||
|
||
<output>
|
||
After completion, create `.planning/phases/01-stabilize-video-pipeline/01-11-SUMMARY.md` per the standard template. Cite:
|
||
- The 14 assertions landed GREEN (0: prod-bundle hook-leak grep gate; 1-13: functional contract from orchestrator brief).
|
||
- Both RED-on-regression canonical demos documented in commit bodies (Bug B for assertion 6; Bug A for assertion 8).
|
||
- The two-bundle separation (`dist/` vs `dist-test/`) verified by Tier-1 grep gate.
|
||
- npm script additions (`build:test` + `test:uat`); dev-dep additions (puppeteer + tsx) with resolved versions.
|
||
- Hook surface inventory (`__mokoshTest`: handlers, notification observables, getCurrentStream, getSegmentCount) + the gated dynamic import sites in `src/background/index.ts` + `src/offscreen/recorder.ts`.
|
||
- Plan 01-09 amendment block landed (Task 5 functional steps redirected; brand/design step retained).
|
||
- STATE.md decision log updated with the operator-retirement decision.
|
||
- Open questions resolved (5 from RESEARCH) + their resolutions; any new open questions surfaced during execution.
|
||
- Bundle-size delta (`dist/` before vs after; should be near-zero since gated dynamic imports tree-shake cleanly).
|
||
- Total harness runtime ranges observed (cold: ~90s including build steps; warm with SKIP_PROD_REBUILD=1: ~70s; the 35-second assertion 11 wait dominates).
|
||
</output>
|