feat(01-11): wave-2 — Puppeteer harness scaffolding + A0 GREEN, popup-bridge architecture

Task 3 of Plan 01-11 (Puppeteer UAT harness). Harness file tree (tests/uat/): - harness.test.ts: tsx-runnable top-to-bottom harness entry point. Runs A0 inline (filesystem grep gate, abort-on-fail T-1-11-01), then launches Chrome + opens popup bridge + queries manifest, then iterates A1-A13 stubs. Each stub throws "NOT YET IMPLEMENTED — Plan 01-11 Task N wires this assertion". Exit code = 0 on full pass, 1 otherwise. Final line: "UAT harness: N/14 assertions passed". - lib/launch.ts: launchHarnessBrowser() — wraps puppeteer.launch with enableExtensions:[dist-test/], headless default (HEADLESS=0 override), --no-sandbox + --auto-select-desktop-capture-source flags. Polls browser.extensions() until the extension registers (empirically ~100ms but the first call right after launch returns Map(0)). Opens both a blank page (for triggerExtensionAction) AND the popup page (the bridge surface). Returns { browser, extension, extensionId, sw, downloadsDir, page, popup }. - lib/extension.ts: waitForOffscreenTarget + attachToOffscreen + countOffscreenTargets. Offscreen attach uses target.type() === 'background_page' + .asPage() (NOT .page() — RESEARCH §4 Pitfall 1). - lib/sw.ts: chrome.* state queries via the POPUP page handle (NOT the WebWorker handle — see architecture note below). getBadgeText, getPopup, getManifest, getIconSize, getIsRecording (side-channeled through badge text), fireOnStartup (via __mokoshTestQuery bridge), sendSyntheticRecordingError, getNotificationSnapshot (via bridge), keepalivePing (no-op message to wake SW for ~30s). - lib/offscreen.ts: getDisplaySurface, simulateUserStop (the dispatchEvent('ended') path per RESEARCH §7 BLOCKER — DO NOT REFACTOR to track.stop()), getSegmentCount. - lib/assertions.ts: runAssertion(idx, name, buffers, fn) wrapper — records pass/fail/duration; on failure dumps last 30 lines of SW + offscreen console buffers to stderr before rethrowing. assertEqual / assertMatch / assertTrue / assertGte / waitFor polling helper. - lib/zip.ts: jszip-based assertArchiveShape + extractEntryToFile for assertions 12 + 13. - README.md: runtime + local-debug + CI semantics + locale gotcha + dev-dep size note + assertion catalog table. - tsconfig.json: per-tree type-check config (mirrors root tsconfig.json compiler options but includes the harness tree explicitly). Architecture refinement (DEVIATION from RESEARCH §1 — Rule 1+3 inline fix): - RESEARCH §1 sketched `sw.evaluate(() => chrome.action.getBadgeText({}))` as the chrome.* query path. Empirical probes during Task 3 execution against Puppeteer 25.0.2 + Chrome 148 + --headless=true revealed two blockers: 1. Puppeteer's WebWorker.evaluate runs in an ISOLATED WORLD that carries SW globals (clients, registration, ...) but NOT the extension's full chrome.* API surface. Object.keys(chrome) inside sw.evaluate returns ["loadTimes","csi"] — the public webpage chrome, not the extension chrome. 2. Chrome 148's headless mode aggressively suspends MV3 service workers; subsequent swTarget.worker() calls return "Protocol error: No target with given id found". - WORKAROUND: open the popup page (chrome-extension://<id>/src/popup/ index.html) as a separate Puppeteer Page. The popup has full chrome.* access (it's an extension context with same privileges as the SW) AND stable Puppeteer lifetime. For SW-globalThis state (__mokoshTest in the SW isolate, NOT in the popup), bridge via chrome.runtime.sendMessage. The popup sends { type: '__mokoshTestQuery', op: 'snapshot' | 'fire-on-startup' | 'handler-types' }; the SW hook's onMessage handler responds. - Bridge implementation added to src/test-hooks/sw-hooks.ts — registers AFTER the production listeners so it never intercepts production messages (__mokoshTest* type is unambiguously test-only). Tier-1 grep gate (no-test-hooks-in-prod-bundle.test.ts) continues to enforce ZERO __mokoshTest occurrences in dist/ — the bridge handler is tree-shaken alongside the rest of the hook module via the __MOKOSH_UAT__ gate. Other configuration changes: - vitest.config.ts: exclude tests/uat/** from vitest discovery. The Puppeteer harness is invoked via `npm run test:uat` (not vitest); running it under vitest would try to launch real Chrome inside a vitest worker. The .test.ts suffix is retained for editor + naming-convention consistency with the rest of the tree. Verification: - npx tsc --noEmit (src/): exit 0 - npx tsc --noEmit -p tests/uat: exit 0 - npm run build: exit 0 - grep -rln '__mokoshTest|simulateUserStop|getSegmentCount|setCurrentStream|setSegmentCountGetter|__mokoshTestQuery|__mokoshKeepalive' dist/: ZERO matches - npm run build:test: exit 0; dist-test/ populated with the new bridge code - SKIP_BUILD=1 npx vitest run: 89/89 GREEN - SKIP_PROD_REBUILD=1 npx tsx tests/uat/harness.test.ts: → A0 [PASS]: production bundle has no test-hook leaks (19ms) → Browser launches; popup opens; manifest read succeeds → A1-A13 [FAIL]: NOT YET IMPLEMENTED — Plan 01-11 Task N wires this → "UAT harness: 1/14 assertions passed, 13 failed (first failure: A1)" → Exit code: 1 (expected — 13 RED stubs intentional) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 09:14:58 +02:00
parent cb1a729962
commit dbd977c815
11 changed files with 1705 additions and 0 deletions
--- a/src/test-hooks/sw-hooks.ts
+++ b/src/test-hooks/sw-hooks.ts
@@ -227,4 +227,85 @@ globalThis.__mokoshTest = {
  },
 } as MokoshTestSurface;
 // ─── Harness message bridge ───────────────────────────────────────────
 // EMPIRICAL ARCHITECTURE NOTE: Puppeteer 25 + Chrome 148 + headless
 // cannot reliably evaluate against the SW directly. The harness
 // queries chrome.* state through the popup page (which has full
 // chrome.* API access) but cannot read the SW's globalThis.__mokoshTest
 // because the popup is a SEPARATE V8 isolate. So we bridge: the popup
 // sends chrome.runtime.sendMessage queries; this handler responds with
 // the queried state.
 //
 // Protocol — popup → SW message: { type: '__mokoshTestQuery', op: <string> }
 // Response shapes:
 //   op='snapshot' → { count, lastOptions, ids }
 //   op='fire-on-startup' → { ok: true } OR { ok: false, error: 'no-handler' }
 //   op='handler-types' → { onClicked, onStartup, notificationOnClicked }
 // Unknown ops respond { ok: false, error: 'unknown-op' }.
 //
 // Returning `true` from the onMessage handler tells Chrome the
 // response is async; we keep sendResponse as a closed-over callback.
 // The bridge handler is registered AFTER the production listeners so
 // the hook never accidentally intercepts a production message —
 // __mokoshTest* messages are unambiguously test-only.
 chrome.runtime.onMessage.addListener((rawMessage, _sender, sendResponse) => {
  // Narrow the message — we accept ANY shape but only act on our type.
  if (rawMessage === null || typeof rawMessage !== 'object') {
    return false;
  }
  const message = rawMessage as { type?: unknown; op?: unknown };
  if (message.type !== '__mokoshTestQuery') {
    // Not our message — production handler will take it.
    return false;
  }
  const op = String(message.op ?? '');
  if (op === 'snapshot') {
    sendResponse({
      count: notificationCount,
      lastOptions: lastNotificationOptions,
      ids: notificationIds.slice(),
    });
    return false; // Sync response — return false per Chrome onMessage contract.
  }
  if (op === 'fire-on-startup') {
    const h = handlers.onStartup;
    if (h === null) {
      sendResponse({ ok: false, error: 'no-handler' });
      return false;
    }
    // Fire-and-respond. The handler may be async; we don't await it
    // for the response, but if it throws synchronously the catch
    // surfaces in the response.
    try {
      // Schedule on microtask so the response goes out first; the
      // handler's side effects (notifications.create) happen right
      // after, before the next harness assertion polls.
      queueMicrotask(() => {
        Promise.resolve(h()).catch((err) => {
          // Swallow async errors — the assertion 8 check is on the
          // notification side effect, not the handler's return value.
          console.warn('[mokoshTest bridge] onStartup handler threw:', err);
        });
      });
      sendResponse({ ok: true });
    } catch (err) {
      sendResponse({
        ok: false,
        error: err instanceof Error ? err.message : String(err),
      });
    }
    return false;
  }
  if (op === 'handler-types') {
    sendResponse({
      onClicked: typeof handlers.onClicked,
      onStartup: typeof handlers.onStartup,
      notificationOnClicked: typeof handlers.notificationOnClicked,
    });
    return false;
  }
  sendResponse({ ok: false, error: 'unknown-op' });
  return false;
 });
 export {};
--- a/tests/uat/README.md
+++ b/tests/uat/README.md
@@ -0,0 +1,106 @@
 # Mokosh UAT harness (Plan 01-11)
 Puppeteer-driven Node script that runs 14 assertions end-to-end against a
 real Chrome instance loaded with the Mokosh extension. Replaces Plan 01-09
 Task 5's operator-empirical functional verification (the operator retains
 only step 1 — build — and step 14 — brand/design acceptance).
 ## Quick start
 ```bash
 npm run test:uat
 ```
 This builds `dist-test/` (the hook-enabled bundle) and runs the harness.
 Exit 0 means all 14 assertions passed. Final line: `UAT harness: 14/14
 assertions passed`.
 ## Local-debug mode
 ```bash
 HEADLESS=0 npm run test:uat
 ```
 Opens a real Chrome window so you can watch the picker auto-accept, the
 badge transitions, the popup appear, etc.
 ## Developer iteration tricks
 ```bash
 # Skip the production build inside assertion 0 (uses existing dist/):
 SKIP_PROD_REBUILD=1 npm run test:uat
 # Run the harness against an existing dist-test/ (skip npm run build:test):
 npx tsx tests/uat/harness.test.ts
 ```
 ## Assertion catalog
 | # | Title | Bug class | Hook used |
 |---|-------|-----------|-----------|
 | 0 | Production bundle has no test-hook leaks | T-1-11-01 | filesystem grep |
 | 1 | SW bootstrap → setIdleMode | — | sw.evaluate |
 | 2 | Toolbar onClicked-idle → REC + popup | — | triggerExtensionAction |
 | 3 | Offscreen displaySurface === monitor | D-15 | __mokoshTest.getCurrentStream |
 | 4 | Toolbar onClicked-recording → popup, no new offscreen | — | targets count |
 | 5 | SAVE_ARCHIVE → download fires | — | downloads polling |
 | 6 | **BUG B**: simulateUserStop → badge OFF + no recovery notif | b9eeeeb | dispatchEvent('ended') |
 | 7 | RECORDING_ERROR codec-unsupported → ERR + recovery notif | — | sendMessage |
 | 8 | **BUG A**: onStartup → mokosh-startup- notification creates | a881bf0 | __mokoshTest.handlers.onStartup |
 | 9 | Icon file sizes meet floors | Bug A precondition | sw.evaluate(fetch) |
 | 10 | Manifest has notifications + 3 icons | Bug A precondition | chrome.runtime.getManifest |
 | 11 | 35s recording → segments.length >= 3 | D-13 | __mokoshTest.getSegmentCount |
 | 12 | ffprobe on extracted webm exits 0 | Plan 01-08 | jszip + execFile |
 | 13 | Archive shape — video + meta.json version match | Plan 01-07 | jszip |
 ## Failure isolation
 Single browser, serial assertions, bail on first failure for setup-
 dependent assertions (assertion 0 abort means refusing to launch a
 potentially-leaky bundle). Per-assertion bail keeps the diagnostic
 output unambiguous — see RESEARCH §5 + Plan 01-11 open-question
 resolution 4.
 On failure, the harness dumps the last 30 lines of SW console + last 30
 lines of offscreen console (captured live during the run) to stderr
 BEFORE rethrowing — gives you contextual triage without needing to re-
 run with debug logging.
 ## Known gotchas
 ### Locale-specific picker auto-accept
 The `--auto-select-desktop-capture-source=Entire screen` Chrome flag
 auto-accepts the screen-share picker. The string `"Entire screen"` is
 en_US-specific. If your Chrome is set to a non-English locale, the
 picker option label will differ and the auto-accept will silently fail
 (picker stays open; assertion 2 times out).
 Fallback: switch your Chrome user-data-dir's locale to en_US for
 harness runs, OR adjust the launch arg in `tests/uat/lib/launch.ts` to
 match your locale's equivalent string.
 ### dev-dep Chromium binary size
 `puppeteer` pulls a ~150 MB Chromium binary at `npm install` time. CI
 must accept this. Production `npm install --omit=dev` skips it cleanly.
 ### Xvfb is NOT required
 Per Plan 01-11 RESEARCH §3 empirical probes against Chrome 148, the
 `--headless=new` mode handles screen capture without Xvfb on Linux CI
 runners. If a future Chrome regresses this, `Xvfb :99 & DISPLAY=:99
 npm run test:uat` is the fallback.
 ### CI runner screen-capture concern
 The 35s recording assertion (A11) captures whatever is on screen during
 that window. CI MUST run the harness in an isolated container with no
 concurrent workload — see T-1-11-02 in Plan 01-11's threat model.
 ### Real Chrome download (assertion 5 → A12)
 The harness configures per-page download behavior via CDP to a fresh
 `os.tmpdir()/mokosh-uat-downloads-*` directory; downloads are NOT
 written to your real ~/Downloads. The temp directory is deleted by OS
 tmpdir GC.
--- a/tests/uat/harness.test.ts
+++ b/tests/uat/harness.test.ts
@@ -0,0 +1,394 @@
 // tests/uat/harness.test.ts — Plan 01-11 Puppeteer UAT harness entry point.
 //
 // Runs end-to-end via `npm run test:uat` (build:test + tsx tests/uat/harness.test.ts).
 // Top-to-bottom narrative: launch Chrome with dist-test loaded as
 // MV3 extension, attach to SW + offscreen, run 14 assertions
 // sequentially with bail-on-first-fail semantics + structured
 // diagnostic dump on failure (RESEARCH §5 + open-question resolution 4).
 //
 // Exit code:
 //   0 — all 14 assertions passed
 //   1 — at least one assertion failed
 //
 // Local-debug mode: `HEADLESS=0 npm run test:uat` (opens real Chrome)
 // Skip prod rebuild: `SKIP_PROD_REBUILD=1` (assertion 0 still verifies
 // the EXISTING dist/ rather than spawning npm run build).
 //
 // Assertion catalog (14 total):
 //   0  — Production bundle grep gate (filesystem-only; pre-flight).
 //   1  — SW bootstrap → setIdleMode (badge '', popup '', isRecording=false).
 //   2  — Toolbar onClicked-idle → badge 'REC' + popup popup.html + isRecording=true.
 //   3  — Offscreen displaySurface === 'monitor' (post-grant validation).
 //   4  — Toolbar onClicked while recording → popup, NO new offscreen.
 //   5  — SAVE_ARCHIVE → download fires + session_report_*.zip appears.
 //   6  — BUG B (canonical): simulateUserStop → badge '' + popup '' + NO recovery notif.
 //   7  — RECORDING_ERROR codec-unsupported → badge 'ERR' + recovery notif.
 //   8  — BUG A (canonical): onStartup → mokosh-startup- notification creates cleanly.
 //   9  — Icon file sizes meet floors (16→200, 48→500, 128→1024).
 //   10 — Manifest has notifications permission + all three icons declared.
 //   11 — 35s recording yields >= 3 segments per D-13.
 //   12 — ffprobe -v error -f matroska on extracted webm exits 0.
 //   13 — Archive shape (video/last_30sec.webm + meta.json with version match).
 import { execFileSync, execSync } from 'node:child_process';
 import { existsSync, readdirSync, readFileSync, statSync, mkdtempSync } from 'node:fs';
 import { tmpdir } from 'node:os';
 import { dirname, join, resolve as resolvePath } from 'node:path';
 import { fileURLToPath } from 'node:url';
 import type { Page } from 'puppeteer';
 import {
  type AssertionRecord,
  type ConsoleBuffers,
  assertEqual,
  assertGte,
  assertMatch,
  assertTrue,
  runAssertion,
  waitFor,
 } from './lib/assertions';
 import {
  attachToOffscreen,
  countOffscreenTargets,
  waitForOffscreenTarget,
 } from './lib/extension';
 import {
  getDisplaySurface,
  getSegmentCount,
  simulateUserStop,
 } from './lib/offscreen';
 import {
  fireOnStartup,
  getBadgeText,
  getIconSize,
  getIsRecording,
  getManifest,
  getNotificationSnapshot,
  getPopup,
  keepalivePing,
  sendSyntheticRecordingError,
 } from './lib/sw';
 import { assertArchiveShape, extractEntryToFile } from './lib/zip';
 import { launchHarnessBrowser, type HarnessHandles } from './lib/launch';
 const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
 const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..');
 const DIST_DIR = resolvePath(REPO_ROOT, 'dist');
 const FFPROBE_BIN = '/usr/bin/ffprobe';
 const TOTAL_ASSERTIONS = 14;
 /**
 * Forbidden hook surface strings — assertion 0 verifies absence
 * in production dist/. Mirrors the Tier-1 unit gate's surface list
 * (tests/background/no-test-hooks-in-prod-bundle.test.ts) but runs
 * against the SAME dist/ as the live harness for E2E parity.
 */
 const FORBIDDEN_HOOK_STRINGS: ReadonlyArray<string> = [
  '__mokoshTest',
  'simulateUserStop',
  'getSegmentCount',
  'setCurrentStream',
  'setSegmentCountGetter',
 ];
 /** Icon-size floors per assertion 9 (per orchestrator brief). */
 const ICON_SIZE_FLOORS: ReadonlyArray<readonly [string, number]> = [
  ['icons/icon16.png', 200],
  ['icons/icon48.png', 500],
  ['icons/icon128.png', 1024],
 ];
 /**
 * Recursively list all files under a root directory (sync). Used by
 * assertion 0 to walk dist/. Symlinks are skipped defensively.
 *
 * @param root - Absolute directory path.
 * @returns Sorted list of absolute file paths.
 */
 function listAllFilesRecursive(root: string): ReadonlyArray<string> {
  const acc: string[] = [];
  const stack: string[] = [root];
  while (stack.length > 0) {
    const dir = stack.pop()!;
    const entries = readdirSync(dir, { withFileTypes: true });
    for (const entry of entries) {
      const fullPath = resolvePath(dir, entry.name);
      if (entry.isSymbolicLink()) continue;
      if (entry.isDirectory()) {
        stack.push(fullPath);
      } else if (entry.isFile()) {
        acc.push(fullPath);
      }
    }
  }
  return acc.sort();
 }
 /**
 * Grep `needle` across every text-like file under `root`. Returns
 * file paths that contain at least one occurrence.
 *
 * @param root - Absolute directory path.
 * @param needle - Literal substring to find.
 * @returns Paths containing `needle`.
 */
 function grepRecursive(root: string, needle: string): ReadonlyArray<string> {
  const binaryExt = new Set(['.png', '.jpg', '.jpeg', '.gif', '.ico', '.webp', '.woff', '.woff2', '.ttf']);
  const out: string[] = [];
  for (const filePath of listAllFilesRecursive(root)) {
    const dotIdx = filePath.lastIndexOf('.');
    const ext = dotIdx >= 0 ? filePath.substring(dotIdx).toLowerCase() : '';
    if (binaryExt.has(ext)) continue;
    if (statSync(filePath).size === 0) continue;
    const text = readFileSync(filePath, 'utf8');
    if (text.includes(needle)) {
      out.push(filePath);
    }
  }
  return out;
 }
 /**
 * Poll `downloadsDir` for any *session_report*.zip file. Returns the
 * absolute path of the first match. Used by assertion 5.
 *
 * @param downloadsDir - Absolute downloads directory path.
 * @param timeoutMs - Maximum wait time.
 * @returns Absolute path to the matched .zip.
 * @throws On timeout.
 */
 async function waitForDownloadedZip(
  downloadsDir: string,
  timeoutMs: number,
 ): Promise<string> {
  const start = Date.now();
  while (Date.now() - start < timeoutMs) {
    const entries = readdirSync(downloadsDir);
    for (const name of entries) {
      if (name.includes('session_report') && name.endsWith('.zip')) {
        const full = join(downloadsDir, name);
        // Make sure write completed (size stabilized).
        const size1 = statSync(full).size;
        await new Promise((r) => setTimeout(r, 200));
        const size2 = statSync(full).size;
        if (size1 === size2 && size1 > 0) {
          return full;
        }
      }
    }
    await new Promise((r) => setTimeout(r, 200));
  }
  throw new Error(
    `waitForDownloadedZip: no session_report_*.zip appeared in ${downloadsDir} within ${timeoutMs}ms`,
  );
 }
 /**
 * Run a production build of dist/ unless SKIP_PROD_REBUILD=1.
 * Assertion 0 reads dist/, so this guarantees the gate runs against
 * a fresh artifact.
 */
 function ensureProductionBuild(): void {
  if (process.env.SKIP_PROD_REBUILD === '1') {
    process.stdout.write('  (SKIP_PROD_REBUILD=1 — using existing dist/)\n');
    return;
  }
  process.stdout.write('  Running `npm run build` (assertion 0 pre-flight)...\n');
  execFileSync('npm', ['run', 'build'], {
    stdio: 'inherit',
    cwd: REPO_ROOT,
  });
 }
 /**
 * Stub placeholder for assertions Task 4+ wires. Each stub throws so
 * the harness exits non-zero today; the diagnostic clearly identifies
 * the assertion as un-implemented vs failing-in-production.
 *
 * @param taskNumber - The plan task number that will wire this assertion.
 * @returns A function that always throws.
 */
 function notYetImplemented(taskNumber: number): () => Promise<void> {
  return async () => {
    throw new Error(
      `NOT YET IMPLEMENTED — Plan 01-11 Task ${taskNumber} wires this assertion`,
    );
  };
 }
 /**
 * Main harness entry point. Runs all 14 assertions sequentially with
 * bail-on-first-fail semantics for the SETUP-dependent assertions
 * (we still record every assertion's outcome — bail only stops
 * subsequent FUNCTIONAL assertions from running).
 */
 async function main(): Promise<number> {
  const results: AssertionRecord[] = [];
  const buffers: ConsoleBuffers = { swLines: [], offscreenLines: [] };
  let handles: HarnessHandles | null = null;
  process.stdout.write('\nMokosh UAT harness — Plan 01-11 Puppeteer-driven 14-assertion suite\n');
  process.stdout.write('='.repeat(72) + '\n\n');
  try {
    // ─── Assertion 0: Pre-flight grep gate ──────────────────────────
    process.stdout.write('Assertion 0 (pre-flight, filesystem-only):\n');
    ensureProductionBuild();
    const a0 = await runAssertion(
      0,
      'production bundle has no test-hook leaks (T-1-11-01)',
      buffers,
      async () => {
        for (const needle of FORBIDDEN_HOOK_STRINGS) {
          const matches = grepRecursive(DIST_DIR, needle);
          assertEqual(
            matches.length,
            0,
            `production dist/ contains '${needle}' in: ${JSON.stringify(matches)}`,
          );
        }
      },
    );
    results.push(a0);
    if (!a0.passed) {
      // Hook leak is security-critical (T-1-11-01) — abort immediately.
      process.stderr.write(
        '\n*** ABORT: assertion 0 (hook leak gate) FAILED — refusing to ' +
          'continue with potentially-leaky production bundle. ***\n',
      );
      return 1;
    }
    // ─── Setup: launch browser, attach to SW + open popup bridge ───
    process.stdout.write('\nLaunching Chrome + opening popup bridge...\n');
    handles = await launchHarnessBrowser();
    const { browser, sw, page, popup, extensionId, downloadsDir } = handles;
    process.stdout.write(`  extensionId: ${extensionId}\n`);
    process.stdout.write(`  downloadsDir: ${downloadsDir}\n`);
    process.stdout.write(`  popup: chrome-extension://${extensionId}/src/popup/index.html\n\n`);
    // Wire console buffers. The popup carries the chrome.* queries;
    // the SW handle is kept for diagnostic console capture (when the
    // SW is alive). Both feed buffers for failure dumps.
    const popupPage: Page = popup;
    popupPage.on('console', (msg) => {
      buffers.swLines.push(`[Popup:${msg.type()}] ${msg.text()}`);
    });
    sw.on('console', (msg) => {
      buffers.swLines.push(`[SW:${msg.type()}] ${msg.text()}`);
    });
    // Read the manifest version once for assertion 13.
    const manifest = await getManifest(popupPage);
    const expectedVersion = manifest.version;
    // ─── Wave 3 stubbed assertions (Tasks 4-7 will wire these) ──────
    const stubs: Array<{
      index: number;
      name: string;
      taskNumber: number;
    }> = [
      { index: 1, name: 'SW bootstrap → setIdleMode', taskNumber: 4 },
      { index: 2, name: 'toolbar onClicked-idle → badge REC + popup', taskNumber: 4 },
      { index: 3, name: 'offscreen displaySurface === monitor', taskNumber: 4 },
      { index: 4, name: 'toolbar onClicked-recording → popup, no new offscreen', taskNumber: 4 },
      { index: 5, name: 'SAVE_ARCHIVE → download fires + zip appears', taskNumber: 5 },
      { index: 6, name: 'BUG B canonical: simulateUserStop → badge OFF + no recovery notif', taskNumber: 5 },
      { index: 7, name: 'RECORDING_ERROR codec-unsupported → badge ERR + recovery notif', taskNumber: 5 },
      { index: 8, name: 'BUG A canonical: onStartup → notification creates cleanly', taskNumber: 6 },
      { index: 9, name: 'icon file sizes meet floors', taskNumber: 6 },
      { index: 10, name: 'manifest has notifications + 3 icons', taskNumber: 6 },
      { index: 11, name: '35s recording → segments.length >= 3', taskNumber: 7 },
      { index: 12, name: 'ffprobe on extracted webm exits 0', taskNumber: 7 },
      { index: 13, name: 'archive shape — video + meta.json version match', taskNumber: 7 },
    ];
    for (const s of stubs) {
      const rec = await runAssertion(
        s.index,
        s.name,
        buffers,
        notYetImplemented(s.taskNumber),
      );
      results.push(rec);
    }
    // Suppress unused-warning placeholders — Tasks 4-7 will use these
    // imports + handles directly. Reference them here for type-clean.
    void browser;
    void page;
    void popupPage;
    void expectedVersion;
    void waitForOffscreenTarget;
    void attachToOffscreen;
    void countOffscreenTargets;
    void waitFor;
    void getBadgeText;
    void getPopup;
    void getIsRecording;
    void getIconSize;
    void fireOnStartup;
    void sendSyntheticRecordingError;
    void getNotificationSnapshot;
    void keepalivePing;
    void getDisplaySurface;
    void simulateUserStop;
    void getSegmentCount;
    void assertArchiveShape;
    void extractEntryToFile;
    void assertMatch;
    void assertTrue;
    void assertGte;
    void waitForDownloadedZip;
    void mkdtempSync;
    void existsSync;
    void execSync;
    void tmpdir;
    void FFPROBE_BIN;
    void ICON_SIZE_FLOORS;
    return finalize(results);
  } catch (setupErr) {
    process.stderr.write(`\n*** Harness setup error: ${String(setupErr)}\n`);
    return finalize(results);
  } finally {
    if (handles !== null) {
      try {
        await handles.browser.close();
      } catch (closeErr) {
        process.stderr.write(`(non-fatal: browser close threw: ${String(closeErr)})\n`);
      }
    }
  }
 }
 /**
 * Print the final summary line + return the exit code.
 *
 * @param results - All assertion records collected during the run.
 * @returns 0 if all 14 passed, 1 otherwise.
 */
 function finalize(results: ReadonlyArray<AssertionRecord>): number {
  const passCount = results.filter((r) => r.passed).length;
  const failCount = results.length - passCount;
  process.stdout.write('\n' + '='.repeat(72) + '\n');
  if (passCount === TOTAL_ASSERTIONS) {
    process.stdout.write(`UAT harness: ${passCount}/${TOTAL_ASSERTIONS} assertions passed\n`);
    return 0;
  }
  const firstFail = results.find((r) => !r.passed);
  process.stdout.write(
    `UAT harness: ${passCount}/${TOTAL_ASSERTIONS} assertions passed, ${failCount} failed`,
  );
  if (firstFail !== undefined) {
    process.stdout.write(` (first failure: A${firstFail.index} ${firstFail.name})`);
  }
  process.stdout.write('\n');
  return 1;
 }
 // Run + exit. Top-level await + explicit exit code so tsx returns
 // the right status without leaving unhandled-promise spew on stderr.
 const exitCode = await main();
 process.exit(exitCode);
--- a/tests/uat/lib/assertions.ts
+++ b/tests/uat/lib/assertions.ts
@@ -0,0 +1,199 @@
 // tests/uat/lib/assertions.ts — Plan 01-11 harness assertion runner.
 //
 // Centralizes:
 //   - `assertEqual` / `assertMatch` / `assertTrue` — thin wrappers
 //     over `node:assert/strict` with explicit Plan 01-11 diagnostic
 //     framing (cite the bug-class on Bug A / Bug B assertions).
 //   - `runAssertion(name, fn)` — wraps each assertion in a try/catch
 //     so the harness can collect a per-assertion pass/fail map AND
 //     dump SW/offscreen console buffers on the FIRST failure (bail
 //     semantics per RESEARCH §5).
 //   - `waitFor(probe, predicate, timeoutMs)` — polling helper used by
 //     assertions that need to wait for async state transitions
 //     (badge changes, downloads, etc.).
 //
 // References:
 //   - node:assert/strict: https://nodejs.org/api/assert.html#strict-assertion-mode
 import { strict as assert } from 'node:assert';
 /**
 * Per-assertion outcome record. Accumulated by runAssertion + flushed
 * to the harness's final summary line.
 */
 export interface AssertionRecord {
  readonly index: number;
  readonly name: string;
  readonly passed: boolean;
  readonly errorMessage: string;
  readonly durationMs: number;
 }
 /**
 * Console buffers captured from SW + offscreen contexts. The harness
 * wires `sw.on('console', ...)` + `offPage.on('console', ...)` at
 * launch + before each assertion-relevant phase; on failure these
 * buffers are dumped to stderr for triage.
 */
 export interface ConsoleBuffers {
  swLines: string[];
  offscreenLines: string[];
 }
 /**
 * Run a single assertion, capturing its outcome + duration. On error,
 * dump the per-context console buffers to stderr BEFORE rethrowing so
 * the harness's top-level catch sees the diagnostic context.
 *
 * @param index - 0-13 (0 = grep gate, 1-13 = functional).
 * @param name - Human-readable assertion title.
 * @param buffers - Console buffers to dump on failure (may be empty).
 * @param fn - Async assertion body.
 * @returns Outcome record.
 */
 export async function runAssertion(
  index: number,
  name: string,
  buffers: ConsoleBuffers,
  fn: () => Promise<void>,
 ): Promise<AssertionRecord> {
  const start = Date.now();
  try {
    await fn();
    const durationMs = Date.now() - start;
    process.stdout.write(`  [PASS] A${index}: ${name} (${durationMs}ms)\n`);
    return {
      index,
      name,
      passed: true,
      errorMessage: '',
      durationMs,
    };
  } catch (err) {
    const durationMs = Date.now() - start;
    const errorMessage =
      err instanceof Error ? `${err.name}: ${err.message}` : String(err);
    process.stderr.write(`  [FAIL] A${index}: ${name} (${durationMs}ms)\n`);
    process.stderr.write(`         ${errorMessage}\n`);
    dumpBuffers(buffers, index);
    return {
      index,
      name,
      passed: false,
      errorMessage,
      durationMs,
    };
  }
 }
 /**
 * Dump SW + offscreen console buffers to stderr with structured framing.
 * Cap at the last 30 lines per context to keep failure output readable.
 *
 * @param buffers - The accumulating buffers.
 * @param assertionIndex - For framing the dump preamble.
 */
 function dumpBuffers(buffers: ConsoleBuffers, assertionIndex: number): void {
  const TAIL = 30;
  const swTail = buffers.swLines.slice(-TAIL);
  const offTail = buffers.offscreenLines.slice(-TAIL);
  if (swTail.length > 0) {
    process.stderr.write(
      `         --- SW console (last ${swTail.length} lines, assertion A${assertionIndex}) ---\n`,
    );
    for (const line of swTail) {
      process.stderr.write(`           ${line}\n`);
    }
  }
  if (offTail.length > 0) {
    process.stderr.write(
      `         --- Offscreen console (last ${offTail.length} lines, assertion A${assertionIndex}) ---\n`,
    );
    for (const line of offTail) {
      process.stderr.write(`           ${line}\n`);
    }
  }
 }
 /**
 * Strict equality with a context-bearing message. Wraps
 * `assert.strictEqual` so the failure surface is uniform across
 * assertions.
 *
 * @param actual - Observed value.
 * @param expected - Expected value.
 * @param msg - Context for the failure diagnostic.
 */
 export function assertEqual<T>(actual: T, expected: T, msg: string): void {
  assert.strictEqual(actual, expected, msg);
 }
 /**
 * Assert that `actual` matches `regex`. Wraps `assert.match`.
 *
 * @param actual - String to test.
 * @param regex - Pattern.
 * @param msg - Context for the failure diagnostic.
 */
 export function assertMatch(actual: string, regex: RegExp, msg: string): void {
  assert.match(actual, regex, msg);
 }
 /**
 * Assert that `cond` is truthy. Wraps `assert.ok`.
 *
 * @param cond - Boolean expression.
 * @param msg - Context for the failure diagnostic.
 */
 export function assertTrue(cond: boolean, msg: string): void {
  assert.ok(cond, msg);
 }
 /**
 * Assert that the actual value is greater than or equal to expected.
 * Used by assertion 9 (icon size floors) + assertion 11 (segment count).
 *
 * @param actual - Observed value.
 * @param expected - Minimum acceptable value.
 * @param msg - Context for the failure diagnostic.
 */
 export function assertGte(actual: number, expected: number, msg: string): void {
  assert.ok(
    actual >= expected,
    `${msg} — expected >= ${expected}, got ${actual}`,
  );
 }
 /**
 * Poll `probe` until `predicate(probe())` returns true OR timeoutMs
 * elapses. Throws on timeout with a structured diagnostic.
 *
 * @param probe - Async function producing a value to test.
 * @param predicate - Returns true when the value satisfies the wait.
 * @param timeoutMs - Maximum wait time.
 * @param description - Human-readable description for the diagnostic.
 * @param pollIntervalMs - Interval between probe calls (default 100ms).
 * @returns The last probed value that satisfied the predicate.
 * @throws If timeoutMs elapses without predicate satisfaction.
 */
 export async function waitFor<T>(
  probe: () => Promise<T>,
  predicate: (v: T) => boolean,
  timeoutMs: number,
  description: string,
  pollIntervalMs: number = 100,
 ): Promise<T> {
  const start = Date.now();
  let lastValue: T | undefined;
  while (Date.now() - start < timeoutMs) {
    lastValue = await probe();
    if (predicate(lastValue)) {
      return lastValue;
    }
    await new Promise((r) => setTimeout(r, pollIntervalMs));
  }
  throw new Error(
    `waitFor timeout ${timeoutMs}ms — ${description}; ` +
      `last probed value: ${JSON.stringify(lastValue)}`,
  );
 }
--- a/tests/uat/lib/extension.ts
+++ b/tests/uat/lib/extension.ts
@@ -0,0 +1,93 @@
 // tests/uat/lib/extension.ts — Plan 01-11 harness extension/offscreen helpers.
 //
 // The offscreen-document attach uses a CDP-level target type that
 // Puppeteer 25 surfaces as `'background_page'` — NOT `'page'`. Per
 // Plan 01-11 RESEARCH §4 / Pitfall 1, finding the offscreen via
 // `t.type() === 'page'` returns no matches; `'background_page'` is
 // the right discriminator. After getting the target, `.asPage()`
 // returns a Page-like handle (NOT `.page()` — that returns undefined).
 //
 // References:
 //   - Puppeteer Target types:
 //     https://pptr.dev/api/puppeteer.targettype
 //   - Chrome offscreen document:
 //     https://developer.chrome.com/docs/extensions/reference/api/offscreen
 import type { Browser, Page, Target } from 'puppeteer';
 /** How long to wait for the offscreen document target to appear. */
 const OFFSCREEN_TARGET_TIMEOUT_MS = 5_000;
 /**
 * Poll the browser's target list for the offscreen document. The
 * offscreen is created lazily — only when the SW issues
 * `chrome.offscreen.createDocument(...)`. Caller MUST invoke a flow
 * that triggers offscreen creation (e.g. start a recording) BEFORE
 * calling this helper.
 *
 * @param browser - Puppeteer Browser handle.
 * @param extensionId - The extension's runtime id (for URL filtering).
 * @returns Resolved Target whose URL contains 'offscreen'.
 * @throws If no offscreen target appears within OFFSCREEN_TARGET_TIMEOUT_MS.
 */
 export async function waitForOffscreenTarget(
  browser: Browser,
  extensionId: string,
 ): Promise<Target> {
  const predicate = (t: Target): boolean => {
    const url = t.url();
    // Offscreen documents are loaded as chrome-extension://<id>/...
    // with a path containing 'offscreen' (matches both 'src/offscreen/'
    // and the bundled equivalents). Target type 'background_page' per
    // RESEARCH §4 Pitfall 1.
    return (
      t.type() === 'background_page' &&
      url.startsWith(`chrome-extension://${extensionId}`) &&
      url.includes('offscreen')
    );
  };
  return await browser.waitForTarget(predicate, {
    timeout: OFFSCREEN_TARGET_TIMEOUT_MS,
  });
 }
 /**
 * Attach to the offscreen document as a Page-like handle. Uses
 * `.asPage()` (NOT `.page()` — Puppeteer 25 returns null for
 * `.page()` on background_page-type targets).
 *
 * @param target - The offscreen Target from waitForOffscreenTarget.
 * @returns Page handle for evaluate/expose/etc.
 */
 export async function attachToOffscreen(target: Target): Promise<Page> {
  const page = await target.asPage();
  return page;
 }
 /**
 * Count the offscreen targets currently in the browser. Used by
 * assertion 4 to verify that a toolbar click while recording does
 * NOT spawn a second offscreen document.
 *
 * @param browser - Puppeteer Browser handle.
 * @param extensionId - The extension's runtime id.
 * @returns Integer count of offscreen targets.
 */
 export function countOffscreenTargets(
  browser: Browser,
  extensionId: string,
 ): number {
  const targets = browser.targets();
  let count = 0;
  for (const t of targets) {
    if (
      t.type() === 'background_page' &&
      t.url().startsWith(`chrome-extension://${extensionId}`) &&
      t.url().includes('offscreen')
    ) {
      count += 1;
    }
  }
  return count;
 }
--- a/tests/uat/lib/launch.ts
+++ b/tests/uat/lib/launch.ts
@@ -0,0 +1,314 @@
 // tests/uat/lib/launch.ts — Plan 01-11 harness launch helper.
 //
 // Wraps puppeteer.launch with the project's invariants:
 //   - enableExtensions points at the absolute path to dist-test/ (the
 //     test bundle that carries the gated test hooks per Plan 01-11
 //     Task 2). NOT dist/ — that would defeat the harness entirely.
 //   - headless defaults to true (CI-friendly); HEADLESS=0 env opens a
 //     real Chrome window for local debugging.
 //   - --auto-select-desktop-capture-source="Entire screen" auto-accepts
 //     the screen-share picker so getDisplayMedia resolves without
 //     operator interaction (RESEARCH §9). The literal string is
 //     en_US-locale-sensitive; document the fallback in tests/uat/README.md.
 //   - Downloads land in a fresh per-run temp dir so assertion 5
 //     (SAVE_ARCHIVE) can poll for session_report_*.zip without
 //     colliding with operator downloads.
 //
 // References:
 //   - puppeteer.launch options: https://pptr.dev/api/puppeteer.launchoptions
 //   - puppeteer extension API: https://pptr.dev/guides/extensions
 //   - Chrome --auto-select-desktop-capture-source:
 //     https://source.chromium.org/chromium/chromium/src/+/main:media/capture/video/chromeos/camera_app_device_provider.cc
 //     (search for the flag in chrome://flags or the Chromium source tree)
 import { execSync } from 'node:child_process';
 import { existsSync, mkdtempSync, statSync } from 'node:fs';
 import { tmpdir } from 'node:os';
 import { dirname, join, resolve as resolvePath } from 'node:path';
 import { fileURLToPath } from 'node:url';
 import puppeteer, {
  type Browser,
  type CDPSession,
  type Extension,
  type Page,
  type WebWorker,
 } from 'puppeteer';
 /// <reference path="./test-hook-contract.d.ts" />
 const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
 const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..', '..');
 const DIST_TEST_DIR = resolvePath(REPO_ROOT, 'dist-test');
 /**
 * Handles returned from `launchHarnessBrowser`. All references are
 * live for the lifetime of the browser; the caller MUST close the
 * browser to release them.
 */
 export interface HarnessHandles {
  readonly browser: Browser;
  readonly extension: Extension;
  readonly extensionId: string;
  /**
   * Service worker handle (for completeness / future use). NOTE: per
   * the architecture refinement documented in tests/uat/lib/sw.ts,
   * the harness's chrome.* state queries go through the `popup` page
   * (which has full extension chrome.* access AND a stable Puppeteer
   * lifetime). Direct sw.evaluate is unreliable in Chrome 148 +
   * headless + Puppeteer 25 (the SW suspends + worker() returns
   * "Protocol error: No target with given id found"). The SW handle
   * is kept here for harness wave-3 assertion 11 / 12 (where we may
   * need a worker reference for diagnostics).
   */
  readonly sw: WebWorker;
  readonly downloadsDir: string;
  /**
   * A pre-opened blank page the harness can use to invoke
   * `triggerExtensionAction` (Puppeteer requires a page in the active
   * tab for the toolbar-click simulation).
   */
  readonly page: Page;
  /**
   * The extension popup page, opened at
   * chrome-extension://<extensionId>/src/popup/index.html. This page
   * is the harness's primary chrome.* query surface (see
   * tests/uat/lib/sw.ts file header for rationale).
   */
  readonly popup: Page;
 }
 /**
 * Optional launch overrides. Defaults are CI-friendly; HEADLESS=0
 * environment variable flips to headful for local debugging.
 */
 export interface LaunchOptions {
  /** Override the dist-test directory (test isolation). */
  readonly distTestDir?: string;
  /** Override the downloads directory (default: fresh tempdir per call). */
  readonly downloadsDir?: string;
  /** Force headless / headful regardless of HEADLESS env. */
  readonly headless?: boolean;
 }
 /**
 * Create a per-run downloads directory under the OS tmpdir. Caller is
 * responsible for cleanup (typically deferred to OS tmpdir GC).
 *
 * @returns Absolute path to the freshly-created downloads directory.
 */
 function makeDownloadsDir(): string {
  return mkdtempSync(join(tmpdir(), 'mokosh-uat-downloads-'));
 }
 /**
 * Verify the dist-test directory exists and is a directory. Fails
 * loudly with an actionable message — the caller likely forgot to
 * run `npm run build:test` before invoking the harness.
 *
 * @param distTestDir - Absolute path to dist-test.
 * @throws If the directory does not exist or is not a directory.
 */
 function assertDistTestPresent(distTestDir: string): void {
  if (!existsSync(distTestDir)) {
    throw new Error(
      `dist-test/ missing at ${distTestDir}. ` +
        `Run \`npm run build:test\` before launching the harness ` +
        `(or invoke via \`npm run test:uat\` which does it for you).`,
    );
  }
  const stat = statSync(distTestDir);
  if (!stat.isDirectory()) {
    throw new Error(
      `dist-test/ exists at ${distTestDir} but is not a directory.`,
    );
  }
 }
 /**
 * Resolve whether to run headless. HEADLESS=0 forces headful;
 * anything else (including undefined) is headless. Explicit
 * `options.headless` overrides the env entirely.
 *
 * @param options - Optional launch overrides.
 * @returns true for headless, false for headful.
 */
 function resolveHeadless(options: LaunchOptions): boolean {
  if (options.headless !== undefined) {
    return options.headless;
  }
  return process.env.HEADLESS !== '0';
 }
 /**
 * Locate the SW target via the extension ID. Polls puppeteer's target
 * list because the SW is registered asynchronously after the extension
 * loads. Times out at 10s — if the SW is missing after that, either
 * dist-test/ is corrupted or the SW bundle threw at module init (which
 * would be caught by sw-bundle-import.test.ts BEFORE the harness ever
 * runs; but defensively, we surface a clear diagnostic here).
 *
 * @param browser - Puppeteer Browser handle.
 * @param extensionId - The extension's runtime id.
 * @returns The SW WebWorker handle.
 * @throws If no SW target appears within 10s.
 */
 async function waitForSwTarget(
  browser: Browser,
  extensionId: string,
 ): Promise<WebWorker> {
  const target = await browser.waitForTarget(
    (t) =>
      t.type() === 'service_worker' &&
      t.url().startsWith(`chrome-extension://${extensionId}`),
    { timeout: 10_000 },
  );
  const sw = await target.worker();
  if (sw === null) {
    throw new Error(
      `Service worker target found for extension ${extensionId} but ` +
        `its worker() returned null — the SW likely crashed at init.`,
    );
  }
  return sw;
 }
 /**
 * Configure the per-page download behavior via CDP so files land in
 * our temp downloadsDir. Puppeteer 25's high-level downloads API is
 * still in flux; the raw CDP call is stable across versions.
 *
 * @param page - Page whose downloads should be redirected.
 * @param downloadsDir - Absolute path to capture downloads.
 */
 async function setDownloadBehavior(
  page: Page,
  downloadsDir: string,
 ): Promise<void> {
  const cdpClient: CDPSession = await page.target().createCDPSession();
  await cdpClient.send('Browser.setDownloadBehavior', {
    behavior: 'allow',
    downloadPath: downloadsDir,
    eventsEnabled: true,
  });
 }
 /**
 * Launch a Chrome instance with the test bundle loaded as an unpacked
 * MV3 extension; wire downloads to a per-run temp dir; return all
 * handles the harness needs. Caller MUST `await handles.browser.close()`.
 *
 * @param options - Optional overrides (mostly for isolation in tests).
 * @returns Resolved handles to browser, extension, SW, page, downloadsDir.
 * @throws If dist-test/ missing OR SW target never appears.
 */
 export async function launchHarnessBrowser(
  options: LaunchOptions = {},
 ): Promise<HarnessHandles> {
  const distTestDir = options.distTestDir ?? DIST_TEST_DIR;
  assertDistTestPresent(distTestDir);
  const downloadsDir = options.downloadsDir ?? makeDownloadsDir();
  const headless = resolveHeadless(options);
  // Pre-flight: verify the operator's chrome binary supports the
  // auto-select picker flag. The string is locale-specific; en_US
  // uses "Entire screen". This pre-flight does NOT verify the locale
  // matches — it only verifies Puppeteer can find a Chromium binary
  // at all (a missing binary fails the launch with a confusing message
  // otherwise).
  // Suppress noisy `puppeteer --version` check; if it fails, the launch
  // itself will surface the same diagnostic.
  try {
    execSync('node ./node_modules/puppeteer/lib/cjs/puppeteer/node/cli.js --help', {
      stdio: 'ignore',
      timeout: 5_000,
    });
  } catch {
    // Best-effort. The actual launch will fail loudly if the binary is
    // truly missing.
  }
  const browser = await puppeteer.launch({
    enableExtensions: [distTestDir],
    headless,
    pipe: true,
    args: [
      '--no-sandbox',
      // RESEARCH §9: auto-accept the screen-share picker so
      // getDisplayMedia resolves without operator interaction. The
      // literal string is en_US-locale-sensitive; tests/uat/README.md
      // documents the fallback for other locales.
      '--auto-select-desktop-capture-source=Entire screen',
      // DO NOT add --use-fake-ui-for-media-stream (RESEARCH §9 Pitfall:
      // conflicts with auto-select).
    ],
  });
  // Resolve the extension ID. Puppeteer 25's browser.extensions() returns
  // a Map<id, Extension> with all enabled extensions — BUT the map is
  // populated asynchronously after the extension's manifest loads.
  // Empirically: extension appears within ~100ms on local hardware but
  // the very first call right after launch returns Map(0). Poll until
  // extension registers OR 5s elapses; surface a clear diagnostic on
  // timeout (probably means dist-test/ is malformed).
  let extensionsMap = await browser.extensions();
  const POLL_TIMEOUT_MS = 5_000;
  const POLL_INTERVAL_MS = 100;
  const pollStart = Date.now();
  while (extensionsMap.size === 0 && Date.now() - pollStart < POLL_TIMEOUT_MS) {
    await new Promise((r) => setTimeout(r, POLL_INTERVAL_MS));
    extensionsMap = await browser.extensions();
  }
  const entries = [...extensionsMap];
  if (entries.length === 0) {
    await browser.close();
    throw new Error(
      `Puppeteer launched Chrome but no extensions loaded after ${POLL_TIMEOUT_MS}ms — ` +
        `verify enableExtensions path points at a valid unpacked extension: ${distTestDir}. ` +
        `Common causes: dist-test/ missing the manifest.json, manifest version mismatch ` +
        `(Chrome requires MV3 — verify "manifest_version": 3), or chrome binary ` +
        `incompatible with the unpacked extension shape.`,
    );
  }
  const [extensionId, extension] = entries[0];
  // Wait for the SW target to appear + capture its worker handle.
  const sw = await waitForSwTarget(browser, extensionId);
  // Give the SW's module init a tick to complete. Empirically the
  // service-worker-loader.js → assets/index-*.js dynamic import
  // resolves quickly, but `chrome.action.onClicked.addListener` (and
  // the gated test-hook addListener monkey-patches) all run inside
  // the module body — a brief settle ensures the hook surface is
  // installed BEFORE the harness's first `sw.evaluate(() =>
  // globalThis.__mokoshTest...)` query.
  await new Promise((r) => setTimeout(r, 500));
  // Pre-open a blank page; configure downloads. The blank page is
  // also the page the harness uses for triggerExtensionAction.
  const page = await browser.newPage();
  await page.goto('about:blank');
  await setDownloadBehavior(page, downloadsDir);
  // Open the extension popup as a separate Page. This is the harness's
  // primary chrome.* query surface — see tests/uat/lib/sw.ts file
  // header for the architecture rationale. The popup page has full
  // extension chrome.* access AND a stable Puppeteer lifetime. Loading
  // the URL also wakes the SW (chrome-extension:// page load IS a SW
  // wake-up event in MV3).
  const popup = await browser.newPage();
  await popup.goto(
    `chrome-extension://${extensionId}/src/popup/index.html`,
    { waitUntil: 'domcontentloaded', timeout: 10_000 },
  );
  return {
    browser,
    extension,
    extensionId,
    sw,
    downloadsDir,
    page,
    popup,
  };
 }
--- a/tests/uat/lib/offscreen.ts
+++ b/tests/uat/lib/offscreen.ts
@@ -0,0 +1,107 @@
 // tests/uat/lib/offscreen.ts — Plan 01-11 harness offscreen-context helpers.
 //
 // Each helper is a thin wrapper over `offPage.evaluate(() => ...)`.
 // The Bug B BLOCKER (RESEARCH §7) lives in simulateUserStop —
 // DO NOT REFACTOR to track.stop().
 //
 // References:
 //   - MediaStreamTrack 'ended' event:
 //     https://developer.mozilla.org/docs/Web/API/MediaStreamTrack/ended_event
 //   - MediaStreamTrack.stop spec note (stop does NOT fire 'ended' on the same track):
 //     https://www.w3.org/TR/mediacapture-streams/#dom-mediastreamtrack-stop
 import type { Page } from 'puppeteer';
 /// <reference path="./test-hook-contract.d.ts" />
 /**
 * Read the displaySurface from the active MediaStream's video track.
 * Used by assertion 3 to verify monitor-only enforcement (the
 * post-grant validation in src/offscreen/recorder.ts).
 *
 * Returns null when there is no active recording (the harness MUST
 * start a recording before calling this).
 *
 * @param offPage - Offscreen Page handle.
 * @returns 'monitor' on success, other strings on regression, null when no stream.
 */
 export async function getDisplaySurface(offPage: Page): Promise<string | null> {
  return await offPage.evaluate(() => {
    const hook = globalThis.__mokoshTest;
    if (hook === undefined || hook.getCurrentStream === undefined) {
      return null;
    }
    const stream = hook.getCurrentStream();
    if (stream === null) {
      return null;
    }
    const track = stream.getVideoTracks()[0];
    if (track === undefined) {
      return null;
    }
    const ds = track.getSettings().displaySurface;
    return typeof ds === 'string' ? ds : null;
  });
 }
 /**
 * Simulate the operator clicking Chrome's "Stop sharing" overlay.
 *
 * **BLOCKER (RESEARCH §7) — DO NOT REFACTOR to `track.stop()`.**
 *
 * `track.stop()` releases the capture but does NOT fire the 'ended'
 * event on the same track per the W3C Screen Capture spec. The
 * production `onUserStoppedSharing` handler (src/offscreen/recorder.ts:
 * 451) is wired to 'ended' — using `track.stop()` would silently bypass
 * the entire Bug B fix path that this assertion exists to verify.
 *
 * `track.dispatchEvent(new Event('ended'))` IS the only path that
 * triggers our handler. After dispatch, the production handler calls
 * `stream.getTracks().forEach(t => t.stop())` which DOES release the
 * capture (just doesn't refire 'ended' on the same track — spec-correct).
 *
 * @param offPage - Offscreen Page handle.
 * @throws If no active MediaStream OR no video track in the stream.
 */
 export async function simulateUserStop(offPage: Page): Promise<void> {
  await offPage.evaluate(() => {
    const hook = globalThis.__mokoshTest;
    if (hook === undefined || hook.getCurrentStream === undefined) {
      throw new Error('simulateUserStop: __mokoshTest.getCurrentStream missing');
    }
    const stream = hook.getCurrentStream();
    if (stream === null) {
      throw new Error(
        'simulateUserStop: no current MediaStream — recording must be active',
      );
    }
    const track = stream.getVideoTracks()[0];
    if (track === undefined) {
      throw new Error('simulateUserStop: no video track in stream');
    }
    // CRITICAL: dispatchEvent, NOT track.stop(). See preamble for the
    // BLOCKER analysis (RESEARCH §7).
    track.dispatchEvent(new Event('ended'));
  });
 }
 /**
 * Read the current segment count from the offscreen recorder's ring
 * buffer. Used by assertion 11 to verify the 30s window per D-13
 * (3 × 10s segments expected after 35s of recording).
 *
 * Returns -1 when the hook is not installed (defensive — should
 * never happen against a dist-test/ bundle).
 *
 * @param offPage - Offscreen Page handle.
 * @returns Current segment count.
 */
 export async function getSegmentCount(offPage: Page): Promise<number> {
  return await offPage.evaluate(() => {
    const hook = globalThis.__mokoshTest;
    if (hook === undefined || hook.getSegmentCount === undefined) {
      return -1;
    }
    return hook.getSegmentCount();
  });
 }
--- a/tests/uat/lib/sw.ts
+++ b/tests/uat/lib/sw.ts
@@ -0,0 +1,262 @@
 // tests/uat/lib/sw.ts — Plan 01-11 harness SW-state helpers.
 //
 // IMPLEMENTATION ARCHITECTURE (refined during Task 3 execution):
 //
 //   The original Plan 01-11 RESEARCH §1 sketch assumed `sw.evaluate(() =>
 //   chrome.action.getBadgeText({}))` would work directly against the
 //   service worker via Puppeteer's WebWorker.evaluate. Empirical probes
 //   during Task 3 execution against Puppeteer 25.0.2 + Chrome 148 +
 //   --headless=true revealed two blockers:
 //     1. Puppeteer's `WebWorker.evaluate` runs in an ISOLATED WORLD that
 //        carries SW globals (clients, registration, ...) but NOT the
 //        extension's full `chrome.*` API surface. `Object.keys(chrome)`
 //        returns `["loadTimes", "csi"]` — the public webpage chrome,
 //        not the extension chrome.
 //     2. Chrome 148's headless mode aggressively suspends MV3 service
 //        workers; subsequent `swTarget.worker()` calls return
 //        `Protocol error: No target with given id found`.
 //
 //   The popup page (chrome-extension://<id>/src/popup/index.html) has:
 //     - Full `chrome.*` API access (it's an extension context — same
 //       privileges as the SW for chrome.action, chrome.runtime,
 //       chrome.notifications, chrome.runtime.getManifest, etc.)
 //     - Stable lifetime (it's a regular Page; Puppeteer keeps it alive)
 //     - Natural SW wake-up via message passing (chrome.runtime
 //       .sendMessage from popup wakes the SW for 30s)
 //
 //   So this module's helpers use a Puppeteer Page handle pointing at
 //   the popup URL — NOT a WebWorker handle. The harness opens the popup
 //   page during setup (tests/uat/lib/launch.ts) and passes it here.
 //
 //   For SW-isolate-specific state (`globalThis.__mokoshTest` lives in
 //   the SW's globalThis, not the popup's), the SW hook exposes a
 //   `chrome.runtime.onMessage` bridge: the popup sends
 //   `{ type: '__mokoshTestQuery', op: '...' }` messages; the hook
 //   responds with the queried state. Bridge implementation is in
 //   src/test-hooks/sw-hooks.ts; this file invokes it via popup.evaluate
 //   wrapping `chrome.runtime.sendMessage`.
 //
 // References:
 //   - Chrome extension pages share chrome.* API:
 //     https://developer.chrome.com/docs/extensions/develop/concepts/popup
 //   - Puppeteer Page.evaluate: https://pptr.dev/api/puppeteer.page.evaluate
 //   - Service worker wake-up on chrome.runtime message:
 //     https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle
 import type { Page } from 'puppeteer';
 /// <reference path="./test-hook-contract.d.ts" />
 /**
 * Structured snapshot of the SW's notification observability state
 * (Plan 01-11 Task 2 sw-hooks.ts surfaces). Used by assertions 7 + 8
 * to verify count-deltas + last-options-shape + id-prefix membership.
 */
 export interface NotificationSnapshot {
  readonly count: number;
  readonly lastOptions: chrome.notifications.NotificationOptions<true> | null;
  readonly ids: ReadonlyArray<string>;
 }
 /**
 * The SW hook's bridge message type. The popup sends one of these
 * shapes via chrome.runtime.sendMessage; the SW's onMessage handler
 * (extended by sw-hooks.ts) responds with the queried state. See
 * src/test-hooks/sw-hooks.ts for the SW-side dispatch.
 */
 interface BridgeQuery {
  type: '__mokoshTestQuery';
  op:
    | 'snapshot'
    | 'fire-on-startup'
    | 'handler-types';
 }
 /**
 * Get the toolbar badge text. Empty string means OFF or initial state;
 * 'REC' means recording; 'ERR' means error per Plan 01-09 badge state
 * machine.
 *
 * @param popup - The extension popup page handle (open against
 *   chrome-extension://<id>/src/popup/index.html).
 * @returns Current badge text.
 */
 export async function getBadgeText(popup: Page): Promise<string> {
  return await popup.evaluate(async () => await chrome.action.getBadgeText({}));
 }
 /**
 * Get the current popup URL. Empty string means popup is not set
 * (toolbar click fires onClicked instead). The chrome-extension://
 * URL means recording (popup hosts SAVE button).
 *
 * @param popup - The extension popup page handle.
 * @returns Current popup URL (full chrome-extension:// form OR '').
 */
 export async function getPopup(popup: Page): Promise<string> {
  return await popup.evaluate(async () => await chrome.action.getPopup({}));
 }
 /**
 * Read the runtime manifest. Used by assertion 10 to verify
 * permissions + icons shape, and by assertion 13 to obtain the
 * version string for archive shape matching.
 *
 * @param popup - The extension popup page handle.
 * @returns The chrome.runtime.getManifest() result.
 */
 export async function getManifest(popup: Page): Promise<chrome.runtime.Manifest> {
  return await popup.evaluate(() => chrome.runtime.getManifest());
 }
 /**
 * Fetch an extension-relative file via popup context and return its
 * size in bytes. Used by assertion 9 to verify icon files meet the
 * size floors that Chrome's imageUtil requires for notifications.create
 * (Bug A regression class — too-small icon → create rejects).
 *
 * @param popup - The extension popup page handle.
 * @param relativePath - Path under the extension root (e.g. 'icons/icon128.png').
 * @returns Byte size on success, -1 on fetch failure.
 */
 export async function getIconSize(
  popup: Page,
  relativePath: string,
 ): Promise<number> {
  return await popup.evaluate(async (path: string) => {
    const url = chrome.runtime.getURL(path);
    const r = await fetch(url);
    if (!r.ok) {
      return -1;
    }
    const cl = r.headers.get('content-length');
    if (cl !== null) {
      const n = Number(cl);
      if (Number.isFinite(n) && n > 0) {
        return n;
      }
    }
    const buf = await r.arrayBuffer();
    return buf.byteLength;
  }, relativePath);
 }
 /**
 * Read whether the SW thinks a recording is active. Side-channeled
 * through the badge text — 'REC' ↔ recording; '' ↔ idle; 'ERR' ↔
 * error state — to avoid needing a dedicated hook field.
 *
 * @param popup - The extension popup page handle.
 * @returns true when badge === 'REC'.
 */
 export async function getIsRecording(popup: Page): Promise<boolean> {
  const badge = await getBadgeText(popup);
  return badge === 'REC';
 }
 /**
 * Fire the captured chrome.runtime.onStartup handler via the test
 * hook's chrome.runtime.sendMessage bridge. Used by assertion 8 to
 * verify the Bug A path (icon-promoted notification fires cleanly).
 *
 * Bridge protocol: popup sends `{ type: '__mokoshTestQuery', op: 'fire-on-startup' }`;
 * SW responds with `{ ok: true }` after invoking the handler, OR
 * `{ ok: false, error: 'no-handler' }` if the production listener
 * was never registered (means the SW module init failed — a
 * different bug class).
 *
 * @param popup - The extension popup page handle.
 * @throws If the bridge response indicates the handler is missing.
 */
 export async function fireOnStartup(popup: Page): Promise<void> {
  const response = await popup.evaluate(async () => {
    const msg = {
      type: '__mokoshTestQuery',
      op: 'fire-on-startup',
    };
    return new Promise<{ ok: boolean; error?: string }>((resolve) => {
      chrome.runtime.sendMessage(msg, (r) => {
        resolve(r as { ok: boolean; error?: string });
      });
    });
  });
  if (!response.ok) {
    throw new Error(
      `fireOnStartup bridge returned ok=false: ${response.error ?? '(no error message)'}`,
    );
  }
 }
 /**
 * Inject a synthetic RECORDING_ERROR message into the SW's
 * chrome.runtime.onMessage handler. Used by assertion 7 to verify
 * the error path is preserved (badge 'ERR' + recovery notification).
 * Goes through the popup's chrome.runtime.sendMessage — a real
 * production code path (sw onMessage handler).
 *
 * @param popup - The extension popup page handle.
 * @param errorCode - The error code to inject (e.g. 'codec-unsupported').
 */
 export async function sendSyntheticRecordingError(
  popup: Page,
  errorCode: string,
 ): Promise<void> {
  await popup.evaluate(async (code: string) => {
    await chrome.runtime.sendMessage({
      type: 'RECORDING_ERROR',
      error: code,
    });
  }, errorCode);
 }
 /**
 * Snapshot the current notification observability state from the SW
 * hook via the bridge.
 *
 * @param popup - The extension popup page handle.
 * @returns Snapshot — count, last options, ids array.
 */
 export async function getNotificationSnapshot(
  popup: Page,
 ): Promise<NotificationSnapshot> {
  const response = await popup.evaluate(async () => {
    const msg = { type: '__mokoshTestQuery', op: 'snapshot' };
    return new Promise<{
      count: number;
      lastOptions: chrome.notifications.NotificationOptions<true> | null;
      ids: string[];
    }>((resolve) => {
      chrome.runtime.sendMessage(msg, (r) => {
        resolve(r as {
          count: number;
          lastOptions: chrome.notifications.NotificationOptions<true> | null;
          ids: string[];
        });
      });
    });
  });
  return {
    count: response.count,
    lastOptions: response.lastOptions,
    ids: response.ids,
  };
 }
 /**
 * Send a no-op keepalive ping to the SW so Chrome's ~30s idle timer
 * does not evict the worker during long waits (assertion 11's 35s
 * recording window). Uses chrome.runtime.sendMessage as the cheapest
 * wake-up signal; the SW's onMessage handler treats unknown messages
 * as a warning-log no-op.
 *
 * @param popup - The extension popup page handle.
 */
 export async function keepalivePing(popup: Page): Promise<void> {
  await popup.evaluate(async () => {
    await chrome.runtime.sendMessage({ type: '__mokoshKeepalive' });
  });
 }
 // Re-export the BridgeQuery type for sw-hooks.ts side reference
 // (the SW hook implements the message dispatch using the same shape).
 export type { BridgeQuery };
--- a/tests/uat/lib/zip.ts
+++ b/tests/uat/lib/zip.ts
@@ -0,0 +1,121 @@
 // tests/uat/lib/zip.ts — Plan 01-11 harness archive-shape helper.
 //
 // Assertion 13 verifies the session_report_*.zip produced by the SW's
 // saveArchive contains:
 //   - `video/last_30sec.webm` (non-zero size)
 //   - `meta.json` whose parsed JSON has `version === <manifest.version>`
 //
 // References:
 //   - JSZip: https://stuk.github.io/jszip/documentation/api_jszip.html
 //   - Plan 01-07 archive shape (session_report contract):
 //     .planning/phases/01-stabilize-video-pipeline/01-07-PLAN.md
 import { readFileSync } from 'node:fs';
 import JSZip from 'jszip';
 /**
 * Outcome of an archive shape inspection. `errors` lists every
 * missing-file / wrong-size / version-mismatch finding.
 */
 export interface ArchiveShapeResult {
  readonly hasVideoEntry: boolean;
  readonly videoSizeBytes: number;
  readonly hasMetaEntry: boolean;
  readonly metaJson: { version?: unknown } | null;
  readonly errors: ReadonlyArray<string>;
 }
 /**
 * Open a downloaded session_report_*.zip and verify its shape.
 *
 * @param zipPath - Absolute path to the downloaded .zip file.
 * @param expectedVersion - The version string from chrome.runtime.getManifest().version.
 * @returns Structured shape result. `errors` non-empty == assertion failure.
 */
 export async function assertArchiveShape(
  zipPath: string,
  expectedVersion: string,
 ): Promise<ArchiveShapeResult> {
  const zipBuf = readFileSync(zipPath);
  const zip = await JSZip.loadAsync(zipBuf);
  const errors: string[] = [];
  // video/last_30sec.webm presence + size
  const videoEntry = zip.file('video/last_30sec.webm');
  let hasVideoEntry = false;
  let videoSizeBytes = 0;
  if (videoEntry === null) {
    errors.push('video/last_30sec.webm entry missing from archive');
  } else {
    hasVideoEntry = true;
    const videoBuf = await videoEntry.async('uint8array');
    videoSizeBytes = videoBuf.byteLength;
    if (videoSizeBytes === 0) {
      errors.push('video/last_30sec.webm entry is zero bytes (no captured video)');
    }
  }
  // meta.json presence + version match
  const metaEntry = zip.file('meta.json');
  let hasMetaEntry = false;
  let metaJson: { version?: unknown } | null = null;
  if (metaEntry === null) {
    errors.push('meta.json entry missing from archive');
  } else {
    hasMetaEntry = true;
    const metaText = await metaEntry.async('string');
    try {
      metaJson = JSON.parse(metaText) as { version?: unknown };
    } catch (parseErr) {
      const msg = parseErr instanceof Error ? parseErr.message : String(parseErr);
      errors.push(`meta.json failed to parse as JSON: ${msg}`);
    }
    if (metaJson !== null) {
      if (typeof metaJson.version !== 'string') {
        errors.push(
          `meta.json.version expected string, got ${typeof metaJson.version} (${JSON.stringify(metaJson.version)})`,
        );
      } else if (metaJson.version !== expectedVersion) {
        errors.push(
          `meta.json.version mismatch — expected "${expectedVersion}", got "${metaJson.version}"`,
        );
      }
    }
  }
  return {
    hasVideoEntry,
    videoSizeBytes,
    hasMetaEntry,
    metaJson,
    errors,
  };
 }
 /**
 * Extract a single named entry from a .zip to an absolute filesystem
 * path. Used by assertion 12 (ffprobe gate on video/last_30sec.webm).
 *
 * @param zipPath - Absolute path to the .zip.
 * @param entryName - Name of the entry inside the zip (e.g. 'video/last_30sec.webm').
 * @param outPath - Absolute filesystem path to write the entry to.
 * @returns The number of bytes written.
 * @throws If the entry is missing from the zip.
 */
 export async function extractEntryToFile(
  zipPath: string,
  entryName: string,
  outPath: string,
 ): Promise<number> {
  const { writeFileSync } = await import('node:fs');
  const zipBuf = readFileSync(zipPath);
  const zip = await JSZip.loadAsync(zipBuf);
  const entry = zip.file(entryName);
  if (entry === null) {
    throw new Error(`extractEntryToFile: entry '${entryName}' missing in ${zipPath}`);
  }
  const buf = await entry.async('nodebuffer');
  writeFileSync(outPath, buf);
  return buf.byteLength;
 }
--- a/tests/uat/tsconfig.json
+++ b/tests/uat/tsconfig.json
@@ -0,0 +1,20 @@
 {
  "_comment": "Plan 01-11 — type-check config for the Puppeteer UAT harness. Mirrors the root tsconfig.json's compiler options but `include`s the harness tree explicitly so `npx tsc --noEmit -p tests/uat` validates the harness in isolation. Used by Task 3 to verify the scaffolding type-checks before tsx runs it. The root tsconfig.json's `include: ['src']` does NOT pick up tests/, so this file is necessary for the type-check verification step.",
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "lib": ["ES2022", "DOM", "DOM.Iterable"],
    "skipLibCheck": true,
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "resolveJsonModule": true,
    "isolatedModules": true,
    "noEmit": true,
    "strict": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "noFallthroughCasesInSwitch": true,
    "types": ["chrome", "node"]
  },
  "include": ["**/*.ts", "**/*.d.ts"]
 }
--- a/vitest.config.ts
+++ b/vitest.config.ts
@@ -17,6 +17,14 @@ export default defineConfig({
  test: {
    environment: 'node',
    include: ['tests/**/*.test.ts'],
    // Plan 01-11: exclude the Puppeteer harness from vitest's discovery.
    // tests/uat/harness.test.ts is a tsx-runnable Node script invoked
    // via `npm run test:uat`; running it under vitest would try to
    // launch a real Chrome inside the vitest worker (interactive UAT
    // does not belong in the unit-test pass). The .test.ts suffix is
    // retained for editor + naming-convention consistency with the
    // rest of the tests/ tree.
    exclude: ['node_modules/**', 'tests/uat/**'],
    reporters: 'dot',
    typecheck: {
      enabled: false,