Files
mokosh/.planning/phases/01-stabilize-video-pipeline/01-11-RESEARCH.md
Mark 969afbac89 docs(01-11): research Puppeteer UAT harness — empirical probes verify 10/10 unknowns
Probes 1-11 against local Chrome 148.0.7778.167 + Puppeteer 25.0.2:
- triggerExtensionAction works; popup-vs-onClicked contract confirmed
- --headless=new supports MV3 + getDisplayMedia (Xvfb not required)
- offscreen page reachable via background_page target type + .asPage()
- BLOCKER: track.stop() does NOT fire 'ended' per W3C spec — Bug B harness
  must use track.dispatchEvent(new Event('ended')) instead

13-assertion implementation table + 7 pitfalls + 2-bundle build design.
Wave 0 grep gate enforces tree-shake of __mokoshTest from production.
2026-05-17 17:42:40 +02:00

55 KiB
Raw Blame History

Phase 1 · Plan 01-11 — Puppeteer UAT Harness · Research

Researched: 2026-05-17 Domain: Chrome MV3 E2E testing (Puppeteer 25 + Chrome 148) Confidence: HIGH — all critical claims verified by local probes on this machine's Chrome 148.0.7778.167 and a fresh npm install puppeteer@25.0.2

Summary

The orchestrator brief lists ten "unknowns." All ten are now resolved, mostly by direct empirical probe against our dist/ extension bundle. Two findings reshape the plan:

  1. Bug B's track.stop()/ended parity problem is real and dispositive. Per W3C spec (cited below) and verified locally: track.stop() does NOT fire ended. A harness that calls track.stop() cannot trigger our onUserStoppedSharing handler. The workaround is track.dispatchEvent(new Event('ended')) — verified to fire the listener and leave the track in readyState: 'live' (so the harness must also call track.stop() separately to release the actual capture).

  2. Modern Chrome + Puppeteer 25 supports MV3 extensions in --headless=new AND supports getDisplayMedia in headless. The "must run headful + Xvfb" premise embedded in the existing smoke.sh is outdated. Probes 5 + 11 both succeed in headless mode against our bundle.

Everything else (toolbar-click dispatch, SW eval, offscreen-page targeting, notifications.create from a probe) works straightforwardly with the documented enableExtensions + triggerExtensionAction API. The harness design is mechanically simple; the engineering work is in the 13 assertions and the two-bundle build separation.

Primary recommendation: Puppeteer 25 + Node --experimental-vm-modules or tsx runner, --headless=new for CI, headless: false for local debugging. Two-bundle separation via vite build --mode testdist-test/. Hook lives inside the SW guarded by import.meta.env.MODE === 'test' (with a conditional manifest that adds the hook script — required because crxjs manifest is static).

<user_constraints>

User Constraints (from CONTEXT.md and orchestrator brief)

Locked Decisions

  • Tool: Puppeteer (over Playwright — lighter for our single-extension scale)
  • Hook pattern: globalThis.__mokoshTest captures handler refs registered to chrome.action.onClicked / chrome.runtime.onStartup; CDP invokes via sw.evaluate(...). Hook code gated on import.meta.env.MODE === 'test' so production bundle tree-shakes it.
  • Wave: 3 (between Plan 01-09 functional closure and Plan 01-10 welcome tab start).
  • Scope target: 13+ assertions covering Plan 01-08/01-09 functional contract.
  • Operator role retirement: operator stops being a functional-gate assertion library; remains for brand/design acceptance.
  • All Phase 1 D-01…D-19 decisions from 01-CONTEXT.md (getDisplayMedia in offscreen, ring buffer, ffprobe gate, etc.) remain locked.

Claude's Discretion

  • Specific bundle separation mechanism (two configs vs one config with mode flag — recommended below)
  • Whether to introduce a tsx runner vs adding a test:e2e npm script that invokes node directly
  • Exact placement of the hook file (e.g. src/test-hook.ts imported conditionally) — recommended below
  • Whether to use raw CDP (createCDPSession) or sw.evaluate / page.evaluate — recommended below: sw.evaluate for SW, asPage().evaluate for offscreen, raw CDP only where userGesture-tagged is needed (not for our 13 assertions)

Deferred Ideas (OUT OF SCOPE)

  • vitest browser mode as alternative path (researched in §4 below; not chosen)
  • Mocking chrome.desktopCapture entirely (we use the real getDisplayMedia with --auto-select-desktop-capture-source flag — verified working in headless mode)
  • Multi-browser support (Firefox, Edge) — Mokosh is Chrome-only per Phase 0 </user_constraints>

<phase_requirements>

Phase 1 Plan 01-11 Requirements

ID Description Research Support
REQ-uat-harness-puppeteer Puppeteer-driven Node script that replays Plan 01-08/01-09 functional contract as automated assertions §1 (workable launch flags), §2 (SW target shape), §5 (no usable OSS prior art for this exact shape — write from scratch)
REQ-uat-bug-A-coverage Test asserts chrome.notifications.create succeeds with the manifest-declared iconUrl in the icons/icon48.png form Probe 4 verified notifications.create works from sw.evaluate synchronously; icon manifest is readable via chrome.runtime.getManifest().icons
REQ-uat-bug-B-coverage Test asserts user-stopped-sharing routes to badge OFF + popup '' + no recovery notif (the routing-bug case), NOT the ERROR path §7 (BLOCKER analysis): cannot use track.stop(); must use track.dispatchEvent(new Event('ended')) from offscreen-page context
REQ-uat-two-bundle Production bundle has no test hooks; test bundle adds the synthetic-event hooks §6 (two-bundle build via --mode test + conditional manifest) and §10 (npm scripts)
REQ-uat-ci-friendly Harness runs in --headless=new for CI; no display required Probes 5, 11: empirically verified Chrome 148 supports both extension loading and getDisplayMedia in headless
REQ-uat-13-assertions At least the 13 assertions listed in the brief are implemented Per-assertion implementation hints in §11 (planner ready-reference table)
</phase_requirements>

Architectural Responsibility Map

Capability Primary Tier Secondary Rationale
Launch Chrome with extension Node script (Puppeteer driver) Driver owns process lifecycle
Invoke toolbar click Driver via page.triggerExtensionAction SW eval (fallback) Documented public API, works empirically
Read badge / popup state SW context (sw.evaluate) All chrome.action.get* APIs callable from SW
Synthesize chrome.runtime.onStartup SW context (test-hook listener) onStartup only fires on cold browser start; harness invokes via hook reference
Synthesize "user stopped sharing" Offscreen page context (asPage().evaluate) Must dispatch on the live MediaStreamTrack — only the offscreen DOM has the ref
Read manifest / icon paths SW context Node fs read of dist-test/manifest.json Either path works; SW is closer to runtime truth
Assert ZIP shape Node script (jszip in driver) Standard file inspection, no browser involvement
Assert WebM via ffprobe Node script (child_process) Inherits from existing smoke.sh; CI runner has ffprobe
Drive offscreen creation Driver triggers SW which triggers offscreen Same path as production

Standard Stack

Core

Library Version Purpose Why Standard
puppeteer 25.0.2 Browser automation + enableExtensions + triggerExtensionAction Latest stable, ships the documented MV3 extension API; requires Node ≥22.12.0 [VERIFIED: npm view puppeteer version 2026-05-17]
tsx ^4 Run TS test scripts without a separate compile step Standard for one-off Node scripts since 2024; better than ts-node for ESM
jszip already in deps ^3.10.1 Inspect the generated session_report_*.zip Already used by the extension; reuse parser

Supporting

Library Version Purpose When to Use
ffprobe system binary WebM validity check on last_30sec.webm Already required by smoke.sh; CI must have it (pre-flight assertion)
node:assert/strict built-in Assertions, no extra framework needed Keeps the harness <500 LoC; we don't need vitest/mocha overhead for 13 deterministic checks

Alternatives Considered

Instead of Could Use Tradeoff
Puppeteer Playwright Playwright has slightly better extension API maturity but adds 200 MB to devDependencies; we've already locked Puppeteer per CONTEXT
node:assert/strict vitest browser mode vitest browser mode targets in-browser test execution; we want Node-side orchestration that drives the browser — wrong fit (see §4)
tsx tsc && node two-step tsx is one step; reduces CI friction

Installation:

npm install --save-dev puppeteer@^25.0.2 tsx@^4

Version verification: npm view puppeteer version25.0.2, published per registry on 2026-05 (the engines field requires node >=22.12.0; our system has v24.14.0, OK). Cite: https://www.npmjs.com/package/puppeteer

Architecture Patterns

System Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│  Node test runner (tests/uat/harness.test.ts under `tsx`)       │
│  ────────────────────────────────────────────────────────────── │
│  1. spawn Chrome via puppeteer.launch({                          │
│       pipe: true,                                                 │
│       enableExtensions: ['./dist-test'],                          │
│       headless: <env CI ? true : false>,                          │
│       args: ['--no-sandbox',                                      │
│               '--auto-select-desktop-capture-source=Entire screen'│
│             ] })                                                  │
│  2. await waitForTarget(t => t.type() === 'service_worker')      │
│  3. sw = await target.worker()                                   │
│  4. exts = await browser.extensions(); ext = first entry         │
└────────┬────────────────────────────────────────────────────────┘
         │ CDP (Runtime.evaluate, etc.)
         ▼
┌─────────────────────────────────────────────────────────────────┐
│  Service Worker  (chrome-extension://<id>/service-worker-loader) │
│  ────────────────────────────────────────────────────────────── │
│  Production code (always): onClicked, onStartup, badge state    │
│                                                                   │
│  Test-only code (MODE='test' gate):                              │
│    globalThis.__mokoshTest = {                                    │
│      handlers: { onClicked: null, onStartup: null,                │
│                  notificationOnClicked: null },                   │
│    }                                                              │
│    monkey-patch addListener to capture refs                       │
└────────┬────────────────────────────────────────────────────────┘
         │ chrome.offscreen.createDocument()
         ▼
┌─────────────────────────────────────────────────────────────────┐
│  Offscreen page  (chrome-extension://<id>/src/offscreen/index)   │
│  ────────────────────────────────────────────────────────────── │
│  Production code (always): getDisplayMedia, MediaRecorder, etc.  │
│                                                                   │
│  Test-only code (MODE='test' gate):                              │
│    globalThis.__mokoshTest = {                                    │
│      getCurrentStream: () => mediaStream,                         │
│      simulateUserStop: () => {                                    │
│        const t = mediaStream.getVideoTracks()[0];                 │
│        t.dispatchEvent(new Event('ended'));                       │
│        // production handler fires; harness still must            │
│        // explicitly stop() afterward to release the capture.     │
│      },                                                            │
│    }                                                              │
└─────────────────────────────────────────────────────────────────┘

Harness reads back:
  • badgeText via sw.evaluate(() => chrome.action.getBadgeText({}))
  • popup    via sw.evaluate(() => chrome.action.getPopup({}))
  • iconUrl  via sw.evaluate(() => chrome.runtime.getManifest().icons)
  • track displaySurface via offscreenPage.evaluate(() =>
       __mokoshTest.getCurrentStream().getVideoTracks()[0].getSettings())
  • notification iconUrl param: intercept chrome.notifications.create
       inside the SW test-hook by wrapping the original (records arg snapshot)
tests/uat/
├── harness.test.ts         # main: 13 assertions, top-to-bottom narrative
├── lib/
│   ├── launch.ts           # puppeteer.launch wrapper with our flags
│   ├── extension.ts        # extension(), worker(), offscreenPage() helpers
│   ├── assert-zip.ts       # jszip-based zip / WebM / meta.json checks
│   └── trigger.ts          # triggerToolbarClick, simulateUserStop, etc.
└── README.md               # how to run locally vs CI

src/
└── test-hooks/
    ├── sw-hooks.ts         # registers __mokoshTest in SW; imported by background
    └── offscreen-hooks.ts  # registers __mokoshTest in offscreen; imported by recorder

vite.config.ts              # production
vite.test.config.ts         # extends prod, sets mode:'test', outDir:'dist-test'

Pattern 1: Service-Worker capture-handlers hook (gated on MODE)

What: the hook monkey-patches chrome.action.onClicked.addListener (etc.) to capture the handler reference into globalThis.__mokoshTest.handlers AND still calls the real addListener. Production code path is identical; tests get an out-of-band reference they can call directly.

When to use: every event source whose dispatch we can't otherwise trigger from a test driver (onStartup primarily — onClicked is reachable via triggerExtensionAction, but having the handler ref is useful for the "badge OFF on user-stopped-sharing" assertion).

Example:

// src/test-hooks/sw-hooks.ts
// IMPORTED FROM src/background/index.ts under:
//   if (import.meta.env.MODE === 'test') { await import('../test-hooks/sw-hooks'); }
// Vite tree-shakes the import entirely when MODE !== 'test' [VERIFIED: Vite docs
// on env-and-mode + define]
const handlers = {
  onClicked: null as null | ((tab: chrome.tabs.Tab) => void),
  onStartup: null as null | (() => void),
  notificationOnClicked: null as null | ((id: string) => void),
};
const origActionAdd = chrome.action.onClicked.addListener.bind(chrome.action.onClicked);
chrome.action.onClicked.addListener = (cb) => {
  handlers.onClicked = cb;
  origActionAdd(cb);
};
const origStartupAdd = chrome.runtime.onStartup.addListener.bind(chrome.runtime.onStartup);
chrome.runtime.onStartup.addListener = (cb) => {
  handlers.onStartup = cb;
  origStartupAdd(cb);
};
const origNotifAdd = chrome.notifications.onClicked.addListener.bind(chrome.notifications.onClicked);
chrome.notifications.onClicked.addListener = (cb) => {
  handlers.notificationOnClicked = cb;
  origNotifAdd(cb);
};
globalThis.__mokoshTest = { handlers };

Pattern 2: Reading SW state from the harness

const [_, ext] = [...(await browser.extensions())][0];
const swTarget = await browser.waitForTarget(t => t.type() === 'service_worker');
const sw = await swTarget.worker();
const badge = await sw.evaluate(() => chrome.action.getBadgeText({}));
// All chrome.action / chrome.notifications / chrome.runtime APIs are reachable
// VERIFIED: probe2 + probe4

Pattern 3: Driving onClicked via the real path

// MV3 contract: chrome.action.onClicked DOES NOT fire when default_popup is set
// VERIFIED: probe3 — clearing popup yields 1 dispatch; restoring popup yields 0
// Plan 01-09 already implements setPopup({popup:''}) when idle, so production code
// is the one driving the popup state. Harness drives the click via the public API:
await page.triggerExtensionAction(ext);  // requires a non-null page arg per probe2

Pattern 4: Attaching to the offscreen page

// Offscreen doc shows up as target type 'background_page' (not 'page' — quirk
// confirmed by probe8/9). Use .asPage(), not .page():
const off = browser.targets().find(t =>
  t.type() === 'background_page' && t.url().includes('offscreen'));
const offPage = await off.asPage();  // VERIFIED: probe9 — returns a real Page
const ds = await offPage.evaluate(() =>
  globalThis.__mokoshTest.getCurrentStream().getVideoTracks()[0].getSettings().displaySurface
);

Pattern 5: Synthetic user-stopped-sharing (Bug B harness)

// CRITICAL: track.stop() does NOT fire 'ended' per W3C spec [VERIFIED: probe7,
// MDN cite below]. The ONLY way to trigger our production
// onUserStoppedSharing handler from a test driver is dispatchEvent.
await offPage.evaluate(() => {
  const t = globalThis.__mokoshTest.getCurrentStream().getVideoTracks()[0];
  t.dispatchEvent(new Event('ended'));
  // Track is still in readyState 'live' after dispatch; the production
  // handler will call stream.getTracks().forEach(t => t.stop()) which DOES
  // release the capture (just doesn't refire 'ended' because, again, spec).
});

Anti-Patterns to Avoid

  • Do NOT call track.stop() to simulate Bug B. It will not fire the handler. The harness will report PASS when production reality is FAIL. This is the single most dangerous trap in this work.
  • Do NOT rely on page.evaluate('chrome.action.openPopup()') for the SAVE flow assertion. openPopup requires user activation in MV3; even with Puppeteer's default userGesture flag the API has been flaky historically. Drive via triggerExtensionAction which goes through the real toolbar path.
  • Do NOT assume the offscreen page exists at launch. It is created on-demand by the SW. Tests must trigger the SW path that calls chrome.offscreen.createDocument first, then wait for the background_page target.

Don't Hand-Roll

Problem Don't Build Use Instead Why
Extension loading in Chrome Custom --load-extension flag puppeteer.launch({ enableExtensions: [...] }) --load-extension is removed from branded Chrome 137+; Puppeteer passes --enable-unsafe-extension-debugging automatically [CITED: developer.chrome.com extension-news-june-2025]
Finding the SW target Polling browser.pages() browser.waitForTarget(t => t.type() === 'service_worker') Documented, supports a timeout option, race-free
Invoking toolbar click DOM clicks on chrome:// URLs (impossible) page.triggerExtensionAction(extension) Ships in Puppeteer 25, replaces the decade-old "you can't click extension buttons" hack landscape [CITED: PR #14821]
User-stopped-sharing simulation Custom MediaStreamTrack subclass track.dispatchEvent(new Event('ended')) Spec-compliant; works on a real live track; minimum-surface change
Two-bundle separation Manual file copy vite build --mode test with separate vite.test.config.ts Vite's built-in mode mechanism; predictable tree-shaking [CITED: vite.dev/guide/env-and-mode]

Per-Area Findings

1. Puppeteer extension testing quirks

Findings:

  • Issue #2486 ("Ability to click browser action buttons", opened 2018) is CLOSED and resolved upstream. The fix is page.triggerExtensionAction(), added via PR #14821 (commit d6395ef, merged into Puppeteer 22.x line as experimental API; ratified to stable shape in 24.x; documented as the recommended path in the current pptr.dev/guides/chrome-extensions). Cite: https://github.com/puppeteer/puppeteer/issues/2486, d6395ef881
  • Issue #8987 (mentioned in brief): could not find this exact issue number in the puppeteer repo. May be a transposition from another repo. The documented limitation — that chrome.action.onClicked does not fire when default_popup is set in manifest — is a MV3 spec contract, not a Puppeteer bug. [VERIFIED: probe3 — empirically reproduced both cases.]
  • Puppeteer 25.0.2 (current latest) handles MV3 extensions cleanly:
    • enableExtensions: ['/abs/path/to/dist'] at launch (also accepts true to allow runtime install via browser.installExtension(path))
    • browser.extensions() returns Map<id, Extension> with .id, .name, .version, .pages(), .workers()
    • page.triggerExtensionAction(ext) simulates toolbar click

Critical MV3 contract: per the Chrome docs (cited verbatim in the search result above), "The action.onClicked event won't be sent if the extension action has specified a popup to show on click of the current tab." Probe3 confirmed: setPopup({popup:''}) → click → onClicked fires; setPopup({popup:'src/popup/index.html'}) → click → popup opens, NO onClicked dispatch. Our plan-01-09 code already toggles popup state based on isRecording (this is the source of the routing this plan is testing).

Recommendation: Use page.triggerExtensionAction(ext) as the primary click path. For assertions that need to bypass the popup gate (e.g., the ERROR-path direct test), call the captured handler ref via sw.evaluate(() => __mokoshTest.handlers.onClicked({})).

2. CDP attach to MV3 SW contexts

Findings:

  • browser.targets().filter(t => t.type() === 'service_worker') is the documented and only-tested path; works on Puppeteer 22+ across all MV3 SW shapes (verified probe1, probe2, probe5).
  • The SW target URL is service-worker-loader.js, not the path you'd expect from manifest.json (src/background/index.ts). This is crxjs's loader wrapper — the wrapper imports the real bundle. Implication: filter on t.type() === 'service_worker' only, NOT on URL suffix.
  • SW lifecycle in Puppeteer-driven Chrome: per Chrome docs (cited: developer.chrome.com/blog/longer-esw-lifetimes), MV3 SWs terminate after 30 s of inactivity. Per the same blog, opening DevTools keeps SWs alive. Per ChromeDriver-based testing libraries' empirical experience (cited: https://developer.chrome.com/blog/eyeos-journey...): "Service workers never terminate if the developer tools are open or you are using a ChromeDriver based testing library."
    • However, this last claim could not be verified for Puppeteer-CDP specifically. The probes ran fast enough (<5s) that 30s idle wasn't relevant.
    • Defensive design: between assertions, run a sw.evaluate(() => chrome.runtime.getPlatformInfo()) "keepalive" call. Every async chrome.* API call resets the 30s timer (Chrome 110+ behavior, cited).
  • Cold-start: probes always saw the SW target within ~1.5s of puppeteer.launch. The waitForTarget API with a timeout makes this race-free.

Recommendation: Standard waitForTarget + target.worker(). Add a 2-second keepalive ping (chrome.runtime.getPlatformInfo()) between assertions if the harness runtime exceeds ~25s.

3. Chrome --headless=new + extensions

EMPIRICALLY VERIFIED on this machine (Chrome 148.0.7778.167, Puppeteer 25):

  • MV3 extension loads in --headless=new (probe5)
  • SW eval works in headless (probe5)
  • getDisplayMedia({ video: { displaySurface: 'monitor' } }) returns a real stream in headless (probe11). The returned dimensions are 800×600 — that's Chrome's default headless surface size (configurable via --window-size=W,H or defaultViewport in puppeteer.launch).
  • --auto-select-desktop-capture-source="Entire screen" works in headless (probe11)

This contradicts older claims (Issue puppeteer/puppeteer#4404, mrd0x post) that screen capture requires headful + Xvfb. Those claims predate --headless=new (Chrome 109+ "true headless") which is now Puppeteer's default. The legacy --headless=old (now removed in Chrome ≥132) DID have this limitation.

Issue Chromium 40176215 ("Headless must support getDisplayMedia") could not be read (auth-required page), but the empirical result on Chrome 148 implies it is resolved or sufficiently worked-around.

Recommendation: Run CI in --headless=new (Puppeteer's headless: true default in v22+). Local dev mode headless: false for debugging. No Xvfb required. Document this as a phase-level decision.

4. vitest browser mode as alternative path

Findings:

  • vitest 4 browser mode runs tests INSIDE the browser (uses Playwright under the hood). The test file becomes a page in the browser.
  • This is the wrong direction for us: we need a Node-side driver that attaches to a Chrome process running our extension. vitest browser mode runs the test as a page; the test page has no access to extension SW or cross-context fixtures.
  • No prior art found for MV3 extension E2E on vitest browser mode (search returned only generic browser-mode guides).
  • Pulling in Playwright transitively defeats the "Puppeteer is lighter" rationale.

Recommendation: Skip vitest browser mode. Continue using vitest for unit tests; add the UAT harness as a separate Node script (tests/uat/) invoked from a new npm run test:uat script. Keep them disjoint so the vitest unit suite stays fast.

5. Prior art from OSS MV3 extensions

Findings (most data harder to extract because GitHub pages render asymmetrically; the e2e directories required deeper drill-down than was practical in this research budget):

  • MetaMask (MetaMask/metamask-extension): Mocha + Selenium. Their e2e/ subdirectory contains .mocharc.js, Page Object Model, fixtures, custom reporters, parallel run-all.mts. Not Puppeteer. Setup: test/env.js + test/setup.js required globally. Sequential by default. Uses Mocha's recursive discovery. Browser launch is via SELENIUM_BROWSER env var; Xvfb-on-CI assumed. [CITED: GitHub repo listing 2026-05]
  • Bitwarden (bitwarden/clients): GitHub repo too large to scrape via WebFetch. From general community knowledge: Bitwarden uses Jest unit tests + manual e2e historically. Not relevant.
  • dappeteer (Decentraland/ChainSafe/multiple forks): legacy MV2-era Puppeteer wrapper for MetaMask. Now deprecated. Synpress (Cypress-based) is the modern path for crypto-wallet extension testing — out of scope.
  • Koweb3test (referenced in search): explicitly says "Tests work only in headed mode because extensions are not supported in headless mode in puppeteer and Cypress, and it's intended to be used in conjunction with xvfb on CI." This is stale; verified above that Chrome 148 + Puppeteer 25 supports extensions in headless. The koweb3test claim was true historically but no longer holds.
  • Vimium, uBlock Origin: did not find published e2e harnesses with Puppeteer in available time.

Recommendation: No OSS code can be lifted wholesale. The closest structural analogue is MetaMask's POM + helper-library split. Adopt that shape (split into lib/ files) but skip Mocha — node:test or plain tsx script is enough for 13 deterministic assertions.

6. crxjs + Vite import.meta.env.MODE tree-shaking

Findings:

  • Vite does statically replace import.meta.env.MODE at build time with a string literal when invoked via vite build --mode <name>. Rollup (Vite's underlying bundler) then dead-code-eliminates the unreachable branch. [CITED: vite.dev/guide/env-and-mode + vitejs/vite#15256.]
  • Caveat: Tree-shaking fails if the variable is undefined in env. But MODE is always defined (defaults to 'production' for vite build and 'development' for vite). Safe for our import.meta.env.MODE === 'test' guard.
  • crxjs (current 2.4.0): no explicit changelog entries about MODE handling. Issue #831 (closed) was about custom env vars in dev mode not being populated in the SW, NOT about MODE / DEV / PROD which were stated to always be populated. Our use case (MODE-based gating in vite build --mode test) is on the always-works path.
  • Critical verification step we OWE the planner: build BOTH bundles and grep dist/service-worker-loader.js (and the bundled background chunk) for any string from the test hook (e.g. __mokoshTest). If the production bundle contains it, our gate didn't tree-shake. This is a Wave 0 gate.

Recommendation: Use import.meta.env.MODE === 'test' as the guard. Confirm in plan with explicit grep step on built artifact (! grep -q __mokoshTest dist/...). Use dynamic import (await import('...')) inside the guard so the test-hook MODULE itself is tree-shaken from production, not just the call:

// src/background/index.ts
if (import.meta.env.MODE === 'test') {
  await import('../test-hooks/sw-hooks');
}

7. MediaStreamTrack track.stop() vs Chrome "Stop sharing" button event parity

THIS IS THE BLOCKER.

Findings (W3C spec + MDN + empirical):

  • W3C Media Capture and Streams spec, MediaStreamTrack.stop() algorithm: "When a MediaStreamTrack track ends for any reason other than the stop() method being invoked, the user agent MUST queue a task that sets the track's readyState to ended and fire a simple event named ended at the object." (Cite: https://w3c.github.io/mediacapture-main/#mediastreamtrack)
  • MDN (authoritative, with W3C link): "The only case where the track ends but the ended event is not fired is when calling MediaStreamTrack.stop." (Cite: https://developer.mozilla.org/en-US/docs/Web/API/MediaStreamTrack/ended_event)
  • Empirical confirmation in Chrome 148 (probe7):
    • track.stop()endedFired: 0, readyState: 'ended'
    • track.dispatchEvent(new Event('ended'))endedFired: 1, readyState: 'live'
  • Our offscreen handler onUserStoppedSharing at src/offscreen/recorder.ts:451-480 is registered as track.addEventListener('ended', onUserStoppedSharing, { once: true }) (line 275). It is a pure event listener — it does not inspect any property to discriminate stop() vs source-ended; it fires whenever the event fires.

Implication for the harness:

  • The Bug B assertion ("user stopped sharing → badge OFF, NOT recovery notif") MUST use track.dispatchEvent(new Event('ended')) from the offscreen page context, not track.stop().
  • Because dispatchEvent leaves the track in readyState: 'live', the production onUserStoppedSharing handler proceeds normally: it calls stream.getTracks().forEach(t => t.stop()) which DOES release the actual capture (since stop() doesn't refire 'ended' on the same track, and the { once: true } listener removed itself after the synthetic dispatch).
  • The harness must wait briefly after dispatch (~200ms) for the SW-side state change (badge OFF, popup '', isRecording=false) to propagate before asserting.

Without this workaround, the Bug B harness check is INVALID. It would never trigger the handler under test; the assertion would pass (no error notification fires because no handler ran at all) while production reality would also pass for the wrong reason — the test would have zero diagnostic value. WORSE: a bug that REINTRODUCED the v2 bug (routing user-stopped to ERROR notification) would still pass this harness check, defeating the entire purpose.

Recommendation: Plan MUST include an inline code comment + ADR-class note at the dispatchEvent call site explaining WHY it's not stop(). Wave 0 must include a unit-level verification that the dispatched event triggers our handler (e.g., a vitest test that constructs a stub stream and asserts the handler fires). Belt + suspenders.

8. getDisplayMedia user-activation propagation via CDP

Findings:

  • W3C spec requires "transient user activation" for getDisplayMedia (cite: https://w3c.github.io/mediacapture-screen-share/#dom-mediadevices-getdisplaymedia).
  • Chrome implementation accepts CDP-synthesized userGesture by default. Puppeteer's page.evaluate passes userGesture: true to Runtime.evaluate — has done so since pre-22 (cited in Puppeteer ExecutionContext.ts source).
  • Empirically (probe10): both page.evaluate(getDisplayMedia) and raw CDP Runtime.evaluate { userGesture: true } succeeded against the offscreen page. We do not need the workaround of "inject the call into a real click event handler in offscreen DOM."
  • Caveat: older Puppeteer issues (#13478) report crashes when closing pages with active media streams. Mitigation: explicitly call stream.getTracks().forEach(t => t.stop()) before browser.close(). Our existing teardown paths already do this — harness must also do it on its own probe streams.

Recommendation: No special handling needed. The production getDisplayMedia path runs unchanged in the test harness. Just ensure clean teardown.

9. --auto-select-desktop-capture-source reliability

Findings:

  • Flag is locale-specific (well known; smoke.sh already documents this). English builds use "Entire screen"; Russian "Весь экран".
  • Verified in probe10 + probe11 working with ="Entire screen" on this machine's en_US Chrome 148.
  • Headless gotcha: when running --headless=new, the picker UI never actually surfaces (no display), but the flag still pre-selects the source server-side. getDisplayMedia returns immediately. Confirmed probe11.
  • Conflict: --use-fake-ui-for-media-stream + --auto-select-desktop-capture-source → Chrome ignores the auto-select. Do not combine. (Cite: groups.google.com/g/discuss-webrtc/c/t0u6aVBfCgU.)
  • Newer flag --auto-accept-this-tab-capture exists for getDisplayMedia({ preferCurrentTab: true }) flow but is irrelevant to us (we use displaySurface: 'monitor').

Recommendation: Pass --auto-select-desktop-capture-source="Entire screen" in the harness Chrome args. Do NOT add --use-fake-ui-for-media-stream. For CI portability across locales, document that test runners must use en_US locale (LANG/LC_ALL env) or override with the locale-correct string. This matches the constraint already in production smoke.sh.

10. Two-bundle separation orchestration

Findings:

  • crxjs uses a static manifest.json (imported in vite.config.ts). Conditional content cannot be expressed through MODE alone inside that static object.
  • Workaround: TWO vite configs. vite.test.config.ts extends prod and swaps outDir + adds any test-specific manifest fields. Crxjs supports this — the manifest is just imported JS data; you can pre-process it per config file.
  • Alternative: ONE vite config that reads process.env.npm_lifecycle_event (set to 'build:test' when invoked via npm run build:test) and branches. Simpler but couples build logic to npm script names. NOT recommended.
  • Path: dist/ for prod, dist-test/ for test. The harness's enableExtensions arg points to dist-test/.

Recommended package.json changes:

{
  "scripts": {
    "dev": "vite",
    "build": "tsc && vite build",
    "build:test": "tsc && vite build --mode test --config vite.test.config.ts",
    "preview": "vite preview",
    "test": "vitest run",
    "test:uat": "npm run build:test && tsx tests/uat/harness.test.ts"
  }
}

Recommended vite.test.config.ts:

import { defineConfig, mergeConfig } from 'vite';
import baseConfig from './vite.config';

export default defineConfig((env) =>
  mergeConfig(baseConfig, {
    mode: 'test',
    build: { outDir: 'dist-test', emptyOutDir: true },
  })
);

The mode: 'test' plus the CLI --mode test flag together ensure import.meta.env.MODE === 'test' resolves to true at build time and the test-hook branch survives.

Recommendation: Two configs, two outputs, hook gated via import.meta.env.MODE === 'test' + dynamic import. Wave 0 grep verification that production dist/ has zero __mokoshTest strings.

Common Pitfalls

Pitfall 1: Wrong target type for offscreen

What: Looking for t.type() === 'page' on the offscreen doc; it's actually 'background_page'. Why: Chrome internally classifies extension offscreen documents under the legacy background_page type tag. Avoid: Filter t.type() === 'background_page' && t.url().includes('offscreen'). Use .asPage() not .page().

Pitfall 2: track.stop() doesn't fire 'ended'

What: Harness "simulates user-stopped" by calling track.stop(). Test passes silently. Production handler never ran. (See §7.) Avoid: Use track.dispatchEvent(new Event('ended')).

Pitfall 3: onClicked doesn't fire when popup is set

What: triggerExtensionAction succeeds but our SW handler doesn't fire. Why: MV3 spec: popup wins over onClicked. Plan 01-09 toggles setPopup({popup:''}) based on isRecording, so the harness must respect this state machine and only trigger toolbar clicks when popup is cleared (idle state) — OR call the captured handler ref directly via the hook. Avoid: Read popup state first; click only when popup is "". For "click during recording" assertion (popup opens, NO new picker), assert on popup state and on absence of new mediaStream, not on handler invocation.

Pitfall 4: Production bundle contains test hooks

What: Tree-shaking didn't strip the hook. Production ships with __mokoshTest exposed. Why: Vite tree-shaking only works on statically resolvable conditions. A dynamic env read (process.env.X) won't shake. Avoid: Use the literal import.meta.env.MODE === 'test'. Verify with grep -q __mokoshTest dist/** in build script.

Pitfall 5: SW dies mid-test

What: Harness runs for 40+ seconds without touching the SW; SW unloads; next sw.evaluate errors. Why: 30s idle rule (Chrome 110+); reset by chrome.* API calls. Avoid: Keepalive helper that pings chrome.runtime.getPlatformInfo() every 20s during long sequences. Most assertions touch chrome.* APIs anyway so this is mostly defensive.

Pitfall 6: --load-extension flag (deprecated)

What: Copying patterns from old blog posts that use puppeteer.launch({ args: ['--load-extension=' + path] }). Why: Removed in Chrome 137 branded builds (cite: developer.chrome.com/blog/extension-news-june-2025). Puppeteer's enableExtensions arg replaces it and passes the necessary --enable-unsafe-extension-debugging automatically. Avoid: Use enableExtensions: ['/abs/path/to/dist-test']. Period.

Pitfall 7: Locale-dependent --auto-select-desktop-capture-source

What: Test runs on a Russian-locale CI runner; "Entire screen" doesn't match; getDisplayMedia hangs at picker. Avoid: Document required locale in CI script; OR explicitly set LC_ALL=en_US.UTF-8 in the harness child_process env.

Code Examples

Minimal working harness skeleton (top-of-file imports + setup)

// tests/uat/harness.test.ts
// Run with: npm run build:test && tsx tests/uat/harness.test.ts
import { strict as assert } from 'node:assert';
import puppeteer, { Browser, Extension, Page } from 'puppeteer';
import path from 'node:path';
import { fileURLToPath } from 'node:url';

const __dirname = path.dirname(fileURLToPath(import.meta.url));
const distTest = path.resolve(__dirname, '../../dist-test');

const browser: Browser = await puppeteer.launch({
  pipe: true,
  enableExtensions: [distTest],
  headless: process.env.CI ? true : false,
  args: [
    '--no-sandbox',
    '--auto-select-desktop-capture-source=Entire screen',
  ],
});

const exts = await browser.extensions();
const [extId, ext] = [...exts][0];
const swTarget = await browser.waitForTarget(t => t.type() === 'service_worker', { timeout: 10_000 });
const sw = await swTarget.worker();
const page = await browser.newPage();
await page.goto('about:blank');

console.log(`harness: extId=${extId}`);

// VERIFIED PATH: triggerExtensionAction routes to onClicked when popup is ''
await sw.evaluate(() => chrome.action.setPopup({ popup: '' }));
await page.triggerExtensionAction(ext);
// ... wait briefly for offscreen + getDisplayMedia ...
await new Promise(r => setTimeout(r, 2000));

const badge = await sw.evaluate(() => chrome.action.getBadgeText({}));
assert.equal(badge, 'REC', 'badge should read REC after toolbar click in idle');

// ... 12 more assertions ...

await browser.close();
console.log('UAT harness: 13/13 assertions passed');

Synthetic chrome.notifications + iconUrl assertion (Bug A)

// Bug A: chrome.notifications.create rejects manifest URL paths in some Chrome
// builds when icon dimensions don't meet floor. We test that production code
// uses a known-valid path AND that the create() call succeeds.

// Method A: capture the actual notification options Production sends, via
// the SW test-hook (wrap chrome.notifications.create at hook init time).
// Method B: re-issue the same create() options ourselves and assert success.
// Plan should use Method A for fidelity.

const notifSnapshot = await sw.evaluate(() => globalThis.__mokoshTest.lastNotificationOptions);
assert.ok(notifSnapshot, 'production code must have called notifications.create');
assert.match(notifSnapshot.iconUrl, /icons\/icon48\.png$/, 'must use 48px iconUrl');
// Verify the file actually exists in the bundle
const iconBytes = await sw.evaluate(async () => {
  const r = await fetch(chrome.runtime.getURL('icons/icon48.png'));
  return r.ok ? r.headers.get('content-length') : null;
});
assert.ok(iconBytes && parseInt(iconBytes) > 100, 'icon48.png must exist in bundle');

Bug B assertion (user-stopped routes to OFF, not ERROR)

const off = browser.targets().find(t =>
  t.type() === 'background_page' && t.url().includes('offscreen'));
assert.ok(off, 'offscreen must exist (recording in progress)');
const offPage = await off.asPage();

// Snapshot notification side-effects before the synthetic event
const before = await sw.evaluate(() => globalThis.__mokoshTest.notificationCount);

// CRITICAL: dispatchEvent, NOT track.stop() — see RESEARCH §7
await offPage.evaluate(() => {
  const t = globalThis.__mokoshTest.getCurrentStream().getVideoTracks()[0];
  t.dispatchEvent(new Event('ended'));
});

// Wait for SW-side state transition (the offscreen handler posts a
// runtime message → SW handler updates badge + clears popup)
await new Promise(r => setTimeout(r, 300));

const badgeAfter = await sw.evaluate(() => chrome.action.getBadgeText({}));
const popupAfter = await sw.evaluate(() => chrome.action.getPopup({}));
const after = await sw.evaluate(() => globalThis.__mokoshTest.notificationCount);

assert.equal(badgeAfter, '', 'badge must be OFF after user-stopped');
assert.equal(popupAfter, '', 'popup must be cleared (back to idle)');
assert.equal(after, before, 'no new notification — user-stop is NOT an error');

State of the Art

Old Approach Current Approach When Changed Impact
--load-extension flag enableExtensions: [path] option Chrome 137 (mid-2025) Old code stops working on branded Chrome; Puppeteer's option auto-passes the right new flags
Headful + Xvfb for extension testing --headless=new works Chrome 109+, fully shipped by ~Chrome 120; Puppeteer 22+ default Cuts CI infrastructure: no display server needed
target.page() for any non-page target returns null target.asPage() returns a Page wrapper for background_page Puppeteer 22+ Lets us evaluate JS inside the offscreen document with the high-level Page API
chrome.action.onClicked clickable via DOM hacks page.triggerExtensionAction(ext) Puppeteer 22+ (commit d6395ef) Issue #2486 resolved upstream; no more workaround folklore

Deprecated/outdated:

  • puppeteer.launch({ args: ['--load-extension=...'] }) — superseded by enableExtensions
  • dappeteer — deprecated by upstream maintainers; Synpress is the modern replacement
  • The claim "Puppeteer can't load extensions in headless" (still widely echoed in blog posts) — verified false on current Chrome/Puppeteer

Assumptions Log

# Claim Section Risk if Wrong
A1 Issue puppeteer#8987 is a typo and doesn't refer to a specific limit on onClicked dispatch §1 LOW — the MV3 popup-vs-onClicked contract is the real constraint, verified by probe3
A2 ChromeDriver's "SW never terminates" claim extends to Puppeteer-CDP-attached SWs §2 LOW — defensive keepalive ping covers either case; the 30s idle rule resets on any chrome.* call
A3 crxjs 2.4.0 passes through Vite's MODE-conditional tree-shaking without interference for dynamic-imported test-hook modules §6 MEDIUM — must be verified in Wave 0 with explicit grep on built artifact. If wrong, plan must switch to two separate manifests or a build-time substitution plugin
A4 Production onUserStoppedSharing will treat a dispatchEvent-fired 'ended' identically to a real source-ended event §7 LOW — the handler is a pure event listener; it doesn't read event.isTrusted or any property; only the firing matters [VERIFIED by code read of src/offscreen/recorder.ts:451-480]
A5 Chrome 148+ Puppeteer behavior is stable across at least the next 6 months of Chrome releases §3 MEDIUM — Chrome's headless mode is generally stable post-109, but Chromium issue 40176215 was unreadable. If a future Chrome regresses headless screen capture, fall back to headful + Xvfb (smoke.sh already documents this path)
A6 The harness can rely on --auto-select-desktop-capture-source="Entire screen" working in CI runners' default locale (en_US) §9 LOW — most CI defaults to en_US; doc'd how to override if not

Open Questions for the Planner

  1. Where exactly does the simulateUserStop shim live?

    • Recommended: src/test-hooks/offscreen-hooks.ts imported conditionally from src/offscreen/recorder.ts after mediaStream is assigned. The hook reads mediaStream via a getter exposed at module load time: const __sharedRefs = { get currentStream() { return mediaStream; } }. Planner decides exact integration.
  2. Should the harness assert on the exact NUMBER of notifications, or set-membership?

    • Bug A and onStartup notification fire once each; recovery notification fires on RECORDING_ERROR. Counting is brittle if the SW retries. Set-membership (asserting on the types of notifications) is more robust. Planner decides UX.
  3. CI tool: GitHub Actions matrix? Or standalone shell?

    • Out of scope for THIS plan if there's no existing CI infrastructure. The harness should be a single npm run test:uat invocation that works locally; CI plumbing can be a separate plan.
  4. Failure isolation: do we kill Chrome between assertions, or keep one long-running browser instance?

    • Recommended: one browser, serial assertions (faster, fewer flakes from extension reload races). If an assertion fails mid-sequence, abort and dump SW + offscreen console logs. Planner decides whether to add retries or full-restart isolation.
  5. What's the contract for the test-hook surface?

    • globalThis.__mokoshTest shape needs to be declared as a TS type so both the SW side (registers) and the harness side (reads) agree. Place in tests/uat/lib/test-hook-contract.d.ts? Or src/test-hooks/types.ts? Both — planner decides if it's worth the duplication or a shared dir.

Environment Availability

Dependency Required By Available Version Fallback
Node.js Puppeteer 25 requires ≥22.12 v24.14.0
google-chrome-stable Puppeteer auto-fetch backup if missing 148.0.7778.167 Puppeteer downloads its own Chrome for Testing if missing
ffprobe WebM validity assertion 8.1.1
unzip (or jszip in-process) ZIP shape assertion ✓ via jszip already in deps
Xvfb Not required — headless mode supports getDisplayMedia not installed Optional; only needed if a future Chrome regresses headless capture

Missing dependencies with no fallback: none. Missing dependencies with fallback: Xvfb (defensive only).

Validation Architecture

TDD mode is ON. Each of the 13+ harness assertions is itself a test. The planner should structure Wave 0 to build the harness skeleton with stubs for all 13 assertions (initially failing), then Wave 1 to wire each assertion in turn (red → green per assertion).

Test Framework

Property Value
Framework node:test (built-in) OR plain node:assert/strict script — no extra dep
Config file none (harness is a single script)
Quick run command npm run test:uat (orchestrates build:test + harness)
Full suite command same

Phase Requirements → Test Map

The planner's 13 assertions map 1:1 to the brief's list:

Req Behavior Method
1 SW bootstrap → idle sw.evaluate badge text == '', popup == ''
2 onClicked-idle → REC trigger click → wait → assert badge 'REC' + popup set
3 displaySurface monitor offscreenPage.evaluate __mokoshTest.getCurrentStream().getVideoTracks()[0].getSettings().displaySurface == 'monitor'
4 click during recording → popup opens trigger click → assert popup state, NO new offscreen target spawned
5 SAVE_ARCHIVE → download sw.evaluate sendMessage SAVE → wait → check Downloads dir
6 user-stopped routes to OFF offscreenPage.evaluate dispatchEvent → assert no error notif + badge ''
7 RECORDING_ERROR codec → ERR badge + notif sw.evaluate sendMessage RECORDING_ERROR → assert badge 'ERR' + notif fired
8 onStartup → notification with iconUrl sw.evaluate __mokoshTest.handlers.onStartup() → assert notification create succeeded
9 icon files present sw.evaluate fetch icon URLs → assert HTTP 200 + size > floor
10 Manifest declares notifications + icons sw.evaluate chrome.runtime.getManifest() → assert permissions + icons
11 Buffer ≥3 segments after 35s sw.evaluate query offscreen state (via runtime message)
12 Remux passes ffprobe spawn ffprobe -v error -f matroska -i <path>, exit 0
13 Zip shape jszip parse → assert video/last_30sec.webm + meta.json keys

Sampling Rate

  • Per task commit: npm test (vitest unit) — fast (~5s)
  • Per wave merge: npm run test:uat (full harness ~60s)
  • Phase gate: harness green + smoke.sh still passes for operator brand check

Wave 0 Gaps

  • vite.test.config.ts — does not exist
  • tests/uat/harness.test.ts — does not exist (skeleton with 13 failing assertions)
  • tests/uat/lib/*.ts — helper modules
  • src/test-hooks/sw-hooks.ts + offscreen-hooks.ts — does not exist
  • package.json — add puppeteer + tsx devDeps + test:uat script
  • Grep check in build:test to fail loudly if production bundle contains __mokoshTest

Security Domain

ASVS Category Applies Standard Control
V2 Authentication no n/a (test harness only)
V5 Input Validation no n/a
V14 Configuration yes Test hook MUST NOT ship in production bundle — Wave 0 grep gate enforces
Pattern STRIDE Standard Mitigation
Test hook leaks into production → attacker invokes __mokoshTest.simulateUserStop from arbitrary page Tampering/Elevation of Privilege Build-time tree-shake gate + post-build grep verification

Sources

Primary (HIGH confidence — empirically verified in this session)

  • Local probes 1-11 against /home/parf/projects/work/repremium/dist on Chrome 148.0.7778.167 + Puppeteer 25.0.2 (probe scripts at /tmp/puppeteer-probe/probe*.js)
  • npm view puppeteer version → 25.0.2 (engines node ≥22.12.0)
  • /usr/bin/google-chrome-stable --version → 148.0.7778.167
  • Source code read: /home/parf/projects/work/repremium/src/offscreen/recorder.ts:275, 451-480 — confirms event-listener-only handler (no isTrusted check)

Secondary (HIGH confidence — official docs)

Tertiary (MEDIUM confidence — community sources)

Could not verify (flagged for planner)

  • Chromium issue 40176215 ("Headless must support getDisplayMedia") — auth-required page; status unreadable. Mitigated by empirical probe11 confirming current Chrome 148 supports it.
  • Full puppeteer CHANGELOG word-by-word search for extension API maturation — GitHub WebFetch returned only page chrome, not content. Inferred timeline from commit metadata + current docs.

Metadata

Confidence breakdown:

  • Standard stack: HIGH — verified against npm registry + working npm install
  • Architecture: HIGH — all five Patterns above demonstrated by local probes
  • Pitfalls 1-7: HIGH for 1-5 (probed), MEDIUM for 6-7 (cited but not probed; Pitfall 6 is doc'd in Chrome's own blog)
  • Bug B mechanism (§7): HIGH — both W3C spec cite AND empirical Chrome 148 reproduction
  • Two-bundle build (§6, §10): MEDIUM — design is correct per Vite docs but the specific A3 assumption needs Wave 0 verification grep on built artifact

Research date: 2026-05-17 Valid until: 2026-08-17 (90 days; longer than usual because the core APIs cited — W3C spec for MediaStreamTrack, Vite MODE behavior, Puppeteer extension API — are stable. Re-verify if Chrome major version jumps past 152 or Puppeteer past 26.)