feat(01-11): wave-2 — Puppeteer harness scaffolding + A0 GREEN, popup-bridge architecture

Task 3 of Plan 01-11 (Puppeteer UAT harness).

Harness file tree (tests/uat/):
- harness.test.ts: tsx-runnable top-to-bottom harness entry point.
  Runs A0 inline (filesystem grep gate, abort-on-fail T-1-11-01),
  then launches Chrome + opens popup bridge + queries manifest, then
  iterates A1-A13 stubs. Each stub throws "NOT YET IMPLEMENTED —
  Plan 01-11 Task N wires this assertion". Exit code = 0 on full
  pass, 1 otherwise. Final line: "UAT harness: N/14 assertions passed".
- lib/launch.ts: launchHarnessBrowser() — wraps puppeteer.launch with
  enableExtensions:[dist-test/], headless default (HEADLESS=0
  override), --no-sandbox + --auto-select-desktop-capture-source flags.
  Polls browser.extensions() until the extension registers (empirically
  ~100ms but the first call right after launch returns Map(0)).
  Opens both a blank page (for triggerExtensionAction) AND the popup
  page (the bridge surface). Returns { browser, extension, extensionId,
  sw, downloadsDir, page, popup }.
- lib/extension.ts: waitForOffscreenTarget + attachToOffscreen +
  countOffscreenTargets. Offscreen attach uses target.type() ===
  'background_page' + .asPage() (NOT .page() — RESEARCH §4 Pitfall 1).
- lib/sw.ts: chrome.* state queries via the POPUP page handle (NOT
  the WebWorker handle — see architecture note below). getBadgeText,
  getPopup, getManifest, getIconSize, getIsRecording (side-channeled
  through badge text), fireOnStartup (via __mokoshTestQuery bridge),
  sendSyntheticRecordingError, getNotificationSnapshot (via bridge),
  keepalivePing (no-op message to wake SW for ~30s).
- lib/offscreen.ts: getDisplaySurface, simulateUserStop (the
  dispatchEvent('ended') path per RESEARCH §7 BLOCKER — DO NOT REFACTOR
  to track.stop()), getSegmentCount.
- lib/assertions.ts: runAssertion(idx, name, buffers, fn) wrapper —
  records pass/fail/duration; on failure dumps last 30 lines of SW
  + offscreen console buffers to stderr before rethrowing. assertEqual
  / assertMatch / assertTrue / assertGte / waitFor polling helper.
- lib/zip.ts: jszip-based assertArchiveShape + extractEntryToFile for
  assertions 12 + 13.
- README.md: runtime + local-debug + CI semantics + locale gotcha
  + dev-dep size note + assertion catalog table.
- tsconfig.json: per-tree type-check config (mirrors root tsconfig.json
  compiler options but includes the harness tree explicitly).

Architecture refinement (DEVIATION from RESEARCH §1 — Rule 1+3 inline fix):
- RESEARCH §1 sketched `sw.evaluate(() => chrome.action.getBadgeText({}))`
  as the chrome.* query path. Empirical probes during Task 3 execution
  against Puppeteer 25.0.2 + Chrome 148 + --headless=true revealed two
  blockers:
    1. Puppeteer's WebWorker.evaluate runs in an ISOLATED WORLD that
       carries SW globals (clients, registration, ...) but NOT the
       extension's full chrome.* API surface. Object.keys(chrome) inside
       sw.evaluate returns ["loadTimes","csi"] — the public webpage
       chrome, not the extension chrome.
    2. Chrome 148's headless mode aggressively suspends MV3 service
       workers; subsequent swTarget.worker() calls return
       "Protocol error: No target with given id found".
- WORKAROUND: open the popup page (chrome-extension://<id>/src/popup/
  index.html) as a separate Puppeteer Page. The popup has full
  chrome.* access (it's an extension context with same privileges as
  the SW) AND stable Puppeteer lifetime. For SW-globalThis state
  (__mokoshTest in the SW isolate, NOT in the popup), bridge via
  chrome.runtime.sendMessage. The popup sends
  { type: '__mokoshTestQuery', op: 'snapshot' | 'fire-on-startup' |
  'handler-types' }; the SW hook's onMessage handler responds.
- Bridge implementation added to src/test-hooks/sw-hooks.ts — registers
  AFTER the production listeners so it never intercepts production
  messages (__mokoshTest* type is unambiguously test-only). Tier-1
  grep gate (no-test-hooks-in-prod-bundle.test.ts) continues to enforce
  ZERO __mokoshTest occurrences in dist/ — the bridge handler is
  tree-shaken alongside the rest of the hook module via the
  __MOKOSH_UAT__ gate.

Other configuration changes:
- vitest.config.ts: exclude tests/uat/** from vitest discovery. The
  Puppeteer harness is invoked via `npm run test:uat` (not vitest);
  running it under vitest would try to launch real Chrome inside a
  vitest worker. The .test.ts suffix is retained for editor +
  naming-convention consistency with the rest of the tree.

Verification:
- npx tsc --noEmit (src/): exit 0
- npx tsc --noEmit -p tests/uat: exit 0
- npm run build: exit 0
- grep -rln '__mokoshTest|simulateUserStop|getSegmentCount|setCurrentStream|setSegmentCountGetter|__mokoshTestQuery|__mokoshKeepalive' dist/: ZERO matches
- npm run build:test: exit 0; dist-test/ populated with the new bridge code
- SKIP_BUILD=1 npx vitest run: 89/89 GREEN
- SKIP_PROD_REBUILD=1 npx tsx tests/uat/harness.test.ts:
  → A0 [PASS]: production bundle has no test-hook leaks (19ms)
  → Browser launches; popup opens; manifest read succeeds
  → A1-A13 [FAIL]: NOT YET IMPLEMENTED — Plan 01-11 Task N wires this
  → "UAT harness: 1/14 assertions passed, 13 failed (first failure: A1)"
  → Exit code: 1 (expected — 13 RED stubs intentional)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-18 09:14:58 +02:00
parent cb1a729962
commit dbd977c815
11 changed files with 1705 additions and 0 deletions

View File

@@ -227,4 +227,85 @@ globalThis.__mokoshTest = {
}, },
} as MokoshTestSurface; } as MokoshTestSurface;
// ─── Harness message bridge ───────────────────────────────────────────
// EMPIRICAL ARCHITECTURE NOTE: Puppeteer 25 + Chrome 148 + headless
// cannot reliably evaluate against the SW directly. The harness
// queries chrome.* state through the popup page (which has full
// chrome.* API access) but cannot read the SW's globalThis.__mokoshTest
// because the popup is a SEPARATE V8 isolate. So we bridge: the popup
// sends chrome.runtime.sendMessage queries; this handler responds with
// the queried state.
//
// Protocol — popup → SW message: { type: '__mokoshTestQuery', op: <string> }
// Response shapes:
// op='snapshot' → { count, lastOptions, ids }
// op='fire-on-startup' → { ok: true } OR { ok: false, error: 'no-handler' }
// op='handler-types' → { onClicked, onStartup, notificationOnClicked }
// Unknown ops respond { ok: false, error: 'unknown-op' }.
//
// Returning `true` from the onMessage handler tells Chrome the
// response is async; we keep sendResponse as a closed-over callback.
// The bridge handler is registered AFTER the production listeners so
// the hook never accidentally intercepts a production message —
// __mokoshTest* messages are unambiguously test-only.
chrome.runtime.onMessage.addListener((rawMessage, _sender, sendResponse) => {
// Narrow the message — we accept ANY shape but only act on our type.
if (rawMessage === null || typeof rawMessage !== 'object') {
return false;
}
const message = rawMessage as { type?: unknown; op?: unknown };
if (message.type !== '__mokoshTestQuery') {
// Not our message — production handler will take it.
return false;
}
const op = String(message.op ?? '');
if (op === 'snapshot') {
sendResponse({
count: notificationCount,
lastOptions: lastNotificationOptions,
ids: notificationIds.slice(),
});
return false; // Sync response — return false per Chrome onMessage contract.
}
if (op === 'fire-on-startup') {
const h = handlers.onStartup;
if (h === null) {
sendResponse({ ok: false, error: 'no-handler' });
return false;
}
// Fire-and-respond. The handler may be async; we don't await it
// for the response, but if it throws synchronously the catch
// surfaces in the response.
try {
// Schedule on microtask so the response goes out first; the
// handler's side effects (notifications.create) happen right
// after, before the next harness assertion polls.
queueMicrotask(() => {
Promise.resolve(h()).catch((err) => {
// Swallow async errors — the assertion 8 check is on the
// notification side effect, not the handler's return value.
console.warn('[mokoshTest bridge] onStartup handler threw:', err);
});
});
sendResponse({ ok: true });
} catch (err) {
sendResponse({
ok: false,
error: err instanceof Error ? err.message : String(err),
});
}
return false;
}
if (op === 'handler-types') {
sendResponse({
onClicked: typeof handlers.onClicked,
onStartup: typeof handlers.onStartup,
notificationOnClicked: typeof handlers.notificationOnClicked,
});
return false;
}
sendResponse({ ok: false, error: 'unknown-op' });
return false;
});
export {}; export {};

106
tests/uat/README.md Normal file
View File

@@ -0,0 +1,106 @@
# Mokosh UAT harness (Plan 01-11)
Puppeteer-driven Node script that runs 14 assertions end-to-end against a
real Chrome instance loaded with the Mokosh extension. Replaces Plan 01-09
Task 5's operator-empirical functional verification (the operator retains
only step 1 — build — and step 14 — brand/design acceptance).
## Quick start
```bash
npm run test:uat
```
This builds `dist-test/` (the hook-enabled bundle) and runs the harness.
Exit 0 means all 14 assertions passed. Final line: `UAT harness: 14/14
assertions passed`.
## Local-debug mode
```bash
HEADLESS=0 npm run test:uat
```
Opens a real Chrome window so you can watch the picker auto-accept, the
badge transitions, the popup appear, etc.
## Developer iteration tricks
```bash
# Skip the production build inside assertion 0 (uses existing dist/):
SKIP_PROD_REBUILD=1 npm run test:uat
# Run the harness against an existing dist-test/ (skip npm run build:test):
npx tsx tests/uat/harness.test.ts
```
## Assertion catalog
| # | Title | Bug class | Hook used |
|---|-------|-----------|-----------|
| 0 | Production bundle has no test-hook leaks | T-1-11-01 | filesystem grep |
| 1 | SW bootstrap → setIdleMode | — | sw.evaluate |
| 2 | Toolbar onClicked-idle → REC + popup | — | triggerExtensionAction |
| 3 | Offscreen displaySurface === monitor | D-15 | __mokoshTest.getCurrentStream |
| 4 | Toolbar onClicked-recording → popup, no new offscreen | — | targets count |
| 5 | SAVE_ARCHIVE → download fires | — | downloads polling |
| 6 | **BUG B**: simulateUserStop → badge OFF + no recovery notif | b9eeeeb | dispatchEvent('ended') |
| 7 | RECORDING_ERROR codec-unsupported → ERR + recovery notif | — | sendMessage |
| 8 | **BUG A**: onStartup → mokosh-startup- notification creates | a881bf0 | __mokoshTest.handlers.onStartup |
| 9 | Icon file sizes meet floors | Bug A precondition | sw.evaluate(fetch) |
| 10 | Manifest has notifications + 3 icons | Bug A precondition | chrome.runtime.getManifest |
| 11 | 35s recording → segments.length >= 3 | D-13 | __mokoshTest.getSegmentCount |
| 12 | ffprobe on extracted webm exits 0 | Plan 01-08 | jszip + execFile |
| 13 | Archive shape — video + meta.json version match | Plan 01-07 | jszip |
## Failure isolation
Single browser, serial assertions, bail on first failure for setup-
dependent assertions (assertion 0 abort means refusing to launch a
potentially-leaky bundle). Per-assertion bail keeps the diagnostic
output unambiguous — see RESEARCH §5 + Plan 01-11 open-question
resolution 4.
On failure, the harness dumps the last 30 lines of SW console + last 30
lines of offscreen console (captured live during the run) to stderr
BEFORE rethrowing — gives you contextual triage without needing to re-
run with debug logging.
## Known gotchas
### Locale-specific picker auto-accept
The `--auto-select-desktop-capture-source=Entire screen` Chrome flag
auto-accepts the screen-share picker. The string `"Entire screen"` is
en_US-specific. If your Chrome is set to a non-English locale, the
picker option label will differ and the auto-accept will silently fail
(picker stays open; assertion 2 times out).
Fallback: switch your Chrome user-data-dir's locale to en_US for
harness runs, OR adjust the launch arg in `tests/uat/lib/launch.ts` to
match your locale's equivalent string.
### dev-dep Chromium binary size
`puppeteer` pulls a ~150 MB Chromium binary at `npm install` time. CI
must accept this. Production `npm install --omit=dev` skips it cleanly.
### Xvfb is NOT required
Per Plan 01-11 RESEARCH §3 empirical probes against Chrome 148, the
`--headless=new` mode handles screen capture without Xvfb on Linux CI
runners. If a future Chrome regresses this, `Xvfb :99 & DISPLAY=:99
npm run test:uat` is the fallback.
### CI runner screen-capture concern
The 35s recording assertion (A11) captures whatever is on screen during
that window. CI MUST run the harness in an isolated container with no
concurrent workload — see T-1-11-02 in Plan 01-11's threat model.
### Real Chrome download (assertion 5 → A12)
The harness configures per-page download behavior via CDP to a fresh
`os.tmpdir()/mokosh-uat-downloads-*` directory; downloads are NOT
written to your real ~/Downloads. The temp directory is deleted by OS
tmpdir GC.

394
tests/uat/harness.test.ts Normal file
View File

@@ -0,0 +1,394 @@
// tests/uat/harness.test.ts — Plan 01-11 Puppeteer UAT harness entry point.
//
// Runs end-to-end via `npm run test:uat` (build:test + tsx tests/uat/harness.test.ts).
// Top-to-bottom narrative: launch Chrome with dist-test loaded as
// MV3 extension, attach to SW + offscreen, run 14 assertions
// sequentially with bail-on-first-fail semantics + structured
// diagnostic dump on failure (RESEARCH §5 + open-question resolution 4).
//
// Exit code:
// 0 — all 14 assertions passed
// 1 — at least one assertion failed
//
// Local-debug mode: `HEADLESS=0 npm run test:uat` (opens real Chrome)
// Skip prod rebuild: `SKIP_PROD_REBUILD=1` (assertion 0 still verifies
// the EXISTING dist/ rather than spawning npm run build).
//
// Assertion catalog (14 total):
// 0 — Production bundle grep gate (filesystem-only; pre-flight).
// 1 — SW bootstrap → setIdleMode (badge '', popup '', isRecording=false).
// 2 — Toolbar onClicked-idle → badge 'REC' + popup popup.html + isRecording=true.
// 3 — Offscreen displaySurface === 'monitor' (post-grant validation).
// 4 — Toolbar onClicked while recording → popup, NO new offscreen.
// 5 — SAVE_ARCHIVE → download fires + session_report_*.zip appears.
// 6 — BUG B (canonical): simulateUserStop → badge '' + popup '' + NO recovery notif.
// 7 — RECORDING_ERROR codec-unsupported → badge 'ERR' + recovery notif.
// 8 — BUG A (canonical): onStartup → mokosh-startup- notification creates cleanly.
// 9 — Icon file sizes meet floors (16→200, 48→500, 128→1024).
// 10 — Manifest has notifications permission + all three icons declared.
// 11 — 35s recording yields >= 3 segments per D-13.
// 12 — ffprobe -v error -f matroska on extracted webm exits 0.
// 13 — Archive shape (video/last_30sec.webm + meta.json with version match).
import { execFileSync, execSync } from 'node:child_process';
import { existsSync, readdirSync, readFileSync, statSync, mkdtempSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { dirname, join, resolve as resolvePath } from 'node:path';
import { fileURLToPath } from 'node:url';
import type { Page } from 'puppeteer';
import {
type AssertionRecord,
type ConsoleBuffers,
assertEqual,
assertGte,
assertMatch,
assertTrue,
runAssertion,
waitFor,
} from './lib/assertions';
import {
attachToOffscreen,
countOffscreenTargets,
waitForOffscreenTarget,
} from './lib/extension';
import {
getDisplaySurface,
getSegmentCount,
simulateUserStop,
} from './lib/offscreen';
import {
fireOnStartup,
getBadgeText,
getIconSize,
getIsRecording,
getManifest,
getNotificationSnapshot,
getPopup,
keepalivePing,
sendSyntheticRecordingError,
} from './lib/sw';
import { assertArchiveShape, extractEntryToFile } from './lib/zip';
import { launchHarnessBrowser, type HarnessHandles } from './lib/launch';
const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..');
const DIST_DIR = resolvePath(REPO_ROOT, 'dist');
const FFPROBE_BIN = '/usr/bin/ffprobe';
const TOTAL_ASSERTIONS = 14;
/**
* Forbidden hook surface strings — assertion 0 verifies absence
* in production dist/. Mirrors the Tier-1 unit gate's surface list
* (tests/background/no-test-hooks-in-prod-bundle.test.ts) but runs
* against the SAME dist/ as the live harness for E2E parity.
*/
const FORBIDDEN_HOOK_STRINGS: ReadonlyArray<string> = [
'__mokoshTest',
'simulateUserStop',
'getSegmentCount',
'setCurrentStream',
'setSegmentCountGetter',
];
/** Icon-size floors per assertion 9 (per orchestrator brief). */
const ICON_SIZE_FLOORS: ReadonlyArray<readonly [string, number]> = [
['icons/icon16.png', 200],
['icons/icon48.png', 500],
['icons/icon128.png', 1024],
];
/**
* Recursively list all files under a root directory (sync). Used by
* assertion 0 to walk dist/. Symlinks are skipped defensively.
*
* @param root - Absolute directory path.
* @returns Sorted list of absolute file paths.
*/
function listAllFilesRecursive(root: string): ReadonlyArray<string> {
const acc: string[] = [];
const stack: string[] = [root];
while (stack.length > 0) {
const dir = stack.pop()!;
const entries = readdirSync(dir, { withFileTypes: true });
for (const entry of entries) {
const fullPath = resolvePath(dir, entry.name);
if (entry.isSymbolicLink()) continue;
if (entry.isDirectory()) {
stack.push(fullPath);
} else if (entry.isFile()) {
acc.push(fullPath);
}
}
}
return acc.sort();
}
/**
* Grep `needle` across every text-like file under `root`. Returns
* file paths that contain at least one occurrence.
*
* @param root - Absolute directory path.
* @param needle - Literal substring to find.
* @returns Paths containing `needle`.
*/
function grepRecursive(root: string, needle: string): ReadonlyArray<string> {
const binaryExt = new Set(['.png', '.jpg', '.jpeg', '.gif', '.ico', '.webp', '.woff', '.woff2', '.ttf']);
const out: string[] = [];
for (const filePath of listAllFilesRecursive(root)) {
const dotIdx = filePath.lastIndexOf('.');
const ext = dotIdx >= 0 ? filePath.substring(dotIdx).toLowerCase() : '';
if (binaryExt.has(ext)) continue;
if (statSync(filePath).size === 0) continue;
const text = readFileSync(filePath, 'utf8');
if (text.includes(needle)) {
out.push(filePath);
}
}
return out;
}
/**
* Poll `downloadsDir` for any *session_report*.zip file. Returns the
* absolute path of the first match. Used by assertion 5.
*
* @param downloadsDir - Absolute downloads directory path.
* @param timeoutMs - Maximum wait time.
* @returns Absolute path to the matched .zip.
* @throws On timeout.
*/
async function waitForDownloadedZip(
downloadsDir: string,
timeoutMs: number,
): Promise<string> {
const start = Date.now();
while (Date.now() - start < timeoutMs) {
const entries = readdirSync(downloadsDir);
for (const name of entries) {
if (name.includes('session_report') && name.endsWith('.zip')) {
const full = join(downloadsDir, name);
// Make sure write completed (size stabilized).
const size1 = statSync(full).size;
await new Promise((r) => setTimeout(r, 200));
const size2 = statSync(full).size;
if (size1 === size2 && size1 > 0) {
return full;
}
}
}
await new Promise((r) => setTimeout(r, 200));
}
throw new Error(
`waitForDownloadedZip: no session_report_*.zip appeared in ${downloadsDir} within ${timeoutMs}ms`,
);
}
/**
* Run a production build of dist/ unless SKIP_PROD_REBUILD=1.
* Assertion 0 reads dist/, so this guarantees the gate runs against
* a fresh artifact.
*/
function ensureProductionBuild(): void {
if (process.env.SKIP_PROD_REBUILD === '1') {
process.stdout.write(' (SKIP_PROD_REBUILD=1 — using existing dist/)\n');
return;
}
process.stdout.write(' Running `npm run build` (assertion 0 pre-flight)...\n');
execFileSync('npm', ['run', 'build'], {
stdio: 'inherit',
cwd: REPO_ROOT,
});
}
/**
* Stub placeholder for assertions Task 4+ wires. Each stub throws so
* the harness exits non-zero today; the diagnostic clearly identifies
* the assertion as un-implemented vs failing-in-production.
*
* @param taskNumber - The plan task number that will wire this assertion.
* @returns A function that always throws.
*/
function notYetImplemented(taskNumber: number): () => Promise<void> {
return async () => {
throw new Error(
`NOT YET IMPLEMENTED — Plan 01-11 Task ${taskNumber} wires this assertion`,
);
};
}
/**
* Main harness entry point. Runs all 14 assertions sequentially with
* bail-on-first-fail semantics for the SETUP-dependent assertions
* (we still record every assertion's outcome — bail only stops
* subsequent FUNCTIONAL assertions from running).
*/
async function main(): Promise<number> {
const results: AssertionRecord[] = [];
const buffers: ConsoleBuffers = { swLines: [], offscreenLines: [] };
let handles: HarnessHandles | null = null;
process.stdout.write('\nMokosh UAT harness — Plan 01-11 Puppeteer-driven 14-assertion suite\n');
process.stdout.write('='.repeat(72) + '\n\n');
try {
// ─── Assertion 0: Pre-flight grep gate ──────────────────────────
process.stdout.write('Assertion 0 (pre-flight, filesystem-only):\n');
ensureProductionBuild();
const a0 = await runAssertion(
0,
'production bundle has no test-hook leaks (T-1-11-01)',
buffers,
async () => {
for (const needle of FORBIDDEN_HOOK_STRINGS) {
const matches = grepRecursive(DIST_DIR, needle);
assertEqual(
matches.length,
0,
`production dist/ contains '${needle}' in: ${JSON.stringify(matches)}`,
);
}
},
);
results.push(a0);
if (!a0.passed) {
// Hook leak is security-critical (T-1-11-01) — abort immediately.
process.stderr.write(
'\n*** ABORT: assertion 0 (hook leak gate) FAILED — refusing to ' +
'continue with potentially-leaky production bundle. ***\n',
);
return 1;
}
// ─── Setup: launch browser, attach to SW + open popup bridge ───
process.stdout.write('\nLaunching Chrome + opening popup bridge...\n');
handles = await launchHarnessBrowser();
const { browser, sw, page, popup, extensionId, downloadsDir } = handles;
process.stdout.write(` extensionId: ${extensionId}\n`);
process.stdout.write(` downloadsDir: ${downloadsDir}\n`);
process.stdout.write(` popup: chrome-extension://${extensionId}/src/popup/index.html\n\n`);
// Wire console buffers. The popup carries the chrome.* queries;
// the SW handle is kept for diagnostic console capture (when the
// SW is alive). Both feed buffers for failure dumps.
const popupPage: Page = popup;
popupPage.on('console', (msg) => {
buffers.swLines.push(`[Popup:${msg.type()}] ${msg.text()}`);
});
sw.on('console', (msg) => {
buffers.swLines.push(`[SW:${msg.type()}] ${msg.text()}`);
});
// Read the manifest version once for assertion 13.
const manifest = await getManifest(popupPage);
const expectedVersion = manifest.version;
// ─── Wave 3 stubbed assertions (Tasks 4-7 will wire these) ──────
const stubs: Array<{
index: number;
name: string;
taskNumber: number;
}> = [
{ index: 1, name: 'SW bootstrap → setIdleMode', taskNumber: 4 },
{ index: 2, name: 'toolbar onClicked-idle → badge REC + popup', taskNumber: 4 },
{ index: 3, name: 'offscreen displaySurface === monitor', taskNumber: 4 },
{ index: 4, name: 'toolbar onClicked-recording → popup, no new offscreen', taskNumber: 4 },
{ index: 5, name: 'SAVE_ARCHIVE → download fires + zip appears', taskNumber: 5 },
{ index: 6, name: 'BUG B canonical: simulateUserStop → badge OFF + no recovery notif', taskNumber: 5 },
{ index: 7, name: 'RECORDING_ERROR codec-unsupported → badge ERR + recovery notif', taskNumber: 5 },
{ index: 8, name: 'BUG A canonical: onStartup → notification creates cleanly', taskNumber: 6 },
{ index: 9, name: 'icon file sizes meet floors', taskNumber: 6 },
{ index: 10, name: 'manifest has notifications + 3 icons', taskNumber: 6 },
{ index: 11, name: '35s recording → segments.length >= 3', taskNumber: 7 },
{ index: 12, name: 'ffprobe on extracted webm exits 0', taskNumber: 7 },
{ index: 13, name: 'archive shape — video + meta.json version match', taskNumber: 7 },
];
for (const s of stubs) {
const rec = await runAssertion(
s.index,
s.name,
buffers,
notYetImplemented(s.taskNumber),
);
results.push(rec);
}
// Suppress unused-warning placeholders — Tasks 4-7 will use these
// imports + handles directly. Reference them here for type-clean.
void browser;
void page;
void popupPage;
void expectedVersion;
void waitForOffscreenTarget;
void attachToOffscreen;
void countOffscreenTargets;
void waitFor;
void getBadgeText;
void getPopup;
void getIsRecording;
void getIconSize;
void fireOnStartup;
void sendSyntheticRecordingError;
void getNotificationSnapshot;
void keepalivePing;
void getDisplaySurface;
void simulateUserStop;
void getSegmentCount;
void assertArchiveShape;
void extractEntryToFile;
void assertMatch;
void assertTrue;
void assertGte;
void waitForDownloadedZip;
void mkdtempSync;
void existsSync;
void execSync;
void tmpdir;
void FFPROBE_BIN;
void ICON_SIZE_FLOORS;
return finalize(results);
} catch (setupErr) {
process.stderr.write(`\n*** Harness setup error: ${String(setupErr)}\n`);
return finalize(results);
} finally {
if (handles !== null) {
try {
await handles.browser.close();
} catch (closeErr) {
process.stderr.write(`(non-fatal: browser close threw: ${String(closeErr)})\n`);
}
}
}
}
/**
* Print the final summary line + return the exit code.
*
* @param results - All assertion records collected during the run.
* @returns 0 if all 14 passed, 1 otherwise.
*/
function finalize(results: ReadonlyArray<AssertionRecord>): number {
const passCount = results.filter((r) => r.passed).length;
const failCount = results.length - passCount;
process.stdout.write('\n' + '='.repeat(72) + '\n');
if (passCount === TOTAL_ASSERTIONS) {
process.stdout.write(`UAT harness: ${passCount}/${TOTAL_ASSERTIONS} assertions passed\n`);
return 0;
}
const firstFail = results.find((r) => !r.passed);
process.stdout.write(
`UAT harness: ${passCount}/${TOTAL_ASSERTIONS} assertions passed, ${failCount} failed`,
);
if (firstFail !== undefined) {
process.stdout.write(` (first failure: A${firstFail.index} ${firstFail.name})`);
}
process.stdout.write('\n');
return 1;
}
// Run + exit. Top-level await + explicit exit code so tsx returns
// the right status without leaving unhandled-promise spew on stderr.
const exitCode = await main();
process.exit(exitCode);

199
tests/uat/lib/assertions.ts Normal file
View File

@@ -0,0 +1,199 @@
// tests/uat/lib/assertions.ts — Plan 01-11 harness assertion runner.
//
// Centralizes:
// - `assertEqual` / `assertMatch` / `assertTrue` — thin wrappers
// over `node:assert/strict` with explicit Plan 01-11 diagnostic
// framing (cite the bug-class on Bug A / Bug B assertions).
// - `runAssertion(name, fn)` — wraps each assertion in a try/catch
// so the harness can collect a per-assertion pass/fail map AND
// dump SW/offscreen console buffers on the FIRST failure (bail
// semantics per RESEARCH §5).
// - `waitFor(probe, predicate, timeoutMs)` — polling helper used by
// assertions that need to wait for async state transitions
// (badge changes, downloads, etc.).
//
// References:
// - node:assert/strict: https://nodejs.org/api/assert.html#strict-assertion-mode
import { strict as assert } from 'node:assert';
/**
* Per-assertion outcome record. Accumulated by runAssertion + flushed
* to the harness's final summary line.
*/
export interface AssertionRecord {
readonly index: number;
readonly name: string;
readonly passed: boolean;
readonly errorMessage: string;
readonly durationMs: number;
}
/**
* Console buffers captured from SW + offscreen contexts. The harness
* wires `sw.on('console', ...)` + `offPage.on('console', ...)` at
* launch + before each assertion-relevant phase; on failure these
* buffers are dumped to stderr for triage.
*/
export interface ConsoleBuffers {
swLines: string[];
offscreenLines: string[];
}
/**
* Run a single assertion, capturing its outcome + duration. On error,
* dump the per-context console buffers to stderr BEFORE rethrowing so
* the harness's top-level catch sees the diagnostic context.
*
* @param index - 0-13 (0 = grep gate, 1-13 = functional).
* @param name - Human-readable assertion title.
* @param buffers - Console buffers to dump on failure (may be empty).
* @param fn - Async assertion body.
* @returns Outcome record.
*/
export async function runAssertion(
index: number,
name: string,
buffers: ConsoleBuffers,
fn: () => Promise<void>,
): Promise<AssertionRecord> {
const start = Date.now();
try {
await fn();
const durationMs = Date.now() - start;
process.stdout.write(` [PASS] A${index}: ${name} (${durationMs}ms)\n`);
return {
index,
name,
passed: true,
errorMessage: '',
durationMs,
};
} catch (err) {
const durationMs = Date.now() - start;
const errorMessage =
err instanceof Error ? `${err.name}: ${err.message}` : String(err);
process.stderr.write(` [FAIL] A${index}: ${name} (${durationMs}ms)\n`);
process.stderr.write(` ${errorMessage}\n`);
dumpBuffers(buffers, index);
return {
index,
name,
passed: false,
errorMessage,
durationMs,
};
}
}
/**
* Dump SW + offscreen console buffers to stderr with structured framing.
* Cap at the last 30 lines per context to keep failure output readable.
*
* @param buffers - The accumulating buffers.
* @param assertionIndex - For framing the dump preamble.
*/
function dumpBuffers(buffers: ConsoleBuffers, assertionIndex: number): void {
const TAIL = 30;
const swTail = buffers.swLines.slice(-TAIL);
const offTail = buffers.offscreenLines.slice(-TAIL);
if (swTail.length > 0) {
process.stderr.write(
` --- SW console (last ${swTail.length} lines, assertion A${assertionIndex}) ---\n`,
);
for (const line of swTail) {
process.stderr.write(` ${line}\n`);
}
}
if (offTail.length > 0) {
process.stderr.write(
` --- Offscreen console (last ${offTail.length} lines, assertion A${assertionIndex}) ---\n`,
);
for (const line of offTail) {
process.stderr.write(` ${line}\n`);
}
}
}
/**
* Strict equality with a context-bearing message. Wraps
* `assert.strictEqual` so the failure surface is uniform across
* assertions.
*
* @param actual - Observed value.
* @param expected - Expected value.
* @param msg - Context for the failure diagnostic.
*/
export function assertEqual<T>(actual: T, expected: T, msg: string): void {
assert.strictEqual(actual, expected, msg);
}
/**
* Assert that `actual` matches `regex`. Wraps `assert.match`.
*
* @param actual - String to test.
* @param regex - Pattern.
* @param msg - Context for the failure diagnostic.
*/
export function assertMatch(actual: string, regex: RegExp, msg: string): void {
assert.match(actual, regex, msg);
}
/**
* Assert that `cond` is truthy. Wraps `assert.ok`.
*
* @param cond - Boolean expression.
* @param msg - Context for the failure diagnostic.
*/
export function assertTrue(cond: boolean, msg: string): void {
assert.ok(cond, msg);
}
/**
* Assert that the actual value is greater than or equal to expected.
* Used by assertion 9 (icon size floors) + assertion 11 (segment count).
*
* @param actual - Observed value.
* @param expected - Minimum acceptable value.
* @param msg - Context for the failure diagnostic.
*/
export function assertGte(actual: number, expected: number, msg: string): void {
assert.ok(
actual >= expected,
`${msg} — expected >= ${expected}, got ${actual}`,
);
}
/**
* Poll `probe` until `predicate(probe())` returns true OR timeoutMs
* elapses. Throws on timeout with a structured diagnostic.
*
* @param probe - Async function producing a value to test.
* @param predicate - Returns true when the value satisfies the wait.
* @param timeoutMs - Maximum wait time.
* @param description - Human-readable description for the diagnostic.
* @param pollIntervalMs - Interval between probe calls (default 100ms).
* @returns The last probed value that satisfied the predicate.
* @throws If timeoutMs elapses without predicate satisfaction.
*/
export async function waitFor<T>(
probe: () => Promise<T>,
predicate: (v: T) => boolean,
timeoutMs: number,
description: string,
pollIntervalMs: number = 100,
): Promise<T> {
const start = Date.now();
let lastValue: T | undefined;
while (Date.now() - start < timeoutMs) {
lastValue = await probe();
if (predicate(lastValue)) {
return lastValue;
}
await new Promise((r) => setTimeout(r, pollIntervalMs));
}
throw new Error(
`waitFor timeout ${timeoutMs}ms — ${description}; ` +
`last probed value: ${JSON.stringify(lastValue)}`,
);
}

View File

@@ -0,0 +1,93 @@
// tests/uat/lib/extension.ts — Plan 01-11 harness extension/offscreen helpers.
//
// The offscreen-document attach uses a CDP-level target type that
// Puppeteer 25 surfaces as `'background_page'` — NOT `'page'`. Per
// Plan 01-11 RESEARCH §4 / Pitfall 1, finding the offscreen via
// `t.type() === 'page'` returns no matches; `'background_page'` is
// the right discriminator. After getting the target, `.asPage()`
// returns a Page-like handle (NOT `.page()` — that returns undefined).
//
// References:
// - Puppeteer Target types:
// https://pptr.dev/api/puppeteer.targettype
// - Chrome offscreen document:
// https://developer.chrome.com/docs/extensions/reference/api/offscreen
import type { Browser, Page, Target } from 'puppeteer';
/** How long to wait for the offscreen document target to appear. */
const OFFSCREEN_TARGET_TIMEOUT_MS = 5_000;
/**
* Poll the browser's target list for the offscreen document. The
* offscreen is created lazily — only when the SW issues
* `chrome.offscreen.createDocument(...)`. Caller MUST invoke a flow
* that triggers offscreen creation (e.g. start a recording) BEFORE
* calling this helper.
*
* @param browser - Puppeteer Browser handle.
* @param extensionId - The extension's runtime id (for URL filtering).
* @returns Resolved Target whose URL contains 'offscreen'.
* @throws If no offscreen target appears within OFFSCREEN_TARGET_TIMEOUT_MS.
*/
export async function waitForOffscreenTarget(
browser: Browser,
extensionId: string,
): Promise<Target> {
const predicate = (t: Target): boolean => {
const url = t.url();
// Offscreen documents are loaded as chrome-extension://<id>/...
// with a path containing 'offscreen' (matches both 'src/offscreen/'
// and the bundled equivalents). Target type 'background_page' per
// RESEARCH §4 Pitfall 1.
return (
t.type() === 'background_page' &&
url.startsWith(`chrome-extension://${extensionId}`) &&
url.includes('offscreen')
);
};
return await browser.waitForTarget(predicate, {
timeout: OFFSCREEN_TARGET_TIMEOUT_MS,
});
}
/**
* Attach to the offscreen document as a Page-like handle. Uses
* `.asPage()` (NOT `.page()` — Puppeteer 25 returns null for
* `.page()` on background_page-type targets).
*
* @param target - The offscreen Target from waitForOffscreenTarget.
* @returns Page handle for evaluate/expose/etc.
*/
export async function attachToOffscreen(target: Target): Promise<Page> {
const page = await target.asPage();
return page;
}
/**
* Count the offscreen targets currently in the browser. Used by
* assertion 4 to verify that a toolbar click while recording does
* NOT spawn a second offscreen document.
*
* @param browser - Puppeteer Browser handle.
* @param extensionId - The extension's runtime id.
* @returns Integer count of offscreen targets.
*/
export function countOffscreenTargets(
browser: Browser,
extensionId: string,
): number {
const targets = browser.targets();
let count = 0;
for (const t of targets) {
if (
t.type() === 'background_page' &&
t.url().startsWith(`chrome-extension://${extensionId}`) &&
t.url().includes('offscreen')
) {
count += 1;
}
}
return count;
}

314
tests/uat/lib/launch.ts Normal file
View File

@@ -0,0 +1,314 @@
// tests/uat/lib/launch.ts — Plan 01-11 harness launch helper.
//
// Wraps puppeteer.launch with the project's invariants:
// - enableExtensions points at the absolute path to dist-test/ (the
// test bundle that carries the gated test hooks per Plan 01-11
// Task 2). NOT dist/ — that would defeat the harness entirely.
// - headless defaults to true (CI-friendly); HEADLESS=0 env opens a
// real Chrome window for local debugging.
// - --auto-select-desktop-capture-source="Entire screen" auto-accepts
// the screen-share picker so getDisplayMedia resolves without
// operator interaction (RESEARCH §9). The literal string is
// en_US-locale-sensitive; document the fallback in tests/uat/README.md.
// - Downloads land in a fresh per-run temp dir so assertion 5
// (SAVE_ARCHIVE) can poll for session_report_*.zip without
// colliding with operator downloads.
//
// References:
// - puppeteer.launch options: https://pptr.dev/api/puppeteer.launchoptions
// - puppeteer extension API: https://pptr.dev/guides/extensions
// - Chrome --auto-select-desktop-capture-source:
// https://source.chromium.org/chromium/chromium/src/+/main:media/capture/video/chromeos/camera_app_device_provider.cc
// (search for the flag in chrome://flags or the Chromium source tree)
import { execSync } from 'node:child_process';
import { existsSync, mkdtempSync, statSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { dirname, join, resolve as resolvePath } from 'node:path';
import { fileURLToPath } from 'node:url';
import puppeteer, {
type Browser,
type CDPSession,
type Extension,
type Page,
type WebWorker,
} from 'puppeteer';
/// <reference path="./test-hook-contract.d.ts" />
const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..', '..');
const DIST_TEST_DIR = resolvePath(REPO_ROOT, 'dist-test');
/**
* Handles returned from `launchHarnessBrowser`. All references are
* live for the lifetime of the browser; the caller MUST close the
* browser to release them.
*/
export interface HarnessHandles {
readonly browser: Browser;
readonly extension: Extension;
readonly extensionId: string;
/**
* Service worker handle (for completeness / future use). NOTE: per
* the architecture refinement documented in tests/uat/lib/sw.ts,
* the harness's chrome.* state queries go through the `popup` page
* (which has full extension chrome.* access AND a stable Puppeteer
* lifetime). Direct sw.evaluate is unreliable in Chrome 148 +
* headless + Puppeteer 25 (the SW suspends + worker() returns
* "Protocol error: No target with given id found"). The SW handle
* is kept here for harness wave-3 assertion 11 / 12 (where we may
* need a worker reference for diagnostics).
*/
readonly sw: WebWorker;
readonly downloadsDir: string;
/**
* A pre-opened blank page the harness can use to invoke
* `triggerExtensionAction` (Puppeteer requires a page in the active
* tab for the toolbar-click simulation).
*/
readonly page: Page;
/**
* The extension popup page, opened at
* chrome-extension://<extensionId>/src/popup/index.html. This page
* is the harness's primary chrome.* query surface (see
* tests/uat/lib/sw.ts file header for rationale).
*/
readonly popup: Page;
}
/**
* Optional launch overrides. Defaults are CI-friendly; HEADLESS=0
* environment variable flips to headful for local debugging.
*/
export interface LaunchOptions {
/** Override the dist-test directory (test isolation). */
readonly distTestDir?: string;
/** Override the downloads directory (default: fresh tempdir per call). */
readonly downloadsDir?: string;
/** Force headless / headful regardless of HEADLESS env. */
readonly headless?: boolean;
}
/**
* Create a per-run downloads directory under the OS tmpdir. Caller is
* responsible for cleanup (typically deferred to OS tmpdir GC).
*
* @returns Absolute path to the freshly-created downloads directory.
*/
function makeDownloadsDir(): string {
return mkdtempSync(join(tmpdir(), 'mokosh-uat-downloads-'));
}
/**
* Verify the dist-test directory exists and is a directory. Fails
* loudly with an actionable message — the caller likely forgot to
* run `npm run build:test` before invoking the harness.
*
* @param distTestDir - Absolute path to dist-test.
* @throws If the directory does not exist or is not a directory.
*/
function assertDistTestPresent(distTestDir: string): void {
if (!existsSync(distTestDir)) {
throw new Error(
`dist-test/ missing at ${distTestDir}. ` +
`Run \`npm run build:test\` before launching the harness ` +
`(or invoke via \`npm run test:uat\` which does it for you).`,
);
}
const stat = statSync(distTestDir);
if (!stat.isDirectory()) {
throw new Error(
`dist-test/ exists at ${distTestDir} but is not a directory.`,
);
}
}
/**
* Resolve whether to run headless. HEADLESS=0 forces headful;
* anything else (including undefined) is headless. Explicit
* `options.headless` overrides the env entirely.
*
* @param options - Optional launch overrides.
* @returns true for headless, false for headful.
*/
function resolveHeadless(options: LaunchOptions): boolean {
if (options.headless !== undefined) {
return options.headless;
}
return process.env.HEADLESS !== '0';
}
/**
* Locate the SW target via the extension ID. Polls puppeteer's target
* list because the SW is registered asynchronously after the extension
* loads. Times out at 10s — if the SW is missing after that, either
* dist-test/ is corrupted or the SW bundle threw at module init (which
* would be caught by sw-bundle-import.test.ts BEFORE the harness ever
* runs; but defensively, we surface a clear diagnostic here).
*
* @param browser - Puppeteer Browser handle.
* @param extensionId - The extension's runtime id.
* @returns The SW WebWorker handle.
* @throws If no SW target appears within 10s.
*/
async function waitForSwTarget(
browser: Browser,
extensionId: string,
): Promise<WebWorker> {
const target = await browser.waitForTarget(
(t) =>
t.type() === 'service_worker' &&
t.url().startsWith(`chrome-extension://${extensionId}`),
{ timeout: 10_000 },
);
const sw = await target.worker();
if (sw === null) {
throw new Error(
`Service worker target found for extension ${extensionId} but ` +
`its worker() returned null — the SW likely crashed at init.`,
);
}
return sw;
}
/**
* Configure the per-page download behavior via CDP so files land in
* our temp downloadsDir. Puppeteer 25's high-level downloads API is
* still in flux; the raw CDP call is stable across versions.
*
* @param page - Page whose downloads should be redirected.
* @param downloadsDir - Absolute path to capture downloads.
*/
async function setDownloadBehavior(
page: Page,
downloadsDir: string,
): Promise<void> {
const cdpClient: CDPSession = await page.target().createCDPSession();
await cdpClient.send('Browser.setDownloadBehavior', {
behavior: 'allow',
downloadPath: downloadsDir,
eventsEnabled: true,
});
}
/**
* Launch a Chrome instance with the test bundle loaded as an unpacked
* MV3 extension; wire downloads to a per-run temp dir; return all
* handles the harness needs. Caller MUST `await handles.browser.close()`.
*
* @param options - Optional overrides (mostly for isolation in tests).
* @returns Resolved handles to browser, extension, SW, page, downloadsDir.
* @throws If dist-test/ missing OR SW target never appears.
*/
export async function launchHarnessBrowser(
options: LaunchOptions = {},
): Promise<HarnessHandles> {
const distTestDir = options.distTestDir ?? DIST_TEST_DIR;
assertDistTestPresent(distTestDir);
const downloadsDir = options.downloadsDir ?? makeDownloadsDir();
const headless = resolveHeadless(options);
// Pre-flight: verify the operator's chrome binary supports the
// auto-select picker flag. The string is locale-specific; en_US
// uses "Entire screen". This pre-flight does NOT verify the locale
// matches — it only verifies Puppeteer can find a Chromium binary
// at all (a missing binary fails the launch with a confusing message
// otherwise).
// Suppress noisy `puppeteer --version` check; if it fails, the launch
// itself will surface the same diagnostic.
try {
execSync('node ./node_modules/puppeteer/lib/cjs/puppeteer/node/cli.js --help', {
stdio: 'ignore',
timeout: 5_000,
});
} catch {
// Best-effort. The actual launch will fail loudly if the binary is
// truly missing.
}
const browser = await puppeteer.launch({
enableExtensions: [distTestDir],
headless,
pipe: true,
args: [
'--no-sandbox',
// RESEARCH §9: auto-accept the screen-share picker so
// getDisplayMedia resolves without operator interaction. The
// literal string is en_US-locale-sensitive; tests/uat/README.md
// documents the fallback for other locales.
'--auto-select-desktop-capture-source=Entire screen',
// DO NOT add --use-fake-ui-for-media-stream (RESEARCH §9 Pitfall:
// conflicts with auto-select).
],
});
// Resolve the extension ID. Puppeteer 25's browser.extensions() returns
// a Map<id, Extension> with all enabled extensions — BUT the map is
// populated asynchronously after the extension's manifest loads.
// Empirically: extension appears within ~100ms on local hardware but
// the very first call right after launch returns Map(0). Poll until
// extension registers OR 5s elapses; surface a clear diagnostic on
// timeout (probably means dist-test/ is malformed).
let extensionsMap = await browser.extensions();
const POLL_TIMEOUT_MS = 5_000;
const POLL_INTERVAL_MS = 100;
const pollStart = Date.now();
while (extensionsMap.size === 0 && Date.now() - pollStart < POLL_TIMEOUT_MS) {
await new Promise((r) => setTimeout(r, POLL_INTERVAL_MS));
extensionsMap = await browser.extensions();
}
const entries = [...extensionsMap];
if (entries.length === 0) {
await browser.close();
throw new Error(
`Puppeteer launched Chrome but no extensions loaded after ${POLL_TIMEOUT_MS}ms — ` +
`verify enableExtensions path points at a valid unpacked extension: ${distTestDir}. ` +
`Common causes: dist-test/ missing the manifest.json, manifest version mismatch ` +
`(Chrome requires MV3 — verify "manifest_version": 3), or chrome binary ` +
`incompatible with the unpacked extension shape.`,
);
}
const [extensionId, extension] = entries[0];
// Wait for the SW target to appear + capture its worker handle.
const sw = await waitForSwTarget(browser, extensionId);
// Give the SW's module init a tick to complete. Empirically the
// service-worker-loader.js → assets/index-*.js dynamic import
// resolves quickly, but `chrome.action.onClicked.addListener` (and
// the gated test-hook addListener monkey-patches) all run inside
// the module body — a brief settle ensures the hook surface is
// installed BEFORE the harness's first `sw.evaluate(() =>
// globalThis.__mokoshTest...)` query.
await new Promise((r) => setTimeout(r, 500));
// Pre-open a blank page; configure downloads. The blank page is
// also the page the harness uses for triggerExtensionAction.
const page = await browser.newPage();
await page.goto('about:blank');
await setDownloadBehavior(page, downloadsDir);
// Open the extension popup as a separate Page. This is the harness's
// primary chrome.* query surface — see tests/uat/lib/sw.ts file
// header for the architecture rationale. The popup page has full
// extension chrome.* access AND a stable Puppeteer lifetime. Loading
// the URL also wakes the SW (chrome-extension:// page load IS a SW
// wake-up event in MV3).
const popup = await browser.newPage();
await popup.goto(
`chrome-extension://${extensionId}/src/popup/index.html`,
{ waitUntil: 'domcontentloaded', timeout: 10_000 },
);
return {
browser,
extension,
extensionId,
sw,
downloadsDir,
page,
popup,
};
}

107
tests/uat/lib/offscreen.ts Normal file
View File

@@ -0,0 +1,107 @@
// tests/uat/lib/offscreen.ts — Plan 01-11 harness offscreen-context helpers.
//
// Each helper is a thin wrapper over `offPage.evaluate(() => ...)`.
// The Bug B BLOCKER (RESEARCH §7) lives in simulateUserStop —
// DO NOT REFACTOR to track.stop().
//
// References:
// - MediaStreamTrack 'ended' event:
// https://developer.mozilla.org/docs/Web/API/MediaStreamTrack/ended_event
// - MediaStreamTrack.stop spec note (stop does NOT fire 'ended' on the same track):
// https://www.w3.org/TR/mediacapture-streams/#dom-mediastreamtrack-stop
import type { Page } from 'puppeteer';
/// <reference path="./test-hook-contract.d.ts" />
/**
* Read the displaySurface from the active MediaStream's video track.
* Used by assertion 3 to verify monitor-only enforcement (the
* post-grant validation in src/offscreen/recorder.ts).
*
* Returns null when there is no active recording (the harness MUST
* start a recording before calling this).
*
* @param offPage - Offscreen Page handle.
* @returns 'monitor' on success, other strings on regression, null when no stream.
*/
export async function getDisplaySurface(offPage: Page): Promise<string | null> {
return await offPage.evaluate(() => {
const hook = globalThis.__mokoshTest;
if (hook === undefined || hook.getCurrentStream === undefined) {
return null;
}
const stream = hook.getCurrentStream();
if (stream === null) {
return null;
}
const track = stream.getVideoTracks()[0];
if (track === undefined) {
return null;
}
const ds = track.getSettings().displaySurface;
return typeof ds === 'string' ? ds : null;
});
}
/**
* Simulate the operator clicking Chrome's "Stop sharing" overlay.
*
* **BLOCKER (RESEARCH §7) — DO NOT REFACTOR to `track.stop()`.**
*
* `track.stop()` releases the capture but does NOT fire the 'ended'
* event on the same track per the W3C Screen Capture spec. The
* production `onUserStoppedSharing` handler (src/offscreen/recorder.ts:
* 451) is wired to 'ended' — using `track.stop()` would silently bypass
* the entire Bug B fix path that this assertion exists to verify.
*
* `track.dispatchEvent(new Event('ended'))` IS the only path that
* triggers our handler. After dispatch, the production handler calls
* `stream.getTracks().forEach(t => t.stop())` which DOES release the
* capture (just doesn't refire 'ended' on the same track — spec-correct).
*
* @param offPage - Offscreen Page handle.
* @throws If no active MediaStream OR no video track in the stream.
*/
export async function simulateUserStop(offPage: Page): Promise<void> {
await offPage.evaluate(() => {
const hook = globalThis.__mokoshTest;
if (hook === undefined || hook.getCurrentStream === undefined) {
throw new Error('simulateUserStop: __mokoshTest.getCurrentStream missing');
}
const stream = hook.getCurrentStream();
if (stream === null) {
throw new Error(
'simulateUserStop: no current MediaStream — recording must be active',
);
}
const track = stream.getVideoTracks()[0];
if (track === undefined) {
throw new Error('simulateUserStop: no video track in stream');
}
// CRITICAL: dispatchEvent, NOT track.stop(). See preamble for the
// BLOCKER analysis (RESEARCH §7).
track.dispatchEvent(new Event('ended'));
});
}
/**
* Read the current segment count from the offscreen recorder's ring
* buffer. Used by assertion 11 to verify the 30s window per D-13
* (3 × 10s segments expected after 35s of recording).
*
* Returns -1 when the hook is not installed (defensive — should
* never happen against a dist-test/ bundle).
*
* @param offPage - Offscreen Page handle.
* @returns Current segment count.
*/
export async function getSegmentCount(offPage: Page): Promise<number> {
return await offPage.evaluate(() => {
const hook = globalThis.__mokoshTest;
if (hook === undefined || hook.getSegmentCount === undefined) {
return -1;
}
return hook.getSegmentCount();
});
}

262
tests/uat/lib/sw.ts Normal file
View File

@@ -0,0 +1,262 @@
// tests/uat/lib/sw.ts — Plan 01-11 harness SW-state helpers.
//
// IMPLEMENTATION ARCHITECTURE (refined during Task 3 execution):
//
// The original Plan 01-11 RESEARCH §1 sketch assumed `sw.evaluate(() =>
// chrome.action.getBadgeText({}))` would work directly against the
// service worker via Puppeteer's WebWorker.evaluate. Empirical probes
// during Task 3 execution against Puppeteer 25.0.2 + Chrome 148 +
// --headless=true revealed two blockers:
// 1. Puppeteer's `WebWorker.evaluate` runs in an ISOLATED WORLD that
// carries SW globals (clients, registration, ...) but NOT the
// extension's full `chrome.*` API surface. `Object.keys(chrome)`
// returns `["loadTimes", "csi"]` — the public webpage chrome,
// not the extension chrome.
// 2. Chrome 148's headless mode aggressively suspends MV3 service
// workers; subsequent `swTarget.worker()` calls return
// `Protocol error: No target with given id found`.
//
// The popup page (chrome-extension://<id>/src/popup/index.html) has:
// - Full `chrome.*` API access (it's an extension context — same
// privileges as the SW for chrome.action, chrome.runtime,
// chrome.notifications, chrome.runtime.getManifest, etc.)
// - Stable lifetime (it's a regular Page; Puppeteer keeps it alive)
// - Natural SW wake-up via message passing (chrome.runtime
// .sendMessage from popup wakes the SW for 30s)
//
// So this module's helpers use a Puppeteer Page handle pointing at
// the popup URL — NOT a WebWorker handle. The harness opens the popup
// page during setup (tests/uat/lib/launch.ts) and passes it here.
//
// For SW-isolate-specific state (`globalThis.__mokoshTest` lives in
// the SW's globalThis, not the popup's), the SW hook exposes a
// `chrome.runtime.onMessage` bridge: the popup sends
// `{ type: '__mokoshTestQuery', op: '...' }` messages; the hook
// responds with the queried state. Bridge implementation is in
// src/test-hooks/sw-hooks.ts; this file invokes it via popup.evaluate
// wrapping `chrome.runtime.sendMessage`.
//
// References:
// - Chrome extension pages share chrome.* API:
// https://developer.chrome.com/docs/extensions/develop/concepts/popup
// - Puppeteer Page.evaluate: https://pptr.dev/api/puppeteer.page.evaluate
// - Service worker wake-up on chrome.runtime message:
// https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle
import type { Page } from 'puppeteer';
/// <reference path="./test-hook-contract.d.ts" />
/**
* Structured snapshot of the SW's notification observability state
* (Plan 01-11 Task 2 sw-hooks.ts surfaces). Used by assertions 7 + 8
* to verify count-deltas + last-options-shape + id-prefix membership.
*/
export interface NotificationSnapshot {
readonly count: number;
readonly lastOptions: chrome.notifications.NotificationOptions<true> | null;
readonly ids: ReadonlyArray<string>;
}
/**
* The SW hook's bridge message type. The popup sends one of these
* shapes via chrome.runtime.sendMessage; the SW's onMessage handler
* (extended by sw-hooks.ts) responds with the queried state. See
* src/test-hooks/sw-hooks.ts for the SW-side dispatch.
*/
interface BridgeQuery {
type: '__mokoshTestQuery';
op:
| 'snapshot'
| 'fire-on-startup'
| 'handler-types';
}
/**
* Get the toolbar badge text. Empty string means OFF or initial state;
* 'REC' means recording; 'ERR' means error per Plan 01-09 badge state
* machine.
*
* @param popup - The extension popup page handle (open against
* chrome-extension://<id>/src/popup/index.html).
* @returns Current badge text.
*/
export async function getBadgeText(popup: Page): Promise<string> {
return await popup.evaluate(async () => await chrome.action.getBadgeText({}));
}
/**
* Get the current popup URL. Empty string means popup is not set
* (toolbar click fires onClicked instead). The chrome-extension://
* URL means recording (popup hosts SAVE button).
*
* @param popup - The extension popup page handle.
* @returns Current popup URL (full chrome-extension:// form OR '').
*/
export async function getPopup(popup: Page): Promise<string> {
return await popup.evaluate(async () => await chrome.action.getPopup({}));
}
/**
* Read the runtime manifest. Used by assertion 10 to verify
* permissions + icons shape, and by assertion 13 to obtain the
* version string for archive shape matching.
*
* @param popup - The extension popup page handle.
* @returns The chrome.runtime.getManifest() result.
*/
export async function getManifest(popup: Page): Promise<chrome.runtime.Manifest> {
return await popup.evaluate(() => chrome.runtime.getManifest());
}
/**
* Fetch an extension-relative file via popup context and return its
* size in bytes. Used by assertion 9 to verify icon files meet the
* size floors that Chrome's imageUtil requires for notifications.create
* (Bug A regression class — too-small icon → create rejects).
*
* @param popup - The extension popup page handle.
* @param relativePath - Path under the extension root (e.g. 'icons/icon128.png').
* @returns Byte size on success, -1 on fetch failure.
*/
export async function getIconSize(
popup: Page,
relativePath: string,
): Promise<number> {
return await popup.evaluate(async (path: string) => {
const url = chrome.runtime.getURL(path);
const r = await fetch(url);
if (!r.ok) {
return -1;
}
const cl = r.headers.get('content-length');
if (cl !== null) {
const n = Number(cl);
if (Number.isFinite(n) && n > 0) {
return n;
}
}
const buf = await r.arrayBuffer();
return buf.byteLength;
}, relativePath);
}
/**
* Read whether the SW thinks a recording is active. Side-channeled
* through the badge text — 'REC' ↔ recording; '' ↔ idle; 'ERR' ↔
* error state — to avoid needing a dedicated hook field.
*
* @param popup - The extension popup page handle.
* @returns true when badge === 'REC'.
*/
export async function getIsRecording(popup: Page): Promise<boolean> {
const badge = await getBadgeText(popup);
return badge === 'REC';
}
/**
* Fire the captured chrome.runtime.onStartup handler via the test
* hook's chrome.runtime.sendMessage bridge. Used by assertion 8 to
* verify the Bug A path (icon-promoted notification fires cleanly).
*
* Bridge protocol: popup sends `{ type: '__mokoshTestQuery', op: 'fire-on-startup' }`;
* SW responds with `{ ok: true }` after invoking the handler, OR
* `{ ok: false, error: 'no-handler' }` if the production listener
* was never registered (means the SW module init failed — a
* different bug class).
*
* @param popup - The extension popup page handle.
* @throws If the bridge response indicates the handler is missing.
*/
export async function fireOnStartup(popup: Page): Promise<void> {
const response = await popup.evaluate(async () => {
const msg = {
type: '__mokoshTestQuery',
op: 'fire-on-startup',
};
return new Promise<{ ok: boolean; error?: string }>((resolve) => {
chrome.runtime.sendMessage(msg, (r) => {
resolve(r as { ok: boolean; error?: string });
});
});
});
if (!response.ok) {
throw new Error(
`fireOnStartup bridge returned ok=false: ${response.error ?? '(no error message)'}`,
);
}
}
/**
* Inject a synthetic RECORDING_ERROR message into the SW's
* chrome.runtime.onMessage handler. Used by assertion 7 to verify
* the error path is preserved (badge 'ERR' + recovery notification).
* Goes through the popup's chrome.runtime.sendMessage — a real
* production code path (sw onMessage handler).
*
* @param popup - The extension popup page handle.
* @param errorCode - The error code to inject (e.g. 'codec-unsupported').
*/
export async function sendSyntheticRecordingError(
popup: Page,
errorCode: string,
): Promise<void> {
await popup.evaluate(async (code: string) => {
await chrome.runtime.sendMessage({
type: 'RECORDING_ERROR',
error: code,
});
}, errorCode);
}
/**
* Snapshot the current notification observability state from the SW
* hook via the bridge.
*
* @param popup - The extension popup page handle.
* @returns Snapshot — count, last options, ids array.
*/
export async function getNotificationSnapshot(
popup: Page,
): Promise<NotificationSnapshot> {
const response = await popup.evaluate(async () => {
const msg = { type: '__mokoshTestQuery', op: 'snapshot' };
return new Promise<{
count: number;
lastOptions: chrome.notifications.NotificationOptions<true> | null;
ids: string[];
}>((resolve) => {
chrome.runtime.sendMessage(msg, (r) => {
resolve(r as {
count: number;
lastOptions: chrome.notifications.NotificationOptions<true> | null;
ids: string[];
});
});
});
});
return {
count: response.count,
lastOptions: response.lastOptions,
ids: response.ids,
};
}
/**
* Send a no-op keepalive ping to the SW so Chrome's ~30s idle timer
* does not evict the worker during long waits (assertion 11's 35s
* recording window). Uses chrome.runtime.sendMessage as the cheapest
* wake-up signal; the SW's onMessage handler treats unknown messages
* as a warning-log no-op.
*
* @param popup - The extension popup page handle.
*/
export async function keepalivePing(popup: Page): Promise<void> {
await popup.evaluate(async () => {
await chrome.runtime.sendMessage({ type: '__mokoshKeepalive' });
});
}
// Re-export the BridgeQuery type for sw-hooks.ts side reference
// (the SW hook implements the message dispatch using the same shape).
export type { BridgeQuery };

121
tests/uat/lib/zip.ts Normal file
View File

@@ -0,0 +1,121 @@
// tests/uat/lib/zip.ts — Plan 01-11 harness archive-shape helper.
//
// Assertion 13 verifies the session_report_*.zip produced by the SW's
// saveArchive contains:
// - `video/last_30sec.webm` (non-zero size)
// - `meta.json` whose parsed JSON has `version === <manifest.version>`
//
// References:
// - JSZip: https://stuk.github.io/jszip/documentation/api_jszip.html
// - Plan 01-07 archive shape (session_report contract):
// .planning/phases/01-stabilize-video-pipeline/01-07-PLAN.md
import { readFileSync } from 'node:fs';
import JSZip from 'jszip';
/**
* Outcome of an archive shape inspection. `errors` lists every
* missing-file / wrong-size / version-mismatch finding.
*/
export interface ArchiveShapeResult {
readonly hasVideoEntry: boolean;
readonly videoSizeBytes: number;
readonly hasMetaEntry: boolean;
readonly metaJson: { version?: unknown } | null;
readonly errors: ReadonlyArray<string>;
}
/**
* Open a downloaded session_report_*.zip and verify its shape.
*
* @param zipPath - Absolute path to the downloaded .zip file.
* @param expectedVersion - The version string from chrome.runtime.getManifest().version.
* @returns Structured shape result. `errors` non-empty == assertion failure.
*/
export async function assertArchiveShape(
zipPath: string,
expectedVersion: string,
): Promise<ArchiveShapeResult> {
const zipBuf = readFileSync(zipPath);
const zip = await JSZip.loadAsync(zipBuf);
const errors: string[] = [];
// video/last_30sec.webm presence + size
const videoEntry = zip.file('video/last_30sec.webm');
let hasVideoEntry = false;
let videoSizeBytes = 0;
if (videoEntry === null) {
errors.push('video/last_30sec.webm entry missing from archive');
} else {
hasVideoEntry = true;
const videoBuf = await videoEntry.async('uint8array');
videoSizeBytes = videoBuf.byteLength;
if (videoSizeBytes === 0) {
errors.push('video/last_30sec.webm entry is zero bytes (no captured video)');
}
}
// meta.json presence + version match
const metaEntry = zip.file('meta.json');
let hasMetaEntry = false;
let metaJson: { version?: unknown } | null = null;
if (metaEntry === null) {
errors.push('meta.json entry missing from archive');
} else {
hasMetaEntry = true;
const metaText = await metaEntry.async('string');
try {
metaJson = JSON.parse(metaText) as { version?: unknown };
} catch (parseErr) {
const msg = parseErr instanceof Error ? parseErr.message : String(parseErr);
errors.push(`meta.json failed to parse as JSON: ${msg}`);
}
if (metaJson !== null) {
if (typeof metaJson.version !== 'string') {
errors.push(
`meta.json.version expected string, got ${typeof metaJson.version} (${JSON.stringify(metaJson.version)})`,
);
} else if (metaJson.version !== expectedVersion) {
errors.push(
`meta.json.version mismatch — expected "${expectedVersion}", got "${metaJson.version}"`,
);
}
}
}
return {
hasVideoEntry,
videoSizeBytes,
hasMetaEntry,
metaJson,
errors,
};
}
/**
* Extract a single named entry from a .zip to an absolute filesystem
* path. Used by assertion 12 (ffprobe gate on video/last_30sec.webm).
*
* @param zipPath - Absolute path to the .zip.
* @param entryName - Name of the entry inside the zip (e.g. 'video/last_30sec.webm').
* @param outPath - Absolute filesystem path to write the entry to.
* @returns The number of bytes written.
* @throws If the entry is missing from the zip.
*/
export async function extractEntryToFile(
zipPath: string,
entryName: string,
outPath: string,
): Promise<number> {
const { writeFileSync } = await import('node:fs');
const zipBuf = readFileSync(zipPath);
const zip = await JSZip.loadAsync(zipBuf);
const entry = zip.file(entryName);
if (entry === null) {
throw new Error(`extractEntryToFile: entry '${entryName}' missing in ${zipPath}`);
}
const buf = await entry.async('nodebuffer');
writeFileSync(outPath, buf);
return buf.byteLength;
}

20
tests/uat/tsconfig.json Normal file
View File

@@ -0,0 +1,20 @@
{
"_comment": "Plan 01-11 — type-check config for the Puppeteer UAT harness. Mirrors the root tsconfig.json's compiler options but `include`s the harness tree explicitly so `npx tsc --noEmit -p tests/uat` validates the harness in isolation. Used by Task 3 to verify the scaffolding type-checks before tsx runs it. The root tsconfig.json's `include: ['src']` does NOT pick up tests/, so this file is necessary for the type-check verification step.",
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"lib": ["ES2022", "DOM", "DOM.Iterable"],
"skipLibCheck": true,
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"resolveJsonModule": true,
"isolatedModules": true,
"noEmit": true,
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"noFallthroughCasesInSwitch": true,
"types": ["chrome", "node"]
},
"include": ["**/*.ts", "**/*.d.ts"]
}

View File

@@ -17,6 +17,14 @@ export default defineConfig({
test: { test: {
environment: 'node', environment: 'node',
include: ['tests/**/*.test.ts'], include: ['tests/**/*.test.ts'],
// Plan 01-11: exclude the Puppeteer harness from vitest's discovery.
// tests/uat/harness.test.ts is a tsx-runnable Node script invoked
// via `npm run test:uat`; running it under vitest would try to
// launch a real Chrome inside the vitest worker (interactive UAT
// does not belong in the unit-test pass). The .test.ts suffix is
// retained for editor + naming-convention consistency with the
// rest of the tests/ tree.
exclude: ['node_modules/**', 'tests/uat/**'],
reporters: 'dot', reporters: 'dot',
typecheck: { typecheck: {
enabled: false, enabled: false,