Task 3 of Plan 01-11 (Puppeteer UAT harness).
Harness file tree (tests/uat/):
- harness.test.ts: tsx-runnable top-to-bottom harness entry point.
Runs A0 inline (filesystem grep gate, abort-on-fail T-1-11-01),
then launches Chrome + opens popup bridge + queries manifest, then
iterates A1-A13 stubs. Each stub throws "NOT YET IMPLEMENTED —
Plan 01-11 Task N wires this assertion". Exit code = 0 on full
pass, 1 otherwise. Final line: "UAT harness: N/14 assertions passed".
- lib/launch.ts: launchHarnessBrowser() — wraps puppeteer.launch with
enableExtensions:[dist-test/], headless default (HEADLESS=0
override), --no-sandbox + --auto-select-desktop-capture-source flags.
Polls browser.extensions() until the extension registers (empirically
~100ms but the first call right after launch returns Map(0)).
Opens both a blank page (for triggerExtensionAction) AND the popup
page (the bridge surface). Returns { browser, extension, extensionId,
sw, downloadsDir, page, popup }.
- lib/extension.ts: waitForOffscreenTarget + attachToOffscreen +
countOffscreenTargets. Offscreen attach uses target.type() ===
'background_page' + .asPage() (NOT .page() — RESEARCH §4 Pitfall 1).
- lib/sw.ts: chrome.* state queries via the POPUP page handle (NOT
the WebWorker handle — see architecture note below). getBadgeText,
getPopup, getManifest, getIconSize, getIsRecording (side-channeled
through badge text), fireOnStartup (via __mokoshTestQuery bridge),
sendSyntheticRecordingError, getNotificationSnapshot (via bridge),
keepalivePing (no-op message to wake SW for ~30s).
- lib/offscreen.ts: getDisplaySurface, simulateUserStop (the
dispatchEvent('ended') path per RESEARCH §7 BLOCKER — DO NOT REFACTOR
to track.stop()), getSegmentCount.
- lib/assertions.ts: runAssertion(idx, name, buffers, fn) wrapper —
records pass/fail/duration; on failure dumps last 30 lines of SW
+ offscreen console buffers to stderr before rethrowing. assertEqual
/ assertMatch / assertTrue / assertGte / waitFor polling helper.
- lib/zip.ts: jszip-based assertArchiveShape + extractEntryToFile for
assertions 12 + 13.
- README.md: runtime + local-debug + CI semantics + locale gotcha
+ dev-dep size note + assertion catalog table.
- tsconfig.json: per-tree type-check config (mirrors root tsconfig.json
compiler options but includes the harness tree explicitly).
Architecture refinement (DEVIATION from RESEARCH §1 — Rule 1+3 inline fix):
- RESEARCH §1 sketched `sw.evaluate(() => chrome.action.getBadgeText({}))`
as the chrome.* query path. Empirical probes during Task 3 execution
against Puppeteer 25.0.2 + Chrome 148 + --headless=true revealed two
blockers:
1. Puppeteer's WebWorker.evaluate runs in an ISOLATED WORLD that
carries SW globals (clients, registration, ...) but NOT the
extension's full chrome.* API surface. Object.keys(chrome) inside
sw.evaluate returns ["loadTimes","csi"] — the public webpage
chrome, not the extension chrome.
2. Chrome 148's headless mode aggressively suspends MV3 service
workers; subsequent swTarget.worker() calls return
"Protocol error: No target with given id found".
- WORKAROUND: open the popup page (chrome-extension://<id>/src/popup/
index.html) as a separate Puppeteer Page. The popup has full
chrome.* access (it's an extension context with same privileges as
the SW) AND stable Puppeteer lifetime. For SW-globalThis state
(__mokoshTest in the SW isolate, NOT in the popup), bridge via
chrome.runtime.sendMessage. The popup sends
{ type: '__mokoshTestQuery', op: 'snapshot' | 'fire-on-startup' |
'handler-types' }; the SW hook's onMessage handler responds.
- Bridge implementation added to src/test-hooks/sw-hooks.ts — registers
AFTER the production listeners so it never intercepts production
messages (__mokoshTest* type is unambiguously test-only). Tier-1
grep gate (no-test-hooks-in-prod-bundle.test.ts) continues to enforce
ZERO __mokoshTest occurrences in dist/ — the bridge handler is
tree-shaken alongside the rest of the hook module via the
__MOKOSH_UAT__ gate.
Other configuration changes:
- vitest.config.ts: exclude tests/uat/** from vitest discovery. The
Puppeteer harness is invoked via `npm run test:uat` (not vitest);
running it under vitest would try to launch real Chrome inside a
vitest worker. The .test.ts suffix is retained for editor +
naming-convention consistency with the rest of the tree.
Verification:
- npx tsc --noEmit (src/): exit 0
- npx tsc --noEmit -p tests/uat: exit 0
- npm run build: exit 0
- grep -rln '__mokoshTest|simulateUserStop|getSegmentCount|setCurrentStream|setSegmentCountGetter|__mokoshTestQuery|__mokoshKeepalive' dist/: ZERO matches
- npm run build:test: exit 0; dist-test/ populated with the new bridge code
- SKIP_BUILD=1 npx vitest run: 89/89 GREEN
- SKIP_PROD_REBUILD=1 npx tsx tests/uat/harness.test.ts:
→ A0 [PASS]: production bundle has no test-hook leaks (19ms)
→ Browser launches; popup opens; manifest read succeeds
→ A1-A13 [FAIL]: NOT YET IMPLEMENTED — Plan 01-11 Task N wires this
→ "UAT harness: 1/14 assertions passed, 13 failed (first failure: A1)"
→ Exit code: 1 (expected — 13 RED stubs intentional)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
395 lines
14 KiB
TypeScript
395 lines
14 KiB
TypeScript
// tests/uat/harness.test.ts — Plan 01-11 Puppeteer UAT harness entry point.
|
|
//
|
|
// Runs end-to-end via `npm run test:uat` (build:test + tsx tests/uat/harness.test.ts).
|
|
// Top-to-bottom narrative: launch Chrome with dist-test loaded as
|
|
// MV3 extension, attach to SW + offscreen, run 14 assertions
|
|
// sequentially with bail-on-first-fail semantics + structured
|
|
// diagnostic dump on failure (RESEARCH §5 + open-question resolution 4).
|
|
//
|
|
// Exit code:
|
|
// 0 — all 14 assertions passed
|
|
// 1 — at least one assertion failed
|
|
//
|
|
// Local-debug mode: `HEADLESS=0 npm run test:uat` (opens real Chrome)
|
|
// Skip prod rebuild: `SKIP_PROD_REBUILD=1` (assertion 0 still verifies
|
|
// the EXISTING dist/ rather than spawning npm run build).
|
|
//
|
|
// Assertion catalog (14 total):
|
|
// 0 — Production bundle grep gate (filesystem-only; pre-flight).
|
|
// 1 — SW bootstrap → setIdleMode (badge '', popup '', isRecording=false).
|
|
// 2 — Toolbar onClicked-idle → badge 'REC' + popup popup.html + isRecording=true.
|
|
// 3 — Offscreen displaySurface === 'monitor' (post-grant validation).
|
|
// 4 — Toolbar onClicked while recording → popup, NO new offscreen.
|
|
// 5 — SAVE_ARCHIVE → download fires + session_report_*.zip appears.
|
|
// 6 — BUG B (canonical): simulateUserStop → badge '' + popup '' + NO recovery notif.
|
|
// 7 — RECORDING_ERROR codec-unsupported → badge 'ERR' + recovery notif.
|
|
// 8 — BUG A (canonical): onStartup → mokosh-startup- notification creates cleanly.
|
|
// 9 — Icon file sizes meet floors (16→200, 48→500, 128→1024).
|
|
// 10 — Manifest has notifications permission + all three icons declared.
|
|
// 11 — 35s recording yields >= 3 segments per D-13.
|
|
// 12 — ffprobe -v error -f matroska on extracted webm exits 0.
|
|
// 13 — Archive shape (video/last_30sec.webm + meta.json with version match).
|
|
|
|
import { execFileSync, execSync } from 'node:child_process';
|
|
import { existsSync, readdirSync, readFileSync, statSync, mkdtempSync } from 'node:fs';
|
|
import { tmpdir } from 'node:os';
|
|
import { dirname, join, resolve as resolvePath } from 'node:path';
|
|
import { fileURLToPath } from 'node:url';
|
|
|
|
import type { Page } from 'puppeteer';
|
|
|
|
import {
|
|
type AssertionRecord,
|
|
type ConsoleBuffers,
|
|
assertEqual,
|
|
assertGte,
|
|
assertMatch,
|
|
assertTrue,
|
|
runAssertion,
|
|
waitFor,
|
|
} from './lib/assertions';
|
|
import {
|
|
attachToOffscreen,
|
|
countOffscreenTargets,
|
|
waitForOffscreenTarget,
|
|
} from './lib/extension';
|
|
import {
|
|
getDisplaySurface,
|
|
getSegmentCount,
|
|
simulateUserStop,
|
|
} from './lib/offscreen';
|
|
import {
|
|
fireOnStartup,
|
|
getBadgeText,
|
|
getIconSize,
|
|
getIsRecording,
|
|
getManifest,
|
|
getNotificationSnapshot,
|
|
getPopup,
|
|
keepalivePing,
|
|
sendSyntheticRecordingError,
|
|
} from './lib/sw';
|
|
import { assertArchiveShape, extractEntryToFile } from './lib/zip';
|
|
import { launchHarnessBrowser, type HarnessHandles } from './lib/launch';
|
|
|
|
const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
|
|
const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..');
|
|
const DIST_DIR = resolvePath(REPO_ROOT, 'dist');
|
|
const FFPROBE_BIN = '/usr/bin/ffprobe';
|
|
const TOTAL_ASSERTIONS = 14;
|
|
|
|
/**
|
|
* Forbidden hook surface strings — assertion 0 verifies absence
|
|
* in production dist/. Mirrors the Tier-1 unit gate's surface list
|
|
* (tests/background/no-test-hooks-in-prod-bundle.test.ts) but runs
|
|
* against the SAME dist/ as the live harness for E2E parity.
|
|
*/
|
|
const FORBIDDEN_HOOK_STRINGS: ReadonlyArray<string> = [
|
|
'__mokoshTest',
|
|
'simulateUserStop',
|
|
'getSegmentCount',
|
|
'setCurrentStream',
|
|
'setSegmentCountGetter',
|
|
];
|
|
|
|
/** Icon-size floors per assertion 9 (per orchestrator brief). */
|
|
const ICON_SIZE_FLOORS: ReadonlyArray<readonly [string, number]> = [
|
|
['icons/icon16.png', 200],
|
|
['icons/icon48.png', 500],
|
|
['icons/icon128.png', 1024],
|
|
];
|
|
|
|
/**
|
|
* Recursively list all files under a root directory (sync). Used by
|
|
* assertion 0 to walk dist/. Symlinks are skipped defensively.
|
|
*
|
|
* @param root - Absolute directory path.
|
|
* @returns Sorted list of absolute file paths.
|
|
*/
|
|
function listAllFilesRecursive(root: string): ReadonlyArray<string> {
|
|
const acc: string[] = [];
|
|
const stack: string[] = [root];
|
|
while (stack.length > 0) {
|
|
const dir = stack.pop()!;
|
|
const entries = readdirSync(dir, { withFileTypes: true });
|
|
for (const entry of entries) {
|
|
const fullPath = resolvePath(dir, entry.name);
|
|
if (entry.isSymbolicLink()) continue;
|
|
if (entry.isDirectory()) {
|
|
stack.push(fullPath);
|
|
} else if (entry.isFile()) {
|
|
acc.push(fullPath);
|
|
}
|
|
}
|
|
}
|
|
return acc.sort();
|
|
}
|
|
|
|
/**
|
|
* Grep `needle` across every text-like file under `root`. Returns
|
|
* file paths that contain at least one occurrence.
|
|
*
|
|
* @param root - Absolute directory path.
|
|
* @param needle - Literal substring to find.
|
|
* @returns Paths containing `needle`.
|
|
*/
|
|
function grepRecursive(root: string, needle: string): ReadonlyArray<string> {
|
|
const binaryExt = new Set(['.png', '.jpg', '.jpeg', '.gif', '.ico', '.webp', '.woff', '.woff2', '.ttf']);
|
|
const out: string[] = [];
|
|
for (const filePath of listAllFilesRecursive(root)) {
|
|
const dotIdx = filePath.lastIndexOf('.');
|
|
const ext = dotIdx >= 0 ? filePath.substring(dotIdx).toLowerCase() : '';
|
|
if (binaryExt.has(ext)) continue;
|
|
if (statSync(filePath).size === 0) continue;
|
|
const text = readFileSync(filePath, 'utf8');
|
|
if (text.includes(needle)) {
|
|
out.push(filePath);
|
|
}
|
|
}
|
|
return out;
|
|
}
|
|
|
|
/**
|
|
* Poll `downloadsDir` for any *session_report*.zip file. Returns the
|
|
* absolute path of the first match. Used by assertion 5.
|
|
*
|
|
* @param downloadsDir - Absolute downloads directory path.
|
|
* @param timeoutMs - Maximum wait time.
|
|
* @returns Absolute path to the matched .zip.
|
|
* @throws On timeout.
|
|
*/
|
|
async function waitForDownloadedZip(
|
|
downloadsDir: string,
|
|
timeoutMs: number,
|
|
): Promise<string> {
|
|
const start = Date.now();
|
|
while (Date.now() - start < timeoutMs) {
|
|
const entries = readdirSync(downloadsDir);
|
|
for (const name of entries) {
|
|
if (name.includes('session_report') && name.endsWith('.zip')) {
|
|
const full = join(downloadsDir, name);
|
|
// Make sure write completed (size stabilized).
|
|
const size1 = statSync(full).size;
|
|
await new Promise((r) => setTimeout(r, 200));
|
|
const size2 = statSync(full).size;
|
|
if (size1 === size2 && size1 > 0) {
|
|
return full;
|
|
}
|
|
}
|
|
}
|
|
await new Promise((r) => setTimeout(r, 200));
|
|
}
|
|
throw new Error(
|
|
`waitForDownloadedZip: no session_report_*.zip appeared in ${downloadsDir} within ${timeoutMs}ms`,
|
|
);
|
|
}
|
|
|
|
/**
|
|
* Run a production build of dist/ unless SKIP_PROD_REBUILD=1.
|
|
* Assertion 0 reads dist/, so this guarantees the gate runs against
|
|
* a fresh artifact.
|
|
*/
|
|
function ensureProductionBuild(): void {
|
|
if (process.env.SKIP_PROD_REBUILD === '1') {
|
|
process.stdout.write(' (SKIP_PROD_REBUILD=1 — using existing dist/)\n');
|
|
return;
|
|
}
|
|
process.stdout.write(' Running `npm run build` (assertion 0 pre-flight)...\n');
|
|
execFileSync('npm', ['run', 'build'], {
|
|
stdio: 'inherit',
|
|
cwd: REPO_ROOT,
|
|
});
|
|
}
|
|
|
|
/**
|
|
* Stub placeholder for assertions Task 4+ wires. Each stub throws so
|
|
* the harness exits non-zero today; the diagnostic clearly identifies
|
|
* the assertion as un-implemented vs failing-in-production.
|
|
*
|
|
* @param taskNumber - The plan task number that will wire this assertion.
|
|
* @returns A function that always throws.
|
|
*/
|
|
function notYetImplemented(taskNumber: number): () => Promise<void> {
|
|
return async () => {
|
|
throw new Error(
|
|
`NOT YET IMPLEMENTED — Plan 01-11 Task ${taskNumber} wires this assertion`,
|
|
);
|
|
};
|
|
}
|
|
|
|
/**
|
|
* Main harness entry point. Runs all 14 assertions sequentially with
|
|
* bail-on-first-fail semantics for the SETUP-dependent assertions
|
|
* (we still record every assertion's outcome — bail only stops
|
|
* subsequent FUNCTIONAL assertions from running).
|
|
*/
|
|
async function main(): Promise<number> {
|
|
const results: AssertionRecord[] = [];
|
|
const buffers: ConsoleBuffers = { swLines: [], offscreenLines: [] };
|
|
let handles: HarnessHandles | null = null;
|
|
|
|
process.stdout.write('\nMokosh UAT harness — Plan 01-11 Puppeteer-driven 14-assertion suite\n');
|
|
process.stdout.write('='.repeat(72) + '\n\n');
|
|
|
|
try {
|
|
// ─── Assertion 0: Pre-flight grep gate ──────────────────────────
|
|
process.stdout.write('Assertion 0 (pre-flight, filesystem-only):\n');
|
|
ensureProductionBuild();
|
|
const a0 = await runAssertion(
|
|
0,
|
|
'production bundle has no test-hook leaks (T-1-11-01)',
|
|
buffers,
|
|
async () => {
|
|
for (const needle of FORBIDDEN_HOOK_STRINGS) {
|
|
const matches = grepRecursive(DIST_DIR, needle);
|
|
assertEqual(
|
|
matches.length,
|
|
0,
|
|
`production dist/ contains '${needle}' in: ${JSON.stringify(matches)}`,
|
|
);
|
|
}
|
|
},
|
|
);
|
|
results.push(a0);
|
|
if (!a0.passed) {
|
|
// Hook leak is security-critical (T-1-11-01) — abort immediately.
|
|
process.stderr.write(
|
|
'\n*** ABORT: assertion 0 (hook leak gate) FAILED — refusing to ' +
|
|
'continue with potentially-leaky production bundle. ***\n',
|
|
);
|
|
return 1;
|
|
}
|
|
|
|
// ─── Setup: launch browser, attach to SW + open popup bridge ───
|
|
process.stdout.write('\nLaunching Chrome + opening popup bridge...\n');
|
|
handles = await launchHarnessBrowser();
|
|
const { browser, sw, page, popup, extensionId, downloadsDir } = handles;
|
|
process.stdout.write(` extensionId: ${extensionId}\n`);
|
|
process.stdout.write(` downloadsDir: ${downloadsDir}\n`);
|
|
process.stdout.write(` popup: chrome-extension://${extensionId}/src/popup/index.html\n\n`);
|
|
|
|
// Wire console buffers. The popup carries the chrome.* queries;
|
|
// the SW handle is kept for diagnostic console capture (when the
|
|
// SW is alive). Both feed buffers for failure dumps.
|
|
const popupPage: Page = popup;
|
|
popupPage.on('console', (msg) => {
|
|
buffers.swLines.push(`[Popup:${msg.type()}] ${msg.text()}`);
|
|
});
|
|
sw.on('console', (msg) => {
|
|
buffers.swLines.push(`[SW:${msg.type()}] ${msg.text()}`);
|
|
});
|
|
|
|
// Read the manifest version once for assertion 13.
|
|
const manifest = await getManifest(popupPage);
|
|
const expectedVersion = manifest.version;
|
|
|
|
// ─── Wave 3 stubbed assertions (Tasks 4-7 will wire these) ──────
|
|
const stubs: Array<{
|
|
index: number;
|
|
name: string;
|
|
taskNumber: number;
|
|
}> = [
|
|
{ index: 1, name: 'SW bootstrap → setIdleMode', taskNumber: 4 },
|
|
{ index: 2, name: 'toolbar onClicked-idle → badge REC + popup', taskNumber: 4 },
|
|
{ index: 3, name: 'offscreen displaySurface === monitor', taskNumber: 4 },
|
|
{ index: 4, name: 'toolbar onClicked-recording → popup, no new offscreen', taskNumber: 4 },
|
|
{ index: 5, name: 'SAVE_ARCHIVE → download fires + zip appears', taskNumber: 5 },
|
|
{ index: 6, name: 'BUG B canonical: simulateUserStop → badge OFF + no recovery notif', taskNumber: 5 },
|
|
{ index: 7, name: 'RECORDING_ERROR codec-unsupported → badge ERR + recovery notif', taskNumber: 5 },
|
|
{ index: 8, name: 'BUG A canonical: onStartup → notification creates cleanly', taskNumber: 6 },
|
|
{ index: 9, name: 'icon file sizes meet floors', taskNumber: 6 },
|
|
{ index: 10, name: 'manifest has notifications + 3 icons', taskNumber: 6 },
|
|
{ index: 11, name: '35s recording → segments.length >= 3', taskNumber: 7 },
|
|
{ index: 12, name: 'ffprobe on extracted webm exits 0', taskNumber: 7 },
|
|
{ index: 13, name: 'archive shape — video + meta.json version match', taskNumber: 7 },
|
|
];
|
|
|
|
for (const s of stubs) {
|
|
const rec = await runAssertion(
|
|
s.index,
|
|
s.name,
|
|
buffers,
|
|
notYetImplemented(s.taskNumber),
|
|
);
|
|
results.push(rec);
|
|
}
|
|
|
|
// Suppress unused-warning placeholders — Tasks 4-7 will use these
|
|
// imports + handles directly. Reference them here for type-clean.
|
|
void browser;
|
|
void page;
|
|
void popupPage;
|
|
void expectedVersion;
|
|
void waitForOffscreenTarget;
|
|
void attachToOffscreen;
|
|
void countOffscreenTargets;
|
|
void waitFor;
|
|
void getBadgeText;
|
|
void getPopup;
|
|
void getIsRecording;
|
|
void getIconSize;
|
|
void fireOnStartup;
|
|
void sendSyntheticRecordingError;
|
|
void getNotificationSnapshot;
|
|
void keepalivePing;
|
|
void getDisplaySurface;
|
|
void simulateUserStop;
|
|
void getSegmentCount;
|
|
void assertArchiveShape;
|
|
void extractEntryToFile;
|
|
void assertMatch;
|
|
void assertTrue;
|
|
void assertGte;
|
|
void waitForDownloadedZip;
|
|
void mkdtempSync;
|
|
void existsSync;
|
|
void execSync;
|
|
void tmpdir;
|
|
void FFPROBE_BIN;
|
|
void ICON_SIZE_FLOORS;
|
|
|
|
return finalize(results);
|
|
} catch (setupErr) {
|
|
process.stderr.write(`\n*** Harness setup error: ${String(setupErr)}\n`);
|
|
return finalize(results);
|
|
} finally {
|
|
if (handles !== null) {
|
|
try {
|
|
await handles.browser.close();
|
|
} catch (closeErr) {
|
|
process.stderr.write(`(non-fatal: browser close threw: ${String(closeErr)})\n`);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
/**
|
|
* Print the final summary line + return the exit code.
|
|
*
|
|
* @param results - All assertion records collected during the run.
|
|
* @returns 0 if all 14 passed, 1 otherwise.
|
|
*/
|
|
function finalize(results: ReadonlyArray<AssertionRecord>): number {
|
|
const passCount = results.filter((r) => r.passed).length;
|
|
const failCount = results.length - passCount;
|
|
process.stdout.write('\n' + '='.repeat(72) + '\n');
|
|
if (passCount === TOTAL_ASSERTIONS) {
|
|
process.stdout.write(`UAT harness: ${passCount}/${TOTAL_ASSERTIONS} assertions passed\n`);
|
|
return 0;
|
|
}
|
|
const firstFail = results.find((r) => !r.passed);
|
|
process.stdout.write(
|
|
`UAT harness: ${passCount}/${TOTAL_ASSERTIONS} assertions passed, ${failCount} failed`,
|
|
);
|
|
if (firstFail !== undefined) {
|
|
process.stdout.write(` (first failure: A${firstFail.index} ${firstFail.name})`);
|
|
}
|
|
process.stdout.write('\n');
|
|
return 1;
|
|
}
|
|
|
|
// Run + exit. Top-level await + explicit exit code so tsx returns
|
|
// the right status without leaving unhandled-promise spew on stderr.
|
|
const exitCode = await main();
|
|
process.exit(exitCode);
|