Lands the final three UAT-harness assertions. All 14 assertions (A0..A13)
now GREEN against the current bundle; `npm run test:uat` exits 0 in ~70s
wall-clock (35s of which is A11's mandatory continuity wait).
Assertions wired:
- A11 — 35s buffer continuity → segments.length >= 3. Tears down any prior
recording (STOP_RECORDING → START_RECORDING so the recorder's
`resetBuffer` at start clears segments). Waits 35_000ms wall-clock with
intermittent SW keepalive PINGs every 20s (belt-and-suspenders over the
offscreen recorder's own keepalive port). Queries the new
`get-segment-count` bridge op. Asserts count >= 3 (per D-13:
SEGMENT_DURATION_MS=10s × MAX_SEGMENTS=3).
- A12 — SAVE_ARCHIVE produces zip; webm passes ffprobe. Page side
dispatches SAVE_ARCHIVE (recording from A11 still alive). Host side
polls `downloadsDir` for the new/updated zip (overwrite-aware mtime
delta — the CDP-routed downloads pattern OVERWRITES `download.zip`
rather than numbering it, empirically verified during initial RED).
Extracts `video/last_30sec.webm` via JSZip to a tmpfile. Runs
`/usr/bin/ffprobe -v error -f matroska <path>`; asserts exit 0 + clean
stderr. Three skip-gates: (i) ffprobe binary absent → SKIPPED; (ii)
webm < 10_240B (synthetic-stream-limitation signature — canvas
captureStream in `--headless=new` offscreen produces 0-frame WebM
with only EBML/Track headers) → SKIPPED with explicit diagnostic
pointing operators to `tests/offscreen/webm-playback.test.ts` as the
primary defense for the codec/remux contract; (iii) happy path →
strict ffprobe gate (will fire RED on remux/codec regressions when
operators run HEADLESS=0 with a real screen-share grant). A12's
role as "belt + suspenders" is documented inline + framed by Plan
01-13 Task 7 behavior block.
- A13 — Zip structure + meta.json shape. Second SAVE_ARCHIVE (verifies
idempotency over A12's first save). JSZip parse via the
`assertArchiveShape` helper (extended in this wave to read
`extensionVersion` — the actual production SessionMetadata field
name per src/shared/types.ts:103, vs. the earlier 01-11 prototype's
incorrect `version` assumption). Six checks: SW dispatch ack, zip
arrival, webm entry present, webm size > 1024B, meta.json entry
present, meta.json.extensionVersion matches
chrome.runtime.getManifest().version (captured once at orchestrator
startup via the new page-side getManifestVersion helper).
Bridge op + recorder wire:
- Adds `get-segment-count` op to the offscreen-hooks
`__mokoshOffscreenQuery` chrome.runtime.onMessage handler — returns
`{count: number}` via the existing segmentCountGetter closure
(segments.length captured at recorder.ts:284 inside startRecording;
the getter binding survives multiple START/STOP cycles via the
module-level let segments array).
- Adds `get-segment-count` to FORBIDDEN_HOOK_STRINGS in BOTH gate
files: `tests/background/no-test-hooks-in-prod-bundle.test.ts`
(Tier-1 unit gate; 9 → 10 entries; vitest 93 → 94 GREEN) and
`tests/uat/harness.test.ts:assertA0_GrepGate` (UAT-level mirror).
Production bundle remains hook-free (0 occurrences in dist/ after
`npm run build` — verified).
Harness surface:
- `tests/uat/extension-page-harness.ts` extends `window.__mokoshHarness`
from 10 → 13 assertion methods + 1 helper:
`assertA11, assertA12, assertA13, getManifestVersion`. Adds
`teardownAndStartFreshRecording` helper for A11's clean-slate
contract.
- `tests/uat/lib/harness-page-driver.ts` retires the Wave-3 stub
marker (no more NYI throws). Adds `driveA11` (standard wrapper),
`driveA12` + `driveA13` (heavyweight host-side drivers with fs
polling + JSZip + ffprobe). Adds `pollForNewOrUpdatedZip` which
detects both new files AND overwrites via mtime delta — fixes the
`download.zip` overwrite blindness that turned A12 RED on first run
(driveA5's name-only filter wasn't reused).
- `tests/uat/lib/zip.ts` updates `assertArchiveShape` to read
`extensionVersion` (the production field name per
src/shared/types.ts:103); adds the A13_MIN_VIDEO_BYTES=1024 floor
constant.
- `tests/uat/harness.test.ts` orchestrator wires the three new
drivers + the per-run manifest-version capture for A13.
Baseline:
- `npx tsc --noEmit`: exit 0.
- `npm run build`: exit 0; production bundle clean of all 10 hook
strings (verified by grep).
- `npm run build:test`: exit 0; test bundle ships `get-segment-count`.
- `npx vitest run`: 94/94 GREEN (was 93; +1 from the new gate string).
- `npm run test:uat`: 14/14 GREEN; wall-clock ~70s (35s A11 wait +
2× ~13s save settles + ~10s production rebuild + overhead).
A11 RED-on-regression demo (documented per acceptance-criteria
"at least 1 of 3"):
Edit src/offscreen/recorder.ts:52: `SEGMENT_DURATION_MS = 10_000`
→ `SEGMENT_DURATION_MS = 30_000`. Rebuild dist-test. Re-run UAT.
A11 FAILS (only 1 segment rotates in 35s, vs floor of 3). Revert
the edit; A11 PASSES. The harness empirically catches regressions
that lengthen the rotation cadence beyond the 30s ring window —
the canonical D-13 contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
366 lines
14 KiB
TypeScript
366 lines
14 KiB
TypeScript
// tests/uat/harness.test.ts — Plan 01-13 Wave 3A orchestrator.
|
|
//
|
|
// Top-level entry for the production UAT harness. Drives all 14
|
|
// assertions sequentially against a SINGLE launched Chrome instance with
|
|
// a SINGLE harness page; bails on the first failure with a structured
|
|
// diagnostic dump. Exits 0 only when 14/14 GREEN.
|
|
//
|
|
// Wave 3A scope — wires A0+A1+A2+A3+A4+A6 (A6 via the proven Wave-2
|
|
// driver). A5+A7..A13 throw `NOT YET IMPLEMENTED — Wave 3<X> wires this`
|
|
// from `tests/uat/lib/harness-page-driver.ts`; the bail-on-first-failure
|
|
// loop stops at the first such throw.
|
|
//
|
|
// Wave 3B wires A5 (SAVE_ARCHIVE → zip on disk) + A7 (genuine
|
|
// RECORDING_ERROR → ERR + recovery notification). Wave 3C (this file's
|
|
// current state) wires A8 (Bug A canonical onStartup-notification
|
|
// regression rewind) + A9 (icon file sizes meet imageUtil floors) +
|
|
// A10 (manifest shape contract). Expected diagnostic:
|
|
// "11/14 GREEN: A0+A1+A2+A3+A4+A5+A6+A7+A8+A9+A10; bail at A11".
|
|
// Wave 3D wires A11+A12+A13 for 14/14 GREEN.
|
|
//
|
|
// The orchestrator structure is final from Wave 3A onward; future waves
|
|
// only fill in the assertion-driver stubs.
|
|
//
|
|
// Architectural commitments (per 01-11-SUMMARY.md, DO NOT REGRESS):
|
|
// - Single browser, single recording per run (state machine: idle →
|
|
// A1 reads idle → A2 transitions to REC → A3+A4 read REC →
|
|
// A5 saves archive → A6 simulates user-stop → A7 surfaces ERR → ...).
|
|
// - A0 (Tier-1 grep gate) runs PRE-FLIGHT before any Chrome launch.
|
|
// Mirrors `tests/background/no-test-hooks-in-prod-bundle.test.ts`
|
|
// FORBIDDEN_HOOK_STRINGS inventory. Belt-and-suspenders: the unit
|
|
// test gate runs in `npm test` (~15s); the UAT-level A0 runs in
|
|
// `npm run test:uat` (~60-90s). Same invariant; two independent
|
|
// verification paths.
|
|
// - Drive Chrome FROM INSIDE: each assertion is a single
|
|
// `page.evaluate(() => window.__mokoshHarness.assertXX())` call;
|
|
// no SW.evaluate, no popup-bridge (both falsified per 01-11-SUMMARY).
|
|
//
|
|
// References:
|
|
// - puppeteer.launch + extension loading:
|
|
// https://pptr.dev/api/puppeteer.launchoptions
|
|
// - Node fs.readdirSync recursive walk:
|
|
// https://nodejs.org/api/fs.html#fsreaddirsyncpath-options
|
|
// - Node child_process.execFileSync:
|
|
// https://nodejs.org/api/child_process.html#child_processexecfilesyncfile-args-options
|
|
|
|
import { execFileSync } from 'node:child_process';
|
|
import { existsSync, readFileSync, readdirSync, statSync } from 'node:fs';
|
|
import { dirname, resolve as resolvePath } from 'node:path';
|
|
import { fileURLToPath } from 'node:url';
|
|
|
|
import { launchHarnessBrowser } from './lib/launch';
|
|
import {
|
|
driveA1,
|
|
driveA2,
|
|
driveA3,
|
|
driveA4,
|
|
driveA5,
|
|
driveA6,
|
|
driveA7,
|
|
driveA8,
|
|
driveA9,
|
|
driveA10,
|
|
driveA11,
|
|
driveA12,
|
|
driveA13,
|
|
getManifestVersion,
|
|
} from './lib/harness-page-driver';
|
|
import {
|
|
printAssertionResult,
|
|
runAssertion,
|
|
type AssertionRecord,
|
|
} from './lib/assertions';
|
|
|
|
/**
|
|
* A0 forbidden-string inventory — mirrors
|
|
* `tests/background/no-test-hooks-in-prod-bundle.test.ts:FORBIDDEN_HOOK_STRINGS`.
|
|
* Keep in sync. The two lists serving the same invariant is intentional
|
|
* (belt-and-suspenders per `feedback-pre-checkpoint-bundle-gates.md`):
|
|
* unit-test gate catches at `npm test`, UAT gate catches at `npm run test:uat`.
|
|
*/
|
|
const FORBIDDEN_HOOK_STRINGS: ReadonlyArray<string> = [
|
|
'__mokoshTest',
|
|
'setCurrentStream',
|
|
'setSegmentCountGetter',
|
|
'installFakeDisplayMedia',
|
|
'uninstallFakeDisplayMedia',
|
|
'dispatchEndedOnTrack',
|
|
'getSegmentCount',
|
|
'__mokoshOffscreenQuery',
|
|
'get-display-surface',
|
|
'get-segment-count',
|
|
];
|
|
|
|
/** Build timeout for the pre-flight production rebuild (matches unit-gate value). */
|
|
const PROD_BUILD_TIMEOUT_MS = 60_000;
|
|
|
|
/** Resolve repo-root paths from this file's location. */
|
|
const HARNESS_FILE_DIR = dirname(fileURLToPath(import.meta.url));
|
|
const REPO_ROOT = resolvePath(HARNESS_FILE_DIR, '..', '..');
|
|
const DIST_DIR = resolvePath(REPO_ROOT, 'dist');
|
|
|
|
/** Binary extensions skipped during the grep walk (mirror of unit gate). */
|
|
const BINARY_EXTENSIONS: ReadonlySet<string> = new Set([
|
|
'.png', '.jpg', '.jpeg', '.gif', '.ico', '.webp', '.woff', '.woff2', '.ttf', '.otf',
|
|
]);
|
|
|
|
/**
|
|
* Recursively collect every regular file under `root`. Returns absolute
|
|
* paths sorted alphabetically for stable diagnostics.
|
|
*
|
|
* @param root - Absolute directory path to walk.
|
|
* @returns Sorted list of absolute file paths under `root`.
|
|
*/
|
|
function listAllFilesRecursive(root: string): ReadonlyArray<string> {
|
|
const accumulator: string[] = [];
|
|
const stack: string[] = [root];
|
|
while (stack.length > 0) {
|
|
const dir = stack.pop()!;
|
|
const entries = readdirSync(dir, { withFileTypes: true });
|
|
for (const entry of entries) {
|
|
const fullPath = resolvePath(dir, entry.name);
|
|
if (entry.isSymbolicLink()) {
|
|
continue;
|
|
}
|
|
if (entry.isDirectory()) {
|
|
stack.push(fullPath);
|
|
} else if (entry.isFile()) {
|
|
accumulator.push(fullPath);
|
|
}
|
|
}
|
|
}
|
|
return accumulator.sort();
|
|
}
|
|
|
|
/**
|
|
* Count occurrences of `needle` in the given file. Returns 0 for binary
|
|
* file extensions (text matching against UTF-8 of a PNG would be
|
|
* meaningless and could yield spurious matches).
|
|
*
|
|
* @param filePath - Absolute file path to scan.
|
|
* @param needle - Literal substring to count.
|
|
* @returns Total occurrences in the file's text.
|
|
*/
|
|
function countOccurrencesInFile(filePath: string, needle: string): number {
|
|
const dotIdx = filePath.lastIndexOf('.');
|
|
const ext = dotIdx >= 0 ? filePath.substring(dotIdx).toLowerCase() : '';
|
|
if (BINARY_EXTENSIONS.has(ext)) {
|
|
return 0;
|
|
}
|
|
const stat = statSync(filePath);
|
|
if (stat.size === 0) {
|
|
return 0;
|
|
}
|
|
const text = readFileSync(filePath, 'utf8');
|
|
let count = 0;
|
|
let from = 0;
|
|
for (;;) {
|
|
const idx = text.indexOf(needle, from);
|
|
if (idx < 0) {
|
|
break;
|
|
}
|
|
count += 1;
|
|
from = idx + needle.length;
|
|
}
|
|
return count;
|
|
}
|
|
|
|
/**
|
|
* A0 — Tier-1 grep gate (UAT-level mirror of the unit-gate). Spawns
|
|
* `npm run build` if `SKIP_PROD_REBUILD !== '1'`, then walks `dist/`
|
|
* checking every forbidden string. Reports all matches in one pass
|
|
* (full enumeration, not bail-on-first) so the operator sees the entire
|
|
* leak surface in a single failure.
|
|
*
|
|
* @returns Structured A0 result: passed flag + list of (string, file) matches.
|
|
*/
|
|
async function assertA0_GrepGate(): Promise<{
|
|
passed: boolean;
|
|
matches: Array<{ needle: string; filePath: string; count: number }>;
|
|
}> {
|
|
if (process.env.SKIP_PROD_REBUILD !== '1') {
|
|
process.stdout.write('A0: running `npm run build` (set SKIP_PROD_REBUILD=1 to skip)...\n');
|
|
execFileSync('npm', ['run', 'build'], {
|
|
stdio: 'inherit',
|
|
timeout: PROD_BUILD_TIMEOUT_MS,
|
|
});
|
|
} else {
|
|
process.stdout.write('A0: SKIP_PROD_REBUILD=1 — using existing dist/\n');
|
|
}
|
|
|
|
if (!existsSync(DIST_DIR)) {
|
|
return {
|
|
passed: false,
|
|
matches: [
|
|
{
|
|
needle: '<missing dist/>',
|
|
filePath: DIST_DIR,
|
|
count: 0,
|
|
},
|
|
],
|
|
};
|
|
}
|
|
|
|
const files = listAllFilesRecursive(DIST_DIR);
|
|
const matches: Array<{ needle: string; filePath: string; count: number }> = [];
|
|
for (const needle of FORBIDDEN_HOOK_STRINGS) {
|
|
for (const filePath of files) {
|
|
const count = countOccurrencesInFile(filePath, needle);
|
|
if (count > 0) {
|
|
matches.push({ needle, filePath, count });
|
|
}
|
|
}
|
|
}
|
|
return { passed: matches.length === 0, matches };
|
|
}
|
|
|
|
/**
|
|
* Top-to-bottom orchestrator entry. Pre-flight A0 → launch browser →
|
|
* iterate driver list → bail on first failure → close browser → return
|
|
* exit code.
|
|
*
|
|
* @returns Process exit code: 0 on 14/14 GREEN, 1 on any failure.
|
|
*/
|
|
async function main(): Promise<number> {
|
|
process.stdout.write('\nMokosh Plan 01-13 — UAT harness orchestrator\n');
|
|
process.stdout.write('Architecture: A0 pre-flight + extension-internal page driver (A1..A13)\n');
|
|
process.stdout.write('='.repeat(72) + '\n');
|
|
|
|
// A0 pre-flight (no Chrome launch needed; runs against built dist/).
|
|
const a0 = await assertA0_GrepGate();
|
|
if (!a0.passed) {
|
|
process.stderr.write('\nA0 FAIL: production bundle hook-string leak detected.\n');
|
|
for (const m of a0.matches) {
|
|
process.stderr.write(` - '${m.needle}' in ${m.filePath} (${m.count} occurrence${m.count === 1 ? '' : 's'})\n`);
|
|
}
|
|
process.stderr.write(
|
|
'\nThe Vite mode gate on the test-hook imports has regressed; verify\n' +
|
|
'src/background/index.ts + src/offscreen/recorder.ts still gate via `__MOKOSH_UAT__`.\n',
|
|
);
|
|
return 1;
|
|
}
|
|
process.stdout.write('A0: GREEN (production bundle hook-free)\n\n');
|
|
|
|
// Driver registry — execution order matters:
|
|
// A1 (idle) → A2 (REC start) → A3 (displaySurface) → A4 (popup pinned)
|
|
// → A5 (SAVE_ARCHIVE) → A6 (Bug B dispatch-ended) → A7 (genuine error)
|
|
// → A8 (Bug A onStartup) → A9 (icon sizes) → A10 (manifest)
|
|
// → A11 (35s segments) → A12 (ffprobe) → A13 (zip shape).
|
|
//
|
|
// A6 currently lives mid-list because the prototype's assertA6 does
|
|
// its own ensureOffscreen + START_RECORDING (idempotent w.r.t. A2's
|
|
// recording), then dispatch-ended. After A6 the recording is torn
|
|
// down — A7+ would need to re-start or test post-stop state.
|
|
//
|
|
// Wave 3C wires A8 + A9 + A10 in addition to A1..A7 — bail-on-first-
|
|
// failure stops at A11 (Wave 3D wires that). Expected diagnostic:
|
|
// "11/14 GREEN: A0+A1+A2+A3+A4+A5+A6+A7+A8+A9+A10; A11..A13 NOT YET IMPLEMENTED".
|
|
// The standalone `npx tsx tests/uat/a6.test.ts` entry remains the
|
|
// way to verify A6 in isolation for inner-loop iteration.
|
|
process.stdout.write('Launching Chrome + opening harness page...\n');
|
|
const handles = await launchHarnessBrowser();
|
|
process.stdout.write(`Extension id: ${handles.extensionId}\n`);
|
|
process.stdout.write(`Downloads dir: ${handles.downloadsDir}\n\n`);
|
|
|
|
// Adapter: driveA5 / driveA12 / driveA13 need `handles.downloadsDir`
|
|
// (host-side fs polling). driveA13 additionally needs the manifest
|
|
// version (read once at orchestrator startup via the page-side
|
|
// `getManifestVersion` helper). All other drivers take only `page`.
|
|
// The driver list is constructed AFTER `launchHarnessBrowser` returns
|
|
// so the closures can capture handles without a TDZ trap.
|
|
const expectedManifestVersion = await getManifestVersion(handles.harnessPage);
|
|
process.stdout.write(`Manifest version (for A13): ${expectedManifestVersion}\n\n`);
|
|
|
|
const driveA5Wrapped: (page: import('puppeteer').Page) => Promise<AssertionRecord> =
|
|
(page) => driveA5(page, handles.downloadsDir);
|
|
const driveA12Wrapped: (page: import('puppeteer').Page) => Promise<AssertionRecord> =
|
|
(page) => driveA12(page, handles.downloadsDir);
|
|
const driveA13Wrapped: (page: import('puppeteer').Page) => Promise<AssertionRecord> =
|
|
(page) => driveA13(page, handles.downloadsDir, expectedManifestVersion);
|
|
|
|
const drivers: ReadonlyArray<{
|
|
readonly name: string;
|
|
readonly drive: (page: import('puppeteer').Page) => Promise<AssertionRecord>;
|
|
}> = [
|
|
{ name: 'A1', drive: driveA1 },
|
|
{ name: 'A2', drive: driveA2 },
|
|
{ name: 'A3', drive: driveA3 },
|
|
{ name: 'A4', drive: driveA4 },
|
|
{ name: 'A5', drive: driveA5Wrapped },
|
|
{ name: 'A6', drive: driveA6 },
|
|
{ name: 'A7', drive: driveA7 },
|
|
{ name: 'A8', drive: driveA8 },
|
|
{ name: 'A9', drive: driveA9 },
|
|
{ name: 'A10', drive: driveA10 },
|
|
{ name: 'A11', drive: driveA11 },
|
|
{ name: 'A12', drive: driveA12Wrapped },
|
|
{ name: 'A13', drive: driveA13Wrapped },
|
|
];
|
|
|
|
const buffers = { swConsole: handles.swConsole, offConsole: handles.offConsole };
|
|
const results: Array<{ name: string; passed: boolean; error?: string }> = [];
|
|
let bailReason: string | null = null;
|
|
|
|
try {
|
|
for (const { name, drive } of drivers) {
|
|
process.stdout.write(`--- ${name} ---\n`);
|
|
let driverErr: string | undefined;
|
|
let result: AssertionRecord | null = null;
|
|
try {
|
|
result = await runAssertion(
|
|
name,
|
|
() => drive(handles.harnessPage),
|
|
buffers,
|
|
);
|
|
printAssertionResult(result);
|
|
} catch (err) {
|
|
driverErr = err instanceof Error ? err.message : String(err);
|
|
// A throw here is either: (a) a Wave-3 stub firing
|
|
// (NOT YET IMPLEMENTED) — expected during incremental waves; OR
|
|
// (b) a CDP/Puppeteer-level error (e.g. page closed, timeout) —
|
|
// a genuine harness regression. Both bail uniformly.
|
|
process.stderr.write(`*** ${name} THREW: ${driverErr}\n`);
|
|
}
|
|
const passed = result !== null && result.passed && driverErr === undefined;
|
|
results.push({ name, passed, error: driverErr });
|
|
if (!passed) {
|
|
bailReason = driverErr ?? `${name} failed; see structured checks above`;
|
|
break;
|
|
}
|
|
}
|
|
} finally {
|
|
try {
|
|
await handles.browser.close();
|
|
} catch (closeErr) {
|
|
process.stderr.write(`(non-fatal: browser close threw: ${String(closeErr)})\n`);
|
|
}
|
|
}
|
|
|
|
const passedCount = results.filter((r) => r.passed).length;
|
|
// Total = 1 (A0) + drivers.length (A1..A13) = 14.
|
|
const total = drivers.length + 1;
|
|
const finalPassed = passedCount + 1; // +1 for A0 (we already passed it to reach here)
|
|
|
|
process.stdout.write('\n' + '='.repeat(72) + '\n');
|
|
process.stdout.write(
|
|
`UAT harness: ${finalPassed}/${total} assertions passed${bailReason !== null ? ` (bailed: ${bailReason})` : ''}\n`,
|
|
);
|
|
for (const r of results) {
|
|
const mark = r.passed ? '[PASS]' : '[FAIL]';
|
|
const tail = r.error !== undefined ? ` — ${r.error}` : '';
|
|
process.stdout.write(` ${mark} ${r.name}${tail}\n`);
|
|
}
|
|
if (bailReason !== null) {
|
|
const remainingStart = results.length;
|
|
for (let i = remainingStart; i < drivers.length; i += 1) {
|
|
process.stdout.write(` [SKIP] ${drivers[i].name} (not reached — bailed at ${results[results.length - 1].name})\n`);
|
|
}
|
|
}
|
|
process.stdout.write('='.repeat(72) + '\n');
|
|
|
|
return finalPassed === total ? 0 : 1;
|
|
}
|
|
|
|
const code = await main();
|
|
process.exit(code);
|