feat(01-13): wave-2 — launchHarnessBrowser + assertions + harness-page-driver scaffolding

Build out the Approach-B harness driver utilities atop the Wave 1
production paths. Three new files form the shared scaffold that
Wave 3's 13 assertion drivers (A1-A5, A7-A13) and the eventual
orchestrator (`tests/uat/harness.test.ts`) will all consume. The
standalone A6 driver (`tests/uat/a6.test.ts`) is rewritten to use
the new lib — behavior-preserving: A6 still PASSES 5/5 in ~7s.

New files:

  - tests/uat/lib/launch.ts (~320 LoC)
      `launchHarnessBrowser({ headless?, downloadsDir? }) → HarnessHandles`
      Extracts the Chrome-launch + victim-page + harness-page + console-
      attach pattern from a6.test.ts into a single reusable helper.
      NEW vs prototype: CDP `Browser.setDownloadBehavior` wires
      Chrome's download path to a per-run `mkdtempSync` tmp dir so A5
      (SAVE_ARCHIVE) can poll a known location without colliding with
      the operator's real downloads. Architectural commitments
      enforced (per 01-11-SUMMARY): no `--auto-select-desktop-capture-
      source` flag; victim about:blank brought to front for the
      production `chrome.tabs.query({active:true})` workaround; SW
      console attach best-effort with bounded poll; offscreen console
      attach opportunistic via `targetcreated` listener (offscreen
      target appears later, when the harness page calls
      chrome.offscreen.createDocument).

  - tests/uat/lib/assertions.ts (~210 LoC)
      Host-side assertion primitives:
        * `AssertionRecord`, `CheckRecord`, `ConsoleBuffers` types —
          mirror the page-side shape returned by `assertA*` methods.
        * `runAssertion(name, fn, buffers)` — try/catch wrapper that
          dumps the SW + offscreen console tails (last 100 lines each)
          to stderr on failure, then returns `{passed: false, error}`
          if `fn` throws.
        * `printAssertionResult(result)` — single source of truth for
          the formatted result print. Extracted from the inline
          `printResult` previously in the prototype's a6.test.ts so
          Wave 3's orchestrator can reuse it across all 14 assertions.
        * `assertEqual / assertGte / assertMatch / assertTrue` —
          structured failure messages atop node:assert/strict.
        * `waitFor(probe, predicate, timeoutMs, description)` — host-
          side polling primitive; mirrors the page-side waitFor
          semantics verbatim (they can't share a module: page-side is
          bundled into the harness HTML, host-side runs in Node).
      NO chrome.* helpers here — all chrome.* work happens inside the
      extension-internal harness page. This module is host-side ONLY
      by construction (no chrome global in Node anyway).

  - tests/uat/lib/harness-page-driver.ts (~170 LoC)
      One driver wrapper per assertion (A1..A13). Each wraps a single
      `page.evaluate(() => window.__mokoshHarness.assertXX())`.
      Centralizing this means adding/renaming an assertion = two-file
      edit (extension-page-harness.ts impl + this file) instead of
      touching every test-file caller.
      Wave 2 wires `driveA6` (proven from c647f61). The 12 Wave-3
      drivers (driveA1..A5, A7..A13) are stubbed as
      `throw new Error('NOT YET IMPLEMENTED — Wave 3<X> wires driveXX')`
      so the future orchestrator's `for (const drive of drivers)` loop
      fails cleanly on the first unimplemented one (bail-on-first-
      failure semantics). The `AssertionWithBytes` type is declared
      for A5/A12/A13 which return `bytesBase64` payloads (zip / webm
      bytes that the host side processes after the page-side
      assertion completes).

Rewrite — `tests/uat/a6.test.ts`:
  - Drops ~80 LoC of Chrome-launch + console-attach + result-print
    plumbing now living in lib/launch.ts + lib/assertions.ts.
  - Now ~70 LoC total — pure orchestration of
    launchHarnessBrowser → runAssertion(driveA6) → printAssertionResult
    → browser.close() → exit code.
  - Behavior-preserving: A6 still 5/5 GREEN with the same diagnostic
    output (SETUP, A6.1-A6.4) and the same ~7s end-to-end runtime.

Verification (all GREEN):
  - `npx tsc --noEmit` — exit 0 (root + tests/uat/tsconfig.json).
  - `npx tsx tests/uat/a6.test.ts` — exits 0 with "PASS"; 5 checks
    GREEN (SETUP, A6.1, A6.2, A6.3, A6.4). End-to-end runtime ~7s
    headless on this workstation.
  - `npm run build` — exit 0; Tier-1 grep gate GREEN (production
    bundle contains zero hook strings AND zero lib symbol names —
    the new lib files are test-only and not bundled into dist/).
  - `npm run build:test` — exit 0; dist-test/ still emits the
    extension-page-harness.html harness (lib files are host-side,
    not rollup inputs).
  - `npx vitest run` — 92/92 GREEN.

Wave 3 ready: harness-page-driver.ts has driveA1..A5/A7..A13 stubs
in place; extending requires only:
  1. Add `assertAXX` method to window.__mokoshHarness in
     tests/uat/extension-page-harness.ts.
  2. Replace the corresponding stub body in this file with the
     page.evaluate wrapper.
  3. (Wave 3A) Create tests/uat/harness.test.ts orchestrator that
     iterates over [A0 grep gate, driveA1..A13] with bail-on-fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-18 15:21:11 +02:00
parent eb2258a880
commit eb64521321
4 changed files with 871 additions and 283 deletions

263
tests/uat/lib/assertions.ts Normal file
View File

@@ -0,0 +1,263 @@
// tests/uat/lib/assertions.ts — Plan 01-13 Wave 2.
//
// Host-side assertion primitives. Re-exports of node:assert/strict with
// structured failure messages + diagnostic-dump wrappers that capture
// SW + offscreen console buffers on failure.
//
// IMPORTANT — NO chrome.* helpers here. All chrome.* work happens
// inside the extension-internal harness page (see
// tests/uat/extension-page-harness.ts and its `window.__mokoshHarness`
// surface). This module is host-side ONLY — it runs in the Node
// process that drives Puppeteer. Calling chrome.* from here would
// fail (no chrome global in Node) — by-construction, not by convention.
//
// References:
// - node:assert/strict (deep strict equality):
// https://nodejs.org/api/assert.html#strict-assertion-mode
import * as assert from 'node:assert/strict';
/**
* One assertion-internal check record — populated by the harness page's
* `assertA*` methods. Each AssertionRecord carries 1..N CheckRecords
* which collectively determine whether the AssertionRecord PASSES.
*/
export interface CheckRecord {
readonly name: string;
readonly expected: unknown;
readonly actual: unknown;
readonly passed: boolean;
}
/**
* Structured result returned by every page-side `assertA*` method.
* Mirrors the shape used by the proven prototype (c647f61) so the
* host-side `runAssertion` + `printAssertionResult` can consume any
* assertion uniformly.
*/
export interface AssertionRecord {
readonly passed: boolean;
readonly name: string;
readonly checks: ReadonlyArray<CheckRecord>;
readonly diagnostics: ReadonlyArray<string>;
readonly error?: string;
}
/**
* Accumulating console buffers from `launchHarnessBrowser`. Passed
* into `runAssertion` so a failing assertion can dump the SW + offscreen
* logs to stderr alongside the structured CheckRecords. The buffers
* are MUTABLE arrays owned by `launch.ts`; readers MUST NOT mutate.
*/
export interface ConsoleBuffers {
readonly swConsole: ReadonlyArray<string>;
readonly offConsole: ReadonlyArray<string>;
}
/**
* How many trailing lines of each console buffer to dump on a failure.
* Bounded so a long-running test with thousands of lines does not
* overwhelm stderr; the cap is generous enough to capture the relevant
* preamble + the actual failure trigger.
*/
const CONSOLE_DUMP_TAIL_LINES = 100;
/**
* Wrap a single assertion attempt with try/catch + diagnostic dump on
* failure. The `fn` is the page-side call (typically a `driveA*`
* wrapper from `harness-page-driver.ts`); a thrown error becomes an
* AssertionRecord with `passed: false` + the error message in `.error`.
*
* On failure, dumps the last `CONSOLE_DUMP_TAIL_LINES` of each console
* buffer to stderr — sized to fit the typical assertion timeline
* (several seconds of SW + offscreen logs) without spamming.
*
* @param name - Assertion name (used only for the failure preamble).
* @param fn - Async function returning the page-side AssertionRecord.
* @param buffers - Console buffers from `launchHarnessBrowser`.
* @returns The page-side AssertionRecord (with passed=false on throw).
*/
export async function runAssertion(
name: string,
fn: () => Promise<AssertionRecord>,
buffers: ConsoleBuffers,
): Promise<AssertionRecord> {
try {
const result = await fn();
if (!result.passed) {
dumpConsoleTail(name, buffers);
}
return result;
} catch (err) {
const errMsg = err instanceof Error ? err.message : String(err);
dumpConsoleTail(name, buffers);
return {
passed: false,
name,
checks: [],
diagnostics: [`runAssertion caught: ${errMsg}`],
error: errMsg,
};
}
}
/**
* Dump the tail of each console buffer to stderr — used by
* `runAssertion` on any failure path. Each line is already pre-tagged
* (`[sw:log] ...` / `[off:log] ...`) by the listeners in `launch.ts`.
*
* @param assertionName - Name to prefix the dump header.
* @param buffers - Console buffers to dump.
*/
function dumpConsoleTail(
assertionName: string,
buffers: ConsoleBuffers,
): void {
process.stderr.write(
`\n--- console dump for assertion '${assertionName}' (tail ${CONSOLE_DUMP_TAIL_LINES} lines per buffer) ---\n`,
);
const swTail = buffers.swConsole.slice(-CONSOLE_DUMP_TAIL_LINES);
const offTail = buffers.offConsole.slice(-CONSOLE_DUMP_TAIL_LINES);
for (const line of swTail) {
process.stderr.write(line + '\n');
}
for (const line of offTail) {
process.stderr.write(line + '\n');
}
process.stderr.write(
`--- end console dump for '${assertionName}' ---\n\n`,
);
}
/**
* Pretty-print an AssertionRecord to stdout. Used by both the
* orchestrator (`harness.test.ts` in Wave 3A) and the standalone A6
* entry (`a6.test.ts`). Single source of formatting truth.
*
* @param result - The structured result from a page-side assertion.
*/
export function printAssertionResult(result: AssertionRecord): void {
process.stdout.write('\n');
process.stdout.write('='.repeat(72) + '\n');
process.stdout.write(`${result.name}: ${result.passed ? 'PASS' : 'FAIL'}\n`);
if (result.error !== undefined) {
process.stdout.write(`Top-level error: ${result.error}\n`);
}
process.stdout.write('\nChecks:\n');
for (const check of result.checks) {
const mark = check.passed ? '[PASS]' : '[FAIL]';
process.stdout.write(` ${mark} ${check.name}\n`);
process.stdout.write(` expected: ${JSON.stringify(check.expected)}\n`);
process.stdout.write(` actual: ${JSON.stringify(check.actual)}\n`);
}
process.stdout.write('\nDiagnostics:\n');
for (const diag of result.diagnostics) {
process.stdout.write(` - ${diag}\n`);
}
process.stdout.write('='.repeat(72) + '\n');
}
/**
* Wrapper around `assert.deepStrictEqual` with a structured message
* preamble. Throws AssertionError on mismatch (caller catches in
* `runAssertion`).
*
* @param actual - Observed value.
* @param expected - Reference value.
* @param message - Human-readable context (e.g. assertion name).
*/
export function assertEqual(actual: unknown, expected: unknown, message: string): void {
assert.deepStrictEqual(actual, expected, message);
}
/**
* Assert `actual >= expected`. Throws AssertionError on failure with
* a structured message including both values.
*
* @param actual - Observed numeric value.
* @param expected - Lower bound (inclusive).
* @param message - Human-readable context.
*/
export function assertGte(actual: number, expected: number, message: string): void {
if (actual < expected) {
throw new assert.AssertionError({
message: `${message} — expected ${actual} >= ${expected}`,
actual,
expected,
operator: '>=',
});
}
}
/**
* Assert `actual` matches the regex. Throws AssertionError on failure.
*
* @param actual - Observed string.
* @param regex - Pattern to test.
* @param message - Human-readable context.
*/
export function assertMatch(actual: string, regex: RegExp, message: string): void {
if (!regex.test(actual)) {
throw new assert.AssertionError({
message: `${message} — expected ${JSON.stringify(actual)} to match ${regex}`,
actual,
expected: regex,
operator: 'match',
});
}
}
/**
* Assert `cond` is exactly `true`. Throws AssertionError otherwise.
*
* @param cond - Boolean to assert true.
* @param message - Human-readable context.
*/
export function assertTrue(cond: boolean, message: string): void {
assert.strictEqual(cond, true, message);
}
/**
* Default polling interval for `waitFor` — matches the prototype's
* 100ms cadence (good tradeoff between CPU and detection latency).
*/
const WAIT_FOR_POLL_INTERVAL_MS = 100;
/**
* Poll an async probe until it satisfies the predicate or the timeout
* elapses. Mirrors the prototype's host-side polling primitive
* (verbatim semantics, host-side scope).
*
* IMPORTANT: this is the HOST-SIDE waitFor. The HARNESS-PAGE-SIDE
* waitFor (inside `tests/uat/extension-page-harness.ts`) is a separate
* implementation with identical semantics — the page-side runs in the
* browser isolate; the host-side runs in Node. They cannot share a
* module because one is bundled into the HTML harness and the other
* runs natively.
*
* @param probe - Async function returning the current value.
* @param predicate - Returns true when the value matches the expectation.
* @param timeoutMs - Maximum wait time before throwing.
* @param description - Used in the timeout error message.
* @returns The value that satisfied the predicate.
* @throws If the timeout elapses; the error includes the last observed value.
*/
export async function waitFor<T>(
probe: () => Promise<T> | T,
predicate: (value: T) => boolean,
timeoutMs: number,
description: string,
): Promise<T> {
const start = Date.now();
let lastValue: T = await probe();
while (Date.now() - start < timeoutMs) {
if (predicate(lastValue)) {
return lastValue;
}
await new Promise((resolve) => setTimeout(resolve, WAIT_FOR_POLL_INTERVAL_MS));
lastValue = await probe();
}
throw new Error(
`waitFor timeout (${timeoutMs}ms) — ${description}; lastValue=${JSON.stringify(lastValue)}`,
);
}