Files
mokosh/tests/uat
Mark eb64521321 feat(01-13): wave-2 — launchHarnessBrowser + assertions + harness-page-driver scaffolding
Build out the Approach-B harness driver utilities atop the Wave 1
production paths. Three new files form the shared scaffold that
Wave 3's 13 assertion drivers (A1-A5, A7-A13) and the eventual
orchestrator (`tests/uat/harness.test.ts`) will all consume. The
standalone A6 driver (`tests/uat/a6.test.ts`) is rewritten to use
the new lib — behavior-preserving: A6 still PASSES 5/5 in ~7s.

New files:

  - tests/uat/lib/launch.ts (~320 LoC)
      `launchHarnessBrowser({ headless?, downloadsDir? }) → HarnessHandles`
      Extracts the Chrome-launch + victim-page + harness-page + console-
      attach pattern from a6.test.ts into a single reusable helper.
      NEW vs prototype: CDP `Browser.setDownloadBehavior` wires
      Chrome's download path to a per-run `mkdtempSync` tmp dir so A5
      (SAVE_ARCHIVE) can poll a known location without colliding with
      the operator's real downloads. Architectural commitments
      enforced (per 01-11-SUMMARY): no `--auto-select-desktop-capture-
      source` flag; victim about:blank brought to front for the
      production `chrome.tabs.query({active:true})` workaround; SW
      console attach best-effort with bounded poll; offscreen console
      attach opportunistic via `targetcreated` listener (offscreen
      target appears later, when the harness page calls
      chrome.offscreen.createDocument).

  - tests/uat/lib/assertions.ts (~210 LoC)
      Host-side assertion primitives:
        * `AssertionRecord`, `CheckRecord`, `ConsoleBuffers` types —
          mirror the page-side shape returned by `assertA*` methods.
        * `runAssertion(name, fn, buffers)` — try/catch wrapper that
          dumps the SW + offscreen console tails (last 100 lines each)
          to stderr on failure, then returns `{passed: false, error}`
          if `fn` throws.
        * `printAssertionResult(result)` — single source of truth for
          the formatted result print. Extracted from the inline
          `printResult` previously in the prototype's a6.test.ts so
          Wave 3's orchestrator can reuse it across all 14 assertions.
        * `assertEqual / assertGte / assertMatch / assertTrue` —
          structured failure messages atop node:assert/strict.
        * `waitFor(probe, predicate, timeoutMs, description)` — host-
          side polling primitive; mirrors the page-side waitFor
          semantics verbatim (they can't share a module: page-side is
          bundled into the harness HTML, host-side runs in Node).
      NO chrome.* helpers here — all chrome.* work happens inside the
      extension-internal harness page. This module is host-side ONLY
      by construction (no chrome global in Node anyway).

  - tests/uat/lib/harness-page-driver.ts (~170 LoC)
      One driver wrapper per assertion (A1..A13). Each wraps a single
      `page.evaluate(() => window.__mokoshHarness.assertXX())`.
      Centralizing this means adding/renaming an assertion = two-file
      edit (extension-page-harness.ts impl + this file) instead of
      touching every test-file caller.
      Wave 2 wires `driveA6` (proven from c647f61). The 12 Wave-3
      drivers (driveA1..A5, A7..A13) are stubbed as
      `throw new Error('NOT YET IMPLEMENTED — Wave 3<X> wires driveXX')`
      so the future orchestrator's `for (const drive of drivers)` loop
      fails cleanly on the first unimplemented one (bail-on-first-
      failure semantics). The `AssertionWithBytes` type is declared
      for A5/A12/A13 which return `bytesBase64` payloads (zip / webm
      bytes that the host side processes after the page-side
      assertion completes).

Rewrite — `tests/uat/a6.test.ts`:
  - Drops ~80 LoC of Chrome-launch + console-attach + result-print
    plumbing now living in lib/launch.ts + lib/assertions.ts.
  - Now ~70 LoC total — pure orchestration of
    launchHarnessBrowser → runAssertion(driveA6) → printAssertionResult
    → browser.close() → exit code.
  - Behavior-preserving: A6 still 5/5 GREEN with the same diagnostic
    output (SETUP, A6.1-A6.4) and the same ~7s end-to-end runtime.

Verification (all GREEN):
  - `npx tsc --noEmit` — exit 0 (root + tests/uat/tsconfig.json).
  - `npx tsx tests/uat/a6.test.ts` — exits 0 with "PASS"; 5 checks
    GREEN (SETUP, A6.1, A6.2, A6.3, A6.4). End-to-end runtime ~7s
    headless on this workstation.
  - `npm run build` — exit 0; Tier-1 grep gate GREEN (production
    bundle contains zero hook strings AND zero lib symbol names —
    the new lib files are test-only and not bundled into dist/).
  - `npm run build:test` — exit 0; dist-test/ still emits the
    extension-page-harness.html harness (lib files are host-side,
    not rollup inputs).
  - `npx vitest run` — 92/92 GREEN.

Wave 3 ready: harness-page-driver.ts has driveA1..A5/A7..A13 stubs
in place; extending requires only:
  1. Add `assertAXX` method to window.__mokoshHarness in
     tests/uat/extension-page-harness.ts.
  2. Replace the corresponding stub body in this file with the
     page.evaluate wrapper.
  3. (Wave 3A) Create tests/uat/harness.test.ts orchestrator that
     iterates over [A0 grep gate, driveA1..A13] with bail-on-fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:21:11 +02:00
..

Mokosh UAT harness (Plan 01-11)

Puppeteer-driven Node script that runs 14 assertions end-to-end against a real Chrome instance loaded with the Mokosh extension. Replaces Plan 01-09 Task 5's operator-empirical functional verification (the operator retains only step 1 — build — and step 14 — brand/design acceptance).

Quick start

npm run test:uat

This builds dist-test/ (the hook-enabled bundle) and runs the harness. Exit 0 means all 14 assertions passed. Final line: UAT harness: 14/14 assertions passed.

Local-debug mode

HEADLESS=0 npm run test:uat

Opens a real Chrome window so you can watch the picker auto-accept, the badge transitions, the popup appear, etc.

Developer iteration tricks

# Skip the production build inside assertion 0 (uses existing dist/):
SKIP_PROD_REBUILD=1 npm run test:uat

# Run the harness against an existing dist-test/ (skip npm run build:test):
npx tsx tests/uat/harness.test.ts

Assertion catalog

# Title Bug class Hook used
0 Production bundle has no test-hook leaks T-1-11-01 filesystem grep
1 SW bootstrap → setIdleMode sw.evaluate
2 Toolbar onClicked-idle → REC + popup triggerExtensionAction
3 Offscreen displaySurface === monitor D-15 __mokoshTest.getCurrentStream
4 Toolbar onClicked-recording → popup, no new offscreen targets count
5 SAVE_ARCHIVE → download fires downloads polling
6 BUG B: simulateUserStop → badge OFF + no recovery notif b9eeeeb dispatchEvent('ended')
7 RECORDING_ERROR codec-unsupported → ERR + recovery notif sendMessage
8 BUG A: onStartup → mokosh-startup- notification creates a881bf0 __mokoshTest.handlers.onStartup
9 Icon file sizes meet floors Bug A precondition sw.evaluate(fetch)
10 Manifest has notifications + 3 icons Bug A precondition chrome.runtime.getManifest
11 35s recording → segments.length >= 3 D-13 __mokoshTest.getSegmentCount
12 ffprobe on extracted webm exits 0 Plan 01-08 jszip + execFile
13 Archive shape — video + meta.json version match Plan 01-07 jszip

Failure isolation

Single browser, serial assertions, bail on first failure for setup- dependent assertions (assertion 0 abort means refusing to launch a potentially-leaky bundle). Per-assertion bail keeps the diagnostic output unambiguous — see RESEARCH §5 + Plan 01-11 open-question resolution 4.

On failure, the harness dumps the last 30 lines of SW console + last 30 lines of offscreen console (captured live during the run) to stderr BEFORE rethrowing — gives you contextual triage without needing to re- run with debug logging.

Known gotchas

Locale-specific picker auto-accept

The --auto-select-desktop-capture-source=Entire screen Chrome flag auto-accepts the screen-share picker. The string "Entire screen" is en_US-specific. If your Chrome is set to a non-English locale, the picker option label will differ and the auto-accept will silently fail (picker stays open; assertion 2 times out).

Fallback: switch your Chrome user-data-dir's locale to en_US for harness runs, OR adjust the launch arg in tests/uat/lib/launch.ts to match your locale's equivalent string.

dev-dep Chromium binary size

puppeteer pulls a ~150 MB Chromium binary at npm install time. CI must accept this. Production npm install --omit=dev skips it cleanly.

Xvfb is NOT required

Per Plan 01-11 RESEARCH §3 empirical probes against Chrome 148, the --headless=new mode handles screen capture without Xvfb on Linux CI runners. If a future Chrome regresses this, Xvfb :99 & DISPLAY=:99 npm run test:uat is the fallback.

CI runner screen-capture concern

The 35s recording assertion (A11) captures whatever is on screen during that window. CI MUST run the harness in an isolated container with no concurrent workload — see T-1-11-02 in Plan 01-11's threat model.

Real Chrome download (assertion 5 → A12)

The harness configures per-page download behavior via CDP to a fresh os.tmpdir()/mokosh-uat-downloads-* directory; downloads are NOT written to your real ~/Downloads. The temp directory is deleted by OS tmpdir GC.