Wave 3A landed. `npm run test:uat` now exercises 5/14 assertions
end-to-end (A0 + A1 + A2 + A3 + A4); bails at A5 NOT YET IMPLEMENTED
(Wave 3B scope). A6 still PASSES 5/5 through the standalone
`npx tsx tests/uat/a6.test.ts` entry — the orchestrator-level A6 won't
reach in Wave 3A because the sequential loop bails at A5; once Wave 3B
wires driveA5 the loop will fall through to A6 (which uses the proven
Wave-2 driveA6 driver — no rework needed there).
Files changed:
- `tests/uat/extension-page-harness.ts` — extends `window.__mokoshHarness`
from `{ assertA6 }` to `{ assertA1, assertA2, assertA3, assertA4,
assertA6 }`. Per-assertion contracts:
• A1 — chrome.action.getBadgeText({}) === '' + getPopup({}) === ''
+ isRecording=false (badge !== 'REC' proxy per state-machine atomic
pairing). 3 CheckRecords.
• A2 — ensureOffscreen + START_RECORDING direct-to-offscreen
(workaround for the `tabs` manifest permission gap per
01-11-SUMMARY + plan resolved-questions row 2) + manual
setBadgeText('REC') + setPopup(POPUP_HTML_PATH) + waitFor
badge==='REC'. The bypassed chrome.action.onClicked →
startVideoCapture path is unit-tested in
tests/background/badge-state-machine.test.ts; A2 verifies the
contract that matters (recording reaches the REC state-machine
row). 2 CheckRecords.
• A3 — offscreen bridge query 'get-display-surface' (new in this
plan via the prior commit's offscreen-hooks extension) → asserts
=== 'monitor'. 1 CheckRecord.
• A4 — getPopup remains 'src/popup/index.html' + hasDocument()===true
(no duplicate offscreen). Essentially a no-op verification —
regression protection against future refactors that might unpin
the popup during recording or spawn extra offscreens on stray
events. 2 CheckRecords.
• IMPORTANT: chrome.action.getPopup() returns the FULL absolute
chrome-extension://<id>/... URL (not the manifest-relative path).
A2.2 + A4.1 assert via .endsWith('src/popup/index.html') to stay
extension-id independent. Empirical finding from first orchestrator
run; documented inline.
- `tests/uat/lib/harness-page-driver.ts` — wires `driveA1/A2/A3/A4`
(replaces the 4 NOT YET IMPLEMENTED Wave-3A stubs from
eb64521). Each wraps a single page.evaluate(() =>
window.__mokoshHarness.assertXX()) call per the contract laid down
by driveA6. A5+A7..A13 remain stubbed for Waves 3B+3C+3D.
- `tests/uat/harness.test.ts` (NEW) — top-level UAT orchestrator
driving all 14 assertions sequentially against a single Chrome +
single harness page. A0 (Tier-1 grep gate) runs pre-flight before
any Chrome launch — mirrors
tests/background/no-test-hooks-in-prod-bundle.test.ts forbidden-
string inventory (9 entries; belt-and-suspenders per
feedback-pre-checkpoint-bundle-gates.md memory). Bail-on-first-
failure with [SKIP] markers for unreached assertions + structured
diagnostic dump (full SW + offscreen console tail) on each failure.
SKIP_PROD_REBUILD=1 escape hatch skips the A0-side `npm run build`
for developer iteration.
Verification (all GREEN):
- npx tsc --noEmit: clean (root)
- npx tsc --noEmit -p tests/uat: clean (UAT subtree)
- npm run build: clean; production bundle hook-free
(9-string grep gate in vitest unit gate)
- npm run build:test: clean; dist-test/assets/extension_page_harness-*.js
grew from 3.87kB → 7.67kB (A1+A2+A3+A4 added)
- SKIP_BUILD=1 npx vitest run: 93/93 GREEN
(Wave 0+1+2 baseline 92 + 1 from the 9th grep-gate string from
the prior commit; this commit adds zero new vitest tests — the
A1-A4 contracts are verified at UAT-harness time only)
- npx tsx tests/uat/a6.test.ts (standalone): 5/5 GREEN; exit 0
(Wave-2 A6 baseline preserved through orchestrator-adjacent
harness page surface extension)
- npm run test:uat (full operator entry): 5/14 GREEN
(A0 + A1 + A2 + A3 + A4); bails at A5 NOT YET IMPLEMENTED
(Wave 3B scope, expected). Total wall clock ~25s (~5s build +
~5s prod-rebuild for A0 + ~15s assertion sequence).
Operator empirical-verification deferred to orchestrator (per
feedback-pre-checkpoint-bundle-gates.md — the orchestrator runs SW
CSP-safety + Node-globals + DOM-globals grep on the built bundle
before surfacing any checkpoint).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mokosh UAT harness (Plan 01-11)
Puppeteer-driven Node script that runs 14 assertions end-to-end against a real Chrome instance loaded with the Mokosh extension. Replaces Plan 01-09 Task 5's operator-empirical functional verification (the operator retains only step 1 — build — and step 14 — brand/design acceptance).
Quick start
npm run test:uat
This builds dist-test/ (the hook-enabled bundle) and runs the harness.
Exit 0 means all 14 assertions passed. Final line: UAT harness: 14/14 assertions passed.
Local-debug mode
HEADLESS=0 npm run test:uat
Opens a real Chrome window so you can watch the picker auto-accept, the badge transitions, the popup appear, etc.
Developer iteration tricks
# Skip the production build inside assertion 0 (uses existing dist/):
SKIP_PROD_REBUILD=1 npm run test:uat
# Run the harness against an existing dist-test/ (skip npm run build:test):
npx tsx tests/uat/harness.test.ts
Assertion catalog
| # | Title | Bug class | Hook used |
|---|---|---|---|
| 0 | Production bundle has no test-hook leaks | T-1-11-01 | filesystem grep |
| 1 | SW bootstrap → setIdleMode | — | sw.evaluate |
| 2 | Toolbar onClicked-idle → REC + popup | — | triggerExtensionAction |
| 3 | Offscreen displaySurface === monitor | D-15 | __mokoshTest.getCurrentStream |
| 4 | Toolbar onClicked-recording → popup, no new offscreen | — | targets count |
| 5 | SAVE_ARCHIVE → download fires | — | downloads polling |
| 6 | BUG B: simulateUserStop → badge OFF + no recovery notif | b9eeeeb |
dispatchEvent('ended') |
| 7 | RECORDING_ERROR codec-unsupported → ERR + recovery notif | — | sendMessage |
| 8 | BUG A: onStartup → mokosh-startup- notification creates | a881bf0 |
__mokoshTest.handlers.onStartup |
| 9 | Icon file sizes meet floors | Bug A precondition | sw.evaluate(fetch) |
| 10 | Manifest has notifications + 3 icons | Bug A precondition | chrome.runtime.getManifest |
| 11 | 35s recording → segments.length >= 3 | D-13 | __mokoshTest.getSegmentCount |
| 12 | ffprobe on extracted webm exits 0 | Plan 01-08 | jszip + execFile |
| 13 | Archive shape — video + meta.json version match | Plan 01-07 | jszip |
Failure isolation
Single browser, serial assertions, bail on first failure for setup- dependent assertions (assertion 0 abort means refusing to launch a potentially-leaky bundle). Per-assertion bail keeps the diagnostic output unambiguous — see RESEARCH §5 + Plan 01-11 open-question resolution 4.
On failure, the harness dumps the last 30 lines of SW console + last 30 lines of offscreen console (captured live during the run) to stderr BEFORE rethrowing — gives you contextual triage without needing to re- run with debug logging.
Known gotchas
Locale-specific picker auto-accept
The --auto-select-desktop-capture-source=Entire screen Chrome flag
auto-accepts the screen-share picker. The string "Entire screen" is
en_US-specific. If your Chrome is set to a non-English locale, the
picker option label will differ and the auto-accept will silently fail
(picker stays open; assertion 2 times out).
Fallback: switch your Chrome user-data-dir's locale to en_US for
harness runs, OR adjust the launch arg in tests/uat/lib/launch.ts to
match your locale's equivalent string.
dev-dep Chromium binary size
puppeteer pulls a ~150 MB Chromium binary at npm install time. CI
must accept this. Production npm install --omit=dev skips it cleanly.
Xvfb is NOT required
Per Plan 01-11 RESEARCH §3 empirical probes against Chrome 148, the
--headless=new mode handles screen capture without Xvfb on Linux CI
runners. If a future Chrome regresses this, Xvfb :99 & DISPLAY=:99 npm run test:uat is the fallback.
CI runner screen-capture concern
The 35s recording assertion (A11) captures whatever is on screen during that window. CI MUST run the harness in an isolated container with no concurrent workload — see T-1-11-02 in Plan 01-11's threat model.
Real Chrome download (assertion 5 → A12)
The harness configures per-page download behavior via CDP to a fresh
os.tmpdir()/mokosh-uat-downloads-* directory; downloads are NOT
written to your real ~/Downloads. The temp directory is deleted by OS
tmpdir GC.