Files
mokosh/tests/uat
Mark 3726eee39f feat(04-04): Wave 0 spike — stopServiceWorker helper + 5-min SW idle empirical result
SPIKE OUTCOME: FAILED (offscreen DIED across 5-min SW idle + worker.close())

Per Plan 04-04 spike-first contract, Wave 0 empirically investigated whether
the offscreen document's RAM-only `segments: Blob[] = []` at
src/offscreen/recorder.ts:91 survives a 5-min SW idle followed by Puppeteer
CDP-driven `worker.close()`. RESEARCH Q2 hypothesis (MEDIUM confidence): yes,
the offscreen has its own lifecycle anchored by active MediaRecorder. Spike
result REFUTES that hypothesis.

Empirical measurement (HEADLESS=1; one full run; reproducible via the
committed spike script):

  - assertA2 priming: PASSED (badge=REC; offscreen + MediaRecorder live)
  - 5-min idle:        elapsed cleanly (308.7s total wall-clock)
  - stopServiceWorker: succeeded (worker.close() returned)
  - SAVE_ARCHIVE ack:  {success: true} (SW respawned + processed message)
  - video/last_30sec.webm size: 8505 bytes (well below 100 KB floor)
  - meta.urls: only chrome-extension://* origins; real-page URLs LOST
  - rrweb/session.json: []
  - logs/events.json: []
  - ffprobe on extracted webm: 'End of file' + 'Duplicate element' errors
    (corrupt/truncated; not a valid 30s segment cluster sequence)

Interpretation: offscreen-document lifecycle is NOT independent of the SW
under Puppeteer CDP-driven worker.close() conditions. The 8505 bytes are
likely stale/partial header bytes from a re-initialized empty offscreen
context after SW respawn, not a surviving 30s buffer. The plan's Task 2
GATING CONDITION (videoSize > 100_000) is NOT satisfied; Task 2 is BLOCKED.

Per saved memory `feedback-gsd-ceremony-for-fixes.md`: architectural changes
(moving segments from offscreen RAM to IndexedDB per RESEARCH Q2 sub-question
b Option C) MUST route through proper plan-fix ceremony, NOT improvised
inline inside Plan 04-04. Plan 04-04 SUMMARY flags the failure mode + cites
exact remediation path. ROADMAP SC #1 remains OPEN pending the persistence-
layer plan-fix.

Task 1 persisting artifacts (this commit):
  - tests/uat/lib/harness-page-driver.ts:
    + Browser type import (puppeteer)
    + stopServiceWorker(browser, extensionId) helper (verbatim from Chrome
      devrel canonical pattern — Puppeteer >=22.1.0; project pin ^25 OK)
    + findLatestZip exported (was module-internal) so the spike script can
      reuse the canonical mtime-sort selection logic without duplication
  - tests/uat/spike-a33-sw-persistence.ts (NEW):
    + One-shot empirical investigation script; reusable for future SW-
      lifecycle regression testing (e.g., verifying the eventual IndexedDB
      persistence layer actually closes ROADMAP SC #1)
    + Step 1 reuses __mokoshHarness.assertA2 (canonical fresh-recording
      prime; not the non-existent dispatchSaveArchive that REVISION iter-2
      explicitly forbids)
    + Step 5 dispatches SAVE_ARCHIVE via chrome.runtime.sendMessage inline
      from harness-page realm (Option B per plan-checker BLOCKER 2;
      matches A5/A11/A12/A13/A26/A28/A29/A30/A31 pattern)

Verification (Task 1 acceptance criteria):
  - npx tsc --noEmit: exits 0
  - HEADLESS=1 tsx tests/uat/spike-a33-sw-persistence.ts: ran to completion
    (no Puppeteer throw); SPIKE RESULT line emitted with explicit
    videoSize=8505 bytes; SAVE_ARCHIVE ack received
  - grep -c 'dispatchSaveArchive' tests/uat/spike-a33-sw-persistence.ts: 0
  - grep -c "type: 'SAVE_ARCHIVE'" tests/uat/spike-a33-sw-persistence.ts: 1
  - Total spike wall-clock: 308.7s (~5min idle + ~8s orchestration)

References:
  - Plan 04-04 PLAN.md spike contract (lines 64-72)
  - 04-RESEARCH.md Q2 sub-question (b) — Chrome MV3 offscreen lifecycle
  - https://developer.chrome.com/docs/extensions/how-to/test/test-serviceworker-termination-with-puppeteer
  - Saved memory: feedback-gsd-ceremony-for-fixes.md (no inline architectural
    fixes; route through plan-fix ceremony)
2026-05-21 18:44:45 +02:00
..

Mokosh UAT harness (Plan 01-11)

Puppeteer-driven Node script that runs 14 assertions end-to-end against a real Chrome instance loaded with the Mokosh extension. Replaces Plan 01-09 Task 5's operator-empirical functional verification (the operator retains only step 1 — build — and step 14 — brand/design acceptance).

Quick start

npm run test:uat

This builds dist-test/ (the hook-enabled bundle) and runs the harness. Exit 0 means all 14 assertions passed. Final line: UAT harness: 14/14 assertions passed.

Local-debug mode

HEADLESS=0 npm run test:uat

Opens a real Chrome window so you can watch the picker auto-accept, the badge transitions, the popup appear, etc.

Developer iteration tricks

# Skip the production build inside assertion 0 (uses existing dist/):
SKIP_PROD_REBUILD=1 npm run test:uat

# Run the harness against an existing dist-test/ (skip npm run build:test):
npx tsx tests/uat/harness.test.ts

Assertion catalog

# Title Bug class Hook used
0 Production bundle has no test-hook leaks T-1-11-01 filesystem grep
1 SW bootstrap → setIdleMode sw.evaluate
2 Toolbar onClicked-idle → REC + popup triggerExtensionAction
3 Offscreen displaySurface === monitor D-15 __mokoshTest.getCurrentStream
4 Toolbar onClicked-recording → popup, no new offscreen targets count
5 SAVE_ARCHIVE → download fires downloads polling
6 BUG B: simulateUserStop → badge OFF + no recovery notif b9eeeeb dispatchEvent('ended')
7 RECORDING_ERROR codec-unsupported → ERR + recovery notif sendMessage
8 BUG A: onStartup → mokosh-startup- notification creates a881bf0 __mokoshTest.handlers.onStartup
9 Icon file sizes meet floors Bug A precondition sw.evaluate(fetch)
10 Manifest has notifications + 3 icons Bug A precondition chrome.runtime.getManifest
11 35s recording → segments.length >= 3 D-13 __mokoshTest.getSegmentCount
12 ffprobe on extracted webm exits 0 Plan 01-08 jszip + execFile
13 Archive shape — video + meta.json version match Plan 01-07 jszip

Failure isolation

Single browser, serial assertions, bail on first failure for setup- dependent assertions (assertion 0 abort means refusing to launch a potentially-leaky bundle). Per-assertion bail keeps the diagnostic output unambiguous — see RESEARCH §5 + Plan 01-11 open-question resolution 4.

On failure, the harness dumps the last 30 lines of SW console + last 30 lines of offscreen console (captured live during the run) to stderr BEFORE rethrowing — gives you contextual triage without needing to re- run with debug logging.

Known gotchas

Locale-specific picker auto-accept

The --auto-select-desktop-capture-source=Entire screen Chrome flag auto-accepts the screen-share picker. The string "Entire screen" is en_US-specific. If your Chrome is set to a non-English locale, the picker option label will differ and the auto-accept will silently fail (picker stays open; assertion 2 times out).

Fallback: switch your Chrome user-data-dir's locale to en_US for harness runs, OR adjust the launch arg in tests/uat/lib/launch.ts to match your locale's equivalent string.

dev-dep Chromium binary size

puppeteer pulls a ~150 MB Chromium binary at npm install time. CI must accept this. Production npm install --omit=dev skips it cleanly.

Xvfb is NOT required

Per Plan 01-11 RESEARCH §3 empirical probes against Chrome 148, the --headless=new mode handles screen capture without Xvfb on Linux CI runners. If a future Chrome regresses this, Xvfb :99 & DISPLAY=:99 npm run test:uat is the fallback.

CI runner screen-capture concern

The 35s recording assertion (A11) captures whatever is on screen during that window. CI MUST run the harness in an isolated container with no concurrent workload — see T-1-11-02 in Plan 01-11's threat model.

Real Chrome download (assertion 5 → A12)

The harness configures per-page download behavior via CDP to a fresh os.tmpdir()/mokosh-uat-downloads-* directory; downloads are NOT written to your real ~/Downloads. The temp directory is deleted by OS tmpdir GC.