5 Commits

Author SHA1 Message Date
9ac580869d fix(debug): race-tolerant offscreen target attach in UAT launch
Plan-04-04 debug session-2 root cause: the offscreen-console capture
in tests/uat/lib/launch.ts:registerOffscreenConsoleAttach matched zero
offscreen targets across 4 spike runs, creating a critical observability
gap that prevented disambiguation of Plan 04-04 Wave 0 spike failure
mode.

Empirical investigation (tests/uat/spike-diagnose-offscreen-target.ts,
NEW): when chrome.offscreen.createDocument fires, Puppeteer's
`targetcreated` event fires with `type='other'` and `url=''` BEFORE the
CDP target metadata stabilizes. The previous filter (whether
`background_page` or `page`) never matched at event time. By the time
the metadata stabilizes (visible via `browser.targets()`), the
target's type is `'background_page'` (not `'page'` — MV2's
background_page type IS still used by Chrome's CDP for invisible
extension documents, despite MV3 abolishing classic background pages).

Fix:
  - Match the offscreen target by URL pattern (load-bearing criterion;
    type field is intentionally unchecked because it's unreliable at
    targetcreated time).
  - Bind to BOTH `targetcreated` AND `targetchanged` events (the latter
    fires when the URL stabilizes after navigation).
  - Add a `browser.targets()` enumeration race-free safety net for
    cases where the offscreen target exists at registration time.

Verification: tests/uat/spike-diagnose-offscreen-target.ts now emits
`(launch: offscreen console attached — url=chrome-extension://.../src/offscreen/index.html)`
followed by `[off:log] [OS:Recorder] Recording started ...` (zero such
lines in any prior spike run).

Test-infra correctness fix; ZERO production source changes. FORBIDDEN_HOOK_STRINGS
inventory unchanged at 12 entries. No new test-only `__MOKOSH_UAT__` symbols.

References:
  - .planning/debug/sw-offscreen-persistence-investigation-session-2.md
    (session-2 debug note documenting empirical root cause)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 07:16:27 +02:00
6a77967b6c feat(01-13): wave-3B — A5+A6+A7 GREEN + Bug B canonical regression rewind
Wave 3B lands the A5 (SAVE_ARCHIVE → zip on disk) and A7 (genuine
RECORDING_ERROR → ERR + recovery notification) assertions, completing
8/14 of the orchestrator's GREEN floor (A0+A1+A2+A3+A4+A5+A6+A7).
Bails at A8 (Wave 3C scope).

Changes per file:

  tests/uat/extension-page-harness.ts
    - assertA5: 11s settle (>= SEGMENT_DURATION_MS so first rotation
      lands a segment) + send SAVE_ARCHIVE + assert resp.success=true.
      Page-side only checks SW handler ack; host-side driver verifies
      disk-side outcome (zip presence + size floor).
    - assertA7: setupFreshRecording helper (A6 tears down; A7 needs
      REC state) → snapshot notif count → send RECORDING_ERROR with
      a non-Bug-B error code ('codec-unsupported') → 200ms settle →
      assert badge='ERR' + popup endsWith popup.html + notif delta=1
      + set-membership for 'mokosh-recovery-*' prefix.
    - setupFreshRecording: shared helper for A7 + future assertions
      that need a fresh REC state after a teardown.

  tests/uat/lib/harness-page-driver.ts
    - driveA5: page.evaluate(assertA5) THEN host-side fs polling for
      *.zip in handles.downloadsDir. The CDP Browser.setDownloadBehavior
      override renames the file to download.zip (data: URL filename
      gap), so we accept any *.zip suffix. Merges page-side check +
      host-side checks into a single AssertionRecord. Signature now
      takes downloadsDir as a second arg.
    - driveA7: standard page.evaluate wrapper (no host-side work).

  tests/uat/harness.test.ts
    - Wraps driveA5 in a closure that captures handles.downloadsDir.
    - Reordered: launchHarnessBrowser MUST run before driver list so
      the closure can read handles without a TDZ trap.

  tests/uat/lib/launch.ts
    - Victim page switched from about:blank to a file:// URL backed by
      a tmp HTML file in downloadsDir. About:blank breaks A5 because
      chrome.tabs.captureVisibleTab needs <all_urls> permission which
      matches http/https/file/ftp but NOT about: or data: URLs. The
      stub HTML satisfies <all_urls> + provides a real .url for the
      production saveArchive's chrome.tabs.query.

  src/test-hooks/offscreen-hooks.ts (test-only — tree-shaken from prod)
    - installFakeDisplayMedia: mintStream() helper called per
      fakeGetDisplayMedia invocation; each call mints a FRESH
      MediaStream from the persistent canvas. Real getDisplayMedia
      returns a new stream per call — fake now matches. Required for
      A7's setupFreshRecording where the previous recording's stream
      tracks were stopped by A6's onUserStoppedSharing teardown.
    - Added 33ms setInterval-driven drawFrame() alongside the
      existing requestAnimationFrame loop. RAF can throttle in
      headless Chrome on offscreen documents (page-visibility
      heuristics produce 0 fps), which yields zero-byte
      MediaRecorder segments that crash ts-ebml's VINT decode in
      webm-remux.extractFramesFromSegment with "Unrepresentable
      length: Infinity". The setInterval is redundant when RAF fires
      at full rate; it's a safety net for the headless-MV3 corner.

Bug B regression-catch demo (success_criteria #3 — MANDATORY per plan):

Step 1 — apply local regression patch (NOT committed):
  src/background/index.ts:792  setIdleMode() → setErrorMode()

Step 2 — npm run build:test && npm run test:uat RED snippet:

  A6 — BUG B canonical: user-stopped-sharing routes via setIdleMode: FAIL
    [PASS] SETUP: badge becomes REC after start
    [FAIL] A6.1: badge text is '' (NOT 'ERR') after user-stop
           expected: ""
           actual:   "ERR"
    [FAIL] A6.2: popup is '' (NOT manifest default) after user-stop
           expected: ""
           actual:   "chrome-extension://<id>/src/popup/index.html"
    [PASS] A6.3: NO recovery notification fired (count delta === 0)
    [PASS] A6.4: isRecording=false (via badge proxy)

  UAT harness: 6/14 assertions passed (bailed: A6 failed; see above)

Step 3 — revert local patch (git checkout -- src/background/index.ts).

Step 4 — npm run build:test && npm run test:uat GREEN snippet:

  A6 — BUG B canonical: user-stopped-sharing routes via setIdleMode: PASS
    [PASS] SETUP: badge becomes REC after start
    [PASS] A6.1: badge text is '' (NOT 'ERR') after user-stop
    [PASS] A6.2: popup is '' (NOT manifest default) after user-stop
    [PASS] A6.3: NO recovery notification fired (count delta === 0)
    [PASS] A6.4: isRecording=false (via badge proxy)

  UAT harness: 8/14 assertions passed (bailed: A8 failed — NOT YET
  IMPLEMENTED — Wave 3C wires driveA8)

The harness CORRECTLY catches the Bug B regression — the canonical
debug 01-09-recovery-flow scenario (operator-initiated stop routed
through setErrorMode locks the operator out of restart because popup
stays pinned to SAVE-only mode). Bug B is now CI-callable end-to-end.

vitest 93/93 GREEN throughout (unit-test layer unaffected). Tier-1
grep gate GREEN (9 forbidden hook strings: 0 occurrences in dist/).
npm run build exit 0; npx tsc --noEmit exit 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:01:06 +02:00
eb64521321 feat(01-13): wave-2 — launchHarnessBrowser + assertions + harness-page-driver scaffolding
Build out the Approach-B harness driver utilities atop the Wave 1
production paths. Three new files form the shared scaffold that
Wave 3's 13 assertion drivers (A1-A5, A7-A13) and the eventual
orchestrator (`tests/uat/harness.test.ts`) will all consume. The
standalone A6 driver (`tests/uat/a6.test.ts`) is rewritten to use
the new lib — behavior-preserving: A6 still PASSES 5/5 in ~7s.

New files:

  - tests/uat/lib/launch.ts (~320 LoC)
      `launchHarnessBrowser({ headless?, downloadsDir? }) → HarnessHandles`
      Extracts the Chrome-launch + victim-page + harness-page + console-
      attach pattern from a6.test.ts into a single reusable helper.
      NEW vs prototype: CDP `Browser.setDownloadBehavior` wires
      Chrome's download path to a per-run `mkdtempSync` tmp dir so A5
      (SAVE_ARCHIVE) can poll a known location without colliding with
      the operator's real downloads. Architectural commitments
      enforced (per 01-11-SUMMARY): no `--auto-select-desktop-capture-
      source` flag; victim about:blank brought to front for the
      production `chrome.tabs.query({active:true})` workaround; SW
      console attach best-effort with bounded poll; offscreen console
      attach opportunistic via `targetcreated` listener (offscreen
      target appears later, when the harness page calls
      chrome.offscreen.createDocument).

  - tests/uat/lib/assertions.ts (~210 LoC)
      Host-side assertion primitives:
        * `AssertionRecord`, `CheckRecord`, `ConsoleBuffers` types —
          mirror the page-side shape returned by `assertA*` methods.
        * `runAssertion(name, fn, buffers)` — try/catch wrapper that
          dumps the SW + offscreen console tails (last 100 lines each)
          to stderr on failure, then returns `{passed: false, error}`
          if `fn` throws.
        * `printAssertionResult(result)` — single source of truth for
          the formatted result print. Extracted from the inline
          `printResult` previously in the prototype's a6.test.ts so
          Wave 3's orchestrator can reuse it across all 14 assertions.
        * `assertEqual / assertGte / assertMatch / assertTrue` —
          structured failure messages atop node:assert/strict.
        * `waitFor(probe, predicate, timeoutMs, description)` — host-
          side polling primitive; mirrors the page-side waitFor
          semantics verbatim (they can't share a module: page-side is
          bundled into the harness HTML, host-side runs in Node).
      NO chrome.* helpers here — all chrome.* work happens inside the
      extension-internal harness page. This module is host-side ONLY
      by construction (no chrome global in Node anyway).

  - tests/uat/lib/harness-page-driver.ts (~170 LoC)
      One driver wrapper per assertion (A1..A13). Each wraps a single
      `page.evaluate(() => window.__mokoshHarness.assertXX())`.
      Centralizing this means adding/renaming an assertion = two-file
      edit (extension-page-harness.ts impl + this file) instead of
      touching every test-file caller.
      Wave 2 wires `driveA6` (proven from c647f61). The 12 Wave-3
      drivers (driveA1..A5, A7..A13) are stubbed as
      `throw new Error('NOT YET IMPLEMENTED — Wave 3<X> wires driveXX')`
      so the future orchestrator's `for (const drive of drivers)` loop
      fails cleanly on the first unimplemented one (bail-on-first-
      failure semantics). The `AssertionWithBytes` type is declared
      for A5/A12/A13 which return `bytesBase64` payloads (zip / webm
      bytes that the host side processes after the page-side
      assertion completes).

Rewrite — `tests/uat/a6.test.ts`:
  - Drops ~80 LoC of Chrome-launch + console-attach + result-print
    plumbing now living in lib/launch.ts + lib/assertions.ts.
  - Now ~70 LoC total — pure orchestration of
    launchHarnessBrowser → runAssertion(driveA6) → printAssertionResult
    → browser.close() → exit code.
  - Behavior-preserving: A6 still 5/5 GREEN with the same diagnostic
    output (SETUP, A6.1-A6.4) and the same ~7s end-to-end runtime.

Verification (all GREEN):
  - `npx tsc --noEmit` — exit 0 (root + tests/uat/tsconfig.json).
  - `npx tsx tests/uat/a6.test.ts` — exits 0 with "PASS"; 5 checks
    GREEN (SETUP, A6.1, A6.2, A6.3, A6.4). End-to-end runtime ~7s
    headless on this workstation.
  - `npm run build` — exit 0; Tier-1 grep gate GREEN (production
    bundle contains zero hook strings AND zero lib symbol names —
    the new lib files are test-only and not bundled into dist/).
  - `npm run build:test` — exit 0; dist-test/ still emits the
    extension-page-harness.html harness (lib files are host-side,
    not rollup inputs).
  - `npx vitest run` — 92/92 GREEN.

Wave 3 ready: harness-page-driver.ts has driveA1..A5/A7..A13 stubs
in place; extending requires only:
  1. Add `assertAXX` method to window.__mokoshHarness in
     tests/uat/extension-page-harness.ts.
  2. Replace the corresponding stub body in this file with the
     page.evaluate wrapper.
  3. (Wave 3A) Create tests/uat/harness.test.ts orchestrator that
     iterates over [A0 grep gate, driveA1..A13] with bail-on-fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:21:11 +02:00
a63066a289 chore(01-13): wave-0 — clean broken Approach-A artifacts per 01-11-SUMMARY
Restore a clean baseline before promoting the c647f61 prototype to
production paths (Wave 1) and building out Approach-B driver
scaffolding (Wave 2). All deletions trace back to falsifications
documented in 01-11-SUMMARY.md.

Deleted — broken Approach-A files:
  - src/test-hooks/sw-hooks.ts
      MV3 SW blocks dynamic import (Chromium es_modules.md;
      w3c/webextensions#212). The gated `await import('../test-hooks/
      sw-hooks')` from 01-11 Wave 1 never resolved → SW silently died →
      production listeners never registered. File was dead-on-arrival;
      no fix possible while MV3 SWs disallow dynamic import. Approach-B
      replaces SW-side instrumentation with the extension-internal
      harness page's chrome.action.* + chrome.notifications.* surface
      (full privilege; no monkey-patching needed).

  - tests/uat/lib/{launch,extension,sw,offscreen,assertions}.ts
      Popup-bridge architecture (01-11 dbd977c) — falsification 2 +
      falsification 3 in 01-11-SUMMARY: `sw.evaluate` exposes only
      chrome.{loadTimes,csi}, NOT chrome.action.* / chrome.notifications.*
      / chrome.runtime.sendMessage; setPopup-juggling for extension-id
      resolution turned out to be unnecessary (browser.extensions()
      works directly per the prototype). These files will be reborn in
      Wave 2 around the extension-page architecture.

      Kept: tests/uat/lib/zip.ts (host-side JSZip work — architecture-
      agnostic; A12+A13 still use it) and tests/uat/lib/test-hook-
      contract.d.ts (type mirror — extended in Wave 3 but kept as-is here).

  - tests/uat/prototype/probe_{offscreen,sw,tabs,tabs2}.mjs
      Feasibility-research probes (01-11 spike) that empirically falsified
      the Approach-A hypotheses. The findings are encoded in 01-11-
      SUMMARY.md; the probes themselves are dead code.

  - tests/uat/harness.test.ts
      01-11 Wave 2 popup-bridge orchestrator (dbd977c). Imports the
      now-deleted tests/uat/lib/{assertions,extension,sw,offscreen,launch}
      modules — would not typecheck after this commit. Reborn in
      Wave 3A as the Approach-B orchestrator (extension-internal page
      driver + A0 grep gate + 13 assertion drivers).

Reverted — SW-side dynamic-import gate comment block:
  - src/background/index.ts lines 13-29
      The existing comment block (post-spike) described the SW-side
      gated dynamic import that never landed. Rewritten to cite 01-13
      Approach-B explicitly, link to 01-11-SUMMARY.md falsification,
      and clarify that the Tier-1 grep gate's enduring value is
      catching regressions in the offscreen chunk's __MOKOSH_UAT__
      gate (the SW chunk is hook-free by construction).

Updated — Tier-1 grep gate FORBIDDEN_HOOK_STRINGS inventory:
  - tests/background/no-test-hooks-in-prod-bundle.test.ts
      Removed: `simulateUserStop` (Approach-A naming; replaced by
      Approach-B `dispatchEndedOnTrack` which matches the W3C
      dispatchEvent semantics per RESEARCH §7 BLOCKER — track.stop()
      does NOT fire 'ended' per spec, so the simulation MUST use
      dispatchEvent).
      Added: `installFakeDisplayMedia`, `uninstallFakeDisplayMedia`,
      `dispatchEndedOnTrack`, `__mokoshOffscreenQuery`.
      Total inventory: 8 surface strings (was 5). Each MUST be absent
      from every file under dist/ post-build.

Verification (all GREEN):
  - `npm run build` — exit 0; dist/ populated.
  - `grep -rln <forbidden> dist/` — 0 matches.
  - `npm run build:test` — exit 0; dist-test/ populated; offscreen-hooks
    chunk contains `installFakeDisplayMedia` (gate runs correctly
    against the test build's distinct artifact).
  - `npx tsc --noEmit` — exit 0 (root + tests/uat/tsconfig.json).
  - `npx vitest run` — 92/92 tests passing (was 89; the +3 new tests
    come from the FORBIDDEN_HOOK_STRINGS list expanding 5 → 8 — each
    forbidden string is one parametric `it(...)` block).

Both prior-failing tests now GREEN:
  - tests/background/sw-bundle-import.test.ts (was missing dist/ → 92/92
    requires the test run to have a current dist/; vitest gate test
    rebuilds via execFile when SKIP_BUILD≠1, otherwise relies on prior
    `npm run build`).
  - tests/background/no-test-hooks-in-prod-bundle.test.ts (was failing
    on stale dist; now GREEN against the freshly-rebuilt clean bundle).

Wave 1 (next): promote tests/uat/prototype/{extension-page-harness.html,
extension-page-harness.ts,a6.test.ts} to tests/uat/ via `git mv`;
update vite.test.config.ts rollup input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:54:41 +02:00
dbd977c815 feat(01-11): wave-2 — Puppeteer harness scaffolding + A0 GREEN, popup-bridge architecture
Task 3 of Plan 01-11 (Puppeteer UAT harness).

Harness file tree (tests/uat/):
- harness.test.ts: tsx-runnable top-to-bottom harness entry point.
  Runs A0 inline (filesystem grep gate, abort-on-fail T-1-11-01),
  then launches Chrome + opens popup bridge + queries manifest, then
  iterates A1-A13 stubs. Each stub throws "NOT YET IMPLEMENTED —
  Plan 01-11 Task N wires this assertion". Exit code = 0 on full
  pass, 1 otherwise. Final line: "UAT harness: N/14 assertions passed".
- lib/launch.ts: launchHarnessBrowser() — wraps puppeteer.launch with
  enableExtensions:[dist-test/], headless default (HEADLESS=0
  override), --no-sandbox + --auto-select-desktop-capture-source flags.
  Polls browser.extensions() until the extension registers (empirically
  ~100ms but the first call right after launch returns Map(0)).
  Opens both a blank page (for triggerExtensionAction) AND the popup
  page (the bridge surface). Returns { browser, extension, extensionId,
  sw, downloadsDir, page, popup }.
- lib/extension.ts: waitForOffscreenTarget + attachToOffscreen +
  countOffscreenTargets. Offscreen attach uses target.type() ===
  'background_page' + .asPage() (NOT .page() — RESEARCH §4 Pitfall 1).
- lib/sw.ts: chrome.* state queries via the POPUP page handle (NOT
  the WebWorker handle — see architecture note below). getBadgeText,
  getPopup, getManifest, getIconSize, getIsRecording (side-channeled
  through badge text), fireOnStartup (via __mokoshTestQuery bridge),
  sendSyntheticRecordingError, getNotificationSnapshot (via bridge),
  keepalivePing (no-op message to wake SW for ~30s).
- lib/offscreen.ts: getDisplaySurface, simulateUserStop (the
  dispatchEvent('ended') path per RESEARCH §7 BLOCKER — DO NOT REFACTOR
  to track.stop()), getSegmentCount.
- lib/assertions.ts: runAssertion(idx, name, buffers, fn) wrapper —
  records pass/fail/duration; on failure dumps last 30 lines of SW
  + offscreen console buffers to stderr before rethrowing. assertEqual
  / assertMatch / assertTrue / assertGte / waitFor polling helper.
- lib/zip.ts: jszip-based assertArchiveShape + extractEntryToFile for
  assertions 12 + 13.
- README.md: runtime + local-debug + CI semantics + locale gotcha
  + dev-dep size note + assertion catalog table.
- tsconfig.json: per-tree type-check config (mirrors root tsconfig.json
  compiler options but includes the harness tree explicitly).

Architecture refinement (DEVIATION from RESEARCH §1 — Rule 1+3 inline fix):
- RESEARCH §1 sketched `sw.evaluate(() => chrome.action.getBadgeText({}))`
  as the chrome.* query path. Empirical probes during Task 3 execution
  against Puppeteer 25.0.2 + Chrome 148 + --headless=true revealed two
  blockers:
    1. Puppeteer's WebWorker.evaluate runs in an ISOLATED WORLD that
       carries SW globals (clients, registration, ...) but NOT the
       extension's full chrome.* API surface. Object.keys(chrome) inside
       sw.evaluate returns ["loadTimes","csi"] — the public webpage
       chrome, not the extension chrome.
    2. Chrome 148's headless mode aggressively suspends MV3 service
       workers; subsequent swTarget.worker() calls return
       "Protocol error: No target with given id found".
- WORKAROUND: open the popup page (chrome-extension://<id>/src/popup/
  index.html) as a separate Puppeteer Page. The popup has full
  chrome.* access (it's an extension context with same privileges as
  the SW) AND stable Puppeteer lifetime. For SW-globalThis state
  (__mokoshTest in the SW isolate, NOT in the popup), bridge via
  chrome.runtime.sendMessage. The popup sends
  { type: '__mokoshTestQuery', op: 'snapshot' | 'fire-on-startup' |
  'handler-types' }; the SW hook's onMessage handler responds.
- Bridge implementation added to src/test-hooks/sw-hooks.ts — registers
  AFTER the production listeners so it never intercepts production
  messages (__mokoshTest* type is unambiguously test-only). Tier-1
  grep gate (no-test-hooks-in-prod-bundle.test.ts) continues to enforce
  ZERO __mokoshTest occurrences in dist/ — the bridge handler is
  tree-shaken alongside the rest of the hook module via the
  __MOKOSH_UAT__ gate.

Other configuration changes:
- vitest.config.ts: exclude tests/uat/** from vitest discovery. The
  Puppeteer harness is invoked via `npm run test:uat` (not vitest);
  running it under vitest would try to launch real Chrome inside a
  vitest worker. The .test.ts suffix is retained for editor +
  naming-convention consistency with the rest of the tree.

Verification:
- npx tsc --noEmit (src/): exit 0
- npx tsc --noEmit -p tests/uat: exit 0
- npm run build: exit 0
- grep -rln '__mokoshTest|simulateUserStop|getSegmentCount|setCurrentStream|setSegmentCountGetter|__mokoshTestQuery|__mokoshKeepalive' dist/: ZERO matches
- npm run build:test: exit 0; dist-test/ populated with the new bridge code
- SKIP_BUILD=1 npx vitest run: 89/89 GREEN
- SKIP_PROD_REBUILD=1 npx tsx tests/uat/harness.test.ts:
  → A0 [PASS]: production bundle has no test-hook leaks (19ms)
  → Browser launches; popup opens; manifest read succeeds
  → A1-A13 [FAIL]: NOT YET IMPLEMENTED — Plan 01-11 Task N wires this
  → "UAT harness: 1/14 assertions passed, 13 failed (first failure: A1)"
  → Exit code: 1 (expected — 13 RED stubs intentional)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 09:14:58 +02:00