Commit Graph

4 Commits

Author SHA1 Message Date
1b67b1c1d3 feat(01-13): wave-3A — A1+A2+A3+A4 GREEN + harness.test.ts orchestrator (5/14 assertions GREEN)
Wave 3A landed. `npm run test:uat` now exercises 5/14 assertions
end-to-end (A0 + A1 + A2 + A3 + A4); bails at A5 NOT YET IMPLEMENTED
(Wave 3B scope). A6 still PASSES 5/5 through the standalone
`npx tsx tests/uat/a6.test.ts` entry — the orchestrator-level A6 won't
reach in Wave 3A because the sequential loop bails at A5; once Wave 3B
wires driveA5 the loop will fall through to A6 (which uses the proven
Wave-2 driveA6 driver — no rework needed there).

Files changed:

- `tests/uat/extension-page-harness.ts` — extends `window.__mokoshHarness`
  from `{ assertA6 }` to `{ assertA1, assertA2, assertA3, assertA4,
  assertA6 }`. Per-assertion contracts:
  • A1 — chrome.action.getBadgeText({}) === '' + getPopup({}) === ''
    + isRecording=false (badge !== 'REC' proxy per state-machine atomic
    pairing). 3 CheckRecords.
  • A2 — ensureOffscreen + START_RECORDING direct-to-offscreen
    (workaround for the `tabs` manifest permission gap per
    01-11-SUMMARY + plan resolved-questions row 2) + manual
    setBadgeText('REC') + setPopup(POPUP_HTML_PATH) + waitFor
    badge==='REC'. The bypassed chrome.action.onClicked →
    startVideoCapture path is unit-tested in
    tests/background/badge-state-machine.test.ts; A2 verifies the
    contract that matters (recording reaches the REC state-machine
    row). 2 CheckRecords.
  • A3 — offscreen bridge query 'get-display-surface' (new in this
    plan via the prior commit's offscreen-hooks extension) → asserts
    === 'monitor'. 1 CheckRecord.
  • A4 — getPopup remains 'src/popup/index.html' + hasDocument()===true
    (no duplicate offscreen). Essentially a no-op verification —
    regression protection against future refactors that might unpin
    the popup during recording or spawn extra offscreens on stray
    events. 2 CheckRecords.
  • IMPORTANT: chrome.action.getPopup() returns the FULL absolute
    chrome-extension://<id>/... URL (not the manifest-relative path).
    A2.2 + A4.1 assert via .endsWith('src/popup/index.html') to stay
    extension-id independent. Empirical finding from first orchestrator
    run; documented inline.

- `tests/uat/lib/harness-page-driver.ts` — wires `driveA1/A2/A3/A4`
  (replaces the 4 NOT YET IMPLEMENTED Wave-3A stubs from
  eb64521). Each wraps a single page.evaluate(() =>
  window.__mokoshHarness.assertXX()) call per the contract laid down
  by driveA6. A5+A7..A13 remain stubbed for Waves 3B+3C+3D.

- `tests/uat/harness.test.ts` (NEW) — top-level UAT orchestrator
  driving all 14 assertions sequentially against a single Chrome +
  single harness page. A0 (Tier-1 grep gate) runs pre-flight before
  any Chrome launch — mirrors
  tests/background/no-test-hooks-in-prod-bundle.test.ts forbidden-
  string inventory (9 entries; belt-and-suspenders per
  feedback-pre-checkpoint-bundle-gates.md memory). Bail-on-first-
  failure with [SKIP] markers for unreached assertions + structured
  diagnostic dump (full SW + offscreen console tail) on each failure.
  SKIP_PROD_REBUILD=1 escape hatch skips the A0-side `npm run build`
  for developer iteration.

Verification (all GREEN):
  - npx tsc --noEmit: clean (root)
  - npx tsc --noEmit -p tests/uat: clean (UAT subtree)
  - npm run build: clean; production bundle hook-free
    (9-string grep gate in vitest unit gate)
  - npm run build:test: clean; dist-test/assets/extension_page_harness-*.js
    grew from 3.87kB → 7.67kB (A1+A2+A3+A4 added)
  - SKIP_BUILD=1 npx vitest run: 93/93 GREEN
    (Wave 0+1+2 baseline 92 + 1 from the 9th grep-gate string from
    the prior commit; this commit adds zero new vitest tests — the
    A1-A4 contracts are verified at UAT-harness time only)
  - npx tsx tests/uat/a6.test.ts (standalone): 5/5 GREEN; exit 0
    (Wave-2 A6 baseline preserved through orchestrator-adjacent
    harness page surface extension)
  - npm run test:uat (full operator entry): 5/14 GREEN
    (A0 + A1 + A2 + A3 + A4); bails at A5 NOT YET IMPLEMENTED
    (Wave 3B scope, expected). Total wall clock ~25s (~5s build +
    ~5s prod-rebuild for A0 + ~15s assertion sequence).

Operator empirical-verification deferred to orchestrator (per
feedback-pre-checkpoint-bundle-gates.md — the orchestrator runs SW
CSP-safety + Node-globals + DOM-globals grep on the built bundle
before surfacing any checkpoint).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:45:25 +02:00
a63066a289 chore(01-13): wave-0 — clean broken Approach-A artifacts per 01-11-SUMMARY
Restore a clean baseline before promoting the c647f61 prototype to
production paths (Wave 1) and building out Approach-B driver
scaffolding (Wave 2). All deletions trace back to falsifications
documented in 01-11-SUMMARY.md.

Deleted — broken Approach-A files:
  - src/test-hooks/sw-hooks.ts
      MV3 SW blocks dynamic import (Chromium es_modules.md;
      w3c/webextensions#212). The gated `await import('../test-hooks/
      sw-hooks')` from 01-11 Wave 1 never resolved → SW silently died →
      production listeners never registered. File was dead-on-arrival;
      no fix possible while MV3 SWs disallow dynamic import. Approach-B
      replaces SW-side instrumentation with the extension-internal
      harness page's chrome.action.* + chrome.notifications.* surface
      (full privilege; no monkey-patching needed).

  - tests/uat/lib/{launch,extension,sw,offscreen,assertions}.ts
      Popup-bridge architecture (01-11 dbd977c) — falsification 2 +
      falsification 3 in 01-11-SUMMARY: `sw.evaluate` exposes only
      chrome.{loadTimes,csi}, NOT chrome.action.* / chrome.notifications.*
      / chrome.runtime.sendMessage; setPopup-juggling for extension-id
      resolution turned out to be unnecessary (browser.extensions()
      works directly per the prototype). These files will be reborn in
      Wave 2 around the extension-page architecture.

      Kept: tests/uat/lib/zip.ts (host-side JSZip work — architecture-
      agnostic; A12+A13 still use it) and tests/uat/lib/test-hook-
      contract.d.ts (type mirror — extended in Wave 3 but kept as-is here).

  - tests/uat/prototype/probe_{offscreen,sw,tabs,tabs2}.mjs
      Feasibility-research probes (01-11 spike) that empirically falsified
      the Approach-A hypotheses. The findings are encoded in 01-11-
      SUMMARY.md; the probes themselves are dead code.

  - tests/uat/harness.test.ts
      01-11 Wave 2 popup-bridge orchestrator (dbd977c). Imports the
      now-deleted tests/uat/lib/{assertions,extension,sw,offscreen,launch}
      modules — would not typecheck after this commit. Reborn in
      Wave 3A as the Approach-B orchestrator (extension-internal page
      driver + A0 grep gate + 13 assertion drivers).

Reverted — SW-side dynamic-import gate comment block:
  - src/background/index.ts lines 13-29
      The existing comment block (post-spike) described the SW-side
      gated dynamic import that never landed. Rewritten to cite 01-13
      Approach-B explicitly, link to 01-11-SUMMARY.md falsification,
      and clarify that the Tier-1 grep gate's enduring value is
      catching regressions in the offscreen chunk's __MOKOSH_UAT__
      gate (the SW chunk is hook-free by construction).

Updated — Tier-1 grep gate FORBIDDEN_HOOK_STRINGS inventory:
  - tests/background/no-test-hooks-in-prod-bundle.test.ts
      Removed: `simulateUserStop` (Approach-A naming; replaced by
      Approach-B `dispatchEndedOnTrack` which matches the W3C
      dispatchEvent semantics per RESEARCH §7 BLOCKER — track.stop()
      does NOT fire 'ended' per spec, so the simulation MUST use
      dispatchEvent).
      Added: `installFakeDisplayMedia`, `uninstallFakeDisplayMedia`,
      `dispatchEndedOnTrack`, `__mokoshOffscreenQuery`.
      Total inventory: 8 surface strings (was 5). Each MUST be absent
      from every file under dist/ post-build.

Verification (all GREEN):
  - `npm run build` — exit 0; dist/ populated.
  - `grep -rln <forbidden> dist/` — 0 matches.
  - `npm run build:test` — exit 0; dist-test/ populated; offscreen-hooks
    chunk contains `installFakeDisplayMedia` (gate runs correctly
    against the test build's distinct artifact).
  - `npx tsc --noEmit` — exit 0 (root + tests/uat/tsconfig.json).
  - `npx vitest run` — 92/92 tests passing (was 89; the +3 new tests
    come from the FORBIDDEN_HOOK_STRINGS list expanding 5 → 8 — each
    forbidden string is one parametric `it(...)` block).

Both prior-failing tests now GREEN:
  - tests/background/sw-bundle-import.test.ts (was missing dist/ → 92/92
    requires the test run to have a current dist/; vitest gate test
    rebuilds via execFile when SKIP_BUILD≠1, otherwise relies on prior
    `npm run build`).
  - tests/background/no-test-hooks-in-prod-bundle.test.ts (was failing
    on stale dist; now GREEN against the freshly-rebuilt clean bundle).

Wave 1 (next): promote tests/uat/prototype/{extension-page-harness.html,
extension-page-harness.ts,a6.test.ts} to tests/uat/ via `git mv`;
update vite.test.config.ts rollup input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:54:41 +02:00
f44ca3afba wip(01-11): wave-3 partial — A1+A4 attempted, popup-bridge SW state query unreliable
Task 4 of Plan 01-11 attempted A1-A4 wiring. Empirical run reveals an
architectural blocker that needs orchestrator-level decision.

Current state after this commit (SKIP_PROD_REBUILD=1 npx tsx tests/uat/harness.test.ts):
- A0 [PASS]: production bundle hook-leak grep gate (17ms)
- A1 [FAIL]: SW bootstrap → setIdleMode — popup state never transitions
  to '' despite keepalive ping + 3s waitFor. chrome.action.getPopup({})
  from the popup page consistently returns the manifest default
  (chrome-extension://<id>/src/popup/index.html), not the '' that
  setIdleMode's chrome.action.setPopup({popup:''}) should produce.
- A2 [FAIL]: toolbar onClicked — badge never transitions to "REC" after
  page.triggerExtensionAction(extension); 8s timeout. Either the
  toolbar action isn't reaching the SW listener, OR getDisplayMedia's
  picker isn't resolving in headless mode (despite the auto-select flag).
- A3 [FAIL]: offscreen target never appears (correlates with A2 — no
  recording started, no offscreen document spawned).
- A4 [PASS]: trivially passes (offscreen count is 0 → 0, both before
  + after the click). Not a true assertion of behavior; would also pass
  if the whole extension were broken.
- A5-A13: stubbed RED per plan.

Architectural blocker (Rule 4 — needs orchestrator decision):
- Puppeteer 25.0.2 + Chrome 148 + headless cannot reliably keep the MV3
  SW alive long enough OR expose its real chrome.* state to a popup
  page query. The popup-bridge architecture (Task 3 commit dbd977c)
  works for synchronous bridge queries (snapshot, fire-on-startup)
  but does NOT reliably reflect chrome.action.setPopup / setBadgeText
  state changes initiated by the SW.

Three plausible paths forward (need orchestrator pick):

  Option A — Content-script bridge: inject a content script that
    bridges chrome.* queries to a webpage's window.* RPC surface;
    harness uses page.evaluate against the content script instead of
    popup.evaluate. Pros: content scripts have stable lifetime tied to
    the page they're injected into. Cons: content scripts have
    DIFFERENT chrome.* surface (no chrome.action API surface — they
    can't read getBadgeText / getPopup at all). Likely DOESN'T solve
    the underlying problem.

  Option B — Headful with Xvfb on CI: relax the headless requirement;
    accept Xvfb dependency. Per Plan 01-11 RESEARCH §3, RESEARCH
    claimed headless works on Chrome 148 — empirical refutation here.
    Pros: SW lifetime is more stable in headful mode; setPopup
    propagation is reliable. Cons: introduces Xvfb dep that RESEARCH
    explicitly said wasn't needed; CI complication.

  Option C — Shrink harness scope to bridge-able assertions: A0 (grep
    gate), A8 (Bug A onStartup via bridge), A9 (icon sizes via popup
    fetch), A10 (manifest via popup), A13 (zip shape — operator runs
    SAVE_ARCHIVE manually + drops zip to a known path; harness reads
    it). Skip A1-A7, A11, A12 (the ones that require live SW state
    observation through chrome.action API). Pros: ships the
    bug-A-coverage portion of the harness today; keeps Plan 01-09's
    Task 5 operator-checkpoint partly automated. Cons: doesn't retire
    operator entirely; Plan 01-09 stays open on operator-empirical
    A1-A7.

  Option D — Switch to WebDriver BiDi (the Puppeteer 25 alternative
    backend): Puppeteer 25 supports BiDi via {protocol: 'webDriverBiDi'}.
    BiDi may handle extension SW evaluation differently (different
    isolation model). Speculative — no empirical evidence either way.

What landed cleanly:
- Tier-1 hook-leak grep gate (T-1-11-01) GREEN: dist/ has zero
  __mokoshTest / simulateUserStop / getSegmentCount / setCurrentStream
  / setSegmentCountGetter / __mokoshTestQuery / __mokoshKeepalive
  occurrences after npm run build.
- Two-bundle infrastructure (dist/ vs dist-test/) operational.
- Bridge handler in sw-hooks.ts works for snapshot + fire-on-startup
  + handler-types ops (verified by no-hang on keepalivePing call).
- Existing 89-test vitest baseline preserved (no regression from any
  Wave 0/1/2/3 work).

Verification:
- npx tsc --noEmit (src/): exit 0
- npx tsc --noEmit -p tests/uat: exit 0
- npm run build: exit 0; dist/ hook-free
- SKIP_BUILD=1 npx vitest run: 89/89 GREEN
- SKIP_PROD_REBUILD=1 npx tsx tests/uat/harness.test.ts:
  2/14 passed (A0 + A4-trivially), 12 FAIL — non-zero exit as expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 09:24:06 +02:00
dbd977c815 feat(01-11): wave-2 — Puppeteer harness scaffolding + A0 GREEN, popup-bridge architecture
Task 3 of Plan 01-11 (Puppeteer UAT harness).

Harness file tree (tests/uat/):
- harness.test.ts: tsx-runnable top-to-bottom harness entry point.
  Runs A0 inline (filesystem grep gate, abort-on-fail T-1-11-01),
  then launches Chrome + opens popup bridge + queries manifest, then
  iterates A1-A13 stubs. Each stub throws "NOT YET IMPLEMENTED —
  Plan 01-11 Task N wires this assertion". Exit code = 0 on full
  pass, 1 otherwise. Final line: "UAT harness: N/14 assertions passed".
- lib/launch.ts: launchHarnessBrowser() — wraps puppeteer.launch with
  enableExtensions:[dist-test/], headless default (HEADLESS=0
  override), --no-sandbox + --auto-select-desktop-capture-source flags.
  Polls browser.extensions() until the extension registers (empirically
  ~100ms but the first call right after launch returns Map(0)).
  Opens both a blank page (for triggerExtensionAction) AND the popup
  page (the bridge surface). Returns { browser, extension, extensionId,
  sw, downloadsDir, page, popup }.
- lib/extension.ts: waitForOffscreenTarget + attachToOffscreen +
  countOffscreenTargets. Offscreen attach uses target.type() ===
  'background_page' + .asPage() (NOT .page() — RESEARCH §4 Pitfall 1).
- lib/sw.ts: chrome.* state queries via the POPUP page handle (NOT
  the WebWorker handle — see architecture note below). getBadgeText,
  getPopup, getManifest, getIconSize, getIsRecording (side-channeled
  through badge text), fireOnStartup (via __mokoshTestQuery bridge),
  sendSyntheticRecordingError, getNotificationSnapshot (via bridge),
  keepalivePing (no-op message to wake SW for ~30s).
- lib/offscreen.ts: getDisplaySurface, simulateUserStop (the
  dispatchEvent('ended') path per RESEARCH §7 BLOCKER — DO NOT REFACTOR
  to track.stop()), getSegmentCount.
- lib/assertions.ts: runAssertion(idx, name, buffers, fn) wrapper —
  records pass/fail/duration; on failure dumps last 30 lines of SW
  + offscreen console buffers to stderr before rethrowing. assertEqual
  / assertMatch / assertTrue / assertGte / waitFor polling helper.
- lib/zip.ts: jszip-based assertArchiveShape + extractEntryToFile for
  assertions 12 + 13.
- README.md: runtime + local-debug + CI semantics + locale gotcha
  + dev-dep size note + assertion catalog table.
- tsconfig.json: per-tree type-check config (mirrors root tsconfig.json
  compiler options but includes the harness tree explicitly).

Architecture refinement (DEVIATION from RESEARCH §1 — Rule 1+3 inline fix):
- RESEARCH §1 sketched `sw.evaluate(() => chrome.action.getBadgeText({}))`
  as the chrome.* query path. Empirical probes during Task 3 execution
  against Puppeteer 25.0.2 + Chrome 148 + --headless=true revealed two
  blockers:
    1. Puppeteer's WebWorker.evaluate runs in an ISOLATED WORLD that
       carries SW globals (clients, registration, ...) but NOT the
       extension's full chrome.* API surface. Object.keys(chrome) inside
       sw.evaluate returns ["loadTimes","csi"] — the public webpage
       chrome, not the extension chrome.
    2. Chrome 148's headless mode aggressively suspends MV3 service
       workers; subsequent swTarget.worker() calls return
       "Protocol error: No target with given id found".
- WORKAROUND: open the popup page (chrome-extension://<id>/src/popup/
  index.html) as a separate Puppeteer Page. The popup has full
  chrome.* access (it's an extension context with same privileges as
  the SW) AND stable Puppeteer lifetime. For SW-globalThis state
  (__mokoshTest in the SW isolate, NOT in the popup), bridge via
  chrome.runtime.sendMessage. The popup sends
  { type: '__mokoshTestQuery', op: 'snapshot' | 'fire-on-startup' |
  'handler-types' }; the SW hook's onMessage handler responds.
- Bridge implementation added to src/test-hooks/sw-hooks.ts — registers
  AFTER the production listeners so it never intercepts production
  messages (__mokoshTest* type is unambiguously test-only). Tier-1
  grep gate (no-test-hooks-in-prod-bundle.test.ts) continues to enforce
  ZERO __mokoshTest occurrences in dist/ — the bridge handler is
  tree-shaken alongside the rest of the hook module via the
  __MOKOSH_UAT__ gate.

Other configuration changes:
- vitest.config.ts: exclude tests/uat/** from vitest discovery. The
  Puppeteer harness is invoked via `npm run test:uat` (not vitest);
  running it under vitest would try to launch real Chrome inside a
  vitest worker. The .test.ts suffix is retained for editor +
  naming-convention consistency with the rest of the tree.

Verification:
- npx tsc --noEmit (src/): exit 0
- npx tsc --noEmit -p tests/uat: exit 0
- npm run build: exit 0
- grep -rln '__mokoshTest|simulateUserStop|getSegmentCount|setCurrentStream|setSegmentCountGetter|__mokoshTestQuery|__mokoshKeepalive' dist/: ZERO matches
- npm run build:test: exit 0; dist-test/ populated with the new bridge code
- SKIP_BUILD=1 npx vitest run: 89/89 GREEN
- SKIP_PROD_REBUILD=1 npx tsx tests/uat/harness.test.ts:
  → A0 [PASS]: production bundle has no test-hook leaks (19ms)
  → Browser launches; popup opens; manifest read succeeds
  → A1-A13 [FAIL]: NOT YET IMPLEMENTED — Plan 01-11 Task N wires this
  → "UAT harness: 1/14 assertions passed, 13 failed (first failure: A1)"
  → Exit code: 1 (expected — 13 RED stubs intentional)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 09:14:58 +02:00