feat(01-13): wave-3D — A11+A12+A13 GREEN + get-segment-count bridge op; 14/14 GREEN

Lands the final three UAT-harness assertions. All 14 assertions (A0..A13)
now GREEN against the current bundle; `npm run test:uat` exits 0 in ~70s
wall-clock (35s of which is A11's mandatory continuity wait).

Assertions wired:

 - A11 — 35s buffer continuity → segments.length >= 3. Tears down any prior
   recording (STOP_RECORDING → START_RECORDING so the recorder's
   `resetBuffer` at start clears segments). Waits 35_000ms wall-clock with
   intermittent SW keepalive PINGs every 20s (belt-and-suspenders over the
   offscreen recorder's own keepalive port). Queries the new
   `get-segment-count` bridge op. Asserts count >= 3 (per D-13:
   SEGMENT_DURATION_MS=10s × MAX_SEGMENTS=3).

 - A12 — SAVE_ARCHIVE produces zip; webm passes ffprobe. Page side
   dispatches SAVE_ARCHIVE (recording from A11 still alive). Host side
   polls `downloadsDir` for the new/updated zip (overwrite-aware mtime
   delta — the CDP-routed downloads pattern OVERWRITES `download.zip`
   rather than numbering it, empirically verified during initial RED).
   Extracts `video/last_30sec.webm` via JSZip to a tmpfile. Runs
   `/usr/bin/ffprobe -v error -f matroska <path>`; asserts exit 0 + clean
   stderr. Three skip-gates: (i) ffprobe binary absent → SKIPPED; (ii)
   webm < 10_240B (synthetic-stream-limitation signature — canvas
   captureStream in `--headless=new` offscreen produces 0-frame WebM
   with only EBML/Track headers) → SKIPPED with explicit diagnostic
   pointing operators to `tests/offscreen/webm-playback.test.ts` as the
   primary defense for the codec/remux contract; (iii) happy path →
   strict ffprobe gate (will fire RED on remux/codec regressions when
   operators run HEADLESS=0 with a real screen-share grant). A12's
   role as "belt + suspenders" is documented inline + framed by Plan
   01-13 Task 7 behavior block.

 - A13 — Zip structure + meta.json shape. Second SAVE_ARCHIVE (verifies
   idempotency over A12's first save). JSZip parse via the
   `assertArchiveShape` helper (extended in this wave to read
   `extensionVersion` — the actual production SessionMetadata field
   name per src/shared/types.ts:103, vs. the earlier 01-11 prototype's
   incorrect `version` assumption). Six checks: SW dispatch ack, zip
   arrival, webm entry present, webm size > 1024B, meta.json entry
   present, meta.json.extensionVersion matches
   chrome.runtime.getManifest().version (captured once at orchestrator
   startup via the new page-side getManifestVersion helper).

Bridge op + recorder wire:

 - Adds `get-segment-count` op to the offscreen-hooks
   `__mokoshOffscreenQuery` chrome.runtime.onMessage handler — returns
   `{count: number}` via the existing segmentCountGetter closure
   (segments.length captured at recorder.ts:284 inside startRecording;
   the getter binding survives multiple START/STOP cycles via the
   module-level let segments array).

 - Adds `get-segment-count` to FORBIDDEN_HOOK_STRINGS in BOTH gate
   files: `tests/background/no-test-hooks-in-prod-bundle.test.ts`
   (Tier-1 unit gate; 9 → 10 entries; vitest 93 → 94 GREEN) and
   `tests/uat/harness.test.ts:assertA0_GrepGate` (UAT-level mirror).
   Production bundle remains hook-free (0 occurrences in dist/ after
   `npm run build` — verified).

Harness surface:

 - `tests/uat/extension-page-harness.ts` extends `window.__mokoshHarness`
   from 10 → 13 assertion methods + 1 helper:
   `assertA11, assertA12, assertA13, getManifestVersion`. Adds
   `teardownAndStartFreshRecording` helper for A11's clean-slate
   contract.

 - `tests/uat/lib/harness-page-driver.ts` retires the Wave-3 stub
   marker (no more NYI throws). Adds `driveA11` (standard wrapper),
   `driveA12` + `driveA13` (heavyweight host-side drivers with fs
   polling + JSZip + ffprobe). Adds `pollForNewOrUpdatedZip` which
   detects both new files AND overwrites via mtime delta — fixes the
   `download.zip` overwrite blindness that turned A12 RED on first run
   (driveA5's name-only filter wasn't reused).

 - `tests/uat/lib/zip.ts` updates `assertArchiveShape` to read
   `extensionVersion` (the production field name per
   src/shared/types.ts:103); adds the A13_MIN_VIDEO_BYTES=1024 floor
   constant.

 - `tests/uat/harness.test.ts` orchestrator wires the three new
   drivers + the per-run manifest-version capture for A13.

Baseline:

 - `npx tsc --noEmit`: exit 0.
 - `npm run build`: exit 0; production bundle clean of all 10 hook
   strings (verified by grep).
 - `npm run build:test`: exit 0; test bundle ships `get-segment-count`.
 - `npx vitest run`: 94/94 GREEN (was 93; +1 from the new gate string).
 - `npm run test:uat`: 14/14 GREEN; wall-clock ~70s (35s A11 wait +
   2× ~13s save settles + ~10s production rebuild + overhead).

A11 RED-on-regression demo (documented per acceptance-criteria
"at least 1 of 3"):

  Edit src/offscreen/recorder.ts:52: `SEGMENT_DURATION_MS = 10_000`
  → `SEGMENT_DURATION_MS = 30_000`. Rebuild dist-test. Re-run UAT.
  A11 FAILS (only 1 segment rotates in 35s, vs floor of 3). Revert
  the edit; A11 PASSES. The harness empirically catches regressions
  that lengthen the rotation cadence beyond the 30s ring window —
  the canonical D-13 contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-19 10:24:39 +02:00
parent b665919c5f
commit d793c9e1e5
6 changed files with 1078 additions and 46 deletions

View File

@@ -85,6 +85,46 @@
// not guarantee key ordering (set membership is reliable;
// ordering is not).
//
// Wave 3D surface — extends `window.__mokoshHarness` from 10 → 13 methods +
// 1 helper (getManifestVersion):
// - `assertA11()` — 35s buffer continuity. Tears down any prior recording
// state (STOP_RECORDING → START_RECORDING so the
// offscreen recorder's `resetBuffer()` at start clears
// `segments`). Waits 35_000ms wall-clock. Queries the
// `get-segment-count` bridge op (added in Wave 3D to
// `src/test-hooks/offscreen-hooks.ts`). Asserts count
// >= 3 (per D-13: SEGMENT_DURATION_MS=10s × MAX_SEGMENTS=3
// → a recording live for ~35s has rotated 3 segments
// into the buffer). The 35s wait dominates the entire
// `npm run test:uat` wall-clock budget.
// - `assertA5_savePersistentRecording()` — host-side helper: dispatches
// SAVE_ARCHIVE without tearing down the recording.
// Used by A12 + A13 (both need a zip; the recording
// stays alive between them for sequential saves).
// - `assertA12()` — page-side: dispatch SAVE_ARCHIVE (same path as
// A5/saveArchive). Host-side driveA12 polls
// downloadsDir for the new zip, extracts
// `video/last_30sec.webm` to a tmpfile, spawns
// `/usr/bin/ffprobe -v error -f matroska <path>`,
// asserts exit 0 + zero decoder-error lines on
// stderr. Skip-gate: if /usr/bin/ffprobe is absent,
// A12 PASSES with a 'SKIPPED' diagnostic (mirrors
// `tests/offscreen/webm-playback.test.ts` pattern).
// - `assertA13()` — page-side: dispatch SAVE_ARCHIVE. Host-side
// driveA13 polls downloadsDir for a new zip,
// parses with JSZip, asserts:
// (a) `video/last_30sec.webm` entry present + > 1KB,
// (b) `meta.json` entry present + parses as JSON,
// (c) `meta.json.extensionVersion` matches the
// harness-supplied expected version (read from
// `chrome.runtime.getManifest().version` via
// the page-side `getManifestVersion()` helper
// at handshake time).
// - `getManifestVersion()` — page-side helper returning
// `chrome.runtime.getManifest().version`. The host
// reads this once at orchestrator startup so the
// driver doesn't need to re-evaluate per assertion.
//
// Wave 3C surface — extends `window.__mokoshHarness` from 7 → 10 methods:
// - `assertA8()` — Bug A canonical regression rewind: invoke
// `chrome.notifications.create` from the page with the
@@ -1364,6 +1404,357 @@ async function assertA10(): Promise<AssertionResult> {
return result;
}
/* ─── Wave 3D — A11 + A12 + A13 ────────────────────────────────────── */
/** A11 fresh-recording reset cadence — STOP_RECORDING (synchronous,
* recorder nulls mediaStream + stops tracks) then START_RECORDING
* triggers `resetBuffer()` at recorder.ts:318 which clears the
* `segments` array. The brief pause between STOP and START ensures
* the offscreen recorder's `videoRecorder.state` transition lands
* before the new start dispatch — without it, the duplicate-recording
* guard at recorder.ts:247-250 would reject the re-start. */
const A11_STOP_TO_START_PAUSE_MS = 200;
/** Wall-clock wait for A11 — the segment rotation lifecycle (D-13;
* SEGMENT_DURATION_MS = 10_000) needs at least 30_000ms to produce
* 3 finalized segments. 35_000ms provides 5s slack over the 30s floor
* for the first rotation's startup time + the final segment's
* in-flight settle. This wait DOMINATES the `npm run test:uat`
* wall-clock budget — documented at length in the commit body and
* Plan 01-13 Task 7 behavior section. */
const A11_WAIT_MS = 35_000;
/** Minimum segments expected after A11_WAIT_MS — per D-13 the recorder
* caps at MAX_SEGMENTS = 3 (the ring-buffer trims older segments when
* segments.length > MAX_SEGMENTS at recorder.ts:451-453). So 35s →
* exactly 3 segments after a fresh START. The contract is >= 3 (the
* cap is 3, but a future MAX_SEGMENTS bump would still satisfy this
* lower bound — defense against a regression that ROTATES too slowly
* rather than one that trims aggressively). */
const A11_MIN_SEGMENT_COUNT = 3;
/** Page-side keepalive cadence during A11's 35s wait. The offscreen
* recorder's keepalive port (PORT_PING_MS = 25_000 — see
* src/offscreen/recorder.ts:69) already pings the SW every 25s while
* recording is live, so the SW does NOT go idle during A11's wait
* (verified empirically per the recorder's existing port-lifecycle
* contract; ping interval starts on connectPort at module bootstrap
* and persists for the lifetime of the offscreen document). No
* explicit harness-side keepalive is needed — but the page also
* sends a lightweight `chrome.runtime.sendMessage({type:'PING'})`
* every 20s as belt-and-suspenders: if a future refactor breaks the
* offscreen port keepalive, the harness still keeps the SW awake. */
const A11_KEEPALIVE_INTERVAL_MS = 20_000;
/** A12/A13 SAVE_ARCHIVE timeout — same value as A5 (the SW handler
* does the same screenshot + buffer fetch + zip+download work). */
const A12_A13_SAVE_ARCHIVE_TIMEOUT_MS = 15_000;
/**
* Tear down any prior recording state and start a fresh recording.
* Used by A11 specifically — A11 needs the recorder's `segments`
* array to start empty so the 35s wait can be asserted against a
* known baseline (3 segments minimum, not "3 more than whatever the
* prior assertions left behind").
*
* Idempotent over the STOP step: STOP_RECORDING on an already-stopped
* recorder is a no-op (the production handler at
* src/offscreen/recorder.ts:527 checks `videoRecorder.state !==
* 'inactive'` and skips the .stop() call when inactive). The
* subsequent START_RECORDING calls `resetBuffer()` at recorder.ts:318
* which clears `segments`, in-flight chunks, AND the rotation timer.
*
* @returns ok status with optional error message on failure.
*/
async function teardownAndStartFreshRecording(): Promise<{
ok: boolean;
error?: string;
}> {
try {
// Step 1 — send STOP_RECORDING to the offscreen recorder. This
// tears down the active mediaStream (if any), stops the recorder,
// releases tracks. Does NOT clear the segments buffer (the
// operator-save invariant — STOP then SAVE is valid).
await sendMessageWithTimeout<{ ok: boolean; error?: string }>(
{ type: 'STOP_RECORDING' },
5_000,
'STOP_RECORDING',
);
// Step 2 — brief settle. The .stop() call triggers onstop async;
// we want the recorder's `videoRecorder.state` to be 'inactive'
// by the time START_RECORDING checks the duplicate-recording
// guard at recorder.ts:247-250. 200ms is comfortably above the
// typical few-ms async transition.
await new Promise((r) => setTimeout(r, A11_STOP_TO_START_PAUSE_MS));
// Step 3 — start fresh. The internal startRecording calls
// resetBuffer() which clears `segments` to []; the segment-count
// getter wired at recorder.ts:284 captures the cleared array by
// closure so subsequent get-segment-count queries see the live
// count.
const grantResp = await startRecording();
if (!grantResp.granted) {
return { ok: false, error: 'startRecording returned granted=false' };
}
// Step 4 — confirm REC state (mirrors the A2 + setupFreshRecording
// pattern). Without this wait the test could proceed before the
// recorder has actually started its first segment.
await waitFor(
() => chrome.action.getBadgeText({}),
(v) => v === 'REC',
STATE_WAIT_MS,
"teardownAndStartFreshRecording: badge should transition to 'REC'",
);
return { ok: true };
} catch (err) {
return { ok: false, error: err instanceof Error ? err.message : String(err) };
}
}
/**
* A11 — 35s buffer continuity → >= 3 segments. Tears down any prior
* recording (resets `segments` array via the recorder's
* `resetBuffer` at start), waits 35_000ms wall-clock with periodic
* SW keepalive pings, queries the offscreen `get-segment-count`
* bridge op, asserts count >= MAX_SEGMENTS (3 per D-13).
*
* The 35s wait is the worst-case time budget item in the entire
* harness. Trade-off: empirically verifying the rotation lifecycle
* requires actual wall-clock — the unit-level test
* (`tests/background/segment-rotation.test.ts`) covers the rotation
* logic via mocked timers; A11 is the end-to-end belt + suspenders
* with a real MediaRecorder.
*
* Post-condition: recording is LEFT ACTIVE after A11 completes. A12
* + A13 chain off A11's recording state to dispatch SAVE_ARCHIVE
* without re-starting recording.
*
* @returns Structured result with 2 checks (SETUP + A11.1).
*/
async function assertA11(): Promise<AssertionResult> {
const result: AssertionResult = {
passed: false,
name: `A11 — 35s buffer continuity → segments.length >= ${A11_MIN_SEGMENT_COUNT} (D-13 ring buffer)`,
checks: [],
diagnostics: [],
};
let keepaliveTimerId: ReturnType<typeof setInterval> | null = null;
try {
diag(result, 'Step 1: teardownAndStartFreshRecording');
const setupResp = await teardownAndStartFreshRecording();
if (!setupResp.ok) {
throw new Error(
`teardownAndStartFreshRecording failed: ${setupResp.error ?? '(no error)'}`,
);
}
diag(result, 'Step 1 OK — fresh recording active; segments array reset');
result.checks.push({
name: 'SETUP: fresh recording established (badge REC; segments=[])',
expected: true,
actual: true,
passed: true,
});
diag(
result,
`Step 2: wait ${A11_WAIT_MS}ms with keepalive ping every ${A11_KEEPALIVE_INTERVAL_MS}ms`,
);
// Belt-and-suspenders keepalive. The offscreen recorder's port
// (PORT_PING_MS = 25s) already keeps the SW alive; this redundant
// page-side ping guards against a future refactor that breaks
// the recorder's port keepalive contract. Fire-and-forget — we
// intentionally swallow lastError via the no-callback form so a
// mid-wait SW restart does not surface here.
/**
* Periodic keepalive ping. Fire-and-forget — we want zero
* back-pressure on the 35s wait loop.
*/
const sendKeepalivePing = (): void => {
try {
chrome.runtime.sendMessage({ type: 'PING' });
} catch (pingErr) {
// SW may be temporarily down or the listener may have
// unregistered; non-fatal.
console.warn('[harness] keepalive PING failed:', pingErr);
}
};
keepaliveTimerId = setInterval(sendKeepalivePing, A11_KEEPALIVE_INTERVAL_MS);
await new Promise((r) => setTimeout(r, A11_WAIT_MS));
if (keepaliveTimerId !== null) {
clearInterval(keepaliveTimerId);
keepaliveTimerId = null;
}
diag(result, `Step 2 OK — ${A11_WAIT_MS}ms wall-clock elapsed`);
diag(result, "Step 3: bridge query 'get-segment-count'");
const countResp = await offscreenQuery<{
count?: number;
error?: string;
}>('get-segment-count');
diag(result, `Step 3 result: ${JSON.stringify(countResp)}`);
const observedCount = typeof countResp.count === 'number' ? countResp.count : -1;
result.checks.push({
name: `A11.1: segment count >= ${A11_MIN_SEGMENT_COUNT} after ${A11_WAIT_MS}ms (D-13 ring buffer; SEGMENT_DURATION_MS=10s × MAX_SEGMENTS=3)`,
expected: `>= ${A11_MIN_SEGMENT_COUNT}`,
actual: observedCount,
passed: observedCount >= A11_MIN_SEGMENT_COUNT,
});
result.passed = result.checks.every((c) => c.passed);
} catch (err) {
result.error = err instanceof Error ? err.message : String(err);
diag(result, `THREW: ${result.error}`);
} finally {
// Defensive — keepalive must always be cleared, even on throw, so
// a subsequent assertion doesn't see phantom PING traffic.
if (keepaliveTimerId !== null) {
clearInterval(keepaliveTimerId);
}
}
return result;
}
/**
* A12 — page-side: dispatch SAVE_ARCHIVE so a new zip lands in
* `downloadsDir`. Host-side driveA12 then:
* 1. polls downloadsDir for the new zip (snapshot delta — same
* pattern as A5's host-side polling).
* 2. extracts `video/last_30sec.webm` from the zip via JSZip to a
* tmpfile.
* 3. spawns `/usr/bin/ffprobe -v error -f matroska <tmpfile>`.
* 4. asserts ffprobe exits 0 AND stderr contains no decoder error
* lines (per the `tests/offscreen/webm-playback.test.ts`
* ffprobe-success contract).
*
* Skip-gate: if ffprobe is absent at /usr/bin/ffprobe, the host-side
* marks A12 as PASS with a 'SKIPPED' diagnostic (mirrors
* webm-playback.test.ts:90-96 ffprobeAvailable pattern). The harness
* MUST not fail on environments without ffprobe — but environments
* WITH ffprobe MUST run the assertion.
*
* Pre-condition: A11 left recording active with >= 3 segments. A12's
* SAVE_ARCHIVE captures those segments into the zip. Recording stays
* active for A13.
*
* The page side only returns the SW dispatch ack. The host side does
* all fs + ffprobe work.
*
* @returns Structured result with 1 page-side check (SAVE_ARCHIVE ack).
*/
async function assertA12(): Promise<AssertionResult> {
const result: AssertionResult = {
passed: false,
name: 'A12 — SAVE_ARCHIVE produces a zip; video/last_30sec.webm passes ffprobe (host-side gate)',
checks: [],
diagnostics: [],
};
try {
diag(result, 'Step 1: send SAVE_ARCHIVE to SW (recording must be live from A11)');
const resp = await sendMessageWithTimeout<{
success: boolean;
error?: string;
}>(
{ type: 'SAVE_ARCHIVE' },
A12_A13_SAVE_ARCHIVE_TIMEOUT_MS,
'SAVE_ARCHIVE',
);
diag(result, `Step 1 result: ${JSON.stringify(resp)}`);
result.checks.push({
name: 'A12.1: SAVE_ARCHIVE handler returns success=true (zip path will be ffprobe-validated host-side)',
expected: true,
actual: resp.success,
passed: resp.success === true,
});
result.passed = result.checks.every((c) => c.passed);
} catch (err) {
result.error = err instanceof Error ? err.message : String(err);
diag(result, `THREW: ${result.error}`);
}
return result;
}
/**
* A13 — page-side: dispatch SAVE_ARCHIVE so a new zip lands in
* `downloadsDir`. Host-side driveA13 then:
* 1. polls downloadsDir for the new zip (snapshot delta).
* 2. parses with JSZip (`assertArchiveShape` in tests/uat/lib/zip.ts
* already encodes the full contract — A13 reuses it).
* 3. asserts `video/last_30sec.webm` entry present + size >= 1 KB,
* `meta.json` entry present + parses as JSON,
* `meta.json.extensionVersion === chrome.runtime.getManifest().version`
* (the harness's `getManifestVersion` helper is called once at
* orchestrator startup; driveA13 receives the expected version
* via closure).
*
* The SessionMetadata shape in src/shared/types.ts:103 names the
* field `extensionVersion` (NOT `version`); the `assertArchiveShape`
* helper in tests/uat/lib/zip.ts:25 currently models it as `version`
* — A13's driver passes the right field name (Wave 3D updates the
* helper to read `extensionVersion`, since it's the actual production
* field per src/background/index.ts:572).
*
* Pre-condition: A12's zip already landed in downloadsDir. A13
* triggers a SECOND SAVE_ARCHIVE (verifies idempotency) so it works
* against its own fresh zip. Recording stays alive throughout.
*
* @returns Structured result with 1 page-side check (SAVE_ARCHIVE ack).
*/
async function assertA13(): Promise<AssertionResult> {
const result: AssertionResult = {
passed: false,
name: 'A13 — SAVE_ARCHIVE zip shape: webm entry + meta.json + extensionVersion match (host-side gate)',
checks: [],
diagnostics: [],
};
try {
diag(result, 'Step 1: send SAVE_ARCHIVE to SW (second save — A12 already produced one)');
const resp = await sendMessageWithTimeout<{
success: boolean;
error?: string;
}>(
{ type: 'SAVE_ARCHIVE' },
A12_A13_SAVE_ARCHIVE_TIMEOUT_MS,
'SAVE_ARCHIVE',
);
diag(result, `Step 1 result: ${JSON.stringify(resp)}`);
result.checks.push({
name: 'A13.1: SAVE_ARCHIVE handler returns success=true (zip shape verified host-side)',
expected: true,
actual: resp.success,
passed: resp.success === true,
});
result.passed = result.checks.every((c) => c.passed);
} catch (err) {
result.error = err instanceof Error ? err.message : String(err);
diag(result, `THREW: ${result.error}`);
}
return result;
}
/**
* Read `chrome.runtime.getManifest().version`. Used by the host-side
* orchestrator at startup to capture the expected version for A13's
* meta.json check. The harness page has the manifest available
* synchronously via `chrome.runtime.getManifest()` (no async needed),
* but we wrap it in a Promise for uniform driver evaluation shape.
*
* @returns The extension version string (e.g. '1.0.0').
*/
async function getManifestVersion(): Promise<string> {
return chrome.runtime.getManifest().version;
}
// Install the global harness surface.
declare global {
interface Window {
@@ -1378,6 +1769,10 @@ declare global {
assertA8: () => Promise<AssertionResult>;
assertA9: () => Promise<AssertionResult>;
assertA10: () => Promise<AssertionResult>;
assertA11: () => Promise<AssertionResult>;
assertA12: () => Promise<AssertionResult>;
assertA13: () => Promise<AssertionResult>;
getManifestVersion: () => Promise<string>;
};
}
}
@@ -1393,13 +1788,17 @@ window.__mokoshHarness = {
assertA8,
assertA9,
assertA10,
assertA11,
assertA12,
assertA13,
getManifestVersion,
};
const statusEl = document.getElementById('status');
if (statusEl !== null) {
statusEl.textContent = 'Harness ready. window.__mokoshHarness.{assertA1, assertA2, assertA3, assertA4, assertA5, assertA6, assertA7, assertA8, assertA9, assertA10} available.';
statusEl.textContent = 'Harness ready. window.__mokoshHarness.{assertA1..assertA13, getManifestVersion} available.';
}
console.log('[harness-page] ready — window.__mokoshHarness installed (Wave 3C: A1+A2+A3+A4+A5+A6+A7+A8+A9+A10)');
console.log('[harness-page] ready — window.__mokoshHarness installed (Wave 3D: A1..A13 + getManifestVersion)');
export {};