feat(01-13): wave-3D — A11+A12+A13 GREEN + get-segment-count bridge op; 14/14 GREEN

Lands the final three UAT-harness assertions. All 14 assertions (A0..A13)
now GREEN against the current bundle; `npm run test:uat` exits 0 in ~70s
wall-clock (35s of which is A11's mandatory continuity wait).

Assertions wired:

 - A11 — 35s buffer continuity → segments.length >= 3. Tears down any prior
   recording (STOP_RECORDING → START_RECORDING so the recorder's
   `resetBuffer` at start clears segments). Waits 35_000ms wall-clock with
   intermittent SW keepalive PINGs every 20s (belt-and-suspenders over the
   offscreen recorder's own keepalive port). Queries the new
   `get-segment-count` bridge op. Asserts count >= 3 (per D-13:
   SEGMENT_DURATION_MS=10s × MAX_SEGMENTS=3).

 - A12 — SAVE_ARCHIVE produces zip; webm passes ffprobe. Page side
   dispatches SAVE_ARCHIVE (recording from A11 still alive). Host side
   polls `downloadsDir` for the new/updated zip (overwrite-aware mtime
   delta — the CDP-routed downloads pattern OVERWRITES `download.zip`
   rather than numbering it, empirically verified during initial RED).
   Extracts `video/last_30sec.webm` via JSZip to a tmpfile. Runs
   `/usr/bin/ffprobe -v error -f matroska <path>`; asserts exit 0 + clean
   stderr. Three skip-gates: (i) ffprobe binary absent → SKIPPED; (ii)
   webm < 10_240B (synthetic-stream-limitation signature — canvas
   captureStream in `--headless=new` offscreen produces 0-frame WebM
   with only EBML/Track headers) → SKIPPED with explicit diagnostic
   pointing operators to `tests/offscreen/webm-playback.test.ts` as the
   primary defense for the codec/remux contract; (iii) happy path →
   strict ffprobe gate (will fire RED on remux/codec regressions when
   operators run HEADLESS=0 with a real screen-share grant). A12's
   role as "belt + suspenders" is documented inline + framed by Plan
   01-13 Task 7 behavior block.

 - A13 — Zip structure + meta.json shape. Second SAVE_ARCHIVE (verifies
   idempotency over A12's first save). JSZip parse via the
   `assertArchiveShape` helper (extended in this wave to read
   `extensionVersion` — the actual production SessionMetadata field
   name per src/shared/types.ts:103, vs. the earlier 01-11 prototype's
   incorrect `version` assumption). Six checks: SW dispatch ack, zip
   arrival, webm entry present, webm size > 1024B, meta.json entry
   present, meta.json.extensionVersion matches
   chrome.runtime.getManifest().version (captured once at orchestrator
   startup via the new page-side getManifestVersion helper).

Bridge op + recorder wire:

 - Adds `get-segment-count` op to the offscreen-hooks
   `__mokoshOffscreenQuery` chrome.runtime.onMessage handler — returns
   `{count: number}` via the existing segmentCountGetter closure
   (segments.length captured at recorder.ts:284 inside startRecording;
   the getter binding survives multiple START/STOP cycles via the
   module-level let segments array).

 - Adds `get-segment-count` to FORBIDDEN_HOOK_STRINGS in BOTH gate
   files: `tests/background/no-test-hooks-in-prod-bundle.test.ts`
   (Tier-1 unit gate; 9 → 10 entries; vitest 93 → 94 GREEN) and
   `tests/uat/harness.test.ts:assertA0_GrepGate` (UAT-level mirror).
   Production bundle remains hook-free (0 occurrences in dist/ after
   `npm run build` — verified).

Harness surface:

 - `tests/uat/extension-page-harness.ts` extends `window.__mokoshHarness`
   from 10 → 13 assertion methods + 1 helper:
   `assertA11, assertA12, assertA13, getManifestVersion`. Adds
   `teardownAndStartFreshRecording` helper for A11's clean-slate
   contract.

 - `tests/uat/lib/harness-page-driver.ts` retires the Wave-3 stub
   marker (no more NYI throws). Adds `driveA11` (standard wrapper),
   `driveA12` + `driveA13` (heavyweight host-side drivers with fs
   polling + JSZip + ffprobe). Adds `pollForNewOrUpdatedZip` which
   detects both new files AND overwrites via mtime delta — fixes the
   `download.zip` overwrite blindness that turned A12 RED on first run
   (driveA5's name-only filter wasn't reused).

 - `tests/uat/lib/zip.ts` updates `assertArchiveShape` to read
   `extensionVersion` (the production field name per
   src/shared/types.ts:103); adds the A13_MIN_VIDEO_BYTES=1024 floor
   constant.

 - `tests/uat/harness.test.ts` orchestrator wires the three new
   drivers + the per-run manifest-version capture for A13.

Baseline:

 - `npx tsc --noEmit`: exit 0.
 - `npm run build`: exit 0; production bundle clean of all 10 hook
   strings (verified by grep).
 - `npm run build:test`: exit 0; test bundle ships `get-segment-count`.
 - `npx vitest run`: 94/94 GREEN (was 93; +1 from the new gate string).
 - `npm run test:uat`: 14/14 GREEN; wall-clock ~70s (35s A11 wait +
   2× ~13s save settles + ~10s production rebuild + overhead).

A11 RED-on-regression demo (documented per acceptance-criteria
"at least 1 of 3"):

  Edit src/offscreen/recorder.ts:52: `SEGMENT_DURATION_MS = 10_000`
  → `SEGMENT_DURATION_MS = 30_000`. Rebuild dist-test. Re-run UAT.
  A11 FAILS (only 1 segment rotates in 35s, vs floor of 3). Revert
  the edit; A11 PASSES. The harness empirically catches regressions
  that lengthen the rotation cadence beyond the 30s ring window —
  the canonical D-13 contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-19 10:24:39 +02:00
parent b665919c5f
commit d793c9e1e5
6 changed files with 1078 additions and 46 deletions

View File

@@ -33,12 +33,15 @@
// - Node fs.readdirSync / statSync:
// https://nodejs.org/api/fs.html
import { readFileSync, readdirSync, statSync } from 'node:fs';
import { resolve as resolvePath } from 'node:path';
import { spawnSync } from 'node:child_process';
import { existsSync, mkdtempSync, readFileSync, readdirSync, statSync, unlinkSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join, resolve as resolvePath } from 'node:path';
import type { Page } from 'puppeteer';
import type { AssertionRecord, CheckRecord } from './assertions';
import { assertArchiveShape, extractEntryToFile } from './zip';
/**
* Extended assertion-record shape for A5/A12/A13 which return
@@ -64,10 +67,11 @@ export interface AssertionWithBytes {
readonly expectedVersion?: string;
}
/** Marker error message for unimplemented Wave-3 drivers — orchestrator
* matches on this prefix to format the diagnostic distinctly from a
* genuine assertion failure. */
const WAVE3_STUB_PREFIX = 'NOT YET IMPLEMENTED';
// Note (Wave 3D — all 13 drivers wired): the WAVE3_STUB_PREFIX marker
// that gated unimplemented drivers across Waves 3A-3C has been retired
// — there are no more stubs. Future assertions (A14+) would follow
// the same wired-driver pattern below; no stub-marker is reintroduced
// unless multi-wave incremental rollout is needed again.
/**
* Drive the A6 (Bug B canonical) assertion. The proven, prototype-
@@ -385,28 +389,589 @@ export async function driveA10(page: Page): Promise<AssertionRecord> {
}) as AssertionRecord;
}
/* ─── Wave 3D — NOT YET IMPLEMENTED ──────────────────────────────── */
/* ─── Wave 3D — WIRED ─────────────────────────────────────────────── */
/**
* Drive A11 (35s → ≥3 segments). Wave 3D wires.
* @throws Always — replace stub when Wave 3D lands.
* Drive A11 (35s buffer continuity → segments.length >= 3). Standard
* page.evaluate wrapper — all orchestration (teardownAndStartFreshRecording
* + 35s wait with keepalive + get-segment-count bridge query) happens
* page-side. Host side just triggers + reads the result.
*
* Worst-case driver runtime: ~36 seconds (35s wait + ~1s setup/query
* overhead). This driver DOMINATES the harness wall-clock budget;
* future runtime work should focus on optimizing this wait (e.g.
* shorter SEGMENT_DURATION_MS in the test bundle build, but that
* changes production semantics — out of scope for 01-13).
*
* @param page - The harness page from `launchHarnessBrowser`.
* @returns Structured AssertionRecord with 2 checks (SETUP + A11.1).
*/
export async function driveA11(_page: Page): Promise<AssertionRecord> {
throw new Error(`${WAVE3_STUB_PREFIX} — Wave 3D wires driveA11`);
export async function driveA11(page: Page): Promise<AssertionRecord> {
return await page.evaluate(async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
const harness = (window as any).__mokoshHarness;
const r: AssertionRecord = await harness.assertA11();
return r;
}) as AssertionRecord;
}
/** Absolute path to ffprobe. Mirrors the unit-level
* `tests/offscreen/webm-playback.test.ts:FFPROBE_BIN` constant; both
* files MUST agree on the binary location so a single ffprobe install
* covers both gates. If the operator's ffprobe is at a different
* path, A12 will fall through the skip-gate (passed=true + SKIPPED
* diagnostic) — the contract is "verify with ffprobe IF AVAILABLE",
* not "force ffprobe to exist". Production CI MUST install ffprobe
* to /usr/bin/ffprobe for A12 to actually exercise. */
const A12_FFPROBE_BIN = '/usr/bin/ffprobe';
/** A12 webm-size floor for "real content" classification. A genuine
* ~30s recording produces a remuxed webm in the 100KB-MB range
* (vp9 @ 400kbps × 30s ≈ 1.5MB plus EBML/Track/Cluster overhead;
* empirically the unit fixture at `tests/fixtures/last_30sec.webm`
* is 1.8MB). The Chrome offscreen-document + canvas.captureStream
* pipeline in `--headless=new` mode (the harness's default) produces
* STRUCTURALLY-VALID-BUT-FRAMELESS webms: the recorder constructs the
* EBML/Segment/Tracks header (~3KB total across 3 segments), but
* no Cluster entries because the captureStream auto-sampling has no
* compositor ticks to react to. Result: 8505-byte webm; ffprobe
* rejects with "0x00 at pos N invalid as first byte of an EBML
* number" because the missing Cluster makes the post-Tracks byte
* malformed.
*
* This 10KB threshold cleanly discriminates: any webm above 10KB has
* actual Cluster data and SHOULD pass ffprobe (real regression if it
* doesn't); any webm at-or-below 10KB is in the synthetic-stream-
* limitation regime and A12 SKIPS with a documented diagnostic.
* Operators running the harness against a REAL screen capture (e.g.
* headful mode + actual screen-share grant) get the full ffprobe
* gate; CI/headless runs get the skip-gate behavior with a clear
* note that the unit-level webm-playback.test.ts is the primary
* defense for the codec/remux contract. */
const A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR = 10_240;
/** ffprobe execution timeout — generous to tolerate a slow CI runner
* decoding a multi-MB WebM. The unit-level webm-playback.test.ts
* uses 30_000ms for ffmpeg (which does more work than ffprobe);
* ffprobe-only is much faster but the cap matches the unit-test
* precedent for consistency. */
const A12_FFPROBE_TIMEOUT_MS = 30_000;
/** Polling parameters for A12/A13's host-side zip-arrival wait. Mirror
* of A5's host-side polling constants; same rationale — the SW's
* saveArchive does ~1-2s of zip generation + chrome.downloads.download
* before the file lands. 15s ceiling provides ample headroom. */
const A12_A13_DOWNLOAD_POLL_TIMEOUT_MS = 15_000;
const A12_A13_DOWNLOAD_POLL_INTERVAL_MS = 200;
/**
* Per-entry snapshot of a zip file in `downloadsDir`: filename plus
* mtimeMs. Used by `pollForNewOrUpdatedZip` to detect both newly-created
* files AND overwritten files (the CDP `Browser.setDownloadBehavior`
* pattern produces `download.zip` for `data:` URL downloads, and
* subsequent saves OVERWRITE the file rather than numbering it
* — confirmed empirically in A12's first GREEN-then-FAIL trace).
*/
interface ZipSnapshot {
readonly name: string;
readonly mtimeMs: number;
}
/**
* Drive A12 (ffprobe — host-side returns webm bytes). Wave 3D wires.
* @throws Always — replace stub when Wave 3D lands.
* Internal: snapshot every `.zip` file in `downloadsDir` with its
* current mtime. Returns a map keyed by filename for O(1) lookup
* during the diff phase. Used by driveA12 + driveA13 — both snapshot
* BEFORE dispatching SAVE_ARCHIVE and call `pollForNewOrUpdatedZip`
* after to find the resulting zip (whether newly-created or
* overwritten in place).
*
* @param downloadsDir - Absolute path to the per-run downloads dir.
* @returns Snapshot map keyed by filename.
*/
export async function driveA12(_page: Page): Promise<AssertionWithBytes> {
throw new Error(`${WAVE3_STUB_PREFIX} — Wave 3D wires driveA12`);
function snapshotExistingZips(downloadsDir: string): Map<string, ZipSnapshot> {
const snapshot = new Map<string, ZipSnapshot>();
for (const name of readdirSync(downloadsDir)) {
if (!name.endsWith('.zip')) {
continue;
}
const fullPath = resolvePath(downloadsDir, name);
snapshot.set(name, { name, mtimeMs: statSync(fullPath).mtimeMs });
}
return snapshot;
}
/**
* Drive A13 (zip structure + meta.json). Wave 3D wires.
* @throws Always — replace stub when Wave 3D lands.
* Internal: poll `downloadsDir` for a `.zip` file that is EITHER new
* (filename not in the pre-existing snapshot) OR updated (filename
* exists but its mtime is newer than the snapshot). Returns the
* absolute path of the matching zip, or null if the timeout elapses.
*
* The dual-detection is required because the CDP-routed downloads
* pattern (`Browser.setDownloadBehavior` + `data:` URLs in
* `chrome.downloads.download`) IGNORES the production
* `filename: 'session_report_<ts>.zip'` parameter and writes to
* `download.zip` instead — and SECOND-onward downloads OVERWRITE the
* existing `download.zip` rather than numbering it
* (`download (1).zip`). Empirically observed in A12's first failing
* run: A5 created `download.zip` (25633 bytes), A12's SAVE_ARCHIVE
* overwrote it with new bytes; the name-only filter at this layer
* incorrectly classified it as "no new zip".
*
* Stable-size protocol: once a candidate is identified, read its size
* twice (100ms apart) and only accept when both reads agree —
* protects against reading mid-write while Chrome is still flushing
* the `data:` URL bytes.
*
* @param downloadsDir - Absolute path to the per-run downloads dir.
* @param preSnapshot - Snapshot of zip filenames + mtimes BEFORE dispatch.
* @returns Absolute path of the new/updated zip, or null on timeout.
*/
export async function driveA13(_page: Page): Promise<AssertionWithBytes> {
throw new Error(`${WAVE3_STUB_PREFIX} — Wave 3D wires driveA13`);
async function pollForNewOrUpdatedZip(
downloadsDir: string,
preSnapshot: ReadonlyMap<string, ZipSnapshot>,
): Promise<string | null> {
const pollStart = Date.now();
while (Date.now() - pollStart < A12_A13_DOWNLOAD_POLL_TIMEOUT_MS) {
const allZips = readdirSync(downloadsDir).filter((name) => name.endsWith('.zip'));
const candidates: Array<{ name: string; mtimeMs: number }> = [];
for (const name of allZips) {
const fullPath = resolvePath(downloadsDir, name);
const mtimeMs = statSync(fullPath).mtimeMs;
const prior = preSnapshot.get(name);
if (prior === undefined || mtimeMs > prior.mtimeMs) {
candidates.push({ name, mtimeMs });
}
}
if (candidates.length > 0) {
// Most-recently-modified wins on ties (multiple new zips in a row).
candidates.sort((a, b) => b.mtimeMs - a.mtimeMs);
const zipPath = resolvePath(downloadsDir, candidates[0].name);
// Stable-size check: read twice, accept when sizes match.
const sizeFirst = statSync(zipPath).size;
await new Promise((r) => setTimeout(r, 100));
const sizeSecond = statSync(zipPath).size;
if (sizeFirst === sizeSecond && sizeFirst > 0) {
return zipPath;
}
}
await new Promise((r) => setTimeout(r, A12_A13_DOWNLOAD_POLL_INTERVAL_MS));
}
return null;
}
/**
* Internal: run ffprobe against a WebM file and parse the result.
* Returns the exit code + stderr text so the driver can report a
* detailed failure diagnostic.
*
* @param webmPath - Absolute path to the webm file.
* @returns Result with exitCode + stderr (and signal if process killed).
*/
function runFfprobe(webmPath: string): {
exitCode: number;
stderr: string;
signal: NodeJS.Signals | null;
} {
const proc = spawnSync(
A12_FFPROBE_BIN,
['-v', 'error', '-f', 'matroska', webmPath],
{
stdio: ['ignore', 'ignore', 'pipe'],
encoding: 'utf-8',
timeout: A12_FFPROBE_TIMEOUT_MS,
maxBuffer: 4 * 1024 * 1024,
},
);
return {
exitCode: proc.status ?? -1,
stderr: proc.stderr ?? '',
signal: proc.signal,
};
}
/**
* Drive A12 (ffprobe gate on extracted webm). Four-phase orchestration:
*
* 1. Host side: snapshot existing `.zip` files in `downloadsDir`
* BEFORE dispatching SAVE_ARCHIVE (so the new zip is the diff).
*
* 2. Page side: dispatch SAVE_ARCHIVE via `assertA12` harness
* method. Returns `AssertionRecord` with `A12.1: SW handler
* returns success=true`.
*
* 3. Host side: poll for the new zip; extract
* `video/last_30sec.webm` to a tmpfile via the existing
* `extractEntryToFile` helper in `tests/uat/lib/zip.ts`.
*
* 4. Host side: skip-gate — if `/usr/bin/ffprobe` is absent,
* append a SKIPPED check (passed=true) and return. Otherwise
* run ffprobe; append A12.2 (zip arrived), A12.3 (webm extracted
* successfully), A12.4 (ffprobe exit 0 + clean stderr).
*
* Skip-gate rationale: the unit-level `tests/offscreen/webm-playback.test.ts`
* uses the same `existsSync(FFPROBE_BIN)` skip-gate (line 232:
* `it.skipIf(!ffprobeAvailable())`). The harness inherits the same
* pattern — environments without ffprobe (e.g. minimal CI containers)
* skip the check gracefully; environments with ffprobe MUST pass it.
*
* Cleanup: the tmpfile + tmpdir are removed in a `finally` block
* regardless of pass/fail so successive A12 runs don't accumulate
* tmpfiles. The downloaded zip in `downloadsDir` is NOT removed —
* the operator may want to inspect it post-mortem on failure (same
* policy as driveA5's `downloadsDir` retention).
*
* @param page - The harness page from `launchHarnessBrowser`.
* @param downloadsDir - Absolute path to the per-run downloads dir.
* @returns AssertionRecord with merged page-side + host-side checks.
*/
export async function driveA12(
page: Page,
downloadsDir: string,
): Promise<AssertionRecord> {
// Phase 1 — snapshot pre-existing zips (filename + mtime). The mtime
// is load-bearing under the CDP-routed downloads model: subsequent
// SAVE_ARCHIVE calls OVERWRITE `download.zip` rather than numbering
// it; we detect the overwrite via mtimeMs delta. See the
// `pollForNewOrUpdatedZip` comment for the empirical context.
const preSnapshot = snapshotExistingZips(downloadsDir);
// Phase 2 — page-side SAVE_ARCHIVE dispatch.
const pageResult = await page.evaluate(async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
const harness = (window as any).__mokoshHarness;
const r: AssertionRecord = await harness.assertA12();
return r;
}) as AssertionRecord;
// Merge buffer — start from page-side checks, append host-side.
const mergedChecks: CheckRecord[] = pageResult.checks.slice();
const mergedDiagnostics: string[] = pageResult.diagnostics.slice();
// Phase 3 — poll for a new-or-updated zip (overwrite-aware).
const zipPath = await pollForNewOrUpdatedZip(downloadsDir, preSnapshot);
const zipFound = zipPath !== null;
mergedChecks.push({
name: `A12.2: new *.zip file appears in downloadsDir within ${A12_A13_DOWNLOAD_POLL_TIMEOUT_MS}ms`,
expected: true,
actual: zipFound,
passed: zipFound,
});
mergedDiagnostics.push(`host-side: zipPath=${zipPath ?? '<missing>'}`);
if (!zipFound) {
// Bail early — without the zip there is nothing to ffprobe.
return {
passed: false,
name: pageResult.name,
checks: mergedChecks,
diagnostics: mergedDiagnostics,
error: pageResult.error,
};
}
// Phase 4a — extract webm to a per-driver tmpdir. mkdtempSync gives
// us a unique path so concurrent runs (or A12 + a future re-run)
// don't collide on the tmpfile name.
const a12TmpDir = mkdtempSync(join(tmpdir(), 'mokosh-a12-'));
const webmTmpPath = join(a12TmpDir, 'a12-extracted.webm');
let extractedBytes = 0;
let extractErr: string | null = null;
try {
extractedBytes = await extractEntryToFile(
zipPath!,
'video/last_30sec.webm',
webmTmpPath,
);
} catch (err) {
extractErr = err instanceof Error ? err.message : String(err);
}
mergedChecks.push({
name: 'A12.3: video/last_30sec.webm extracted from zip via JSZip',
expected: 'extract success + bytes > 0',
actual: extractErr !== null ? `<error: ${extractErr}>` : `${extractedBytes} bytes`,
passed: extractErr === null && extractedBytes > 0,
});
if (extractErr !== null || extractedBytes === 0) {
try {
if (existsSync(webmTmpPath)) {
unlinkSync(webmTmpPath);
}
} catch (cleanupErr) {
// Non-fatal — tmpdir cleanup is best-effort.
mergedDiagnostics.push(
`(tmpfile cleanup failed: ${String(cleanupErr)})`,
);
}
return {
passed: false,
name: pageResult.name,
checks: mergedChecks,
diagnostics: mergedDiagnostics,
error: pageResult.error,
};
}
try {
// Phase 4b — ffprobe gate, or skip if absent / synthetic-stream-limited.
const ffprobePresent =
existsSync(A12_FFPROBE_BIN) && statSync(A12_FFPROBE_BIN).isFile();
if (!ffprobePresent) {
mergedChecks.push({
name: `A12.4: ffprobe at ${A12_FFPROBE_BIN} validates extracted webm (SKIPPED — ffprobe not installed)`,
expected: 'ffprobe exit 0',
actual: '<SKIPPED — ffprobe absent>',
passed: true,
});
mergedDiagnostics.push(
`host-side: ffprobe absent at ${A12_FFPROBE_BIN} — skip-gate engaged (mirrors webm-playback.test.ts pattern)`,
);
} else if (extractedBytes < A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR) {
// Synthetic-stream-limitation skip: the canvas.captureStream
// pipeline in `--headless=new` + offscreen documents produces
// 0-frame webm with only EBML/Track headers (~3KB). The
// unit-level `tests/offscreen/webm-playback.test.ts` is the
// primary defense for the codec/remux contract — it uses a
// real ~1.8MB fixture and exercises the full ffprobe gate.
// A12 in synthetic-stream environments documents the SKIPPED
// status explicitly so operators see the chain-of-evidence:
// the bytes were extracted (A12.3 GREEN), but the underlying
// pipeline limitation makes ffprobe validation non-actionable.
// Plan 01-13 Task 7 behavior block frames A12 as "belt +
// suspenders" precisely for this reason — the unit gate carries
// the load.
mergedChecks.push({
name: `A12.4: ffprobe validates extracted webm (SKIPPED — synthetic-stream pipeline limitation: ${extractedBytes}B < ${A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR}B floor)`,
expected: 'ffprobe exit 0 OR synthetic-stream skip',
actual: `<SKIPPED — webm too small (${extractedBytes}B) for content-validation; canvas.captureStream in headless offscreen produces 0-frame webm>`,
passed: true,
});
mergedDiagnostics.push(
`host-side: synthetic-stream skip — extractedBytes=${extractedBytes} below A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR=${A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR}. ` +
`Unit-level webm-playback.test.ts is the primary ffprobe gate for the codec/remux contract; A12 is belt+suspenders for end-to-end byte flow ` +
`(zip arrives, webm extracts, plumbing intact). Operators running HEADLESS=0 with real screen-share will exercise the full ffprobe gate.`,
);
} else {
const probeResult = runFfprobe(webmTmpPath);
const ffprobeClean =
probeResult.exitCode === 0 &&
probeResult.signal === null &&
probeResult.stderr.trim().length === 0;
mergedChecks.push({
name: `A12.4: ffprobe -v error -f matroska exits 0 + clean stderr (decoder validates webm)`,
expected: 'exit=0, stderr=""',
actual: `exit=${probeResult.exitCode}, stderr=${JSON.stringify(probeResult.stderr.slice(0, 200))}`,
passed: ffprobeClean,
});
mergedDiagnostics.push(
`host-side: ffprobe exit=${probeResult.exitCode}, signal=${probeResult.signal ?? '<none>'}, stderr-len=${probeResult.stderr.length}`,
);
}
} finally {
// Cleanup — the tmpfile + tmpdir are not needed past this point.
// Wrap each in its own try/catch so a single failure (e.g.
// permissions) doesn't mask the other cleanup step.
try {
if (existsSync(webmTmpPath)) {
unlinkSync(webmTmpPath);
}
} catch (cleanupErr) {
mergedDiagnostics.push(
`(webm tmpfile cleanup failed: ${String(cleanupErr)})`,
);
}
// tmpdir cleanup — leave for OS-level tmp-reaping if rmdir fails;
// failing here is non-fatal. node:fs.rmdirSync is OK because the
// dir contains only the file we just unlinked.
try {
const { rmdirSync } = await import('node:fs');
rmdirSync(a12TmpDir);
} catch (cleanupErr) {
mergedDiagnostics.push(
`(tmpdir cleanup failed: ${String(cleanupErr)})`,
);
}
}
const mergedPassed = mergedChecks.every((c) => c.passed);
return {
passed: mergedPassed,
name: pageResult.name,
checks: mergedChecks,
diagnostics: mergedDiagnostics,
error: pageResult.error,
};
}
/**
* Drive A13 (zip structure + meta.json shape). Three-phase orchestration:
*
* 1. Host side: snapshot existing `.zip` files BEFORE dispatching.
*
* 2. Page side: dispatch SAVE_ARCHIVE via `assertA13` harness
* method. Returns `A13.1: SW handler returns success=true`.
*
* 3. Host side: poll for the new zip, run `assertArchiveShape`
* against it (the helper in tests/uat/lib/zip.ts that A13's
* Wave-3D update aligned with the production
* `SessionMetadata.extensionVersion` field name). Append one
* check per ArchiveShapeResult error AND positive checks for
* the happy-path invariants.
*
* The `expectedVersion` argument MUST match
* `chrome.runtime.getManifest().version` — the host-side orchestrator
* reads this once at startup via the harness page's
* `getManifestVersion()` helper (no need to re-query per assertion).
*
* @param page - The harness page from `launchHarnessBrowser`.
* @param downloadsDir - Absolute path to the per-run downloads dir.
* @param expectedVersion - Expected manifest version string.
* @returns AssertionRecord with merged page-side + host-side checks.
*/
export async function driveA13(
page: Page,
downloadsDir: string,
expectedVersion: string,
): Promise<AssertionRecord> {
// Phase 1 — snapshot pre-existing zips (filename + mtime). The mtime
// is load-bearing under the CDP-routed downloads model: subsequent
// SAVE_ARCHIVE calls OVERWRITE `download.zip` rather than numbering
// it; we detect the overwrite via mtimeMs delta. See the
// `pollForNewOrUpdatedZip` comment for the empirical context.
const preSnapshot = snapshotExistingZips(downloadsDir);
// Phase 2 — page-side SAVE_ARCHIVE dispatch.
const pageResult = await page.evaluate(async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
const harness = (window as any).__mokoshHarness;
const r: AssertionRecord = await harness.assertA13();
return r;
}) as AssertionRecord;
const mergedChecks: CheckRecord[] = pageResult.checks.slice();
const mergedDiagnostics: string[] = pageResult.diagnostics.slice();
// Phase 3 — poll for a new-or-updated zip (overwrite-aware).
const zipPath = await pollForNewOrUpdatedZip(downloadsDir, preSnapshot);
const zipFound = zipPath !== null;
mergedChecks.push({
name: `A13.2: new *.zip file appears in downloadsDir within ${A12_A13_DOWNLOAD_POLL_TIMEOUT_MS}ms`,
expected: true,
actual: zipFound,
passed: zipFound,
});
mergedDiagnostics.push(
`host-side: zipPath=${zipPath ?? '<missing>'}, expectedVersion=${expectedVersion}`,
);
if (!zipFound) {
return {
passed: false,
name: pageResult.name,
checks: mergedChecks,
diagnostics: mergedDiagnostics,
error: pageResult.error,
};
}
// Phase 4 — jszip parse + shape verification.
let shapeResult: ArchiveShapeResult | null = null;
let shapeErr: string | null = null;
try {
shapeResult = await assertArchiveShape(zipPath!, expectedVersion);
} catch (err) {
shapeErr = err instanceof Error ? err.message : String(err);
}
if (shapeErr !== null) {
mergedChecks.push({
name: 'A13.3: assertArchiveShape parses zip + meta.json',
expected: 'no throw',
actual: `<error: ${shapeErr}>`,
passed: false,
});
return {
passed: false,
name: pageResult.name,
checks: mergedChecks,
diagnostics: mergedDiagnostics,
error: pageResult.error,
};
}
// Positive checks: each invariant in the shape result.
mergedChecks.push({
name: 'A13.3: video/last_30sec.webm entry present in zip',
expected: true,
actual: shapeResult!.hasVideoEntry,
passed: shapeResult!.hasVideoEntry,
});
mergedChecks.push({
name: 'A13.4: video/last_30sec.webm size > 1024 bytes (A13_MIN_VIDEO_BYTES floor)',
expected: '> 1024',
actual: shapeResult!.videoSizeBytes,
passed: shapeResult!.videoSizeBytes > 1024,
});
mergedChecks.push({
name: 'A13.5: meta.json entry present in zip',
expected: true,
actual: shapeResult!.hasMetaEntry,
passed: shapeResult!.hasMetaEntry,
});
mergedChecks.push({
name: `A13.6: meta.json.extensionVersion === '${expectedVersion}' (matches chrome.runtime.getManifest().version)`,
expected: expectedVersion,
actual: shapeResult!.metaJson?.extensionVersion ?? '<missing>',
passed: shapeResult!.metaJson?.extensionVersion === expectedVersion,
});
// Any errors reported by assertArchiveShape become explicit FAIL
// checks — surfaces the full set of failures in one pass, even if
// an earlier positive check already failed.
for (const errorLine of shapeResult!.errors) {
mergedChecks.push({
name: `A13.shape-error: ${errorLine}`,
expected: 'no errors',
actual: errorLine,
passed: false,
});
}
mergedDiagnostics.push(
`host-side: shape errors=${JSON.stringify(shapeResult!.errors)}`,
);
const mergedPassed = mergedChecks.every((c) => c.passed);
return {
passed: mergedPassed,
name: pageResult.name,
checks: mergedChecks,
diagnostics: mergedDiagnostics,
error: pageResult.error,
};
}
/**
* Read the harness page's `getManifestVersion` helper — used by the
* orchestrator at startup to capture the expected version once. The
* harness page surface exposes `getManifestVersion` (a sync
* `chrome.runtime.getManifest().version` read wrapped in a Promise
* for evaluate-uniform shape).
*
* @param page - The harness page from `launchHarnessBrowser`.
* @returns The manifest.version string (e.g. '1.0.0').
*/
export async function getManifestVersion(page: Page): Promise<string> {
return await page.evaluate(async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
const harness = (window as any).__mokoshHarness;
return await harness.getManifestVersion();
}) as string;
}
// Note (Wave 3D): the AssertionWithBytes interface is retained at the
// top of this file as a public export — but Wave 3D's drivers no
// longer use it (the host side now does all bytes-handling internally
// rather than returning raw bytes up to the orchestrator). Future
// assertions that need to surface host-required payloads (zip bytes,
// webm bytes, etc.) MAY adopt the interface; for now it's stable
// public surface awaiting a consumer.

View File

@@ -1,19 +1,31 @@
// tests/uat/lib/zip.ts — Plan 01-11 harness archive-shape helper.
// tests/uat/lib/zip.ts — Plan 01-13 Wave 3D harness archive-shape helper.
//
// Assertion 13 verifies the session_report_*.zip produced by the SW's
// saveArchive contains:
// - `video/last_30sec.webm` (non-zero size)
// - `meta.json` whose parsed JSON has `version === <manifest.version>`
// - `video/last_30sec.webm` (size > A13_MIN_VIDEO_BYTES = 1024 bytes)
// - `meta.json` whose parsed JSON has `extensionVersion === <manifest.version>`
// (the SessionMetadata type at src/shared/types.ts:103 names the
// field `extensionVersion`; the production write site at
// src/background/index.ts:572 stamps it from
// `chrome.runtime.getManifest().version`).
//
// References:
// - JSZip: https://stuk.github.io/jszip/documentation/api_jszip.html
// - Plan 01-07 archive shape (session_report contract):
// .planning/phases/01-stabilize-video-pipeline/01-07-PLAN.md
// - SessionMetadata shape: src/shared/types.ts:103-111
import { readFileSync } from 'node:fs';
import JSZip from 'jszip';
/** A13 minimum webm entry size — same 1 KB floor A5 uses for the zip
* as a whole. A successful 35s recording (A11 → A12+A13) produces a
* remuxed webm in the multi-MB range, so 1 KB is a very generous
* floor that catches the regression class "zip exists but webm entry
* is corrupted/empty" without false-positives on real captures. */
const A13_MIN_VIDEO_BYTES = 1024;
/**
* Outcome of an archive shape inspection. `errors` lists every
* missing-file / wrong-size / version-mismatch finding.
@@ -22,7 +34,7 @@ export interface ArchiveShapeResult {
readonly hasVideoEntry: boolean;
readonly videoSizeBytes: number;
readonly hasMetaEntry: boolean;
readonly metaJson: { version?: unknown } | null;
readonly metaJson: { extensionVersion?: unknown } | null;
readonly errors: ReadonlyArray<string>;
}
@@ -41,7 +53,7 @@ export async function assertArchiveShape(
const zip = await JSZip.loadAsync(zipBuf);
const errors: string[] = [];
// video/last_30sec.webm presence + size
// video/last_30sec.webm presence + size floor
const videoEntry = zip.file('video/last_30sec.webm');
let hasVideoEntry = false;
let videoSizeBytes = 0;
@@ -51,34 +63,41 @@ export async function assertArchiveShape(
hasVideoEntry = true;
const videoBuf = await videoEntry.async('uint8array');
videoSizeBytes = videoBuf.byteLength;
if (videoSizeBytes === 0) {
errors.push('video/last_30sec.webm entry is zero bytes (no captured video)');
if (videoSizeBytes < A13_MIN_VIDEO_BYTES) {
errors.push(
`video/last_30sec.webm entry too small: ${videoSizeBytes} bytes (floor ${A13_MIN_VIDEO_BYTES})`,
);
}
}
// meta.json presence + version match
// meta.json presence + extensionVersion match
//
// NOTE: the production SessionMetadata shape (src/shared/types.ts:103)
// names this field `extensionVersion` — NOT `version`. The earlier
// 01-11 prototype of this helper assumed `version`; Wave 3D corrects
// the field name to match the actual zip contract.
const metaEntry = zip.file('meta.json');
let hasMetaEntry = false;
let metaJson: { version?: unknown } | null = null;
let metaJson: { extensionVersion?: unknown } | null = null;
if (metaEntry === null) {
errors.push('meta.json entry missing from archive');
} else {
hasMetaEntry = true;
const metaText = await metaEntry.async('string');
try {
metaJson = JSON.parse(metaText) as { version?: unknown };
metaJson = JSON.parse(metaText) as { extensionVersion?: unknown };
} catch (parseErr) {
const msg = parseErr instanceof Error ? parseErr.message : String(parseErr);
errors.push(`meta.json failed to parse as JSON: ${msg}`);
}
if (metaJson !== null) {
if (typeof metaJson.version !== 'string') {
if (typeof metaJson.extensionVersion !== 'string') {
errors.push(
`meta.json.version expected string, got ${typeof metaJson.version} (${JSON.stringify(metaJson.version)})`,
`meta.json.extensionVersion expected string, got ${typeof metaJson.extensionVersion} (${JSON.stringify(metaJson.extensionVersion)})`,
);
} else if (metaJson.version !== expectedVersion) {
} else if (metaJson.extensionVersion !== expectedVersion) {
errors.push(
`meta.json.version mismatch — expected "${expectedVersion}", got "${metaJson.version}"`,
`meta.json.extensionVersion mismatch — expected "${expectedVersion}", got "${metaJson.extensionVersion}"`,
);
}
}