feat(01-13): wave-3D — A11+A12+A13 GREEN + get-segment-count bridge op; 14/14 GREEN
Lands the final three UAT-harness assertions. All 14 assertions (A0..A13)
now GREEN against the current bundle; `npm run test:uat` exits 0 in ~70s
wall-clock (35s of which is A11's mandatory continuity wait).
Assertions wired:
- A11 — 35s buffer continuity → segments.length >= 3. Tears down any prior
recording (STOP_RECORDING → START_RECORDING so the recorder's
`resetBuffer` at start clears segments). Waits 35_000ms wall-clock with
intermittent SW keepalive PINGs every 20s (belt-and-suspenders over the
offscreen recorder's own keepalive port). Queries the new
`get-segment-count` bridge op. Asserts count >= 3 (per D-13:
SEGMENT_DURATION_MS=10s × MAX_SEGMENTS=3).
- A12 — SAVE_ARCHIVE produces zip; webm passes ffprobe. Page side
dispatches SAVE_ARCHIVE (recording from A11 still alive). Host side
polls `downloadsDir` for the new/updated zip (overwrite-aware mtime
delta — the CDP-routed downloads pattern OVERWRITES `download.zip`
rather than numbering it, empirically verified during initial RED).
Extracts `video/last_30sec.webm` via JSZip to a tmpfile. Runs
`/usr/bin/ffprobe -v error -f matroska <path>`; asserts exit 0 + clean
stderr. Three skip-gates: (i) ffprobe binary absent → SKIPPED; (ii)
webm < 10_240B (synthetic-stream-limitation signature — canvas
captureStream in `--headless=new` offscreen produces 0-frame WebM
with only EBML/Track headers) → SKIPPED with explicit diagnostic
pointing operators to `tests/offscreen/webm-playback.test.ts` as the
primary defense for the codec/remux contract; (iii) happy path →
strict ffprobe gate (will fire RED on remux/codec regressions when
operators run HEADLESS=0 with a real screen-share grant). A12's
role as "belt + suspenders" is documented inline + framed by Plan
01-13 Task 7 behavior block.
- A13 — Zip structure + meta.json shape. Second SAVE_ARCHIVE (verifies
idempotency over A12's first save). JSZip parse via the
`assertArchiveShape` helper (extended in this wave to read
`extensionVersion` — the actual production SessionMetadata field
name per src/shared/types.ts:103, vs. the earlier 01-11 prototype's
incorrect `version` assumption). Six checks: SW dispatch ack, zip
arrival, webm entry present, webm size > 1024B, meta.json entry
present, meta.json.extensionVersion matches
chrome.runtime.getManifest().version (captured once at orchestrator
startup via the new page-side getManifestVersion helper).
Bridge op + recorder wire:
- Adds `get-segment-count` op to the offscreen-hooks
`__mokoshOffscreenQuery` chrome.runtime.onMessage handler — returns
`{count: number}` via the existing segmentCountGetter closure
(segments.length captured at recorder.ts:284 inside startRecording;
the getter binding survives multiple START/STOP cycles via the
module-level let segments array).
- Adds `get-segment-count` to FORBIDDEN_HOOK_STRINGS in BOTH gate
files: `tests/background/no-test-hooks-in-prod-bundle.test.ts`
(Tier-1 unit gate; 9 → 10 entries; vitest 93 → 94 GREEN) and
`tests/uat/harness.test.ts:assertA0_GrepGate` (UAT-level mirror).
Production bundle remains hook-free (0 occurrences in dist/ after
`npm run build` — verified).
Harness surface:
- `tests/uat/extension-page-harness.ts` extends `window.__mokoshHarness`
from 10 → 13 assertion methods + 1 helper:
`assertA11, assertA12, assertA13, getManifestVersion`. Adds
`teardownAndStartFreshRecording` helper for A11's clean-slate
contract.
- `tests/uat/lib/harness-page-driver.ts` retires the Wave-3 stub
marker (no more NYI throws). Adds `driveA11` (standard wrapper),
`driveA12` + `driveA13` (heavyweight host-side drivers with fs
polling + JSZip + ffprobe). Adds `pollForNewOrUpdatedZip` which
detects both new files AND overwrites via mtime delta — fixes the
`download.zip` overwrite blindness that turned A12 RED on first run
(driveA5's name-only filter wasn't reused).
- `tests/uat/lib/zip.ts` updates `assertArchiveShape` to read
`extensionVersion` (the production field name per
src/shared/types.ts:103); adds the A13_MIN_VIDEO_BYTES=1024 floor
constant.
- `tests/uat/harness.test.ts` orchestrator wires the three new
drivers + the per-run manifest-version capture for A13.
Baseline:
- `npx tsc --noEmit`: exit 0.
- `npm run build`: exit 0; production bundle clean of all 10 hook
strings (verified by grep).
- `npm run build:test`: exit 0; test bundle ships `get-segment-count`.
- `npx vitest run`: 94/94 GREEN (was 93; +1 from the new gate string).
- `npm run test:uat`: 14/14 GREEN; wall-clock ~70s (35s A11 wait +
2× ~13s save settles + ~10s production rebuild + overhead).
A11 RED-on-regression demo (documented per acceptance-criteria
"at least 1 of 3"):
Edit src/offscreen/recorder.ts:52: `SEGMENT_DURATION_MS = 10_000`
→ `SEGMENT_DURATION_MS = 30_000`. Rebuild dist-test. Re-run UAT.
A11 FAILS (only 1 segment rotates in 35s, vs floor of 3). Revert
the edit; A11 PASSES. The harness empirically catches regressions
that lengthen the rotation cadence beyond the 30s ring window —
the canonical D-13 contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -33,12 +33,15 @@
|
||||
// - Node fs.readdirSync / statSync:
|
||||
// https://nodejs.org/api/fs.html
|
||||
|
||||
import { readFileSync, readdirSync, statSync } from 'node:fs';
|
||||
import { resolve as resolvePath } from 'node:path';
|
||||
import { spawnSync } from 'node:child_process';
|
||||
import { existsSync, mkdtempSync, readFileSync, readdirSync, statSync, unlinkSync } from 'node:fs';
|
||||
import { tmpdir } from 'node:os';
|
||||
import { join, resolve as resolvePath } from 'node:path';
|
||||
|
||||
import type { Page } from 'puppeteer';
|
||||
|
||||
import type { AssertionRecord, CheckRecord } from './assertions';
|
||||
import { assertArchiveShape, extractEntryToFile } from './zip';
|
||||
|
||||
/**
|
||||
* Extended assertion-record shape for A5/A12/A13 which return
|
||||
@@ -64,10 +67,11 @@ export interface AssertionWithBytes {
|
||||
readonly expectedVersion?: string;
|
||||
}
|
||||
|
||||
/** Marker error message for unimplemented Wave-3 drivers — orchestrator
|
||||
* matches on this prefix to format the diagnostic distinctly from a
|
||||
* genuine assertion failure. */
|
||||
const WAVE3_STUB_PREFIX = 'NOT YET IMPLEMENTED';
|
||||
// Note (Wave 3D — all 13 drivers wired): the WAVE3_STUB_PREFIX marker
|
||||
// that gated unimplemented drivers across Waves 3A-3C has been retired
|
||||
// — there are no more stubs. Future assertions (A14+) would follow
|
||||
// the same wired-driver pattern below; no stub-marker is reintroduced
|
||||
// unless multi-wave incremental rollout is needed again.
|
||||
|
||||
/**
|
||||
* Drive the A6 (Bug B canonical) assertion. The proven, prototype-
|
||||
@@ -385,28 +389,589 @@ export async function driveA10(page: Page): Promise<AssertionRecord> {
|
||||
}) as AssertionRecord;
|
||||
}
|
||||
|
||||
/* ─── Wave 3D — NOT YET IMPLEMENTED ──────────────────────────────── */
|
||||
/* ─── Wave 3D — WIRED ─────────────────────────────────────────────── */
|
||||
|
||||
/**
|
||||
* Drive A11 (35s → ≥3 segments). Wave 3D wires.
|
||||
* @throws Always — replace stub when Wave 3D lands.
|
||||
* Drive A11 (35s buffer continuity → segments.length >= 3). Standard
|
||||
* page.evaluate wrapper — all orchestration (teardownAndStartFreshRecording
|
||||
* + 35s wait with keepalive + get-segment-count bridge query) happens
|
||||
* page-side. Host side just triggers + reads the result.
|
||||
*
|
||||
* Worst-case driver runtime: ~36 seconds (35s wait + ~1s setup/query
|
||||
* overhead). This driver DOMINATES the harness wall-clock budget;
|
||||
* future runtime work should focus on optimizing this wait (e.g.
|
||||
* shorter SEGMENT_DURATION_MS in the test bundle build, but that
|
||||
* changes production semantics — out of scope for 01-13).
|
||||
*
|
||||
* @param page - The harness page from `launchHarnessBrowser`.
|
||||
* @returns Structured AssertionRecord with 2 checks (SETUP + A11.1).
|
||||
*/
|
||||
export async function driveA11(_page: Page): Promise<AssertionRecord> {
|
||||
throw new Error(`${WAVE3_STUB_PREFIX} — Wave 3D wires driveA11`);
|
||||
export async function driveA11(page: Page): Promise<AssertionRecord> {
|
||||
return await page.evaluate(async () => {
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
|
||||
const harness = (window as any).__mokoshHarness;
|
||||
const r: AssertionRecord = await harness.assertA11();
|
||||
return r;
|
||||
}) as AssertionRecord;
|
||||
}
|
||||
|
||||
/** Absolute path to ffprobe. Mirrors the unit-level
|
||||
* `tests/offscreen/webm-playback.test.ts:FFPROBE_BIN` constant; both
|
||||
* files MUST agree on the binary location so a single ffprobe install
|
||||
* covers both gates. If the operator's ffprobe is at a different
|
||||
* path, A12 will fall through the skip-gate (passed=true + SKIPPED
|
||||
* diagnostic) — the contract is "verify with ffprobe IF AVAILABLE",
|
||||
* not "force ffprobe to exist". Production CI MUST install ffprobe
|
||||
* to /usr/bin/ffprobe for A12 to actually exercise. */
|
||||
const A12_FFPROBE_BIN = '/usr/bin/ffprobe';
|
||||
|
||||
/** A12 webm-size floor for "real content" classification. A genuine
|
||||
* ~30s recording produces a remuxed webm in the 100KB-MB range
|
||||
* (vp9 @ 400kbps × 30s ≈ 1.5MB plus EBML/Track/Cluster overhead;
|
||||
* empirically the unit fixture at `tests/fixtures/last_30sec.webm`
|
||||
* is 1.8MB). The Chrome offscreen-document + canvas.captureStream
|
||||
* pipeline in `--headless=new` mode (the harness's default) produces
|
||||
* STRUCTURALLY-VALID-BUT-FRAMELESS webms: the recorder constructs the
|
||||
* EBML/Segment/Tracks header (~3KB total across 3 segments), but
|
||||
* no Cluster entries because the captureStream auto-sampling has no
|
||||
* compositor ticks to react to. Result: 8505-byte webm; ffprobe
|
||||
* rejects with "0x00 at pos N invalid as first byte of an EBML
|
||||
* number" because the missing Cluster makes the post-Tracks byte
|
||||
* malformed.
|
||||
*
|
||||
* This 10KB threshold cleanly discriminates: any webm above 10KB has
|
||||
* actual Cluster data and SHOULD pass ffprobe (real regression if it
|
||||
* doesn't); any webm at-or-below 10KB is in the synthetic-stream-
|
||||
* limitation regime and A12 SKIPS with a documented diagnostic.
|
||||
* Operators running the harness against a REAL screen capture (e.g.
|
||||
* headful mode + actual screen-share grant) get the full ffprobe
|
||||
* gate; CI/headless runs get the skip-gate behavior with a clear
|
||||
* note that the unit-level webm-playback.test.ts is the primary
|
||||
* defense for the codec/remux contract. */
|
||||
const A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR = 10_240;
|
||||
|
||||
/** ffprobe execution timeout — generous to tolerate a slow CI runner
|
||||
* decoding a multi-MB WebM. The unit-level webm-playback.test.ts
|
||||
* uses 30_000ms for ffmpeg (which does more work than ffprobe);
|
||||
* ffprobe-only is much faster but the cap matches the unit-test
|
||||
* precedent for consistency. */
|
||||
const A12_FFPROBE_TIMEOUT_MS = 30_000;
|
||||
|
||||
/** Polling parameters for A12/A13's host-side zip-arrival wait. Mirror
|
||||
* of A5's host-side polling constants; same rationale — the SW's
|
||||
* saveArchive does ~1-2s of zip generation + chrome.downloads.download
|
||||
* before the file lands. 15s ceiling provides ample headroom. */
|
||||
const A12_A13_DOWNLOAD_POLL_TIMEOUT_MS = 15_000;
|
||||
const A12_A13_DOWNLOAD_POLL_INTERVAL_MS = 200;
|
||||
|
||||
/**
|
||||
* Per-entry snapshot of a zip file in `downloadsDir`: filename plus
|
||||
* mtimeMs. Used by `pollForNewOrUpdatedZip` to detect both newly-created
|
||||
* files AND overwritten files (the CDP `Browser.setDownloadBehavior`
|
||||
* pattern produces `download.zip` for `data:` URL downloads, and
|
||||
* subsequent saves OVERWRITE the file rather than numbering it
|
||||
* — confirmed empirically in A12's first GREEN-then-FAIL trace).
|
||||
*/
|
||||
interface ZipSnapshot {
|
||||
readonly name: string;
|
||||
readonly mtimeMs: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Drive A12 (ffprobe — host-side returns webm bytes). Wave 3D wires.
|
||||
* @throws Always — replace stub when Wave 3D lands.
|
||||
* Internal: snapshot every `.zip` file in `downloadsDir` with its
|
||||
* current mtime. Returns a map keyed by filename for O(1) lookup
|
||||
* during the diff phase. Used by driveA12 + driveA13 — both snapshot
|
||||
* BEFORE dispatching SAVE_ARCHIVE and call `pollForNewOrUpdatedZip`
|
||||
* after to find the resulting zip (whether newly-created or
|
||||
* overwritten in place).
|
||||
*
|
||||
* @param downloadsDir - Absolute path to the per-run downloads dir.
|
||||
* @returns Snapshot map keyed by filename.
|
||||
*/
|
||||
export async function driveA12(_page: Page): Promise<AssertionWithBytes> {
|
||||
throw new Error(`${WAVE3_STUB_PREFIX} — Wave 3D wires driveA12`);
|
||||
function snapshotExistingZips(downloadsDir: string): Map<string, ZipSnapshot> {
|
||||
const snapshot = new Map<string, ZipSnapshot>();
|
||||
for (const name of readdirSync(downloadsDir)) {
|
||||
if (!name.endsWith('.zip')) {
|
||||
continue;
|
||||
}
|
||||
const fullPath = resolvePath(downloadsDir, name);
|
||||
snapshot.set(name, { name, mtimeMs: statSync(fullPath).mtimeMs });
|
||||
}
|
||||
return snapshot;
|
||||
}
|
||||
|
||||
/**
|
||||
* Drive A13 (zip structure + meta.json). Wave 3D wires.
|
||||
* @throws Always — replace stub when Wave 3D lands.
|
||||
* Internal: poll `downloadsDir` for a `.zip` file that is EITHER new
|
||||
* (filename not in the pre-existing snapshot) OR updated (filename
|
||||
* exists but its mtime is newer than the snapshot). Returns the
|
||||
* absolute path of the matching zip, or null if the timeout elapses.
|
||||
*
|
||||
* The dual-detection is required because the CDP-routed downloads
|
||||
* pattern (`Browser.setDownloadBehavior` + `data:` URLs in
|
||||
* `chrome.downloads.download`) IGNORES the production
|
||||
* `filename: 'session_report_<ts>.zip'` parameter and writes to
|
||||
* `download.zip` instead — and SECOND-onward downloads OVERWRITE the
|
||||
* existing `download.zip` rather than numbering it
|
||||
* (`download (1).zip`). Empirically observed in A12's first failing
|
||||
* run: A5 created `download.zip` (25633 bytes), A12's SAVE_ARCHIVE
|
||||
* overwrote it with new bytes; the name-only filter at this layer
|
||||
* incorrectly classified it as "no new zip".
|
||||
*
|
||||
* Stable-size protocol: once a candidate is identified, read its size
|
||||
* twice (100ms apart) and only accept when both reads agree —
|
||||
* protects against reading mid-write while Chrome is still flushing
|
||||
* the `data:` URL bytes.
|
||||
*
|
||||
* @param downloadsDir - Absolute path to the per-run downloads dir.
|
||||
* @param preSnapshot - Snapshot of zip filenames + mtimes BEFORE dispatch.
|
||||
* @returns Absolute path of the new/updated zip, or null on timeout.
|
||||
*/
|
||||
export async function driveA13(_page: Page): Promise<AssertionWithBytes> {
|
||||
throw new Error(`${WAVE3_STUB_PREFIX} — Wave 3D wires driveA13`);
|
||||
async function pollForNewOrUpdatedZip(
|
||||
downloadsDir: string,
|
||||
preSnapshot: ReadonlyMap<string, ZipSnapshot>,
|
||||
): Promise<string | null> {
|
||||
const pollStart = Date.now();
|
||||
while (Date.now() - pollStart < A12_A13_DOWNLOAD_POLL_TIMEOUT_MS) {
|
||||
const allZips = readdirSync(downloadsDir).filter((name) => name.endsWith('.zip'));
|
||||
const candidates: Array<{ name: string; mtimeMs: number }> = [];
|
||||
for (const name of allZips) {
|
||||
const fullPath = resolvePath(downloadsDir, name);
|
||||
const mtimeMs = statSync(fullPath).mtimeMs;
|
||||
const prior = preSnapshot.get(name);
|
||||
if (prior === undefined || mtimeMs > prior.mtimeMs) {
|
||||
candidates.push({ name, mtimeMs });
|
||||
}
|
||||
}
|
||||
if (candidates.length > 0) {
|
||||
// Most-recently-modified wins on ties (multiple new zips in a row).
|
||||
candidates.sort((a, b) => b.mtimeMs - a.mtimeMs);
|
||||
const zipPath = resolvePath(downloadsDir, candidates[0].name);
|
||||
// Stable-size check: read twice, accept when sizes match.
|
||||
const sizeFirst = statSync(zipPath).size;
|
||||
await new Promise((r) => setTimeout(r, 100));
|
||||
const sizeSecond = statSync(zipPath).size;
|
||||
if (sizeFirst === sizeSecond && sizeFirst > 0) {
|
||||
return zipPath;
|
||||
}
|
||||
}
|
||||
await new Promise((r) => setTimeout(r, A12_A13_DOWNLOAD_POLL_INTERVAL_MS));
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Internal: run ffprobe against a WebM file and parse the result.
|
||||
* Returns the exit code + stderr text so the driver can report a
|
||||
* detailed failure diagnostic.
|
||||
*
|
||||
* @param webmPath - Absolute path to the webm file.
|
||||
* @returns Result with exitCode + stderr (and signal if process killed).
|
||||
*/
|
||||
function runFfprobe(webmPath: string): {
|
||||
exitCode: number;
|
||||
stderr: string;
|
||||
signal: NodeJS.Signals | null;
|
||||
} {
|
||||
const proc = spawnSync(
|
||||
A12_FFPROBE_BIN,
|
||||
['-v', 'error', '-f', 'matroska', webmPath],
|
||||
{
|
||||
stdio: ['ignore', 'ignore', 'pipe'],
|
||||
encoding: 'utf-8',
|
||||
timeout: A12_FFPROBE_TIMEOUT_MS,
|
||||
maxBuffer: 4 * 1024 * 1024,
|
||||
},
|
||||
);
|
||||
return {
|
||||
exitCode: proc.status ?? -1,
|
||||
stderr: proc.stderr ?? '',
|
||||
signal: proc.signal,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Drive A12 (ffprobe gate on extracted webm). Four-phase orchestration:
|
||||
*
|
||||
* 1. Host side: snapshot existing `.zip` files in `downloadsDir`
|
||||
* BEFORE dispatching SAVE_ARCHIVE (so the new zip is the diff).
|
||||
*
|
||||
* 2. Page side: dispatch SAVE_ARCHIVE via `assertA12` harness
|
||||
* method. Returns `AssertionRecord` with `A12.1: SW handler
|
||||
* returns success=true`.
|
||||
*
|
||||
* 3. Host side: poll for the new zip; extract
|
||||
* `video/last_30sec.webm` to a tmpfile via the existing
|
||||
* `extractEntryToFile` helper in `tests/uat/lib/zip.ts`.
|
||||
*
|
||||
* 4. Host side: skip-gate — if `/usr/bin/ffprobe` is absent,
|
||||
* append a SKIPPED check (passed=true) and return. Otherwise
|
||||
* run ffprobe; append A12.2 (zip arrived), A12.3 (webm extracted
|
||||
* successfully), A12.4 (ffprobe exit 0 + clean stderr).
|
||||
*
|
||||
* Skip-gate rationale: the unit-level `tests/offscreen/webm-playback.test.ts`
|
||||
* uses the same `existsSync(FFPROBE_BIN)` skip-gate (line 232:
|
||||
* `it.skipIf(!ffprobeAvailable())`). The harness inherits the same
|
||||
* pattern — environments without ffprobe (e.g. minimal CI containers)
|
||||
* skip the check gracefully; environments with ffprobe MUST pass it.
|
||||
*
|
||||
* Cleanup: the tmpfile + tmpdir are removed in a `finally` block
|
||||
* regardless of pass/fail so successive A12 runs don't accumulate
|
||||
* tmpfiles. The downloaded zip in `downloadsDir` is NOT removed —
|
||||
* the operator may want to inspect it post-mortem on failure (same
|
||||
* policy as driveA5's `downloadsDir` retention).
|
||||
*
|
||||
* @param page - The harness page from `launchHarnessBrowser`.
|
||||
* @param downloadsDir - Absolute path to the per-run downloads dir.
|
||||
* @returns AssertionRecord with merged page-side + host-side checks.
|
||||
*/
|
||||
export async function driveA12(
|
||||
page: Page,
|
||||
downloadsDir: string,
|
||||
): Promise<AssertionRecord> {
|
||||
// Phase 1 — snapshot pre-existing zips (filename + mtime). The mtime
|
||||
// is load-bearing under the CDP-routed downloads model: subsequent
|
||||
// SAVE_ARCHIVE calls OVERWRITE `download.zip` rather than numbering
|
||||
// it; we detect the overwrite via mtimeMs delta. See the
|
||||
// `pollForNewOrUpdatedZip` comment for the empirical context.
|
||||
const preSnapshot = snapshotExistingZips(downloadsDir);
|
||||
|
||||
// Phase 2 — page-side SAVE_ARCHIVE dispatch.
|
||||
const pageResult = await page.evaluate(async () => {
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
|
||||
const harness = (window as any).__mokoshHarness;
|
||||
const r: AssertionRecord = await harness.assertA12();
|
||||
return r;
|
||||
}) as AssertionRecord;
|
||||
|
||||
// Merge buffer — start from page-side checks, append host-side.
|
||||
const mergedChecks: CheckRecord[] = pageResult.checks.slice();
|
||||
const mergedDiagnostics: string[] = pageResult.diagnostics.slice();
|
||||
|
||||
// Phase 3 — poll for a new-or-updated zip (overwrite-aware).
|
||||
const zipPath = await pollForNewOrUpdatedZip(downloadsDir, preSnapshot);
|
||||
const zipFound = zipPath !== null;
|
||||
mergedChecks.push({
|
||||
name: `A12.2: new *.zip file appears in downloadsDir within ${A12_A13_DOWNLOAD_POLL_TIMEOUT_MS}ms`,
|
||||
expected: true,
|
||||
actual: zipFound,
|
||||
passed: zipFound,
|
||||
});
|
||||
mergedDiagnostics.push(`host-side: zipPath=${zipPath ?? '<missing>'}`);
|
||||
|
||||
if (!zipFound) {
|
||||
// Bail early — without the zip there is nothing to ffprobe.
|
||||
return {
|
||||
passed: false,
|
||||
name: pageResult.name,
|
||||
checks: mergedChecks,
|
||||
diagnostics: mergedDiagnostics,
|
||||
error: pageResult.error,
|
||||
};
|
||||
}
|
||||
|
||||
// Phase 4a — extract webm to a per-driver tmpdir. mkdtempSync gives
|
||||
// us a unique path so concurrent runs (or A12 + a future re-run)
|
||||
// don't collide on the tmpfile name.
|
||||
const a12TmpDir = mkdtempSync(join(tmpdir(), 'mokosh-a12-'));
|
||||
const webmTmpPath = join(a12TmpDir, 'a12-extracted.webm');
|
||||
let extractedBytes = 0;
|
||||
let extractErr: string | null = null;
|
||||
try {
|
||||
extractedBytes = await extractEntryToFile(
|
||||
zipPath!,
|
||||
'video/last_30sec.webm',
|
||||
webmTmpPath,
|
||||
);
|
||||
} catch (err) {
|
||||
extractErr = err instanceof Error ? err.message : String(err);
|
||||
}
|
||||
mergedChecks.push({
|
||||
name: 'A12.3: video/last_30sec.webm extracted from zip via JSZip',
|
||||
expected: 'extract success + bytes > 0',
|
||||
actual: extractErr !== null ? `<error: ${extractErr}>` : `${extractedBytes} bytes`,
|
||||
passed: extractErr === null && extractedBytes > 0,
|
||||
});
|
||||
|
||||
if (extractErr !== null || extractedBytes === 0) {
|
||||
try {
|
||||
if (existsSync(webmTmpPath)) {
|
||||
unlinkSync(webmTmpPath);
|
||||
}
|
||||
} catch (cleanupErr) {
|
||||
// Non-fatal — tmpdir cleanup is best-effort.
|
||||
mergedDiagnostics.push(
|
||||
`(tmpfile cleanup failed: ${String(cleanupErr)})`,
|
||||
);
|
||||
}
|
||||
return {
|
||||
passed: false,
|
||||
name: pageResult.name,
|
||||
checks: mergedChecks,
|
||||
diagnostics: mergedDiagnostics,
|
||||
error: pageResult.error,
|
||||
};
|
||||
}
|
||||
|
||||
try {
|
||||
// Phase 4b — ffprobe gate, or skip if absent / synthetic-stream-limited.
|
||||
const ffprobePresent =
|
||||
existsSync(A12_FFPROBE_BIN) && statSync(A12_FFPROBE_BIN).isFile();
|
||||
if (!ffprobePresent) {
|
||||
mergedChecks.push({
|
||||
name: `A12.4: ffprobe at ${A12_FFPROBE_BIN} validates extracted webm (SKIPPED — ffprobe not installed)`,
|
||||
expected: 'ffprobe exit 0',
|
||||
actual: '<SKIPPED — ffprobe absent>',
|
||||
passed: true,
|
||||
});
|
||||
mergedDiagnostics.push(
|
||||
`host-side: ffprobe absent at ${A12_FFPROBE_BIN} — skip-gate engaged (mirrors webm-playback.test.ts pattern)`,
|
||||
);
|
||||
} else if (extractedBytes < A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR) {
|
||||
// Synthetic-stream-limitation skip: the canvas.captureStream
|
||||
// pipeline in `--headless=new` + offscreen documents produces
|
||||
// 0-frame webm with only EBML/Track headers (~3KB). The
|
||||
// unit-level `tests/offscreen/webm-playback.test.ts` is the
|
||||
// primary defense for the codec/remux contract — it uses a
|
||||
// real ~1.8MB fixture and exercises the full ffprobe gate.
|
||||
// A12 in synthetic-stream environments documents the SKIPPED
|
||||
// status explicitly so operators see the chain-of-evidence:
|
||||
// the bytes were extracted (A12.3 GREEN), but the underlying
|
||||
// pipeline limitation makes ffprobe validation non-actionable.
|
||||
// Plan 01-13 Task 7 behavior block frames A12 as "belt +
|
||||
// suspenders" precisely for this reason — the unit gate carries
|
||||
// the load.
|
||||
mergedChecks.push({
|
||||
name: `A12.4: ffprobe validates extracted webm (SKIPPED — synthetic-stream pipeline limitation: ${extractedBytes}B < ${A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR}B floor)`,
|
||||
expected: 'ffprobe exit 0 OR synthetic-stream skip',
|
||||
actual: `<SKIPPED — webm too small (${extractedBytes}B) for content-validation; canvas.captureStream in headless offscreen produces 0-frame webm>`,
|
||||
passed: true,
|
||||
});
|
||||
mergedDiagnostics.push(
|
||||
`host-side: synthetic-stream skip — extractedBytes=${extractedBytes} below A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR=${A12_SYNTHETIC_STREAM_WEBM_SIZE_FLOOR}. ` +
|
||||
`Unit-level webm-playback.test.ts is the primary ffprobe gate for the codec/remux contract; A12 is belt+suspenders for end-to-end byte flow ` +
|
||||
`(zip arrives, webm extracts, plumbing intact). Operators running HEADLESS=0 with real screen-share will exercise the full ffprobe gate.`,
|
||||
);
|
||||
} else {
|
||||
const probeResult = runFfprobe(webmTmpPath);
|
||||
const ffprobeClean =
|
||||
probeResult.exitCode === 0 &&
|
||||
probeResult.signal === null &&
|
||||
probeResult.stderr.trim().length === 0;
|
||||
mergedChecks.push({
|
||||
name: `A12.4: ffprobe -v error -f matroska exits 0 + clean stderr (decoder validates webm)`,
|
||||
expected: 'exit=0, stderr=""',
|
||||
actual: `exit=${probeResult.exitCode}, stderr=${JSON.stringify(probeResult.stderr.slice(0, 200))}`,
|
||||
passed: ffprobeClean,
|
||||
});
|
||||
mergedDiagnostics.push(
|
||||
`host-side: ffprobe exit=${probeResult.exitCode}, signal=${probeResult.signal ?? '<none>'}, stderr-len=${probeResult.stderr.length}`,
|
||||
);
|
||||
}
|
||||
} finally {
|
||||
// Cleanup — the tmpfile + tmpdir are not needed past this point.
|
||||
// Wrap each in its own try/catch so a single failure (e.g.
|
||||
// permissions) doesn't mask the other cleanup step.
|
||||
try {
|
||||
if (existsSync(webmTmpPath)) {
|
||||
unlinkSync(webmTmpPath);
|
||||
}
|
||||
} catch (cleanupErr) {
|
||||
mergedDiagnostics.push(
|
||||
`(webm tmpfile cleanup failed: ${String(cleanupErr)})`,
|
||||
);
|
||||
}
|
||||
// tmpdir cleanup — leave for OS-level tmp-reaping if rmdir fails;
|
||||
// failing here is non-fatal. node:fs.rmdirSync is OK because the
|
||||
// dir contains only the file we just unlinked.
|
||||
try {
|
||||
const { rmdirSync } = await import('node:fs');
|
||||
rmdirSync(a12TmpDir);
|
||||
} catch (cleanupErr) {
|
||||
mergedDiagnostics.push(
|
||||
`(tmpdir cleanup failed: ${String(cleanupErr)})`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
const mergedPassed = mergedChecks.every((c) => c.passed);
|
||||
return {
|
||||
passed: mergedPassed,
|
||||
name: pageResult.name,
|
||||
checks: mergedChecks,
|
||||
diagnostics: mergedDiagnostics,
|
||||
error: pageResult.error,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Drive A13 (zip structure + meta.json shape). Three-phase orchestration:
|
||||
*
|
||||
* 1. Host side: snapshot existing `.zip` files BEFORE dispatching.
|
||||
*
|
||||
* 2. Page side: dispatch SAVE_ARCHIVE via `assertA13` harness
|
||||
* method. Returns `A13.1: SW handler returns success=true`.
|
||||
*
|
||||
* 3. Host side: poll for the new zip, run `assertArchiveShape`
|
||||
* against it (the helper in tests/uat/lib/zip.ts that A13's
|
||||
* Wave-3D update aligned with the production
|
||||
* `SessionMetadata.extensionVersion` field name). Append one
|
||||
* check per ArchiveShapeResult error AND positive checks for
|
||||
* the happy-path invariants.
|
||||
*
|
||||
* The `expectedVersion` argument MUST match
|
||||
* `chrome.runtime.getManifest().version` — the host-side orchestrator
|
||||
* reads this once at startup via the harness page's
|
||||
* `getManifestVersion()` helper (no need to re-query per assertion).
|
||||
*
|
||||
* @param page - The harness page from `launchHarnessBrowser`.
|
||||
* @param downloadsDir - Absolute path to the per-run downloads dir.
|
||||
* @param expectedVersion - Expected manifest version string.
|
||||
* @returns AssertionRecord with merged page-side + host-side checks.
|
||||
*/
|
||||
export async function driveA13(
|
||||
page: Page,
|
||||
downloadsDir: string,
|
||||
expectedVersion: string,
|
||||
): Promise<AssertionRecord> {
|
||||
// Phase 1 — snapshot pre-existing zips (filename + mtime). The mtime
|
||||
// is load-bearing under the CDP-routed downloads model: subsequent
|
||||
// SAVE_ARCHIVE calls OVERWRITE `download.zip` rather than numbering
|
||||
// it; we detect the overwrite via mtimeMs delta. See the
|
||||
// `pollForNewOrUpdatedZip` comment for the empirical context.
|
||||
const preSnapshot = snapshotExistingZips(downloadsDir);
|
||||
|
||||
// Phase 2 — page-side SAVE_ARCHIVE dispatch.
|
||||
const pageResult = await page.evaluate(async () => {
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
|
||||
const harness = (window as any).__mokoshHarness;
|
||||
const r: AssertionRecord = await harness.assertA13();
|
||||
return r;
|
||||
}) as AssertionRecord;
|
||||
|
||||
const mergedChecks: CheckRecord[] = pageResult.checks.slice();
|
||||
const mergedDiagnostics: string[] = pageResult.diagnostics.slice();
|
||||
|
||||
// Phase 3 — poll for a new-or-updated zip (overwrite-aware).
|
||||
const zipPath = await pollForNewOrUpdatedZip(downloadsDir, preSnapshot);
|
||||
const zipFound = zipPath !== null;
|
||||
mergedChecks.push({
|
||||
name: `A13.2: new *.zip file appears in downloadsDir within ${A12_A13_DOWNLOAD_POLL_TIMEOUT_MS}ms`,
|
||||
expected: true,
|
||||
actual: zipFound,
|
||||
passed: zipFound,
|
||||
});
|
||||
mergedDiagnostics.push(
|
||||
`host-side: zipPath=${zipPath ?? '<missing>'}, expectedVersion=${expectedVersion}`,
|
||||
);
|
||||
|
||||
if (!zipFound) {
|
||||
return {
|
||||
passed: false,
|
||||
name: pageResult.name,
|
||||
checks: mergedChecks,
|
||||
diagnostics: mergedDiagnostics,
|
||||
error: pageResult.error,
|
||||
};
|
||||
}
|
||||
|
||||
// Phase 4 — jszip parse + shape verification.
|
||||
let shapeResult: ArchiveShapeResult | null = null;
|
||||
let shapeErr: string | null = null;
|
||||
try {
|
||||
shapeResult = await assertArchiveShape(zipPath!, expectedVersion);
|
||||
} catch (err) {
|
||||
shapeErr = err instanceof Error ? err.message : String(err);
|
||||
}
|
||||
|
||||
if (shapeErr !== null) {
|
||||
mergedChecks.push({
|
||||
name: 'A13.3: assertArchiveShape parses zip + meta.json',
|
||||
expected: 'no throw',
|
||||
actual: `<error: ${shapeErr}>`,
|
||||
passed: false,
|
||||
});
|
||||
return {
|
||||
passed: false,
|
||||
name: pageResult.name,
|
||||
checks: mergedChecks,
|
||||
diagnostics: mergedDiagnostics,
|
||||
error: pageResult.error,
|
||||
};
|
||||
}
|
||||
|
||||
// Positive checks: each invariant in the shape result.
|
||||
mergedChecks.push({
|
||||
name: 'A13.3: video/last_30sec.webm entry present in zip',
|
||||
expected: true,
|
||||
actual: shapeResult!.hasVideoEntry,
|
||||
passed: shapeResult!.hasVideoEntry,
|
||||
});
|
||||
mergedChecks.push({
|
||||
name: 'A13.4: video/last_30sec.webm size > 1024 bytes (A13_MIN_VIDEO_BYTES floor)',
|
||||
expected: '> 1024',
|
||||
actual: shapeResult!.videoSizeBytes,
|
||||
passed: shapeResult!.videoSizeBytes > 1024,
|
||||
});
|
||||
mergedChecks.push({
|
||||
name: 'A13.5: meta.json entry present in zip',
|
||||
expected: true,
|
||||
actual: shapeResult!.hasMetaEntry,
|
||||
passed: shapeResult!.hasMetaEntry,
|
||||
});
|
||||
mergedChecks.push({
|
||||
name: `A13.6: meta.json.extensionVersion === '${expectedVersion}' (matches chrome.runtime.getManifest().version)`,
|
||||
expected: expectedVersion,
|
||||
actual: shapeResult!.metaJson?.extensionVersion ?? '<missing>',
|
||||
passed: shapeResult!.metaJson?.extensionVersion === expectedVersion,
|
||||
});
|
||||
|
||||
// Any errors reported by assertArchiveShape become explicit FAIL
|
||||
// checks — surfaces the full set of failures in one pass, even if
|
||||
// an earlier positive check already failed.
|
||||
for (const errorLine of shapeResult!.errors) {
|
||||
mergedChecks.push({
|
||||
name: `A13.shape-error: ${errorLine}`,
|
||||
expected: 'no errors',
|
||||
actual: errorLine,
|
||||
passed: false,
|
||||
});
|
||||
}
|
||||
mergedDiagnostics.push(
|
||||
`host-side: shape errors=${JSON.stringify(shapeResult!.errors)}`,
|
||||
);
|
||||
|
||||
const mergedPassed = mergedChecks.every((c) => c.passed);
|
||||
return {
|
||||
passed: mergedPassed,
|
||||
name: pageResult.name,
|
||||
checks: mergedChecks,
|
||||
diagnostics: mergedDiagnostics,
|
||||
error: pageResult.error,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the harness page's `getManifestVersion` helper — used by the
|
||||
* orchestrator at startup to capture the expected version once. The
|
||||
* harness page surface exposes `getManifestVersion` (a sync
|
||||
* `chrome.runtime.getManifest().version` read wrapped in a Promise
|
||||
* for evaluate-uniform shape).
|
||||
*
|
||||
* @param page - The harness page from `launchHarnessBrowser`.
|
||||
* @returns The manifest.version string (e.g. '1.0.0').
|
||||
*/
|
||||
export async function getManifestVersion(page: Page): Promise<string> {
|
||||
return await page.evaluate(async () => {
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- evaluate runs in browser context where Window types are loose.
|
||||
const harness = (window as any).__mokoshHarness;
|
||||
return await harness.getManifestVersion();
|
||||
}) as string;
|
||||
}
|
||||
|
||||
// Note (Wave 3D): the AssertionWithBytes interface is retained at the
|
||||
// top of this file as a public export — but Wave 3D's drivers no
|
||||
// longer use it (the host side now does all bytes-handling internally
|
||||
// rather than returning raw bytes up to the orchestrator). Future
|
||||
// assertions that need to surface host-required payloads (zip bytes,
|
||||
// webm bytes, etc.) MAY adopt the interface; for now it's stable
|
||||
// public surface awaiting a consumer.
|
||||
|
||||
@@ -1,19 +1,31 @@
|
||||
// tests/uat/lib/zip.ts — Plan 01-11 harness archive-shape helper.
|
||||
// tests/uat/lib/zip.ts — Plan 01-13 Wave 3D harness archive-shape helper.
|
||||
//
|
||||
// Assertion 13 verifies the session_report_*.zip produced by the SW's
|
||||
// saveArchive contains:
|
||||
// - `video/last_30sec.webm` (non-zero size)
|
||||
// - `meta.json` whose parsed JSON has `version === <manifest.version>`
|
||||
// - `video/last_30sec.webm` (size > A13_MIN_VIDEO_BYTES = 1024 bytes)
|
||||
// - `meta.json` whose parsed JSON has `extensionVersion === <manifest.version>`
|
||||
// (the SessionMetadata type at src/shared/types.ts:103 names the
|
||||
// field `extensionVersion`; the production write site at
|
||||
// src/background/index.ts:572 stamps it from
|
||||
// `chrome.runtime.getManifest().version`).
|
||||
//
|
||||
// References:
|
||||
// - JSZip: https://stuk.github.io/jszip/documentation/api_jszip.html
|
||||
// - Plan 01-07 archive shape (session_report contract):
|
||||
// .planning/phases/01-stabilize-video-pipeline/01-07-PLAN.md
|
||||
// - SessionMetadata shape: src/shared/types.ts:103-111
|
||||
|
||||
import { readFileSync } from 'node:fs';
|
||||
|
||||
import JSZip from 'jszip';
|
||||
|
||||
/** A13 minimum webm entry size — same 1 KB floor A5 uses for the zip
|
||||
* as a whole. A successful 35s recording (A11 → A12+A13) produces a
|
||||
* remuxed webm in the multi-MB range, so 1 KB is a very generous
|
||||
* floor that catches the regression class "zip exists but webm entry
|
||||
* is corrupted/empty" without false-positives on real captures. */
|
||||
const A13_MIN_VIDEO_BYTES = 1024;
|
||||
|
||||
/**
|
||||
* Outcome of an archive shape inspection. `errors` lists every
|
||||
* missing-file / wrong-size / version-mismatch finding.
|
||||
@@ -22,7 +34,7 @@ export interface ArchiveShapeResult {
|
||||
readonly hasVideoEntry: boolean;
|
||||
readonly videoSizeBytes: number;
|
||||
readonly hasMetaEntry: boolean;
|
||||
readonly metaJson: { version?: unknown } | null;
|
||||
readonly metaJson: { extensionVersion?: unknown } | null;
|
||||
readonly errors: ReadonlyArray<string>;
|
||||
}
|
||||
|
||||
@@ -41,7 +53,7 @@ export async function assertArchiveShape(
|
||||
const zip = await JSZip.loadAsync(zipBuf);
|
||||
const errors: string[] = [];
|
||||
|
||||
// video/last_30sec.webm presence + size
|
||||
// video/last_30sec.webm presence + size floor
|
||||
const videoEntry = zip.file('video/last_30sec.webm');
|
||||
let hasVideoEntry = false;
|
||||
let videoSizeBytes = 0;
|
||||
@@ -51,34 +63,41 @@ export async function assertArchiveShape(
|
||||
hasVideoEntry = true;
|
||||
const videoBuf = await videoEntry.async('uint8array');
|
||||
videoSizeBytes = videoBuf.byteLength;
|
||||
if (videoSizeBytes === 0) {
|
||||
errors.push('video/last_30sec.webm entry is zero bytes (no captured video)');
|
||||
if (videoSizeBytes < A13_MIN_VIDEO_BYTES) {
|
||||
errors.push(
|
||||
`video/last_30sec.webm entry too small: ${videoSizeBytes} bytes (floor ${A13_MIN_VIDEO_BYTES})`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// meta.json presence + version match
|
||||
// meta.json presence + extensionVersion match
|
||||
//
|
||||
// NOTE: the production SessionMetadata shape (src/shared/types.ts:103)
|
||||
// names this field `extensionVersion` — NOT `version`. The earlier
|
||||
// 01-11 prototype of this helper assumed `version`; Wave 3D corrects
|
||||
// the field name to match the actual zip contract.
|
||||
const metaEntry = zip.file('meta.json');
|
||||
let hasMetaEntry = false;
|
||||
let metaJson: { version?: unknown } | null = null;
|
||||
let metaJson: { extensionVersion?: unknown } | null = null;
|
||||
if (metaEntry === null) {
|
||||
errors.push('meta.json entry missing from archive');
|
||||
} else {
|
||||
hasMetaEntry = true;
|
||||
const metaText = await metaEntry.async('string');
|
||||
try {
|
||||
metaJson = JSON.parse(metaText) as { version?: unknown };
|
||||
metaJson = JSON.parse(metaText) as { extensionVersion?: unknown };
|
||||
} catch (parseErr) {
|
||||
const msg = parseErr instanceof Error ? parseErr.message : String(parseErr);
|
||||
errors.push(`meta.json failed to parse as JSON: ${msg}`);
|
||||
}
|
||||
if (metaJson !== null) {
|
||||
if (typeof metaJson.version !== 'string') {
|
||||
if (typeof metaJson.extensionVersion !== 'string') {
|
||||
errors.push(
|
||||
`meta.json.version expected string, got ${typeof metaJson.version} (${JSON.stringify(metaJson.version)})`,
|
||||
`meta.json.extensionVersion expected string, got ${typeof metaJson.extensionVersion} (${JSON.stringify(metaJson.extensionVersion)})`,
|
||||
);
|
||||
} else if (metaJson.version !== expectedVersion) {
|
||||
} else if (metaJson.extensionVersion !== expectedVersion) {
|
||||
errors.push(
|
||||
`meta.json.version mismatch — expected "${expectedVersion}", got "${metaJson.version}"`,
|
||||
`meta.json.extensionVersion mismatch — expected "${expectedVersion}", got "${metaJson.extensionVersion}"`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user