fix(debug): A33.1 SAVE-ack race — gate on race-free fresh-archive signal
Root cause: driveA33's A33.1 hard-gated on the chrome.runtime.sendMessage SAVE_ARCHIVE callback ack. After the Puppeteer CDP worker.close() SW kill, the SAVE_ARCHIVE message wakes a fresh SW instance; that instance runs the multi-step saveArchive() pipeline (offscreen video-keepalive port re-establishment + REQUEST_BUFFER round-trip + rrweb collection + zip build). The harness's original sendMessage response port has its own MV3 lifetime — on a 5-min-aged SW the pipeline INTERMITTENTLY outruns it, surfacing chrome.runtime.lastError "message port closed before a response was received". The archive is still written correctly every time, which is why A33.2/A33.3 always passed (Plan 04-05 full-mode UAT: A33.1 FAIL while A33.2/A33.3 PASS at 1.56 MB). A33.1 was gating a CI assertion on a best-effort transport ack with inherent MV3 non-determinism. Fix (harness-side only, Option A — race-free reframe): A33.1 now gates on the durable race-free signal — a fresh archive on disk — via the canonical snapshotExistingZips + pollForNewOrUpdatedZip helpers (also used by driveA12/A13/A27). The sendMessage ack is demoted to a soft non-gating diagnostic. This is exactly the signal the proven-reliable spike already uses. A33.2/A33.3 substantive checks are intact and now read the verified fresh zip. No new symbol; FORBIDDEN_HOOK_STRINGS unchanged at 12. The SW SAVE_ARCHIVE handler is a correct MV3 async pattern — no production change. Verified: full-mode A33 (genuine 5-min idle) 3/3 GREEN; skip-mode UAT 35/35 GREEN; tsc + build:test exit 0; vitest 184/184. Debug session: .planning/debug/a33-save-ack-race.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -2527,8 +2527,12 @@ export async function driveA32(page: Page): Promise<AssertionRecord> {
|
||||
// `worker.close()` because Puppeteer's persistent CDP attach keeps
|
||||
// SWs alive indefinitely; natural 30s idle eviction does NOT fire
|
||||
// under test conditions per Chrome devrel.
|
||||
// - `findLatestZip(downloadsDir)` — exported helper from Plan 04-04;
|
||||
// mtime-sort archive selection.
|
||||
// - `snapshotExistingZips` + `pollForNewOrUpdatedZip` — canonical
|
||||
// race-free post-SAVE archive detection (also used by driveA12/A13/
|
||||
// A27). A33.1 gates on a fresh zip appearing here. The debug session
|
||||
// .planning/debug/a33-save-ack-race.md replaced an earlier
|
||||
// `findLatestZip` + sendMessage-ack-gated A33.1 with this race-free
|
||||
// signal (the ack is now a soft diagnostic only).
|
||||
// - `__mokoshHarness.assertA2` — canonical "go to REC state" entrypoint
|
||||
// per Plan 04-04 REVISION iter-2 Option B (read_first verified:
|
||||
// __mokoshHarness has assertA1..A31 + getManifestVersion; A2 does
|
||||
@@ -2567,16 +2571,35 @@ const A33_VIDEO_SIZE_FLOOR_BYTES = 100_000;
|
||||
* 2. Waiting 5 min wall-clock for the SW idle window to elapse.
|
||||
* 3. Force-terminating the SW via stopServiceWorker (Puppeteer CDP).
|
||||
* 4. Settling for SW teardown.
|
||||
* 5. Dispatching SAVE_ARCHIVE inline via chrome.runtime.sendMessage
|
||||
* (wakes SW event-driven per the canonical MV3 wakeup path).
|
||||
* 5. Snapshotting the pre-SAVE zip state, then dispatching SAVE_ARCHIVE
|
||||
* inline via chrome.runtime.sendMessage (wakes SW event-driven per
|
||||
* the canonical MV3 wakeup path).
|
||||
* 6. Settling for chrome.downloads to finish writing.
|
||||
* 7. Locating the produced zip + measuring video/last_30sec.webm size.
|
||||
* 7. Polling downloadsDir for a FRESH archive (race-free), then
|
||||
* measuring video/last_30sec.webm size.
|
||||
*
|
||||
* Checks (3 total):
|
||||
* - A33.1: SAVE_ARCHIVE ack success after 5-min idle + SW kill
|
||||
* - A33.1: a fresh archive appeared in downloadsDir within the poll
|
||||
* timeout after SAVE_ARCHIVE dispatch (race-free durable
|
||||
* signal — the SAVE actually produced an archive).
|
||||
* - A33.2: video/last_30sec.webm size > 0 (buffer survived SW eviction)
|
||||
* - A33.3: video size > 100 KB (sanity floor; real archives 1-3 MB)
|
||||
*
|
||||
* A33.1 design (debug session .planning/debug/a33-save-ack-race.md):
|
||||
* The chrome.runtime.sendMessage callback ack is NOT a gating check. After
|
||||
* worker.close() force-kills the SW, the SAVE_ARCHIVE message wakes a
|
||||
* FRESH SW instance; that instance runs the multi-step saveArchive()
|
||||
* pipeline (offscreen video-keepalive port re-establishment + REQUEST_BUFFER
|
||||
* round-trip + rrweb collection + zip build). The harness's original
|
||||
* sendMessage response port has its own MV3 lifetime — on a 5-min-aged SW
|
||||
* the pipeline INTERMITTENTLY outruns it, surfacing chrome.runtime.lastError
|
||||
* ("message port closed before a response was received"). The archive is
|
||||
* still written correctly every time (saveArchive() + chrome.downloads
|
||||
* complete regardless of whether the ack reaches the harness). So A33.1
|
||||
* gates on the durable race-free signal — a fresh zip on disk — exactly
|
||||
* as the spike (tests/uat/spike-a33-sw-persistence.ts) does; the ack is
|
||||
* captured as a soft diagnostic only.
|
||||
*
|
||||
* Env-gating: when this driver runs, the orchestrator does NOT skip the
|
||||
* 5-min wait — caller should wrap with SKIP_LONG_UAT env-gate at the
|
||||
* harness.test.ts level. See harness.test.ts for the gate.
|
||||
@@ -2586,6 +2609,7 @@ const A33_VIDEO_SIZE_FLOOR_BYTES = 100_000;
|
||||
* References:
|
||||
* - Plan 04-04 PLAN.md Pattern 4 (revived verbatim under valid methodology)
|
||||
* - Plan 04-08 PLAN.md Task 2
|
||||
* - .planning/debug/a33-save-ack-race.md (A33.1 race-free reframe)
|
||||
* - .planning/debug/sw-offscreen-persistence-investigation-session-2.md
|
||||
* - https://developer.chrome.com/docs/extensions/how-to/test/test-serviceworker-termination-with-puppeteer
|
||||
*
|
||||
@@ -2633,10 +2657,22 @@ export async function driveA33(
|
||||
// Step 4 — brief settle for SW teardown.
|
||||
await new Promise((res) => setTimeout(res, A33_NEW_SW_BOOT_MS));
|
||||
|
||||
// Step 5 — SAVE_ARCHIVE inline dispatch from harness-page realm
|
||||
// (Plan 04-04 REVISION iter-2 Option B; wakes SW event-driven).
|
||||
// No dedicated dispatch-save-archive helper symbol is intentionally
|
||||
// introduced — see Plan 04-08 Task 2 Step 3 contract.
|
||||
// Step 5 — snapshot the pre-SAVE zip state, then dispatch SAVE_ARCHIVE
|
||||
// inline from the harness-page realm (Plan 04-04 REVISION iter-2
|
||||
// Option B; wakes SW event-driven). No dedicated dispatch-save-archive
|
||||
// helper symbol is intentionally introduced — see Plan 04-08 Task 2
|
||||
// Step 3 contract.
|
||||
//
|
||||
// The sendMessage callback ack is captured as a SOFT DIAGNOSTIC only,
|
||||
// NOT a gating check — see the function doc + debug session
|
||||
// .planning/debug/a33-save-ack-race.md. The freshly-woken SW completes
|
||||
// saveArchive() + writes the archive regardless of whether the original
|
||||
// response port survives long enough for the ack to land; gating on it
|
||||
// is a flaky-by-design test (the ack intermittently surfaces
|
||||
// chrome.runtime.lastError "message port closed before a response was
|
||||
// received" on the worker.close() -> respawn boundary). A33.1 instead
|
||||
// gates on the durable race-free signal — a fresh zip on disk.
|
||||
const preSnapshot = snapshotExistingZips(downloadsDir);
|
||||
const saveResult = await page.evaluate(
|
||||
(timeoutMs: number) =>
|
||||
new Promise<{ success: boolean; error?: string }>((resolve) => {
|
||||
@@ -2654,25 +2690,29 @@ export async function driveA33(
|
||||
}),
|
||||
A33_SAVE_ARCHIVE_TIMEOUT_MS,
|
||||
);
|
||||
checks.push({
|
||||
name: 'A33.1: SAVE_ARCHIVE ack success after 5-min idle + SW kill',
|
||||
expected: true,
|
||||
actual: saveResult.success,
|
||||
passed: saveResult.success === true,
|
||||
});
|
||||
diagnostics.push(
|
||||
`A33 Step 5: SAVE_ARCHIVE sendMessage ack (soft diagnostic, non-gating) -> ` +
|
||||
`success=${saveResult.success}` +
|
||||
(saveResult.error !== undefined ? ` error="${saveResult.error}"` : ''),
|
||||
);
|
||||
|
||||
// Step 6 — settle for chrome.downloads to finish writing.
|
||||
await new Promise((res) => setTimeout(res, A33_DOWNLOAD_SETTLE_MS));
|
||||
|
||||
// Step 7 — locate the produced zip + measure the video entry.
|
||||
const zipPath = findLatestZip(downloadsDir);
|
||||
// Step 7 — poll downloadsDir for a FRESH archive (race-free). This is
|
||||
// the canonical post-SAVE detection used by driveA12/A13/A27 — it
|
||||
// tolerates the CDP `download.zip` overwrite pattern (mtime diff vs the
|
||||
// pre-SAVE snapshot) and uses the stable-size protocol. A33.1 gates on
|
||||
// this: the SAVE provably produced an archive after the 5-min idle +
|
||||
// SW kill, independent of the best-effort sendMessage ack.
|
||||
const zipPath = await pollForNewOrUpdatedZip(downloadsDir, preSnapshot);
|
||||
checks.push({
|
||||
name: 'A33.1: fresh archive written to downloadsDir after 5-min idle + SW kill (race-free; sendMessage ack is a soft diagnostic per .planning/debug/a33-save-ack-race.md)',
|
||||
expected: 'fresh zip within poll timeout',
|
||||
actual: zipPath !== null ? `fresh zip: ${zipPath}` : 'no fresh zip within poll timeout',
|
||||
passed: zipPath !== null,
|
||||
});
|
||||
if (zipPath === null) {
|
||||
checks.push({
|
||||
name: 'A33.0: at least one zip present in downloadsDir',
|
||||
expected: '>=1 zip',
|
||||
actual: 'no zip in downloadsDir',
|
||||
passed: false,
|
||||
});
|
||||
return {
|
||||
passed: false,
|
||||
name: 'A33 — SW state persistence (5-min idle + SW kill; ROADMAP SC #1)',
|
||||
|
||||
Reference in New Issue
Block a user