diff --git a/.planning/debug/sw-offscreen-persistence-investigation-session-2.md b/.planning/debug/sw-offscreen-persistence-investigation-session-2.md new file mode 100644 index 0000000..fe6a3e2 --- /dev/null +++ b/.planning/debug/sw-offscreen-persistence-investigation-session-2.md @@ -0,0 +1,96 @@ +--- +status: diagnosed +trigger: "Continuation of session-1 (committed at d614462). Prior verdict INCONCLUSIVE; ~75 min cheap disambiguation plan: Step A fix offscreen target filter + re-run + capture logs; Step B segment-count probe pre-kill; Step C spike without worker.close(). Synthesize verdict for SC#1 plan-fix routing." +created: 2026-05-21T19:30:00Z +updated: 2026-05-21T20:35:00Z +parent_session: ".planning/debug/sw-offscreen-persistence-investigation.md (committed d614462)" +goal: find_root_cause_only +verdict: REFUTED-architecture (canvas-captureStream issue) +recommendation: "REFUTE the Plan 04-04 SUMMARY's framing 'spike FAILED → architectural plan-fix needed'. The architecture works correctly; the SPIKE methodology is invalid. ROADMAP SC#1 remains OPEN but the previously-recommended IndexedDB persistence plan-fix would NOT close it (the spike would STILL produce 8505 bytes after IDB persistence is added, because the failure is in the test's fake-stream pipeline, not in the persistence layer). Reframe SC#1 verification methodology BEFORE doing any architectural work." +--- + +## Current Focus + +verdict: REFUTED-architecture (canvas-captureStream issue) +confidence: HIGH — three independent spike runs converge on the same conclusion via different observability channels. +hypothesis_state: "RESOLVED. Root cause was Hypothesis B (canvas-captureStream + headless idle = 0-frame segments). Hypotheses A (architectural RAM loss) and C (CDP-induced offscreen teardown) are REFUTED by empirical evidence below." +next_action: "Return verdict + recommendation to orchestrator. Plan-fix routing decision: do NOT proceed with IndexedDB persistence plan-fix; instead, propose a Phase 5 plan that reframes SC#1 verification methodology (real getDisplayMedia in non-headless Puppeteer; OR a video-file-backed fake stream that survives headless throttling)." + +## Symptoms + +(Inherited verbatim from session-1; IMMUTABLE.) + +expected: "After 5 min SW idle + SAVE_ARCHIVE, video buffer survives at 1-3 MB." +actual: "videoSize=8505 bytes (deterministic; 4/4 runs in session-1 + 2/2 runs in session-2 + 1/1 Step-C variant = 7/7 reproducibility)." +errors: "ffprobe: 'End of file' + 'Duplicate element'; no valid clusters." +reproduction: "HEADLESS=1 npx tsx tests/uat/spike-a33-sw-persistence.ts (~6 min wall-clock)." + +## Eliminated + +- hypothesis: "(A) Architectural — offscreen-RAM `segments: Blob[] = []` lost across SW termination" + evidence: "Direct observation via the `get-segment-count` bridge probe at 3 checkpoints in the canonical spike: POST-PRIME=0, PRE-KILL=3, POST-KILL=3. Segments structurally survive the SW kill at full count. The offscreen document itself responds to the post-kill probe (segments.length=3), so it did not die with the SW. ALSO: Step C (no SW kill) produces the IDENTICAL 8505-byte result — kill is irrelevant." + timestamp: 2026-05-21T19:55:00Z + +- hypothesis: "(C) CDP-artifact — Puppeteer worker.close() collaterally tears down offscreen via cross-target side effect (Puppeteer #9995 territory)" + evidence: "Step C spike WITHOUT worker.close() produces IDENTICAL 8505-byte videoSize. Removing the SW-kill step does NOT improve the outcome. The CDP attach is not implicated." + timestamp: 2026-05-21T20:01:00Z + +## Evidence + +- timestamp: "2026-05-21T19:30Z" + checked: "Existing `__mokoshOffscreenQuery` 'get-segment-count' bridge op at src/test-hooks/offscreen-hooks.ts:493-523" + found: "The bridge op already exists. The harness can dispatch chrome.runtime.sendMessage({type: '__mokoshOffscreenQuery', op: 'get-segment-count'}) from the harness page realm, and the offscreen-hooks's onMessage listener responds with {count: number} where count=segmentCountGetter() which is closure over recorder's module-level segments.length. This is a PRODUCTION TEST SURFACE (gated by __MOKOSH_UAT__, in the canonical 12-string FORBIDDEN_HOOK_STRINGS inventory)." + implication: "Step B uses this without adding ANY new symbol — zero impact on FORBIDDEN_HOOK_STRINGS count. Probe rides production surface." + +- timestamp: "2026-05-21T19:48Z (canonical spike with probes + race-broken offscreen attach)" + checked: "HEADLESS=1 npx tsx tests/uat/spike-a33-sw-persistence.ts → /tmp/04-04-debug-step-ab.log" + found: "SPIKE PROBE [POST-PRIME]: segments.length=0; SPIKE PROBE [PRE-KILL]: segments.length=3; SPIKE PROBE [POST-KILL]: segments.length=3. Final videoSize=8505 bytes." + implication: "Segments accumulated to MAX_SEGMENTS=3 during the 5-min idle (recorder rotation worked). Segments survived the SW kill (count unchanged at 3). YET the final WebM is 8505 bytes (no frames). Conclusion: segments structurally exist but contain NO video frames — they are Blob OBJECTS with valid track metadata but empty Cluster sections. This rules out Hypothesis A (segments lost across SW termination) at the structural level." + +- timestamp: "2026-05-21T20:00Z (Step C spike: SPIKE_SKIP_SW_KILL=1)" + checked: "HEADLESS=1 SPIKE_SKIP_SW_KILL=1 npx tsx tests/uat/spike-a33-sw-persistence.ts → /tmp/04-04-debug-step-c.log" + found: "Identical 8505-byte videoSize. Crucially, since SW was NOT killed, its console listener stayed connected, so we see the FULL Remux pipeline output: 'Remuxing 3 segments; sizes: [array Array]' followed by 'Segment ts=1: 0 frames, duration=0ms, trackInfo=320x180' (and identical lines for ts=2 + ts=3) and 'Remux complete: 0 frames, total timeline=0ms, output=8505 bytes'." + implication: "DEFINITIVE: each of the 3 segments has a valid track header (trackInfo=320x180 was parsed from each) but ZERO VP9 frames. The Remuxer faithfully emitted an 8505-byte WebM containing only EBML headers + Segment + Tracks + Cues (no Clusters). The 5-min headless idle window prevented MediaRecorder from receiving frame data from the canvas-captureStream source despite the offscreen-hooks setInterval(drawFrame, 33) belt-and-suspenders. Hypothesis B (canvas-captureStream throttling) is empirically confirmed; Hypothesis C (CDP-induced teardown) is empirically refuted (Step C removed the CDP-kill and the failure persisted)." + +- timestamp: "2026-05-21T20:30Z (diagnostic spike — tests/uat/spike-diagnose-offscreen-target.ts)" + checked: "Puppeteer browser.targets() enumeration with active offscreen recording (post-assertA2 prime)" + found: "browser.targets() count=7 targets. Offscreen reported as: type=background_page url=chrome-extension://{id}/src/offscreen/index.html. Crucially, the targetcreated event fires for the offscreen with type='other' and url='' (metadata not yet stabilized at create time); type+url stabilize AFTER the target's first navigation completes." + implication: "ROOT CAUSE OF SESSION-1 OBSERVABILITY BUG: the launch.ts `targetcreated` listener checked target.type() and target.url() at event time, but the metadata is unstable at that instant. The original `type==='background_page'` filter was actually CORRECT for the stable enumeration — it failed only because checked at the wrong time. Fix: bind to BOTH 'targetcreated' AND 'targetchanged' (the latter fires when URL stabilizes) + enumerate `browser.targets()` as a race-free safety net. URL pattern is the load-bearing match criterion; type-based discrimination is unreliable across event timings." + +- timestamp: "2026-05-21T20:31Z (verification — diagnostic with the fix in place)" + checked: "HEADLESS=1 npx tsx tests/uat/spike-diagnose-offscreen-target.ts → grep [off:" + found: "[off:log] [OS:Recorder] Recording started (D-13 restart-segments), mime: video/webm;codecs=vp9 segment_ms: 10000 max_segments: 3" + implication: "Step A fix verified — offscreen console capture now functional. The (launch: offscreen console attached — url=...) diagnostic confirms the attach happened via the targetchanged event after the URL stabilized." + +## Resolution + +root_cause: "Test methodology (NOT production architecture). The spike's `installFakeDisplayMedia()` synthetic stream (canvas.captureStream(30) on a hidden 320x180 canvas appended to the offscreen DOM, offset to -9999px) cannot sustain frame production during a 5-min wall-clock idle window in headless Chrome. The offscreen-hooks setInterval(drawFrame, 33ms) belt-and-suspenders is INSUFFICIENT to mitigate the documented Chromium auto-throttling of MediaRecorder on invisible-canvas sources (Chrome bug 653548, auto-throttled-screen-capture design doc, sendrec.eu blog 'Why Canvas Breaks Your Screen Recorder'). The MediaRecorder produces structurally-valid WebM segments (EBML header + Tracks block parses correctly; trackInfo=320x180 extracted) but each segment's Cluster section is empty. The Remuxer faithfully emits a header-only 8505-byte WebM." + +fix: "Plan 04-04 spike's test methodology needs replacement. The architecture (offscreen-RAM segments: Blob[] = []) is sound; it correctly preserves the 3 segments across SW kill (proven by post-kill probe = 3). RECOMMENDATIONS (out of scope for this debug session): (a) reframe SC#1 verification to use real getDisplayMedia in non-headless Puppeteer with --auto-select-desktop-capture-source (currently rejected per 01-11-SUMMARY falsification 4 as 'unreliable in headless'; may be reliable in non-headless); OR (b) replace the canvas-captureStream fake with a video-file-backed source (e.g., MediaStream from an HTMLVideoElement playing a bundled WebM) which doesn't suffer the invisible-canvas throttling; OR (c) lower SC#1's wall-clock idle threshold to a value short enough that canvas-captureStream survives (e.g., 30s) AND add a separate manual operator-empirical test for the full 5-min case. The ARCHITECTURAL change (IndexedDB persistence) recommended in Plan 04-04 SUMMARY would NOT close SC#1 — the spike would STILL produce 8505 bytes after IDB lands because the failure is in the test's fake stream, not in segment persistence." + +verification: "Three independent spike runs converge: (1) canonical with broken observability → 8505 bytes; pre-kill probe count=3, post-kill count=3. (2) canonical with race-fixed observability (post-Step-A) → 8505 bytes; pre-kill count=3, post-kill count=3. (3) Step C variant (skip worker.close()) → 8505 bytes; Remux output explicitly shows '0 frames' for each segment. The architecture works (segments survive); the test methodology is broken (segments are 0-frame Blobs)." + +files_changed: + - "tests/uat/lib/launch.ts (Step A fix — race-tolerant offscreen target attach: bind targetcreated AND targetchanged + browser.targets() enumeration; URL-based match; updated docstrings citing empirical evidence)" + - "tests/uat/spike-a33-sw-persistence.ts (Step B — added `probeSegmentCount` helper using existing `__mokoshOffscreenQuery` 'get-segment-count' bridge op; 3 probes at POST-PRIME/PRE-KILL/POST-KILL; Step C — added SPIKE_SKIP_SW_KILL=1 mode that skips the worker.close() call)" + - "tests/uat/spike-diagnose-offscreen-target.ts (NEW — one-off diagnostic script that enumerated browser.targets() to reveal the Puppeteer event-timing bug; safe to keep committed as a future SW-lifecycle debugging tool)" + - ".planning/debug/sw-offscreen-persistence-investigation-session-2.md (NEW — this debug note)" + +## Synthesis — Verdict Decision Tree + +Per the disambiguation_protocol Step D: + +- **CONFIRMED-architecture:** REJECTED. Segments survive SW kill; pre/post-kill counts both = 3. Removing the SW kill (Step C) does NOT improve outcome. Persistence is not the issue. + +- **REFUTED-architecture (canvas-captureStream issue):** CONFIRMED. + - Pre-kill probe: count=3 → segments accumulated correctly. + - Step C (no kill) → identical 8505-byte failure. + - Direct Remux log evidence: `Segment ts=1: 0 frames, duration=0ms, trackInfo=320x180` × 3. + - Documented Chromium throttling behavior matches exactly: invisible-canvas MediaRecorder → near-zero frame production over time. + - Setinterval(drawFrame, 33) workaround empirically insufficient against the actual throttling path (canvas pixels change BUT captureStream track stops emitting frames). + +- **REFUTED-architecture (CDP artifact):** REJECTED. Step C (without worker.close()) produces same failure. CDP attach is not the cause. + +- **STILL-INCONCLUSIVE:** REJECTED. The three independent observations (segment count, Step-C, Remux logs) converge unambiguously on canvas-captureStream throttling. + +**Routing recommendation:** Do NOT proceed with the IndexedDB persistence plan-fix proposed by Plan 04-04 SUMMARY. Open a new plan slot (likely Plan 04-08 or a Phase 5 plan) that reframes SC#1 verification methodology. Architecture is sound; verification gate is broken. diff --git a/tests/uat/spike-a33-sw-persistence.ts b/tests/uat/spike-a33-sw-persistence.ts index 092888d..a482a9b 100644 --- a/tests/uat/spike-a33-sw-persistence.ts +++ b/tests/uat/spike-a33-sw-persistence.ts @@ -1,4 +1,5 @@ -// tests/uat/spike-a33-sw-persistence.ts — Plan 04-04 Wave 0 SPIKE. +// tests/uat/spike-a33-sw-persistence.ts — Plan 04-04 Wave 0 SPIKE + +// Plan-04-04 debug session-2 disambiguation probes. // // One-shot empirical investigation: does the offscreen document survive // a 5-min idle window followed by Puppeteer-driven force-termination @@ -8,6 +9,19 @@ // is MEDIUM because the Chrome docs leave the offscreen-vs-SW interplay // implicit. This spike is the empirical hedge. // +// Session-2 disambiguation (committed alongside debug +// session-2 note .planning/debug/sw-offscreen-persistence-investigation-session-2.md): +// - get-segment-count probes at 3 checkpoints (post-prime, pre-kill, +// post-kill-pre-SAVE) — distinguishes "segments empty pre-kill +// (canvas-captureStream throttled)" from "segments existed pre-kill +// but lost post-kill (architectural)". Probes ride the production +// __mokoshOffscreenQuery bridge op (already in FORBIDDEN_HOOK_STRINGS +// inventory); no new test-only symbols introduced. +// - SPIKE_SKIP_SW_KILL=1 env-var skips the worker.close() call — +// distinguishes "SW kill via CDP causes the failure (CDP artifact)" +// from "5-min idle alone causes the failure (architectural OR +// canvas-throttling-without-SW-involvement)". +// // Outcome decision tree: // - videoSize > 100_000 bytes → SPIKE PASSED. Plan 04-04 Wave 1 // proceeds: A33 is a verification-only harness assertion that @@ -63,16 +77,99 @@ const SPIKE_DOWNLOAD_SETTLE_MS = 5_000; * meta.json + screenshot is ~50 KB; anything above 100 KB means the * video buffer actually contained segments). */ const SPIKE_VIDEO_SIZE_FLOOR_BYTES = 100_000; +/** Session-2: probe op timeout (5s; matches the harness's offscreenQuery default). */ +const SPIKE_PROBE_TIMEOUT_MS = 5_000; + +/** + * Session-2 disambiguation probe — query the offscreen recorder's live + * `segments.length` via the production `__mokoshOffscreenQuery` + * `get-segment-count` bridge op (src/test-hooks/offscreen-hooks.ts:493). + * + * Returns the segment count (number of fully-rotated 10s WebM segments + * in the in-memory ring buffer), or `-1` on bridge error. + * + * The bridge op rides chrome.runtime.sendMessage from the harness page + * realm → offscreen-hooks's onMessage listener responds synchronously + * via sendResponse({count: number}). No new test-hook symbol; the + * `get-segment-count` op is already in the 12-entry FORBIDDEN_HOOK_STRINGS + * inventory (Plan 01-13 Wave 3D A11 contract). + * + * Used at 3 checkpoints in the spike: + * 1. post-prime: confirms baseline segment-count ≈ 0 (no rotations yet) + * — sanity check; if NON-zero here, the prime did unexpected work. + * 2. pre-kill (just BEFORE stopServiceWorker): the critical probe — + * if zero here, hypothesis B (canvas-throttling) confirmed; + * if non-zero here but archive empty post-SAVE, hypothesis A + * (architectural RAM loss across SW termination) confirmed. + * 3. post-kill-pre-SAVE: confirms offscreen still respawns + can + * respond — if the offscreen target itself died with the SW kill, + * this probe would throw or time out; that result distinguishes + * "offscreen survives kill but loses state" from "offscreen dies + * with the SW kill (collateral teardown per Puppeteer #9995)". + * + * @param page - The harness page (has chrome.runtime access). + * @returns The live segment count, or `-1` on probe error. + */ +async function probeSegmentCount( + page: import('puppeteer').Page, + label: string, +): Promise { + try { + const result = await page.evaluate( + (timeoutMs: number) => + new Promise<{ count?: number; error?: string }>((resolve) => { + const timer = setTimeout(() => { + resolve({ error: `get-segment-count timed out after ${timeoutMs}ms` }); + }, timeoutMs); + chrome.runtime.sendMessage( + { type: '__mokoshOffscreenQuery', op: 'get-segment-count' }, + (response: unknown) => { + clearTimeout(timer); + if (chrome.runtime.lastError !== undefined) { + resolve({ error: String(chrome.runtime.lastError.message) }); + return; + } + resolve(response as { count?: number; error?: string }); + }, + ); + }), + SPIKE_PROBE_TIMEOUT_MS, + ); + if (typeof result.count === 'number') { + process.stdout.write( + `SPIKE PROBE [${label}]: segments.length=${result.count}\n`, + ); + return result.count; + } + process.stdout.write( + `SPIKE PROBE [${label}]: ERROR ${result.error ?? '(unknown)'}\n`, + ); + return -1; + } catch (err) { + process.stdout.write( + `SPIKE PROBE [${label}]: THREW ${err instanceof Error ? err.message : String(err)}\n`, + ); + return -1; + } +} /** * Spike entrypoint. Returns a Node exit code (0 = PASSED, 1 = FAILED). */ async function main(): Promise { + // Session-2 mode switch: SPIKE_SKIP_SW_KILL=1 = run Step C variant + // (5-min idle + SAVE, NO worker.close()). Distinguishes "SW kill is + // the cause" from "5-min idle alone is the cause". + const skipSwKill = process.env.SPIKE_SKIP_SW_KILL === '1'; + process.stdout.write('\nMokosh Plan 04-04 Wave 0 SPIKE — SW state persistence empirical test\n'); process.stdout.write('='.repeat(72) + '\n'); process.stdout.write( `SPIKE: idle window = ${SPIKE_IDLE_WAIT_MS}ms (${SPIKE_IDLE_WAIT_MS / 60_000} min); ` + - `pass floor = ${SPIKE_VIDEO_SIZE_FLOOR_BYTES} bytes\n\n`, + `pass floor = ${SPIKE_VIDEO_SIZE_FLOOR_BYTES} bytes\n`, + ); + process.stdout.write( + `SPIKE: mode = ${skipSwKill ? 'STEP-C (skip-sw-kill: 5-min idle + SAVE, no worker.close)' : 'CANONICAL (5-min idle + worker.close + SAVE)'}\n\n`, ); const t0 = Date.now(); @@ -108,20 +205,55 @@ async function main(): Promise { throw new Error(`assertA2 priming failed: ${a2Result.error ?? '(no error)'}`); } + // Session-2 PROBE 1: post-prime baseline. Expected: 0 (no segment + // rotation has happened yet; A2's prime takes <1s + first rotation + // is at +10s). + await probeSegmentCount(handles.harnessPage, 'POST-PRIME'); + // Step 2 — 5-min wall-clock idle. The whole point of the spike. process.stdout.write(`\nSPIKE Step 2: waiting ${SPIKE_IDLE_WAIT_MS}ms for SW idle window...\n`); process.stdout.write(`SPIKE Step 2: ETA ${new Date(Date.now() + SPIKE_IDLE_WAIT_MS).toISOString()}\n`); await new Promise((res) => setTimeout(res, SPIKE_IDLE_WAIT_MS)); process.stdout.write('SPIKE Step 2 OK — 5-min idle elapsed\n'); - // Step 3 — force-terminate the SW via Puppeteer CDP worker.close(). - process.stdout.write('\nSPIKE Step 3: stopServiceWorker(browser, extensionId)\n'); - await stopServiceWorker(handles.browser, handles.extensionId); - process.stdout.write('SPIKE Step 3 OK — SW terminated via worker.close()\n'); + // Session-2 PROBE 2: pre-kill — THE critical disambiguation point. + // If count==0 here: hypothesis B (canvas-captureStream throttled in + // headless idle; segments never accumulated). Architecture NOT broken. + // If count>=3 here: hypothesis A (architectural RAM loss across SW + // termination); the test is correct; segments existed pre-kill. + // Then check post-SAVE archive: if also empty, IDB persistence needed. + // If count==-1 (probe failed): offscreen is unresponsive — either + // collateral-killed already (unlikely; no worker.close yet) or + // SW had to broker and SW is also unresponsive. + const segCountPreKill = await probeSegmentCount(handles.harnessPage, 'PRE-KILL'); + + if (skipSwKill) { + // Step C variant — skip the SW kill entirely. Use the same 500ms + // settle as the canonical path so the timing between SAVE dispatch + // and zip mtime stays consistent across runs. + process.stdout.write( + '\nSPIKE Step 3: SKIPPED (SPIKE_SKIP_SW_KILL=1) — testing 5-min idle alone\n', + ); + } else { + // Step 3 — force-terminate the SW via Puppeteer CDP worker.close(). + process.stdout.write('\nSPIKE Step 3: stopServiceWorker(browser, extensionId)\n'); + await stopServiceWorker(handles.browser, handles.extensionId); + process.stdout.write('SPIKE Step 3 OK — SW terminated via worker.close()\n'); + } // Step 4 — brief settle for SW teardown. await new Promise((res) => setTimeout(res, SPIKE_NEW_SW_BOOT_MS)); + // Session-2 PROBE 3: post-kill (or post-skip-kill) — confirms + // offscreen is still responsive after worker.close. If count drops + // from pre-kill value, the SW kill collaterally destroyed offscreen + // state. If count is same (or grows by 1 due to a rotation between + // probes), offscreen survived. + const segCountPostKill = await probeSegmentCount(handles.harnessPage, skipSwKill ? 'POST-SKIP-KILL' : 'POST-KILL'); + process.stdout.write( + `SPIKE: segment-count transition: pre=${segCountPreKill}, post=${segCountPostKill}\n`, + ); + // Step 5 — dispatch SAVE_ARCHIVE via chrome.runtime.sendMessage from // the harness page realm. The first message after worker.close() // wakes the SW back up (event-driven respawn — canonical MV3 wakeup @@ -177,7 +309,10 @@ async function main(): Promise { const elapsedSec = (tEnd - t0) / 1000; process.stdout.write('\n' + '='.repeat(72) + '\n'); - process.stdout.write(`SPIKE RESULT: videoSize=${videoSize} bytes (floor=${SPIKE_VIDEO_SIZE_FLOOR_BYTES}; elapsed=${elapsedSec.toFixed(1)}s)\n`); + process.stdout.write( + `SPIKE RESULT [${skipSwKill ? 'STEP-C-NO-KILL' : 'CANONICAL'}]: ` + + `videoSize=${videoSize} bytes (floor=${SPIKE_VIDEO_SIZE_FLOOR_BYTES}; elapsed=${elapsedSec.toFixed(1)}s)\n`, + ); if (spikeError !== null) { process.stdout.write(`SPIKE OUTCOME: FAILED (spike threw: ${spikeError})\n`); process.stdout.write('='.repeat(72) + '\n');