Session-2 (/gsd-debug continuation) empirically refuted the SUMMARY's original 'architecture broken → IndexedDB plan-fix needed' interpretation: - Pre-kill probe: segments.length=3 (segments accumulated correctly during 5-min idle) - Post-kill probe: segments.length=3 (offscreen-RAM survives SW kill structurally) - Step C (no worker.close, just 5-min idle): identical 8505 bytes (CDP not the cause) - Remux logs: each segment trackInfo=320x180 but 0 frames per segment - 7/7 spike runs deterministic at 8505 bytes (canvas-captureStream throttling) Root cause: installFakeDisplayMedia() at src/test-hooks/offscreen-hooks.ts:139-264 mints canvas.captureStream(30) on hidden -9999px-offset canvas; headless-Chromium throttles MediaRecorder on invisible-canvas (Chrome bug 653548). Segments exist but contain zero VP9 frames over 5-min idle. Routing: Plan 04-08 inserted (user-authorized ceremony 2026-05-22) — video-file MediaStream methodology reframe (Option 2 from session-2). IndexedDB plan-fix recommendation REJECTED — would not close SC#1 because frames are the problem, not segments. stopServiceWorker helper + spike script + launch.ts:225 race-tolerant fix all remain valid persisting artifacts for Plan 04-08.
36 KiB
phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, key-decisions, patterns-established, requirements-completed, duration, completed
| phase | plan | subsystem | tags | requires | provides | affects | tech-stack | key-files | key-decisions | patterns-established | requirements-completed | duration | completed | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 04-harden-clean-up-optional | 04 | testing |
|
|
|
|
|
|
|
|
~25 min | 2026-05-21 |
Phase 04 Plan 04: SW state persistence spike — empirical NO, plan-fix ceremony required
Wave 0 SPIKE empirically refutes RESEARCH Q2 MEDIUM-confidence hypothesis A3 (offscreen-document independent lifecycle anchored by active MediaRecorder): the current src/offscreen/recorder.ts:91 let segments: Blob[] = [] RAM-only architecture does NOT survive 5 minutes of SW idle + Puppeteer CDP worker.close(). Measured video/last_30sec.webm post-SAVE = 8505 bytes (broken WebM per ffprobe; no valid clusters; rrweb + events.json + meta.urls all empty/lost). Spike-first contract triggers — Task 2 (A33 verification-only harness assertion) BLOCKED; ROADMAP SC #1 remains OPEN; architectural change (IndexedDB persistence per RESEARCH Q2 sub-question b Option C) routes through plan-fix ceremony per saved-memory contract. Persisting positive artifacts committed: stopServiceWorker(browser, extensionId) helper (verbatim Chrome-devrel canonical pattern) at tests/uat/lib/harness-page-driver.ts + tests/uat/spike-a33-sw-persistence.ts forensic-evidence one-shot script. UAT harness stays at 33/33 GREEN (A33 NOT added); vitest baseline 183 preserved (3 pre-existing parallel-vitest flakes pass in isolation per 04-CONTEXT items 9-10).
Performance
- Duration: ~25 min (Phase 4 Wave 3; fourth plan in execution order)
- Started: 2026-05-21T16:32:00Z (executor re-spawn after prior agent confusion; took on-disk Wave 0 work as-is per the re-spawn handoff)
- Completed: 2026-05-21T18:55:00Z (this SUMMARY committed)
- Tasks: 1 of 2 plan tasks complete (Task 1: Wave 0 SPIKE; Task 2: BLOCKED by spike outcome per the gating condition)
- Files modified: 2 (tests/uat/lib/harness-page-driver.ts +43 / -6; tests/uat/spike-a33-sw-persistence.ts NEW +202)
- Production source changes: 0 (Plan 04-04 made ZERO source-code edits to src/; only adds tests/uat/ artifacts)
Accomplishments
- Wave 0 SPIKE executed end-to-end (Task 1): 308.7s wall-clock (~5min idle + ~8s orchestration). Step 1 assertA2 prime → REC state achieved; Step 2 5-min idle elapsed cleanly; Step 3 stopServiceWorker via Puppeteer CDP worker.close() succeeded; Step 4 500ms settle; Step 5 SAVE_ARCHIVE dispatch inline from harness-page realm via
chrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}, cb)returned{success: true}(SW respawned event-driven on the message); Step 6 5s download settle; Step 7 findLatestZip + JSZip.loadAsync +video/last_30sec.webmextraction. Empirical numbers logged. - Empirical refutation of RESEARCH MEDIUM-confidence hypothesis A3:
videoSize = 8505 bytes(sanity floor was 100 KB; typical healthy archive 1-3 MB). The 8505 bytes are corrupt WebM per ffprobe (End of file+Duplicate elementerrors; no valid clusters). Companion zip entries also empty/lost:rrweb/session.json=[],logs/events.json=[],meta.urls=[chrome-extension://*](real-page URLs LOST — confirms the SW tab tracker was reset across the SW death + the active probe tab navigated state vanished too). Conclusive empirical NO. - stopServiceWorker helper landed (Task 1 persisting artifact): canonical Chrome-devrel pattern at tests/uat/lib/harness-page-driver.ts:68-80.
await browser.waitForTarget(t => t.type() === 'service_worker' && t.url().startsWith(\chrome-extension://${extensionId}`))→target.worker()?.close()`. Docstring cites 3 authoritative references including the Chrome blog post on eyeOS's MV3 SW suspension testing journey. - Spike script committed (Task 1 forensic evidence): tests/uat/spike-a33-sw-persistence.ts is 202 lines incl. extensive docstring documenting: spike outcome decision tree, architectural reuse rationale (assertA2 prime + chrome.runtime.sendMessage SAVE; both REVISION iter-2 Option B verified), references to PLAN.md + RESEARCH.md + Chrome docs. Future plan-fix re-runs this script as its regression-verification gate.
- Task 2 gating condition documented as NOT MET: per the plan's Task 2
<action>first sentence —**GATING CONDITION:** Task 1 spike produced videoSize > 100_000. (If FAILED, this task is BLOCKED and the plan must be re-planned to add IndexedDB persistence work.)— measured videoSize=8505 < 100_000, so Task 2 is BLOCKED. No code added for Task 2; UAT count stays at 33; FORBIDDEN_HOOK_STRINGS inventory unchanged at 12; A33 not introduced. - ROADMAP SC #1 status communicated as OPEN: leaving the ROADMAP success-criteria row unflipped (cannot mark CLOSED on a FAILED spike). The next plan-fix's SUMMARY will close it when the persistence layer lands + the spike script is re-run + PASSES.
Task Commits
Each plan task was committed atomically with normal git commits + pre-commit hooks (sequential foreground mode, in-line with Plans 04-01 + 04-02 + 04-03's protocol):
-
Task 1: Wave 0 SPIKE — stopServiceWorker helper + 5-min SW idle empirical result —
3726eee(feat). Adds Browser type to puppeteer import; addsstopServiceWorker(browser, extensionId)helper (verbatim Chrome-devrel canonical) at top of tests/uat/lib/harness-page-driver.ts; exportsfindLatestZip(was module-internal). Creates tests/uat/spike-a33-sw-persistence.ts one-shot reproducible spike script. Spike RAN to completion with explicitvideoSize=8505 bytes (floor=100000; elapsed=308.7s)line +SPIKE OUTCOME: FAILED (offscreen DIED — videoSize below floor). Acceptance criteria all met for the FAIL branch (script completed, no Puppeteer throw, explicit videoSize line, SAVE_ARCHIVE dispatch verified to usechrome.runtime.sendMessagenotdispatchSaveArchive). -
Task 2: A33 SW state persistence harness assertion — BLOCKED, NOT COMMITTED. Per the plan's explicit gating condition (
If FAILED, this task is BLOCKED and the plan must be re-planned to add IndexedDB persistence work.), no code was added; no UAT count flip; no FORBIDDEN_HOOK_STRINGS lockstep update; no orchestrator wiring. The re-planning event is delegated to /gsd-plan-phase rewrite OR /gsd-debug ceremony per saved-memoryfeedback-gsd-ceremony-for-fixes.md.
Plan metadata commit (will follow): docs(04-04): complete harden-clean-up-optional plan 04-04 — SW persistence spike FAILED, plan-fix ceremony required — includes this SUMMARY.md + STATE.md + ROADMAP.md updates.
Files Created/Modified
tests/uat/lib/harness-page-driver.ts— MODIFIED. +43 / -6 lines. Added Browser type to puppeteer import at line 43. AddedstopServiceWorker(browser, extensionId)helper as exported async function near top of file (after existing imports + assertion-record interface) — verbatim Chrome-devrel canonical pattern with full docstring + 3 authoritative reference URLs. ExportedfindLatestZip(was module-internal); docstring updated to cite Plan 04-04 reuse rationale. Other driveA* / driveA1..driveA32 functions UNCHANGED.tests/uat/spike-a33-sw-persistence.ts— CREATED. 202 lines. One-shot reproducible empirical investigation script. ImportslaunchHarnessBrowser(from./lib/launch.ts) +stopServiceWorker+findLatestZip(from./lib/harness-page-driver.ts) + JSZip + readFileSync. Step 1 prime via__mokoshHarness.assertA2; Step 2 5-min wall-clock idle; Step 3 stopServiceWorker; Step 4 settle; Step 5 inlinechrome.runtime.sendMessage({type: 'SAVE_ARCHIVE'}, cb)from harness-page realm; Step 6 download settle; Step 7 findLatestZip + JSZip + extractvideo/last_30sec.webm. PASS/FAIL gate at 100_000 bytes; exit code 0 = PASSED, 1 = FAILED. Run withHEADLESS=1 tsx tests/uat/spike-a33-sw-persistence.ts..planning/phases/04-harden-clean-up-optional/04-04-SUMMARY.md— CREATED (this file).
Decisions Made
See key-decisions in frontmatter for the canonical list. Highlights:
- Honor spike-first contract — STOP at Task 1; do NOT improvise inline.
- Commit (not delete) spike script — forensic evidence + future regression test.
- Keep stopServiceWorker helper — non-empty positive artifact independent of Task 2 status.
- ROADMAP SC #1 stays OPEN — cannot mark CLOSED on a FAILED spike.
- Saved memory
feedback-gsd-ceremony-for-fixes.mdapplied — architectural fix routes through plan-fix ceremony. - Saved memory
feedback-no-unilateral-scope-reduction.mdhonored — full 5-min spike was run to completion; the STOP decision is the spike-first contract executing as designed, not a unilateral scope reduction. - Pre-existing parallel-vitest flakes are NOT in Plan 04-04 scope — documented in CONTEXT items 9-10; pass in isolation; Plan 04-04 made zero source-code changes that could possibly affect them.
Deviations from Plan
None at the code level — plan executed exactly as written through the spike-first decision point. The decision tree at line 64-70 of the plan (<objective> section: "Wave 0 (spike): A30-min empirical investigation. ... Wave 1 (impl): Based on spike outcome ... if spike FAILS ... A33 implementation expands per RESEARCH Q2 sub-question (b) recommendation (Option C: IndexedDB persistence in offscreen) ... This is a wider plan rewrite; the plan-checker should flag for re-planning if it materializes.") + the explicit Task 2 GATING CONDITION at line 345 (**GATING CONDITION:** Task 1 spike produced videoSize > 100_000. (If FAILED, this task is BLOCKED and the plan must be re-planned to add IndexedDB persistence work.)) both unambiguously specify the STOP-at-Task-1 outcome for spike failure. This SUMMARY documents that outcome verbatim.
One process micro-deviation: Plan was re-spawned with a fresh executor mid-flight (prior executor stalled after launching the spike; user authorized "preserve work, fresh executor continues" via GSD ceremony). Re-spawn adopted the on-disk Wave 0 work as-is (verified per-plan-spec via diff inspection before adopting). No code-level deviation; just orchestrator continuity.
Total deviations: 0 auto-fixes; 1 process-level executor re-spawn (handled per user's GSD ceremony invocation). Plan logic + contract honored verbatim.
Issues Encountered
-
Spike result was a FAILURE — but this is the spike contract working as designed. The whole point of Wave 0 was to empirically test the RESEARCH MEDIUM-confidence assumption BEFORE expanding scope into Wave 1 work that would have been wasted if the assumption broke. The "issue" is properly framed not as an issue but as the spike's job: surface the empirical NO and route to plan-fix ceremony.
-
Prior executor stalled / vanished without committing — the re-spawn handoff document caught this; this fresh executor verified on-disk work matched plan spec, adopted it, ran the spike + committed Task 1 + wrote this SUMMARY. Total prior agent loss: ~64 minutes of wall-clock + no commits + no work-on-disk loss (Wave 0 work was already structured per-plan-spec and was the right thing to keep).
-
vitest
npm test(full sequential suite) showed 180/183 (3 failures) during pre-SUMMARY verification. All 3 failures (tests/background/blob-url-download.test.ts,tests/background/webm-remux.test.ts,tests/offscreen/webm-playback.test.ts) PASS in isolation. Per 04-CONTEXT.md §"In scope" items 9-10 these are documented pre-existing flakes: "Pre-existing parallel-vitest Tier-1-build-step race (~1/5 full-suite runs)" + "2 pre-existing ffprobe/ffmpeg vitest flakes (pre-date Phase 3)". Plan 04-04 made ZERO source-code changes that could possibly affect those three test files — they are entirely about pre-Phase-4 production code. The flakes are out of Plan 04-04 scope; a future Phase 4 plan owns flake stabilization.
Verification — Pre-Checkpoint Bundle Gates
Per saved memory feedback-pre-checkpoint-bundle-gates.md — these run on the production build output BEFORE any operator/empirical checkpoint or plan closure.
=== dist/assets/index-CgqXENQe.js (SW chunk) ===
new Function: 0 (Plan 04-02 polarity preserved — was 1 pre-04-02; now 0 since 04-02)
eval: 0 (Plan 04-02 baseline preserved)
Buffer.: 1 (JSZip bundled `buffer` polyfill — pre-existing per Plan 04-02 SUMMARY + deferred-items.md)
window.: 0 (DOM-globals in SW chunk gate — preserved)
document.: 0 (DOM-globals in SW chunk gate — preserved)
=== Tier-1 FORBIDDEN_HOOK_STRINGS inventory ===
tests/uat/harness.test.ts: 12 entries (10 core + 2 Plan 01-14 A23)
tests/background/no-test-hooks-in-prod-bundle.test.ts: 12 entries (lockstep with the above)
=== dist/ grep against Tier-1 list (all 12 strings) ===
__mokoshTest files-with-match: 0
setCurrentStream files-with-match: 0
setSegmentCountGetter files-with-match: 0
installFakeDisplayMedia files-with-match: 0
uninstallFakeDisplayMedia files-with-match: 0
dispatchEndedOnTrack files-with-match: 0
getSegmentCount files-with-match: 0
__mokoshOffscreenQuery files-with-match: 0
get-display-surface files-with-match: 0
get-segment-count files-with-match: 0
lastGetDisplayMediaConstraints files-with-match: 0
get-last-getDisplayMedia-constraints files-with-match: 0
All 6/6 gates GREEN unchanged from Plan 04-03 baseline. Plan 04-04 made zero production-source changes (only tests/uat/* + a one-shot script) so the gates trivially hold.
SKIP_LONG_UAT Env-Gate Decision
The plan called for an SKIP_LONG_UAT env-gate to be wired into tests/uat/harness.test.ts as part of Task 2 to allow per-commit dev iteration to skip the 5-min A33 test. This wiring was NOT added because Task 2 is BLOCKED — no A33 means no need for the env-gate, no need for the orchestrator import/wrap/push lockstep. The env-gate becomes a Task-1 artifact of the eventual plan-fix that adds A33 against an IndexedDB-persistent buffer.
Recommended Next Step (out of Plan 04-04 scope; routed to plan-fix ceremony)
Per the plan's <objective> section + saved memory feedback-gsd-ceremony-for-fixes.md:
Route: /gsd-plan-phase rewrite OR /gsd-debug ceremony — operator's choice. The new plan should:
- Architecture: Implement RESEARCH Q2 sub-question (b) recommendation Option C — move
segments: Blob[]from offscreen module-scope RAM into an IndexedDB store inside the offscreen document. Blobs serialize cleanly via structured-clone (no base64 encoding tax; native IDB shape). Per-segment write ~3 MB; ~3 writes per 30s window. RESEARCH notes IDB has no extension-context lifetime gotchas at this scale; Chrome enforces a default 30s minimum SW idle but the offscreen's own lifecycle (independent of SW per our spike) is the relevant constraint — which the spike just empirically refuted, so IDB persistence is the canonical fix. - Verification harness: A33 against the new persistence layer. The spike script at
tests/uat/spike-a33-sw-persistence.tsis the canonical regression-verification gate — re-run it after the fix and it MUST exit 0 withvideoSize > 100_000. Promote the spike methodology to a permanent harness assertion (assertA33 / driveA33 / orchestrator wiring + SKIP_LONG_UAT env-gate per the original Plan 04-04 Wave 1 spec). - Files likely touched: src/offscreen/recorder.ts (new IDB write path in the segment-rotation lifecycle); possibly a new src/offscreen/idb-segments.ts module; tests/offscreen/* unit tests; tests/uat/* harness assertion for A33; manifest.json may need adjusting (Chrome storage quota — though IDB doesn't require explicit permission).
- Risk: the new I/O path adds failure modes (IDB quota exceeded; transaction abort; cross-context tab close during write). Plan-fix's THREAT MODEL needs to cover them.
- Cost: likely 3-5 plan tasks across 2 waves. Phase 4 plan count grows from current 7 to ~8-9.
- Status communication: ROADMAP SC #1 stays OPEN until the plan-fix's SUMMARY proves the spike script passes against the new architecture.
The plan-checker / planner owns whether to:
- (a) rewrite Plan 04-04 in-place (likely as Plan 04-04 v2 with
type: tddIDB-persistence work), - (b) insert a new plan slot (e.g., Plan 04-08) for the persistence work + leave Plan 04-04's SUMMARY as the spike-findings record,
- (c) close Plan 04-04 as "spike concluded — outcome FAILED — see SUMMARY" + open a fresh Phase 4 follow-up plan slot for the IDB work.
Recommendation (this executor's read, non-binding): Option (b) or (c) — keep Plan 04-04 as the spike-findings record + open a new plan slot. The spike is a complete unit of work; mixing it with persistence implementation in a single SUMMARY would muddle the canonical decision-record. The user's preference / plan-checker discretion wins.
Self-Check
Verifying claims before declaring plan complete (per executor protocol §self_check).
Files created:
tests/uat/spike-a33-sw-persistence.ts— FOUND (verified via Read tool at session start; confirmed committed at3726eee).planning/phases/04-harden-clean-up-optional/04-04-SUMMARY.md— FOUND (this file, just written)
Files modified:
tests/uat/lib/harness-page-driver.ts— FOUND (git diff verified pre-commit; helper landed at lines 49-80; findLatestZip exported at line 1434; committed at3726eee)
Commits:
3726eee(feat(04-04): Wave 0 spike — stopServiceWorker helper + 5-min SW idle empirical result) — FOUND ingit log --oneline -3.
Verification gates:
- npx tsc --noEmit: exits 0 (verified pre-spike)
- HEADLESS=1 tsx tests/uat/spike-a33-sw-persistence.ts: ran to completion with explicit SPIKE RESULT + SPIKE OUTCOME lines + exit code 1 (FAILED branch — captured in /tmp/04-04-spike.log)
- npx tsc --noEmit (post-spike): exits 0 (helper + spike script both type-check cleanly; verified via the spike's tsc-clean exit before launch)
- Pre-checkpoint bundle gates: 6/6 GREEN unchanged from Plan 04-03 baseline (verified above)
- vitest baseline: 183 tests total; 3 pre-existing parallel-vitest flakes observed (out of scope per 04-CONTEXT items 9-10; pass in isolation; no regression caused by Plan 04-04 which made zero source-code changes)
- Spike acceptance criteria (Task 1):
stopServiceWorker(browser, extensionId)exists at tests/uat/lib/harness-page-driver.ts with canonical signature — MET- Spike script ran to completion (no Puppeteer throw) — MET
- Spike result logged with explicit
videoSize=<N> bytesline — MET (videoSize=8505 bytes) - SAVE_ARCHIVE dispatch uses
chrome.runtime.sendMessagenotdispatchSaveArchive— MET (grep verified: 0 hits ondispatchSaveArchive; 1 hit ontype: 'SAVE_ARCHIVE') - Spike outcome decision recorded (>100_000 → PASSED; ≤100_000 → FAILED) — MET (FAILED branch; SUMMARY documents failure mode + flag for re-planning per Task 1 acceptance criteria sentence)
- Task 2 acceptance criteria: NOT APPLICABLE — Task 2 BLOCKED by gating condition (videoSize > 100_000 NOT met).
Self-Check: PASSED
All claims verified. Plan 04-04 closes at Task 1 (Wave 0 SPIKE FAILED) per the spike-first contract; Task 2 BLOCKED; ROADMAP SC #1 remains OPEN; plan-fix ceremony route documented.
Post-Debug Amendment (2026-05-22)
The above SPIKE FAILED interpretation ("architecture broken → IndexedDB plan-fix needed") is empirically REFUTED by the follow-on /gsd-debug ceremony at .planning/debug/sw-offscreen-persistence-investigation-session-2.md (commit 4ea1bbb). Per user-authorized ceremony route, the SC#1 routing was held until disambiguation completed.
Session-2 verdict: REFUTED-architecture (canvas-captureStream issue). The current let segments: Blob[] = [] offscreen-RAM architecture (recorder.ts:91) is NOT broken. The spike's test methodology is invalid:
- Pre-kill probe:
segments.length=3→ segments accumulated correctly during the 5-min idle. - Post-kill probe:
segments.length=3→ segments survive SW kill structurally (offscreen-RAM persistence works). - Step C (no SW kill, just 5-min idle + SAVE_ARCHIVE): identical 8505-byte failure → Puppeteer
worker.close()is not the cause; 5-min idle alone is what breaks the recording. - Direct Remux logs (visible in Step C because SW respawn did not happen):
Segment ts=1..3: 0 frames, duration=0ms, trackInfo=320x180;Remux complete: 0 frames, total timeline=0ms, output=8505 bytes.
Root cause: installFakeDisplayMedia() at src/test-hooks/offscreen-hooks.ts:139-264 mints a canvas.captureStream(30) from a hidden -9999px-offset 320x180 canvas. Despite the setInterval(drawFrame, 33ms) belt-and-suspenders mitigation against RAF throttling, headless-Chromium aggressively throttles MediaRecorder on invisible-canvas sources (Chrome bug 653548; chromium auto-throttled-screen-capture design doc; sendrec.eu "Why Canvas Breaks Your Screen Recorder"). The MediaRecorder emits structurally-valid WebM with valid V_VP9 track metadata (320x180) but zero VP9 frames per segment over the 5-min idle window. The Remuxer correctly emits an 8505-byte header-only WebM from 3 × 0-frame segments.
Reproducibility: 7/7 spike runs across both debug sessions converge on identical 8505 bytes (deterministic methodology failure).
Status correction (supersedes the above):
- ROADMAP SC #1 remains OPEN but for a TEST METHODOLOGY reason — NOT for an architectural reason.
- The IndexedDB persistence plan-fix recommendation is REJECTED. It would not have closed SC#1 because the spike would still produce 8505 bytes after IDB lands; segments are not the problem, frames are.
- The correct fix: replace
canvas.captureStream(30)ininstallFakeDisplayMedia()with anHTMLVideoElementplaying a bundled WebM (Option 2 from session-2 recommendations). Bypasses canvas throttling entirely.
Routing decision (user-authorized 2026-05-22): Insert new Plan 04-08 — video-file-backed MediaStream methodology reframe (replaces canvas.captureStream + revives the A33 harness assertion under a valid methodology). Plan 04-08 lands between Plans 04-06 and 04-07 (Wave 5.5).
Persisting artifacts from this plan remain valid:
stopServiceWorker(browser, extensionId)at tests/uat/lib/harness-page-driver.ts — still required for the A33-equivalent verification gate that Plan 04-08 lands.tests/uat/spike-a33-sw-persistence.ts— kept as forensic evidence + future regression test (Plan 04-08 may reuse or supersede).- Session-2 commit
9ac5808— race-tolerant offscreen target attach fix at tests/uat/lib/launch.ts:225 (background_page → page, with full settle-and-retry). Permanent test-infra improvement; lives on past this plan.
Phase: 04-harden-clean-up-optional Plan: 04 (of 7, plus inserted 04-08) Completed: 2026-05-21 Amended: 2026-05-22 — post-debug session-2 verdict REFUTED-architecture; SC#1 reframed to test-methodology issue; Plan 04-08 inserted Outcome: SPIKE FAILED but root cause is test methodology (canvas throttling), not architecture; Plan 04-08 lands video-file MediaStream + A33 revival