diff --git a/.planning/debug/d13-multi-ebml-concat-unplayable.md b/.planning/debug/d13-multi-ebml-concat-unplayable.md new file mode 100644 index 0000000..8cc2e98 --- /dev/null +++ b/.planning/debug/d13-multi-ebml-concat-unplayable.md @@ -0,0 +1,275 @@ +--- +slug: d13-multi-ebml-concat-unplayable +status: investigating +trigger: | + Phase 1 UAT Test 3 re-attempt post-Option-C produced a structurally-correct + 3-segment WebM (SW logs confirm: "Merging 3 segments / Adding segment 0 + size: 672159 / 1 size: 507559 / 2 size: 496181 / Final video blob size: + 1675899 bytes, total segments merged: 3") but the resulting file plays + ONLY ~9 s in Chrome AND in mpv. Cross-checking the canonical fixture + committed at Phase 1 closure on 2026-05-15 (`tests/fixtures/last_30sec.webm`, + 1633459 bytes, 3 segments per architecture) reveals it ALSO plays only ~9 s + in mpv. Operator confirmed both via mpv playback test. + + This means D-13's "concat of self-contained WebM segments → playable 30 s + WebM" architecture is fundamentally broken. The 2026-05-15 Phase 1 + closure was certified on an insufficient "operator-confirmed clean + Chrome playback" check that did not actually verify 30 s duration — + both the closure fixture and today's UAT-produced fixture exhibit + the same first-segment-only-plays behavior. + + Phase 1's primary deliverable (REQ-video-ring-buffer) does not actually + produce a playable 30 s WebM. SPEC §10 #7 (`last_30sec.webm plays back + in a browser`) is NOT satisfied by the current architecture even + though it was marked Complete in REQUIREMENTS.md/ROADMAP.md/STATE.md + on 2026-05-15. +created: 2026-05-16T16:56:41Z +updated: 2026-05-16T16:56:41Z +phase: 01-stabilize-video-pipeline +related_uat: .planning/phases/01-stabilize-video-pipeline/01-UAT.md +related_review_fix: .planning/phases/01-stabilize-video-pipeline/01-REVIEW-FIX.md +prior_resolved_sessions: + - .planning/debug/resolved/d12-blob-port-transfer-fails.md + - .planning/debug/resolved/webm-playback-freeze.md + - .planning/debug/resolved/empty-archive-port-race.md +architectural_impact: | + This is NOT a code-level bug; it's a wrong-architecture finding. + D-09..D-11 (single-continuous + age-trim + first-chunk-pin) was retired + in favor of D-13 (restart-segments + concat) on 2026-05-15 because + D-09..D-11 caused orphan-P-frame freezes (debug session + webm-playback-freeze). D-13 was supposed to fix that by making each + segment self-contained with its own EBML header + seed keyframe. But + D-13 only solved the freeze symptom — it did NOT solve the underlying + problem of producing a single playable 30 s WebM. Players see the + first EBML header, read its duration metadata (~9.94 s), and stop + there. Most Matroska/WebM players (ffmpeg/mpv/probably Chrome) do not + implement the multi-segment Matroska feature; the spec permits it but + doesn't mandate it. + + The fix requires real WebM REMUX: extract the VP9 frames + cluster + timestamps from each of the 3 segments and rewrite them into a single + EBML-headered WebM with adjusted timestamps. This is significantly + more work than D-13 (~500-1000 LOC for a JS remuxer) but architecturally + necessary. +--- + +# Debug: D-13 multi-EBML-concat produces unplayable WebM (Phase 1 architecture failure) + +## Symptoms + +**Expected behavior:** +When the operator clicks save, the produced `video/last_30sec.webm` plays +for ~30 s in a browser (SPEC §10 #7) covering the most recent 30 s of +captured screen. + +**Actual behavior:** +- WebM file is structurally valid (3 segments concatenated per D-13 design) +- All 3 segments arrive at SW per logs: + [SW:Main] Video buffer: 3 segments + [SW:Main] Merging 3 segments + [SW:Main] Adding segment 0, size: 672159 bytes + [SW:Main] Adding segment 1, size: 507559 bytes + [SW:Main] Adding segment 2, size: 496181 bytes + [SW:Main] Final video blob size: 1675899 bytes, total segments merged: 3 +- Resulting file (1675899 bytes) plays only ~9 s in Chrome +- Same file plays only ~9 s in mpv +- **The canonical Phase 1 closure fixture from 2026-05-15 + (`tests/fixtures/last_30sec.webm`, 1633459 bytes) ALSO plays only + ~9 s in mpv** — operator verified by drag-drop test + +**Error messages:** +None at the runtime layer. Recording is healthy, SW merge is healthy, +download is healthy. The bug is in the PRODUCED FILE'S COMPATIBILITY +with downstream players. + +ffprobe reports `duration=9.94 s` on both files — the first EBML +header's reported duration. ffmpeg dry-run produces 299 muxer warnings +(non-monotonic DTS at segment join boundaries) for both files — that's +the segment boundary noise from concatenation, not playback failure. + +**Timeline:** +- Bug introduced: commit `6a1a034` (Plan 01-07-debug-a3, 2026-05-15 + "feat(fix-a3): activate D-13 restart-segments in src/offscreen/recorder.ts" + + commit `5530292` "feat(fix-a3): retire ring-buffer first-chunk pin + tests, add segment-rotation contract") +- Operator-validated incorrectly: commit `cd61cbc` (2026-05-15 + "test(01-07): commit regenerated last_30sec.webm fixture against D-13 + recorder") + commit `7df72aa` (2026-05-15 "feat(01-07): close Phase 1 — + REQ-video-ring-buffer complete, SPEC §10 #7 satisfied"). The "operator + confirmed clean Chrome playback" assessment was insufficient — it + checked that the file played but did not measure the total playback + duration. +- Discovered: 2026-05-16 UAT Test 3 re-attempt after Option C debug + session (`.planning/debug/resolved/empty-archive-port-race.md`) + fixed the silent-empty-video archive bug. With the empty-video + symptom retired, the underlying broken-playback issue surfaced + cleanly. + +**Reproduction:** +1. `npm run build` +2. `KEEP_PROFILE=0 ./smoke.sh` +3. Load extension, click icon, wait 5+ minutes, click save +4. Extract `video/last_30sec.webm` from the produced zip +5. Open in mpv or Chrome — playback stops at ~9 s instead of ~30 s +6. Verify the file structurally contains 3 segments via: + `ffmpeg -v warning -i FILE -f null -` (produces ~299 muxer warnings + = 3 segment join boundaries) +7. OR verify against committed fixture: same behavior + (`/tmp/mokosh-test-committed-3seg.webm` and + `/tmp/mokosh-test-uat-3seg.webm` both play 9 s in mpv per operator) + +## Current Focus + +hypothesis: | + **H4 confirmed by operator empirical test**: D-13's "concat of self- + contained WebM segments → produce playable 30 s WebM" architecture + does not work in practice because most Matroska/WebM players do not + implement the multi-segment Matroska feature. The Matroska spec + permits multiple segments in one file but most decoders read only + the first segment's EBML header and stop there. ffmpeg's behavior + (which mpv inherits) is to honor the first EBML's duration metadata. + Chrome's MSE implementation appears to do the same (per UAT operator + observation). + + **H3 confirmed by operator empirical test**: The 2026-05-15 Phase 1 + closure's "operator-confirmed clean Chrome playback" check was + insufficient. The check did not measure total playback duration. + Both the canonical committed fixture and today's UAT-produced fixture + exhibit the same first-segment-only-plays behavior; the bug has + existed since D-13 was activated on 2026-05-15. + + **Fix direction**: replace the file-concat merge with a real WebM + REMUX. Parse each segment's EBML structure, extract VP9 frames + + cluster boundaries + keyframe positions, write a SINGLE-EBML-header + WebM whose clusters carry adjusted (monotonic) timestamps. This + produces a file that any player can read end-to-end as one continuous + ~30 s stream. + + **Candidate implementations**: + - `webm-muxer` npm package (Vanilla. ~10 KB. Browser + Node support. + Single-segment output. Active maintenance.) + - `ts-ebml` (EBML parser + writer. Allows manual control over + structure. ~50 KB.) + - Custom EBML parser (full control, ~500-1000 LOC, no dep weight) + - **Alternative path: MediaRecorder timeslice with cluster-aware trim**: + revisit retired D-09..D-11 architecture but trim ONLY on keyframe + boundaries (preserving every cluster from the most recent keyframe + onwards). This avoids the A3 orphan-P-frame freeze by guaranteeing + every kept cluster's references are present. ~200-400 LOC. The + risk: requires understanding EBML/Matroska cluster structure to + trim correctly. + - **Alternative path: WebCodecs API** (VideoEncoder + Muxer.js or + similar): full control over container framing. Significant rewrite + (~1000-2000 LOC). Most flexible but heaviest. + + The remux approach (webm-muxer or equivalent) is likely the right + trade-off: small, well-tested library, preserves D-13's segment + lifecycle benefits (no orphan-P-frame freeze, ~10s rotation gap + acceptable), but produces a single-EBML output that all players + read correctly. + +test: | + RED test: introduce a playable-duration assertion to + tests/offscreen/webm-playback.test.ts. Use ffprobe -count_frames + -show_streams to count VIDEO FRAMES (not just metadata duration), + then divide by reported frame rate to compute actual playable + content duration. Assert actual_duration > 25_000 ms for the + generated/committed fixture. This test should FAIL against the + current D-13 architecture and PASS after the remux fix lands. + + Alternative RED test: ffprobe -read_intervals -i FILE + '0%+#90000' (seek to last 90s, read all packets). Count packets + read. Should be ~600 packets for 30s @ ~20fps, not ~200 for 9s. +expecting: | + RED test fails on current code (both fixture and freshly-recorded + output should fail the duration assertion). Debugger then implements + the chosen fix path (webm-muxer remux most likely) and re-asserts + GREEN. +next_action: gather initial evidence from EBML parsing of both fixtures + research candidate JS remux libraries +reasoning_checkpoint: "" +tdd_checkpoint: "" + +## Constraints + +- TDD mode is ON (workflow.tdd_mode: true). RED test MUST land before + GREEN fix. +- Auto-loaded memories: `feedback-gsd-ceremony-for-fixes.md` (no + hot-edits; route through proper GSD ceremony) and + `feedback-no-unilateral-scope-reduction.md` (no scope narrowing). +- This fix may RETIRE the D-13 decision entirely OR keep D-13's + rotation lifecycle but replace the concat-merge with real remux. + CONTEXT.md will need amendment regardless. +- This fix may invalidate the existing committed fixture + `tests/fixtures/last_30sec.webm` — once the architecture changes, + a fresh fixture will be needed. +- The Phase 1 closure markers (REQUIREMENTS.md, ROADMAP.md, STATE.md) + marked REQ-video-ring-buffer complete on 2026-05-15; with this + finding they need to be REVERTED to in-progress until the fix + lands. That's a DOCUMENTATION change the orchestrator handles, NOT + a debugger action. +- Phase 1 architecture amendment is large enough that this debug + session may need to escalate to a fresh Plan 01-08 (e.g. "WebM + remux for playable ring-buffer") rather than landing as a + hotfix in the debug session itself. The debugger should + CHECKPOINT to the orchestrator after root-cause confirmation + + fix-strategy options, before executing. + +## Files of Interest (preliminary) + +- src/offscreen/recorder.ts: + - 80-110: getSegments + segment array management + - 250-360: D-13 restart-segments rotation lifecycle + - 522-650: encodeAndSendBuffer (sends segments to SW) +- src/background/index.ts: + - 129-150: decodeBufferSegments (base64 -> Blob) + - 395-420: mergeVideoSegments (the concat point — likely replaced by remux) + - 444-460: createArchive (calls mergeVideoSegments) +- tests/offscreen/webm-playback.test.ts (existing — uses ffmpeg dry-run + to check decoder errors but does NOT check total playable duration) +- tests/fixtures/last_30sec.webm (canonical fixture; needs regen post-fix) +- .planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md + (D-13 decision; needs amendment or retirement) +- .planning/REQUIREMENTS.md + (REQ-video-ring-buffer; needs status flip from [x] back to [ ]) + +## Evidence + +(populated by debugger; initial evidence below) + +### Operator empirical observations (2026-05-16) +- `/tmp/mokosh-test-uat-3seg.webm` (today's UAT output, 1.68 MB, 3 segments): + played ~9 s in mpv +- `/tmp/mokosh-test-committed-3seg.webm` (2026-05-15 closure fixture, 1.63 MB, + 3 segments): played ~9 s in mpv +- Earlier today operator confirmed Chrome playback of the UAT output was + also ~9 s, not ~30 s + +### SW log evidence (today's UAT run, 16:48:52) +- 3 segments arrived at SW +- Mergeed correctly: 672159 + 507559 + 496181 = 1675899 bytes (matches + archive WebM size) +- No errors anywhere in delivery path + +### ffmpeg dry-run signature +- Both files produce ~299 warning lines (segment join boundary noise) +- Both files report `duration=9.94 s` via ffprobe -show_entries format=duration +- Decoder errors: zero (segments are individually valid) + +## Eliminated + +(populated by debugger as hypotheses are ruled out) + +- H1 (Chrome version regression): unlikely given mpv exhibits same behavior + and mpv uses ffmpeg internally — not Chrome +- H2 (today's encoding differs subtly from 2026-05-15): ruled out — committed + fixture also plays ~9 s in mpv, so it's been broken since D-13 activation +- (H5: defective committed fixture in storage): ruled out — file size + matches expected (1.63 MB matches what was committed on 2026-05-15; + not bit-rot) + +## Resolution + +root_cause: "" +fix: "" +verification: "" +files_changed: [] diff --git a/.planning/phases/01-stabilize-video-pipeline/01-UAT.md b/.planning/phases/01-stabilize-video-pipeline/01-UAT.md index 645fb56..162677e 100644 --- a/.planning/phases/01-stabilize-video-pipeline/01-UAT.md +++ b/.planning/phases/01-stabilize-video-pipeline/01-UAT.md @@ -1,5 +1,5 @@ --- -status: partial +status: testing phase: 01-stabilize-video-pipeline source: - 01-01-SUMMARY.md @@ -10,16 +10,34 @@ source: - 01-06-SUMMARY.md - 01-07-SUMMARY.md verifier_residue: 01-VERIFICATION.md (status: human_needed, OPR-1/2/3) +debug_session_landed: .planning/debug/resolved/empty-archive-port-race.md (Option C — 8 commits 674c415..f0871c0) started: 2026-05-16T11:14:00Z -updated: 2026-05-16T11:58:00Z -paused_for: /gsd-debug investigation of Test 3 port-reconnect uncaught-error finding +updated: 2026-05-16T16:50:00Z +resumed_for: empirical re-verification of Option C fix (silent empty-video archive + port-reconnect race) before closing Phase 1 --- ## Current Test -[testing paused — 1 issue + 1 blocked outstanding; user routed to /gsd-debug - immediately after Test 3 surfaced 3× Uncaught Error post-reconnect. Resume - Test 3 save+play AND Test 4 SW Force-Stop after debug session lands a fix.] +number: 3 +name: "OPR-2: Continuous Recording Across Tab Switches (re-attempt post-Option-C)" +expected: | + Now that Option C landed (port lifecycle refactor + request-id'd BUFFER + routing + EmptyVideoBufferError surfaced to popup), Test 3 should now + produce a zip with a valid `video/last_30sec.webm` AND show NO + Uncaught Errors after the 290s mark (that timer is retired in favor + of the port-health-probe). Re-run smoke.sh with the freshly built + dist/ and confirm: + (a) Recording runs uninterrupted across tab switches (open 2-3 new + tabs, switch between them for ~30 s). + (b) NO `Uncaught Error: Attempting to use a disconnected port object` + appears in offscreen console — even past the 290 s mark. + (c) The save flow completes in reasonable time (< 5 s, NOT 600 s as + before). + (d) The produced zip contains `video/last_30sec.webm` of expected + size (~1.5 MB for 30 s VP9 1024×768-ish). + (e) The WebM plays continuously in Chrome (no freeze, no missing + seconds across tab-switch moments). +awaiting: user response (smoke.sh output + zip contents) ## Tests @@ -192,10 +210,11 @@ reason: | total: 4 passed: 2 -issues: 1 +issues: 2 (Test 3 confirmed BLOCKER × 2: empty-archive fixed by Option C → new finding: D-13 multi-EBML-concat plays only ~9 s in mpv AND Chrome) pending: 0 skipped: 0 -blocked: 1 +blocked: 2 (Test 3 retest after D-13 architectural fix; Test 4 SW Force-Stop deferred behind the same fix) +paused_for: /gsd-debug d13-multi-ebml-concat-unplayable — Phase 1 architectural finding ## Gaps