Milestone v1 (v2.0.0): Mokosh — Session Capture #1

Merged
strategy155 merged 297 commits from gsd/phase-04-harden-clean-up-optional into main 2026-05-31 15:34:17 +00:00
2 changed files with 302 additions and 8 deletions
Showing only changes of commit f1026954fc - Show all commits

View File

@@ -0,0 +1,275 @@
---
slug: d13-multi-ebml-concat-unplayable
status: investigating
trigger: |
Phase 1 UAT Test 3 re-attempt post-Option-C produced a structurally-correct
3-segment WebM (SW logs confirm: "Merging 3 segments / Adding segment 0
size: 672159 / 1 size: 507559 / 2 size: 496181 / Final video blob size:
1675899 bytes, total segments merged: 3") but the resulting file plays
ONLY ~9 s in Chrome AND in mpv. Cross-checking the canonical fixture
committed at Phase 1 closure on 2026-05-15 (`tests/fixtures/last_30sec.webm`,
1633459 bytes, 3 segments per architecture) reveals it ALSO plays only ~9 s
in mpv. Operator confirmed both via mpv playback test.
This means D-13's "concat of self-contained WebM segments → playable 30 s
WebM" architecture is fundamentally broken. The 2026-05-15 Phase 1
closure was certified on an insufficient "operator-confirmed clean
Chrome playback" check that did not actually verify 30 s duration —
both the closure fixture and today's UAT-produced fixture exhibit
the same first-segment-only-plays behavior.
Phase 1's primary deliverable (REQ-video-ring-buffer) does not actually
produce a playable 30 s WebM. SPEC §10 #7 (`last_30sec.webm plays back
in a browser`) is NOT satisfied by the current architecture even
though it was marked Complete in REQUIREMENTS.md/ROADMAP.md/STATE.md
on 2026-05-15.
created: 2026-05-16T16:56:41Z
updated: 2026-05-16T16:56:41Z
phase: 01-stabilize-video-pipeline
related_uat: .planning/phases/01-stabilize-video-pipeline/01-UAT.md
related_review_fix: .planning/phases/01-stabilize-video-pipeline/01-REVIEW-FIX.md
prior_resolved_sessions:
- .planning/debug/resolved/d12-blob-port-transfer-fails.md
- .planning/debug/resolved/webm-playback-freeze.md
- .planning/debug/resolved/empty-archive-port-race.md
architectural_impact: |
This is NOT a code-level bug; it's a wrong-architecture finding.
D-09..D-11 (single-continuous + age-trim + first-chunk-pin) was retired
in favor of D-13 (restart-segments + concat) on 2026-05-15 because
D-09..D-11 caused orphan-P-frame freezes (debug session
webm-playback-freeze). D-13 was supposed to fix that by making each
segment self-contained with its own EBML header + seed keyframe. But
D-13 only solved the freeze symptom — it did NOT solve the underlying
problem of producing a single playable 30 s WebM. Players see the
first EBML header, read its duration metadata (~9.94 s), and stop
there. Most Matroska/WebM players (ffmpeg/mpv/probably Chrome) do not
implement the multi-segment Matroska feature; the spec permits it but
doesn't mandate it.
The fix requires real WebM REMUX: extract the VP9 frames + cluster
timestamps from each of the 3 segments and rewrite them into a single
EBML-headered WebM with adjusted timestamps. This is significantly
more work than D-13 (~500-1000 LOC for a JS remuxer) but architecturally
necessary.
---
# Debug: D-13 multi-EBML-concat produces unplayable WebM (Phase 1 architecture failure)
## Symptoms
**Expected behavior:**
When the operator clicks save, the produced `video/last_30sec.webm` plays
for ~30 s in a browser (SPEC §10 #7) covering the most recent 30 s of
captured screen.
**Actual behavior:**
- WebM file is structurally valid (3 segments concatenated per D-13 design)
- All 3 segments arrive at SW per logs:
[SW:Main] Video buffer: 3 segments
[SW:Main] Merging 3 segments
[SW:Main] Adding segment 0, size: 672159 bytes
[SW:Main] Adding segment 1, size: 507559 bytes
[SW:Main] Adding segment 2, size: 496181 bytes
[SW:Main] Final video blob size: 1675899 bytes, total segments merged: 3
- Resulting file (1675899 bytes) plays only ~9 s in Chrome
- Same file plays only ~9 s in mpv
- **The canonical Phase 1 closure fixture from 2026-05-15
(`tests/fixtures/last_30sec.webm`, 1633459 bytes) ALSO plays only
~9 s in mpv** — operator verified by drag-drop test
**Error messages:**
None at the runtime layer. Recording is healthy, SW merge is healthy,
download is healthy. The bug is in the PRODUCED FILE'S COMPATIBILITY
with downstream players.
ffprobe reports `duration=9.94 s` on both files — the first EBML
header's reported duration. ffmpeg dry-run produces 299 muxer warnings
(non-monotonic DTS at segment join boundaries) for both files — that's
the segment boundary noise from concatenation, not playback failure.
**Timeline:**
- Bug introduced: commit `6a1a034` (Plan 01-07-debug-a3, 2026-05-15
"feat(fix-a3): activate D-13 restart-segments in src/offscreen/recorder.ts"
+ commit `5530292` "feat(fix-a3): retire ring-buffer first-chunk pin
tests, add segment-rotation contract")
- Operator-validated incorrectly: commit `cd61cbc` (2026-05-15
"test(01-07): commit regenerated last_30sec.webm fixture against D-13
recorder") + commit `7df72aa` (2026-05-15 "feat(01-07): close Phase 1 —
REQ-video-ring-buffer complete, SPEC §10 #7 satisfied"). The "operator
confirmed clean Chrome playback" assessment was insufficient — it
checked that the file played but did not measure the total playback
duration.
- Discovered: 2026-05-16 UAT Test 3 re-attempt after Option C debug
session (`.planning/debug/resolved/empty-archive-port-race.md`)
fixed the silent-empty-video archive bug. With the empty-video
symptom retired, the underlying broken-playback issue surfaced
cleanly.
**Reproduction:**
1. `npm run build`
2. `KEEP_PROFILE=0 ./smoke.sh`
3. Load extension, click icon, wait 5+ minutes, click save
4. Extract `video/last_30sec.webm` from the produced zip
5. Open in mpv or Chrome — playback stops at ~9 s instead of ~30 s
6. Verify the file structurally contains 3 segments via:
`ffmpeg -v warning -i FILE -f null -` (produces ~299 muxer warnings
= 3 segment join boundaries)
7. OR verify against committed fixture: same behavior
(`/tmp/mokosh-test-committed-3seg.webm` and
`/tmp/mokosh-test-uat-3seg.webm` both play 9 s in mpv per operator)
## Current Focus
hypothesis: |
**H4 confirmed by operator empirical test**: D-13's "concat of self-
contained WebM segments → produce playable 30 s WebM" architecture
does not work in practice because most Matroska/WebM players do not
implement the multi-segment Matroska feature. The Matroska spec
permits multiple segments in one file but most decoders read only
the first segment's EBML header and stop there. ffmpeg's behavior
(which mpv inherits) is to honor the first EBML's duration metadata.
Chrome's MSE implementation appears to do the same (per UAT operator
observation).
**H3 confirmed by operator empirical test**: The 2026-05-15 Phase 1
closure's "operator-confirmed clean Chrome playback" check was
insufficient. The check did not measure total playback duration.
Both the canonical committed fixture and today's UAT-produced fixture
exhibit the same first-segment-only-plays behavior; the bug has
existed since D-13 was activated on 2026-05-15.
**Fix direction**: replace the file-concat merge with a real WebM
REMUX. Parse each segment's EBML structure, extract VP9 frames +
cluster boundaries + keyframe positions, write a SINGLE-EBML-header
WebM whose clusters carry adjusted (monotonic) timestamps. This
produces a file that any player can read end-to-end as one continuous
~30 s stream.
**Candidate implementations**:
- `webm-muxer` npm package (Vanilla. ~10 KB. Browser + Node support.
Single-segment output. Active maintenance.)
- `ts-ebml` (EBML parser + writer. Allows manual control over
structure. ~50 KB.)
- Custom EBML parser (full control, ~500-1000 LOC, no dep weight)
- **Alternative path: MediaRecorder timeslice with cluster-aware trim**:
revisit retired D-09..D-11 architecture but trim ONLY on keyframe
boundaries (preserving every cluster from the most recent keyframe
onwards). This avoids the A3 orphan-P-frame freeze by guaranteeing
every kept cluster's references are present. ~200-400 LOC. The
risk: requires understanding EBML/Matroska cluster structure to
trim correctly.
- **Alternative path: WebCodecs API** (VideoEncoder + Muxer.js or
similar): full control over container framing. Significant rewrite
(~1000-2000 LOC). Most flexible but heaviest.
The remux approach (webm-muxer or equivalent) is likely the right
trade-off: small, well-tested library, preserves D-13's segment
lifecycle benefits (no orphan-P-frame freeze, ~10s rotation gap
acceptable), but produces a single-EBML output that all players
read correctly.
test: |
RED test: introduce a playable-duration assertion to
tests/offscreen/webm-playback.test.ts. Use ffprobe -count_frames
-show_streams to count VIDEO FRAMES (not just metadata duration),
then divide by reported frame rate to compute actual playable
content duration. Assert actual_duration > 25_000 ms for the
generated/committed fixture. This test should FAIL against the
current D-13 architecture and PASS after the remux fix lands.
Alternative RED test: ffprobe -read_intervals -i FILE
'0%+#90000' (seek to last 90s, read all packets). Count packets
read. Should be ~600 packets for 30s @ ~20fps, not ~200 for 9s.
expecting: |
RED test fails on current code (both fixture and freshly-recorded
output should fail the duration assertion). Debugger then implements
the chosen fix path (webm-muxer remux most likely) and re-asserts
GREEN.
next_action: gather initial evidence from EBML parsing of both fixtures + research candidate JS remux libraries
reasoning_checkpoint: ""
tdd_checkpoint: ""
## Constraints
- TDD mode is ON (workflow.tdd_mode: true). RED test MUST land before
GREEN fix.
- Auto-loaded memories: `feedback-gsd-ceremony-for-fixes.md` (no
hot-edits; route through proper GSD ceremony) and
`feedback-no-unilateral-scope-reduction.md` (no scope narrowing).
- This fix may RETIRE the D-13 decision entirely OR keep D-13's
rotation lifecycle but replace the concat-merge with real remux.
CONTEXT.md will need amendment regardless.
- This fix may invalidate the existing committed fixture
`tests/fixtures/last_30sec.webm` — once the architecture changes,
a fresh fixture will be needed.
- The Phase 1 closure markers (REQUIREMENTS.md, ROADMAP.md, STATE.md)
marked REQ-video-ring-buffer complete on 2026-05-15; with this
finding they need to be REVERTED to in-progress until the fix
lands. That's a DOCUMENTATION change the orchestrator handles, NOT
a debugger action.
- Phase 1 architecture amendment is large enough that this debug
session may need to escalate to a fresh Plan 01-08 (e.g. "WebM
remux for playable ring-buffer") rather than landing as a
hotfix in the debug session itself. The debugger should
CHECKPOINT to the orchestrator after root-cause confirmation +
fix-strategy options, before executing.
## Files of Interest (preliminary)
- src/offscreen/recorder.ts:
- 80-110: getSegments + segment array management
- 250-360: D-13 restart-segments rotation lifecycle
- 522-650: encodeAndSendBuffer (sends segments to SW)
- src/background/index.ts:
- 129-150: decodeBufferSegments (base64 -> Blob)
- 395-420: mergeVideoSegments (the concat point — likely replaced by remux)
- 444-460: createArchive (calls mergeVideoSegments)
- tests/offscreen/webm-playback.test.ts (existing — uses ffmpeg dry-run
to check decoder errors but does NOT check total playable duration)
- tests/fixtures/last_30sec.webm (canonical fixture; needs regen post-fix)
- .planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md
(D-13 decision; needs amendment or retirement)
- .planning/REQUIREMENTS.md
(REQ-video-ring-buffer; needs status flip from [x] back to [ ])
## Evidence
(populated by debugger; initial evidence below)
### Operator empirical observations (2026-05-16)
- `/tmp/mokosh-test-uat-3seg.webm` (today's UAT output, 1.68 MB, 3 segments):
played ~9 s in mpv
- `/tmp/mokosh-test-committed-3seg.webm` (2026-05-15 closure fixture, 1.63 MB,
3 segments): played ~9 s in mpv
- Earlier today operator confirmed Chrome playback of the UAT output was
also ~9 s, not ~30 s
### SW log evidence (today's UAT run, 16:48:52)
- 3 segments arrived at SW
- Mergeed correctly: 672159 + 507559 + 496181 = 1675899 bytes (matches
archive WebM size)
- No errors anywhere in delivery path
### ffmpeg dry-run signature
- Both files produce ~299 warning lines (segment join boundary noise)
- Both files report `duration=9.94 s` via ffprobe -show_entries format=duration
- Decoder errors: zero (segments are individually valid)
## Eliminated
(populated by debugger as hypotheses are ruled out)
- H1 (Chrome version regression): unlikely given mpv exhibits same behavior
and mpv uses ffmpeg internally — not Chrome
- H2 (today's encoding differs subtly from 2026-05-15): ruled out — committed
fixture also plays ~9 s in mpv, so it's been broken since D-13 activation
- (H5: defective committed fixture in storage): ruled out — file size
matches expected (1.63 MB matches what was committed on 2026-05-15;
not bit-rot)
## Resolution
root_cause: ""
fix: ""
verification: ""
files_changed: []

View File

@@ -1,5 +1,5 @@
---
status: partial
status: testing
phase: 01-stabilize-video-pipeline
source:
- 01-01-SUMMARY.md
@@ -10,16 +10,34 @@ source:
- 01-06-SUMMARY.md
- 01-07-SUMMARY.md
verifier_residue: 01-VERIFICATION.md (status: human_needed, OPR-1/2/3)
debug_session_landed: .planning/debug/resolved/empty-archive-port-race.md (Option C — 8 commits 674c415..f0871c0)
started: 2026-05-16T11:14:00Z
updated: 2026-05-16T11:58:00Z
paused_for: /gsd-debug investigation of Test 3 port-reconnect uncaught-error finding
updated: 2026-05-16T16:50:00Z
resumed_for: empirical re-verification of Option C fix (silent empty-video archive + port-reconnect race) before closing Phase 1
---
## Current Test
[testing paused — 1 issue + 1 blocked outstanding; user routed to /gsd-debug
immediately after Test 3 surfaced 3× Uncaught Error post-reconnect. Resume
Test 3 save+play AND Test 4 SW Force-Stop after debug session lands a fix.]
number: 3
name: "OPR-2: Continuous Recording Across Tab Switches (re-attempt post-Option-C)"
expected: |
Now that Option C landed (port lifecycle refactor + request-id'd BUFFER
routing + EmptyVideoBufferError surfaced to popup), Test 3 should now
produce a zip with a valid `video/last_30sec.webm` AND show NO
Uncaught Errors after the 290s mark (that timer is retired in favor
of the port-health-probe). Re-run smoke.sh with the freshly built
dist/ and confirm:
(a) Recording runs uninterrupted across tab switches (open 2-3 new
tabs, switch between them for ~30 s).
(b) NO `Uncaught Error: Attempting to use a disconnected port object`
appears in offscreen console — even past the 290 s mark.
(c) The save flow completes in reasonable time (< 5 s, NOT 600 s as
before).
(d) The produced zip contains `video/last_30sec.webm` of expected
size (~1.5 MB for 30 s VP9 1024×768-ish).
(e) The WebM plays continuously in Chrome (no freeze, no missing
seconds across tab-switch moments).
awaiting: user response (smoke.sh output + zip contents)
## Tests
@@ -192,10 +210,11 @@ reason: |
total: 4
passed: 2
issues: 1
issues: 2 (Test 3 confirmed BLOCKER × 2: empty-archive fixed by Option C → new finding: D-13 multi-EBML-concat plays only ~9 s in mpv AND Chrome)
pending: 0
skipped: 0
blocked: 1
blocked: 2 (Test 3 retest after D-13 architectural fix; Test 4 SW Force-Stop deferred behind the same fix)
paused_for: /gsd-debug d13-multi-ebml-concat-unplayable — Phase 1 architectural finding
## Gaps