REQ-video-ring-buffer flipped from [x] back to [ ]. ROADMAP.md Phase 1 row reverted from [x] Closed 2026-05-15 to [ ] reopened 2026-05-16. STATE.md status flipped phase_complete → phase_reopened with full historical narrative preserved. Root cause (confirmed at byte level by gsd-debugger 2026-05-16): D-13's concat-of-self-contained-WebM-segments architecture produces a 3-EBML-header WebM that standards-compliant Matroska parsers (mpv, ffmpeg, Chrome HTMLMediaElement) play only as the first segment (~9.94 s) and silently drop the remaining 2 segments. Confirmed via operator mpv drag-drop test of BOTH the canonical 2026-05-15 closure fixture and the 2026-05-16 UAT-produced fixture — both exhibit the same broken playback. The 2026-05-15 "operator-confirmed clean Chrome playback" assessment was insufficient: it verified the file plays without freezing but did not measure total duration. Phase 1's primary deliverable (REQ-video-ring-buffer / SPEC §10 #7) is therefore NOT satisfied. Fix path chosen by user: ts-ebml (parse) + webm-muxer (write) to replace mergeVideoSegments file-concat with real single-EBML remux. Will land as Plan 01-08 via fresh /gsd-plan-phase ceremony. RED test landed in tests/offscreen/webm-playback.test.ts (2 new assertions on container-format-duration + ffmpeg-full-decode-duration). 2 failures, 53 baseline tests still GREEN. Option C port-lifecycle refactor (debug session empty-archive-port-race, commits 674c415..f0871c0) DID land cleanly and is retained — that fix was orthogonal and correctly resolved the silent-empty-archive symptom that previously masked this deeper bug. Debug session: .planning/debug/d13-multi-ebml-concat-unplayable.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
527 lines
26 KiB
Markdown
527 lines
26 KiB
Markdown
---
|
|
slug: d13-multi-ebml-concat-unplayable
|
|
status: investigating
|
|
trigger: |
|
|
Phase 1 UAT Test 3 re-attempt post-Option-C produced a structurally-correct
|
|
3-segment WebM (SW logs confirm: "Merging 3 segments / Adding segment 0
|
|
size: 672159 / 1 size: 507559 / 2 size: 496181 / Final video blob size:
|
|
1675899 bytes, total segments merged: 3") but the resulting file plays
|
|
ONLY ~9 s in Chrome AND in mpv. Cross-checking the canonical fixture
|
|
committed at Phase 1 closure on 2026-05-15 (`tests/fixtures/last_30sec.webm`,
|
|
1633459 bytes, 3 segments per architecture) reveals it ALSO plays only
|
|
~9 s in mpv. Operator confirmed both via mpv playback test.
|
|
|
|
This means D-13's "concat of self-contained WebM segments → playable 30 s
|
|
WebM" architecture is fundamentally broken. The 2026-05-15 Phase 1
|
|
closure was certified on an insufficient "operator-confirmed clean
|
|
Chrome playback" check that did not actually verify 30 s duration —
|
|
both the closure fixture and today's UAT-produced fixture exhibit
|
|
the same first-segment-only-plays behavior.
|
|
|
|
Phase 1's primary deliverable (REQ-video-ring-buffer) does not actually
|
|
produce a playable 30 s WebM. SPEC §10 #7 (`last_30sec.webm plays back
|
|
in a browser`) is NOT satisfied by the current architecture even
|
|
though it was marked Complete in REQUIREMENTS.md/ROADMAP.md/STATE.md
|
|
on 2026-05-15.
|
|
created: 2026-05-16T16:56:41Z
|
|
updated: 2026-05-16T17:25:00Z
|
|
phase: 01-stabilize-video-pipeline
|
|
related_uat: .planning/phases/01-stabilize-video-pipeline/01-UAT.md
|
|
related_review_fix: .planning/phases/01-stabilize-video-pipeline/01-REVIEW-FIX.md
|
|
prior_resolved_sessions:
|
|
- .planning/debug/resolved/d12-blob-port-transfer-fails.md
|
|
- .planning/debug/resolved/webm-playback-freeze.md
|
|
- .planning/debug/resolved/empty-archive-port-race.md
|
|
architectural_impact: |
|
|
This is NOT a code-level bug; it's a wrong-architecture finding.
|
|
D-09..D-11 (single-continuous + age-trim + first-chunk-pin) was retired
|
|
in favor of D-13 (restart-segments + concat) on 2026-05-15 because
|
|
D-09..D-11 caused orphan-P-frame freezes (debug session
|
|
webm-playback-freeze). D-13 was supposed to fix that by making each
|
|
segment self-contained with its own EBML header + seed keyframe. But
|
|
D-13 only solved the freeze symptom — it did NOT solve the underlying
|
|
problem of producing a single playable 30 s WebM. Players see the
|
|
first EBML header, read its duration metadata (~9.94 s), and stop
|
|
there. Most Matroska/WebM players (ffmpeg/mpv/probably Chrome) do not
|
|
implement the multi-segment Matroska feature; the spec permits it but
|
|
doesn't mandate it.
|
|
|
|
The fix requires real WebM REMUX: extract the VP9 frames + cluster
|
|
timestamps from each of the 3 segments and rewrite them into a single
|
|
EBML-headered WebM with adjusted timestamps. This is significantly
|
|
more work than D-13 (~500-1000 LOC for a JS remuxer) but architecturally
|
|
necessary.
|
|
---
|
|
|
|
# Debug: D-13 multi-EBML-concat produces unplayable WebM (Phase 1 architecture failure)
|
|
|
|
## Symptoms
|
|
|
|
**Expected behavior:**
|
|
When the operator clicks save, the produced `video/last_30sec.webm` plays
|
|
for ~30 s in a browser (SPEC §10 #7) covering the most recent 30 s of
|
|
captured screen.
|
|
|
|
**Actual behavior:**
|
|
- WebM file is structurally valid (3 segments concatenated per D-13 design)
|
|
- All 3 segments arrive at SW per logs:
|
|
[SW:Main] Video buffer: 3 segments
|
|
[SW:Main] Merging 3 segments
|
|
[SW:Main] Adding segment 0, size: 672159 bytes
|
|
[SW:Main] Adding segment 1, size: 507559 bytes
|
|
[SW:Main] Adding segment 2, size: 496181 bytes
|
|
[SW:Main] Final video blob size: 1675899 bytes, total segments merged: 3
|
|
- Resulting file (1675899 bytes) plays only ~9 s in Chrome
|
|
- Same file plays only ~9 s in mpv
|
|
- **The canonical Phase 1 closure fixture from 2026-05-15
|
|
(`tests/fixtures/last_30sec.webm`, 1633459 bytes) ALSO plays only
|
|
~9 s in mpv** — operator verified by drag-drop test
|
|
|
|
**Error messages:**
|
|
None at the runtime layer. Recording is healthy, SW merge is healthy,
|
|
download is healthy. The bug is in the PRODUCED FILE'S COMPATIBILITY
|
|
with downstream players.
|
|
|
|
ffprobe reports `duration=9.94 s` on both files — the first EBML
|
|
header's reported duration. ffmpeg dry-run produces 299 muxer warnings
|
|
(non-monotonic DTS at segment join boundaries) for both files — that's
|
|
the segment boundary noise from concatenation, not playback failure.
|
|
|
|
**Timeline:**
|
|
- Bug introduced: commit `6a1a034` (Plan 01-07-debug-a3, 2026-05-15
|
|
"feat(fix-a3): activate D-13 restart-segments in src/offscreen/recorder.ts"
|
|
+ commit `5530292` "feat(fix-a3): retire ring-buffer first-chunk pin
|
|
tests, add segment-rotation contract")
|
|
- Operator-validated incorrectly: commit `cd61cbc` (2026-05-15
|
|
"test(01-07): commit regenerated last_30sec.webm fixture against D-13
|
|
recorder") + commit `7df72aa` (2026-05-15 "feat(01-07): close Phase 1 —
|
|
REQ-video-ring-buffer complete, SPEC §10 #7 satisfied"). The "operator
|
|
confirmed clean Chrome playback" assessment was insufficient — it
|
|
checked that the file played but did not measure the total playback
|
|
duration.
|
|
- Discovered: 2026-05-16 UAT Test 3 re-attempt after Option C debug
|
|
session (`.planning/debug/resolved/empty-archive-port-race.md`)
|
|
fixed the silent-empty-video archive bug. With the empty-video
|
|
symptom retired, the underlying broken-playback issue surfaced
|
|
cleanly.
|
|
|
|
**Reproduction:**
|
|
1. `npm run build`
|
|
2. `KEEP_PROFILE=0 ./smoke.sh`
|
|
3. Load extension, click icon, wait 5+ minutes, click save
|
|
4. Extract `video/last_30sec.webm` from the produced zip
|
|
5. Open in mpv or Chrome — playback stops at ~9 s instead of ~30 s
|
|
6. Verify the file structurally contains 3 segments via:
|
|
`ffmpeg -v warning -i FILE -f null -` (produces ~299 muxer warnings
|
|
= 3 segment join boundaries)
|
|
7. OR verify against committed fixture: same behavior
|
|
(`/tmp/mokosh-test-committed-3seg.webm` and
|
|
`/tmp/mokosh-test-uat-3seg.webm` both play 9 s in mpv per operator)
|
|
|
|
## Current Focus
|
|
|
|
hypothesis: |
|
|
**H4 confirmed by byte-level EBML probe (2026-05-16T17:10Z, see
|
|
Evidence/H4 below)**: D-13's "concat of self-contained WebM
|
|
segments → produce playable 30 s WebM" architecture does not work
|
|
because standards-compliant Matroska parsers (mpv, mkvtoolnix,
|
|
Chrome's HTMLMediaElement, ffprobe's `format=duration` path) honor
|
|
the FIRST Segment element's Info.Duration EBML (~9_934 ms for the
|
|
fixture) and stop there. Even ffmpeg's matroska DEMUXER — which is
|
|
unusually liberal and reads through the second segment's EBML
|
|
header — collapses segments 2..N onto seg1's local timestamp axis
|
|
(verified empirically: 601 packets decoded from segs 1+2, ZERO
|
|
packets from seg3, output `time=00:00:09.96`). Multi-segment
|
|
Matroska is technically permitted by the spec but in practice
|
|
consumer-grade players do not implement it.
|
|
|
|
**H3 confirmed by operator empirical test**: The 2026-05-15 Phase 1
|
|
closure's "operator-confirmed clean Chrome playback" check was
|
|
insufficient. The check did not measure total playback duration.
|
|
Both the canonical committed fixture and today's UAT-produced fixture
|
|
exhibit the same first-segment-only-plays behavior; the bug has
|
|
existed since D-13 was activated on 2026-05-15.
|
|
|
|
**Fix direction**: replace the file-concat merge with a real WebM
|
|
REMUX. Parse each segment's EBML structure, extract VP9 frames +
|
|
cluster boundaries + keyframe positions, write a SINGLE-EBML-header
|
|
WebM whose clusters carry adjusted (monotonic) timestamps. This
|
|
produces a file that any player can read end-to-end as one continuous
|
|
~30 s stream.
|
|
|
|
**Candidate implementations** (researched 2026-05-16, see
|
|
Evidence/library-survey below for full table):
|
|
- `webm-muxer` 5.1.4 (Vanilagy, MIT, last release 2025-07-02,
|
|
gzipped ~12 KB, pure ESM/CJS no DOM globals). `addVideoChunkRaw(data,
|
|
type:'key'|'delta', timestamp, meta?)` accepts already-encoded VP9
|
|
frames — exactly the shape produced by a stream of existing WebM
|
|
segments. SW-compatible. PRIMARY CANDIDATE for the write half.
|
|
- `ts-ebml` 3.0.2 (legokichi, MIT, last release 2025-09-28, gzipped
|
|
~87 KB, UMD has a single `typeof window` check with self-fallback
|
|
so SW-compatible). Decoder+Encoder API. Needed for the parse half
|
|
(extract VP9 SimpleBlock payloads + cluster timecodes + keyframe
|
|
flags from each segment).
|
|
- `ebml` 3.0.0 (node-ebml, MIT, last release **2018-09-06** — dead
|
|
upstream). Smaller but unmaintained.
|
|
- `mp4-muxer` 5.2.2 (sibling of webm-muxer; not applicable — we need
|
|
WebM container output).
|
|
- Custom EBML parser (full control, ~500-1000 LOC, no dep weight)
|
|
- **Alternative path: MediaRecorder timeslice with cluster-aware trim**:
|
|
revisit retired D-09..D-11 architecture but trim ONLY on keyframe
|
|
boundaries (preserving every cluster from the most recent keyframe
|
|
onwards). See Evidence/cluster-aware-trim below — the DETERMINISTIC
|
|
floor on retained-content duration is much less than 30 s
|
|
(worst-case: keyframe just emitted → retain only the post-keyframe
|
|
sliver) because VP9 kf_max_dist under Chrome's MediaRecorder is
|
|
irregular (3-5 s typical, 26 s observed in the prior debug
|
|
session). This path produces a NON-DETERMINISTIC content window;
|
|
rejected as architecturally weaker than remux.
|
|
- **Alternative path: WebCodecs API** (VideoEncoder + Muxer.js or
|
|
similar): full control over container framing. Significant rewrite
|
|
(~1000-2000 LOC). Most flexible but heaviest. WebCodecs is
|
|
available in MV3 service workers per Chrome 94+ — viable but
|
|
over-engineered for the current need (we already have VP9 frames,
|
|
we just need to RE-CONTAIN them).
|
|
|
|
The recommendation (TIEBREAKER only — the user makes the call):
|
|
`ts-ebml` (parse) + `webm-muxer` (write) is the smallest fix that
|
|
matches the actual problem shape. Combined ~100 KB gzipped, both
|
|
MIT, both actively maintained, both verified SW-compatible. Net
|
|
source-edit LOC ~150-300 in `src/background/index.ts`
|
|
mergeVideoSegments() — we don't decode/re-encode VP9 frames, we
|
|
just parse them out of segments and re-emit with monotonic
|
|
timestamps. Preserves D-13's recorder-side lifecycle (which DID
|
|
fix the orphan-P-frame freeze) and adds a single new SW-side
|
|
remux pass on the save path.
|
|
|
|
test: |
|
|
RED test LANDED at tests/offscreen/webm-playback.test.ts. Two new
|
|
assertions in the new `describe('webm playable duration (RED —
|
|
confirms d13-multi-ebml-concat-unplayable bug)')` block:
|
|
|
|
1. `container-level format=duration on last_30sec.webm exceeds 25 s`
|
|
— uses ffprobe to read `format=duration`. Asserts
|
|
`>= MIN_PLAYABLE_DURATION_MS = 25_000`. RED today
|
|
(actual: 9_934 ms).
|
|
|
|
2. `ffmpeg full decode of last_30sec.webm reaches at least 25 s of
|
|
timeline` — parses the last `time=HH:MM:SS.MS` from `ffmpeg -stats
|
|
-f null -` output. Asserts `>= 25_000 ms`. RED today
|
|
(actual: 9_960 ms).
|
|
|
|
Both gate behind `it.skipIf(!ffprobeAvailable())` /
|
|
`it.skipIf(!ffmpegAvailable())` so CI environments without those
|
|
binaries auto-skip rather than hard-fail (matches the existing
|
|
webm-playback.test.ts skip discipline). The existing two structural-
|
|
validity tests in the same file (`...zero decoder packet errors` and
|
|
`...does not end prematurely`) remain GREEN and untouched.
|
|
expecting: |
|
|
RED test fails on current code (both fixture and freshly-recorded
|
|
output should fail the duration assertion). Debugger then implements
|
|
the chosen fix path (webm-muxer + ts-ebml remux most likely) and
|
|
re-asserts GREEN. RED confirmed 2026-05-16T17:20Z: 11 test files,
|
|
53 passed + 2 failed (the two new assertions). All pre-existing
|
|
tests still GREEN; tsc clean (exit 0).
|
|
next_action: CHECKPOINT to orchestrator — root cause confirmed, RED test landed, fix-strategy options surfaced; awaiting user's chosen path via orchestrator routing.
|
|
reasoning_checkpoint: |
|
|
Why CHECKPOINT here rather than execute: the choice between
|
|
`ts-ebml + webm-muxer` vs `custom EBML parser` vs `cluster-aware
|
|
trim revisit of D-09..D-11` vs `WebCodecs rewrite` is architecturally
|
|
significant (it determines whether Phase 1's deliverable stays in
|
|
the debug-session hotfix lane OR escalates to a fresh Plan 01-08,
|
|
and whether the project gains two new runtime deps). Per the
|
|
feedback memory `feedback-no-unilateral-scope-reduction.md` the
|
|
debugger does not narrow this for the user — surface options and
|
|
let the user pick.
|
|
tdd_checkpoint: |
|
|
RED gate honored. Two new failing assertions in
|
|
tests/offscreen/webm-playback.test.ts pin the playable-duration
|
|
contract that the 2026-05-15 closure check missed. Existing
|
|
structural-validity tests remain GREEN. tsc clean. Full vitest run
|
|
reports `Test Files 1 failed | 10 passed (11) / Tests 2 failed | 53
|
|
passed (55)` — exactly the expected RED-on-new shape, no collateral
|
|
regression.
|
|
|
|
## Constraints
|
|
|
|
- TDD mode is ON (workflow.tdd_mode: true). RED test MUST land before
|
|
GREEN fix.
|
|
- Auto-loaded memories: `feedback-gsd-ceremony-for-fixes.md` (no
|
|
hot-edits; route through proper GSD ceremony) and
|
|
`feedback-no-unilateral-scope-reduction.md` (no scope narrowing).
|
|
- This fix may RETIRE the D-13 decision entirely OR keep D-13's
|
|
rotation lifecycle but replace the concat-merge with real remux.
|
|
CONTEXT.md will need amendment regardless.
|
|
- This fix may invalidate the existing committed fixture
|
|
`tests/fixtures/last_30sec.webm` — once the architecture changes,
|
|
a fresh fixture will be needed.
|
|
- The Phase 1 closure markers (REQUIREMENTS.md, ROADMAP.md, STATE.md)
|
|
marked REQ-video-ring-buffer complete on 2026-05-15; with this
|
|
finding they need to be REVERTED to in-progress until the fix
|
|
lands. That's a DOCUMENTATION change the orchestrator handles, NOT
|
|
a debugger action.
|
|
- Phase 1 architecture amendment is large enough that this debug
|
|
session may need to escalate to a fresh Plan 01-08 (e.g. "WebM
|
|
remux for playable ring-buffer") rather than landing as a
|
|
hotfix in the debug session itself. The debugger should
|
|
CHECKPOINT to the orchestrator after root-cause confirmation +
|
|
fix-strategy options, before executing.
|
|
|
|
## Files of Interest (preliminary)
|
|
|
|
- src/offscreen/recorder.ts:
|
|
- 80-110: getSegments + segment array management
|
|
- 250-360: D-13 restart-segments rotation lifecycle
|
|
- 522-650: encodeAndSendBuffer (sends segments to SW)
|
|
- src/background/index.ts:
|
|
- 129-150: decodeBufferSegments (base64 -> Blob)
|
|
- 395-420: mergeVideoSegments (the concat point — likely replaced by remux)
|
|
- 444-460: createArchive (calls mergeVideoSegments)
|
|
- tests/offscreen/webm-playback.test.ts (existing — uses ffmpeg dry-run
|
|
to check decoder errors but does NOT check total playable duration)
|
|
- tests/fixtures/last_30sec.webm (canonical fixture; needs regen post-fix)
|
|
- .planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md
|
|
(D-13 decision; needs amendment or retirement)
|
|
- .planning/REQUIREMENTS.md
|
|
(REQ-video-ring-buffer; needs status flip from [x] back to [ ])
|
|
|
|
## Evidence
|
|
|
|
(populated by debugger; initial evidence below)
|
|
|
|
### Operator empirical observations (2026-05-16)
|
|
- `/tmp/mokosh-test-uat-3seg.webm` (today's UAT output, 1.68 MB, 3 segments):
|
|
played ~9 s in mpv
|
|
- `/tmp/mokosh-test-committed-3seg.webm` (2026-05-15 closure fixture, 1.63 MB,
|
|
3 segments): played ~9 s in mpv
|
|
- Earlier today operator confirmed Chrome playback of the UAT output was
|
|
also ~9 s, not ~30 s
|
|
|
|
### SW log evidence (today's UAT run, 16:48:52)
|
|
- 3 segments arrived at SW
|
|
- Mergeed correctly: 672159 + 507559 + 496181 = 1675899 bytes (matches
|
|
archive WebM size)
|
|
- No errors anywhere in delivery path
|
|
|
|
### ffmpeg dry-run signature
|
|
- Both files produce ~299 warning lines (segment join boundary noise)
|
|
- Both files report `duration=9.94 s` via ffprobe -show_entries format=duration
|
|
- Decoder errors: zero (segments are individually valid)
|
|
|
|
### H4 byte-level EBML probe (2026-05-16T17:10Z) — confirms multi-EBML-concat is the root cause
|
|
|
|
Probe target: `tests/fixtures/last_30sec.webm` (1_633_459 bytes, committed
|
|
fixture from Phase 1 closure 2026-05-15).
|
|
|
|
**EBML structural scan** (raw byte search for element IDs per
|
|
[Matroska spec](https://www.matroska.org/technical/elements.html)):
|
|
|
|
| EBML element | ID (hex) | Occurrences in file | Byte offsets |
|
|
|---|---|---|---|
|
|
| EBML header | `1A 45 DF A3` | **3** | `[0, 509038, 970967]` |
|
|
| Segment | `18 53 80 67` | **3** | `[36, 509074, 971003]` |
|
|
| Cluster | `1F 43 B6 75` | 13 | spread across all 3 segments |
|
|
|
|
The file is THREE concatenated WebM files, each with its own EBML header
|
|
+ Segment element. mkvinfo (without `--all-elements`) reports only the
|
|
FIRST segment + its EBML header — two top-level elements visible —
|
|
confirming standards-compliant parsers stop at the first segment.
|
|
|
|
**Per-segment isolated probes** (sliced via Python at the EBML offsets
|
|
above into `/tmp/d13-seg{1,2,3}.webm`):
|
|
|
|
| Segment | Bytes | format=duration | -count_frames |
|
|
|---|---|---|---|
|
|
| seg1 | 509_038 | 9.934 s | 301 frames |
|
|
| seg2 | 461_929 | 9.963 s | 300 frames |
|
|
| seg3 | 662_492 | 9.958 s | 311 frames |
|
|
| **TOTAL** | **1_633_459** | (29.86 s of real content) | **912 frames** |
|
|
|
|
Each segment is individually a valid, complete ~10 s WebM. The
|
|
underlying VP9 stream is intact across all three. The bug is purely the
|
|
multi-segment topology of the concatenated container.
|
|
|
|
**Concatenated file probe** (the actual fixture):
|
|
|
|
| Probe command | Reported value |
|
|
|---|---|
|
|
| `ffprobe -show_entries format=duration` | **9.934024 s** (first segment's Info.Duration metadata only) |
|
|
| `ffprobe -count_frames` | **601 frames** (= 301 + 300 = segs 1+2 only) |
|
|
| `ffmpeg -f null -` decoder | **frame=601 time=00:00:09.96** + 299 non-monotonic-DTS warnings |
|
|
| Packets read from byte range `pos<509038` (seg1) | 301 |
|
|
| Packets read from byte range `509038 ≤ pos < 970967` (seg2) | 300 |
|
|
| Packets read from byte range `pos ≥ 970967` (seg3) | **0** |
|
|
| mkvinfo top-level elements visible | 2 (EBML head + Segment) — seg2 + seg3 invisible |
|
|
|
|
Two observations both fatal to D-13:
|
|
|
|
1. ffmpeg's matroska demuxer is the most-permissive parser in common
|
|
use and even IT silently drops segment 3 (zero packets from
|
|
`pos ≥ 970967`).
|
|
2. Even when ffmpeg DOES read segments 1+2 it does not offset seg2's
|
|
local timestamps onto the global timeline. The 299 non-monotonic-DTS
|
|
warnings are seg2's local timestamps (`tt < 9934 ms`) colliding with
|
|
seg1's end timestamp (`9934 ms`). Output `time=00:00:09.96` because
|
|
the muxer cannot grow the timeline past the maximum monotonic DTS
|
|
it has accepted.
|
|
|
|
Conclusion: H4 confirmed at the byte level. The file is structurally
|
|
valid as a "concatenated archive of three WebMs" but is NOT a single
|
|
30-second playable WebM. To produce a 30-second playable WebM the
|
|
segments must be REMUXED (parse VP9 frames + keyframe flags + cluster
|
|
timestamps from each segment, then re-emit them inside a single
|
|
EBML-headered container with monotonically-adjusted timestamps).
|
|
|
|
### Library survey (2026-05-16T17:15Z) — candidate JS WebM remux libraries
|
|
|
|
All sizes are the bundled dist (no source-map, no tests, no docs).
|
|
Gzipped values measured locally via `gzip -c`. SW-compat verdict is
|
|
based on grep of dist for `window`/`document`/`navigator`/`XMLHttpRequest`
|
|
followed by manual inspection of any hits.
|
|
|
|
| Lib | Version | License | Last release | Dist size | Gzipped | SW-compat | API shape | Verdict |
|
|
|---|---|---|---|---|---|---|---|---|
|
|
| `webm-muxer` | 5.1.4 | MIT | 2025-07-02 | 69 KB | **~12 KB** | YES (zero DOM refs) | `addVideoChunkRaw(data, type:'key'\|'delta', ts, meta?)` accepts encoded VP9 frames | PRIMARY — write half |
|
|
| `ts-ebml` | 3.0.2 | MIT | 2025-09-28 | 356 KB | ~87 KB | YES (`typeof window` with `self` fallback in UMD wrapper) | `Decoder.decode(ArrayBuffer) → EBMLElementDetail[]` ; `Encoder.encode(elms) → ArrayBuffer` | PRIMARY — parse half |
|
|
| `ebml` | 3.0.0 | MIT | **2018-09-06** | 7.7 MB unpacked | n/a | uncertain | older streaming parser API | DEAD UPSTREAM — avoid |
|
|
| `mp4-muxer` | 5.2.2 | MIT | (active) | 70 KB | ~13 KB | YES | analogous to webm-muxer but MP4 | n/a — wrong container |
|
|
| Custom EBML parser | n/a | n/a | n/a | 0 KB | 0 KB | YES | hand-rolled per Matroska spec | ~500-1000 LOC, full ownership |
|
|
|
|
Important note on the `webm-muxer` API: `addVideoChunkRaw()` takes
|
|
already-encoded VP9 frame bytes + a keyframe flag + a timestamp. We
|
|
do NOT need to decode/re-encode the VP9 stream — the existing
|
|
segments already contain valid VP9 frame payloads inside their
|
|
Cluster/SimpleBlock elements. The remux path is:
|
|
|
|
1. For each segment blob, parse via `ts-ebml.Decoder` → walk the EBML
|
|
tree → for each Cluster's SimpleBlock children, extract the VP9
|
|
frame bytes + keyframe flag (Matroska SimpleBlock bit 7 of the
|
|
first flag byte = "keyframe" per
|
|
[spec](https://www.matroska.org/technical/elements.html#SimpleBlock))
|
|
+ cluster Timestamp + local block offset.
|
|
2. Compute monotonic adjusted timestamp: `globalTs = segmentBaseMs +
|
|
clusterTsMs + blockOffsetMs` where `segmentBaseMs` accumulates the
|
|
prior segment's total content duration.
|
|
3. Stream all adjusted frames into a single `webm-muxer.Muxer` with
|
|
`addVideoChunkRaw(frameData, isKey ? 'key' : 'delta', globalUs)`.
|
|
4. `muxer.finalize()` → `ArrayBufferTarget.buffer` → single-EBML
|
|
WebM Blob.
|
|
|
|
Combined dep weight: ~100 KB gzipped (`webm-muxer` ~12 KB + `ts-ebml`
|
|
~87 KB). Combined source edit estimate at `mergeVideoSegments()`:
|
|
~150-300 LOC including type defs.
|
|
|
|
### Cluster-aware-trim alternative path (D-09..D-11 revisit, 2026-05-16T17:18Z)
|
|
|
|
Path summary: keep MediaRecorder running continuously (the retired
|
|
D-09 lifecycle) but, on each periodic trim pass, scan the chunk buffer
|
|
for the OLDEST keyframe whose position would keep total duration ≤ 30
|
|
s, then drop everything strictly before that keyframe. Preserves header
|
|
chunk + a contiguous run of keyframe-anchored clusters.
|
|
|
|
Why this is architecturally weaker than remux:
|
|
|
|
1. **Non-deterministic content window.** MediaRecorder/VP9 keyframe
|
|
cadence under Chrome's default `kf_max_dist=100` is irregular —
|
|
the prior `webm-playback-freeze` debug session observed a 26-second
|
|
keyframe gap empirically. If the latest keyframe was emitted 2 s
|
|
ago, cluster-aware trim retains only 2 s of content. The user's
|
|
`last_30sec.webm` would be anywhere in `[~few seconds .. ~30 s]`
|
|
depending on when SAVE landed in the keyframe cycle. That breaks
|
|
SPEC §10 #7's implicit "≥ 30 s of recent context" requirement.
|
|
|
|
2. **Still need EBML parsing.** To find keyframe boundaries inside
|
|
the chunk buffer we still need to parse the WebM container for
|
|
SimpleBlock keyframe flags. So the dep weight is similar (`ts-ebml`
|
|
at minimum) but the output is worse.
|
|
|
|
3. **Re-introduces the freeze-risk surface area.** The prior debug
|
|
session retired D-09..D-11 precisely because age-trim repeatedly
|
|
produced orphan-P-frame freezes. A "keyframe-aware" variant still
|
|
has to delete content; one bug in the keyframe-detection path and
|
|
the freeze returns. The risk surface is wider than the remux path,
|
|
which never deletes — it only re-containers what already exists.
|
|
|
|
LOC estimate: ~200-400 LOC for keyframe parsing + buffer mutation +
|
|
tests. Net: similar dep weight, worse playable-duration guarantee,
|
|
re-opens the freeze regression surface. **REJECTED as inferior to
|
|
remux.** Documenting here only because the orchestrator brief
|
|
explicitly requested the comparison.
|
|
|
|
### WebCodecs API path (2026-05-16T17:19Z)
|
|
|
|
WebCodecs (`VideoEncoder` + `VideoDecoder`) is available in MV3 service
|
|
workers from Chrome 94+. The path would be: feed each segment's
|
|
clusters → `VideoDecoder` → emit `VideoFrame` objects → feed back into
|
|
`VideoEncoder` (re-encode VP9) → wrap output via `webm-muxer`.
|
|
|
|
This works but adds a re-encode pass that:
|
|
- doubles CPU cost during the save flow
|
|
- introduces an additional quality loss (re-encoding lossy VP9)
|
|
- adds 500-1000 LOC of encoder/decoder lifecycle management
|
|
- requires Chrome 94+ exclusively (we already require modern Chrome,
|
|
so OK, but it tightens the version floor)
|
|
|
|
There is no benefit over the `ts-ebml + webm-muxer` path for this
|
|
specific shape of problem — we already have encoded VP9 frames and
|
|
just need to put them in a different container. Re-encoding is
|
|
unnecessary work. **REJECTED as over-engineered.**
|
|
|
|
### RED test landing evidence (2026-05-16T17:20Z)
|
|
|
|
File edited: `tests/offscreen/webm-playback.test.ts` (preserved
|
|
existing 2 GREEN tests; appended new `describe` block with 2 new
|
|
assertions + supporting helpers).
|
|
|
|
Test run scoped to file:
|
|
```
|
|
$ npx vitest run tests/offscreen/webm-playback.test.ts
|
|
Test Files 1 failed (1)
|
|
Tests 2 failed | 2 passed (4)
|
|
```
|
|
|
|
Failures:
|
|
- `container-level format=duration on last_30sec.webm exceeds 25 s`
|
|
— `expected 9934 to be greater than or equal to 25000`
|
|
- `ffmpeg full decode of last_30sec.webm reaches at least 25 s of timeline`
|
|
— `expected 9960 to be greater than or equal to 25000`
|
|
|
|
Full suite (proves zero collateral regression):
|
|
```
|
|
$ npx vitest run
|
|
Test Files 1 failed | 10 passed (11)
|
|
Tests 2 failed | 53 passed (55)
|
|
```
|
|
|
|
All 53 pre-existing tests still GREEN. tsc:
|
|
```
|
|
$ npx tsc --noEmit; echo exit=$?
|
|
exit=0
|
|
```
|
|
|
|
## Eliminated
|
|
|
|
(populated by debugger as hypotheses are ruled out)
|
|
|
|
- H1 (Chrome version regression): unlikely given mpv exhibits same behavior
|
|
and mpv uses ffmpeg internally — not Chrome
|
|
- H2 (today's encoding differs subtly from 2026-05-15): ruled out — committed
|
|
fixture also plays ~9 s in mpv, so it's been broken since D-13 activation
|
|
- (H5: defective committed fixture in storage): ruled out — file size
|
|
matches expected (1.63 MB matches what was committed on 2026-05-15;
|
|
not bit-rot)
|
|
- H6 (cluster-aware-trim revisit of D-09..D-11): rejected on architectural
|
|
weakness — non-deterministic content window (depends on keyframe
|
|
cadence), still needs EBML parsing, re-opens freeze-regression
|
|
surface area. See Evidence/cluster-aware-trim section.
|
|
- H7 (WebCodecs re-encode path): rejected as over-engineered — re-encodes
|
|
VP9 frames we already have. ~500-1000 LOC for zero quality/playability
|
|
benefit. See Evidence/WebCodecs section.
|
|
|
|
## Resolution
|
|
|
|
root_cause: ""
|
|
fix: ""
|
|
verification: ""
|
|
files_changed: []
|