Files
mokosh/.planning/debug/resolved/webm-playback-freeze.md
Mark 872f25d649 docs(fix-a3): resolve webm-playback-freeze debug session, update STATE
Closes the second debug session in Phase 1's life (after d12). Both
sessions resolved fast — ~30 min for d12, ~15 min for the RED-test
landing in this one — because the planner had explicitly pre-staged
contingencies (D-12 ffprobe gate + D-13 restart-segments skeleton)
for the assumptions RESEARCH.md flagged HIGH-risk. Neither was a
planning oversight; both were the documented HIGH-risk assumption
activating as expected.

Changes:
- Moved .planning/debug/webm-playback-freeze.md →
  .planning/debug/resolved/webm-playback-freeze.md (status:
  root-cause-confirmed → resolved).
- Added the Resolution section: root-cause one-liner, applied-fix
  description, the 5 files-changed list, the 6 fix-a3 commit hashes,
  the in-tree verification matrix, and the explicit operator
  next-step (re-run ./smoke.sh, verify Chrome playback +
  ffmpeg-clean stderr + the 2 webm-playback.test.ts assertions
  flipping GREEN, then Phase 1 closes).
- Updated STATE.md frontmatter `stopped_at`, the Decisions log
  with a [Phase 01-07-debug-a3] entry summarising D-13 activation
  + the type renames + the retired old-API surface, and the
  Session Continuity block (timestamp, stopped_at narrative,
  resume-file pointer).

Phase 1 close is still pending operator regen of
tests/fixtures/last_30sec.webm. REQ-video-ring-buffer must not
be marked complete by this commit — Plan 07's §10 #7 acceptance
criterion owns that and only the in-Chrome playback + ffmpeg-clean
stderr (against a freshly regenerated fixture) can close it.
2026-05-15 21:18:36 +02:00

163 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
slug: webm-playback-freeze
status: resolved
trigger: Phase 1 A3 cluster-alignment failure — last_30sec.webm freezes ~1 s into playback in Chrome despite ffprobe structural validation passing. Surfaced during /gsd-execute-phase 1 Plan 01-07 manual smoke retest after the D-12 binary-transfer fix landed.
created: 2026-05-15
updated: 2026-05-15
resolved: 2026-05-15
phase: 1
plan: 01-07
related_resolved: d12-blob-port-transfer-fails
resolution_commits:
- 5530292 feat(fix-a3): retire ring-buffer first-chunk pin tests, add segment-rotation contract
- 6a1a034 feat(fix-a3): activate D-13 restart-segments in src/offscreen/recorder.ts
- 670daa3 feat(fix-a3): adapt SW receive path to segment semantics
- f81438d feat(fix-a3): rename TransferredVideoChunk → TransferredVideoSegment
- 87909d9 test(fix-a3): commit debug-session test artifacts + stale fixture
---
# Debug session — WebM playback freeze (A3 cluster alignment)
## Symptoms
- **Expected:** `tests/fixtures/last_30sec.webm` plays end-to-end (~30 s of video) in Chrome's built-in player. SPEC §10 #7 acceptance criterion: "архив открывается, last_30sec.webm воспроизводится в браузере" — *plays back in the browser*.
- **Actual:** The file is 2.1 MB of valid VP9 stream metadata (ffprobe passes structural validation; D-12 gate green). When opened in Chrome, playback FREEZES ~1 s in. Decoder cannot continue past the early frames.
- **Error messages (from `ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null -`):**
- `[vist#0:0/vp9 @ ...] [dec:vp9 @ ...] Error submitting packet to decoder: Invalid data found when processing input` ×8
- `[in#0/matroska,webm @ ...] File ended prematurely at pos. 2100851 (0x200e73)`
- **Timeline:** First playback test after the D-12 base64-transfer fix landed (commits c0d9166..bf07619). The fix made the WebM container *valid*; this is the next-layer failure that the prior session masked.
- **Reproduction:**
1. Build is current (commit bf07619 on `gsd/phase-01-stabilize-video-pipeline`).
2. `./smoke.sh` (KEEP_PROFILE=1 — extension already loaded). Reload extension at `chrome://extensions`.
3. Click extension → wait ~35 s → click "Сохранить отчёт об ошибке".
4. `unzip -p ~/Downloads/session_report_2026-05-15_20-28-58.zip video/last_30sec.webm > /tmp/last_30sec.webm`
5. `ffprobe -v error -f matroska -i /tmp/last_30sec.webm; echo $?` → exit 0 (D-12 passes)
6. Open the WebM in Chrome → playback freezes ~1 s in.
## Evidence already collected
- timestamp: 2026-05-15T20:35Z — Keyframe distribution map via `ffprobe -select_streams v:0 -show_frames -show_entries frame=key_frame,pts_time`:
- Keyframes at pts_time: 0.000, 0.029, 0.095, **then a 26.4-second gap with NO keyframes**, then 26.474, 29.843, 33.209, 36.577, 39.945, ... (regular ~3 s cadence)
- First ~50 packets after t=0.095s are all P-frames (`___` flag, no keyframe)
- timestamp: 2026-05-15T20:35Z — `ffmpeg -v warning -i ... -f null -` decode dry-run: 8× `Error submitting packet to decoder: Invalid data found` + `File ended prematurely at pos. 2100851`. Empirical proof that decoding fails partway through.
- timestamp: 2026-05-15T20:30Z — Container validity confirmed by ffprobe `-show_streams`: VP9 codec, profile 0, 912×886, valid color metadata (bt709), start_pts=0, time_base=1/1000. The container is *structurally* valid; the *content* is not decodable end-to-end.
- timestamp: 2026-05-15T20:30Z — Fixture committed at `tests/fixtures/last_30sec.webm` (2.1 MB) by Plan 07's executor before the playback freeze was discovered. This fixture IS the reproduction case.
## Current Focus
- **hypothesis:** The ring-buffer trim removes chunks containing P-frames that subsequent retained chunks depend on. `MediaRecorder.start(2000)` emits chunks at the 2 s timeslice but does NOT force a keyframe at each chunk boundary; VP9's `kf_max_dist` default places keyframes every ~35 s (bugzilla #1666487 cited in RESEARCH.md). So most "later" chunks contain only P-frames whose reference frames are in earlier (trimmed) chunks. Concretely: chunk 1 contains a keyframe + ~0.1 s of frames; the ring buffer keeps chunk 1 (header retention per D-10) plus the most recent 30 s of chunks. But the keyframe needed for the retained recent chunks lives in trimmed-out middle chunks, so decoding hits a wall just past chunk 1's end.
- **Secondary cause:** The WebM lacks proper `MediaRecorder.stop()` finalization (no Cues/SegmentSize markers) because the SW reads the in-memory buffer mid-stream without stopping the recorder. Hence "File ended prematurely". This compounds the freeze but is not the root cause; even with proper finalization, the keyframe gap would still break playback.
- **next_action:** RED tests have landed (see Evidence below). Hand off to executor for D-13 activation per the Resolution / Activation Plan section below.
- **expecting:** RED today on (a) empirical fixture decode and (b) production `getSegments` API. D-13 activation + fresh fixture regeneration flips both GREEN.
- **reasoning_checkpoint:** A3 was explicitly flagged HIGH-risk in RESEARCH.md and D-13 was specifically pre-staged for this. The keyframe map empirically matches the predicted failure exactly. This is NOT a "we missed it" situation — it's "the documented contingency activated as expected." The RED tests are landed first before any source edit per TDD discipline + the GSD-ceremony feedback the user gave earlier in this session (no hot-fixes).
- **specialist_hint:** `chrome-extension-mv3` — the fix lives in the MediaRecorder lifecycle in the offscreen document; the format constraints come from VP9/WebM/Matroska spec. There is no language-specialist agent for this in the current dispatcher table, so engineering:debug or a manual review path is appropriate.
## Pre-existing fix material (D-13 skeleton)
Per Phase 1 CONTEXT.md decisions D-13 + Plan 01-03's SUMMARY, a commented-out restart-segments skeleton already lives at the bottom of `src/offscreen/recorder.ts` (lines 298-316). The activation plan needs to:
1. Replace the single-continuous-MediaRecorder lifecycle with a segment-based one (stop+restart every ~10 s on the same MediaStream)
2. Keep the last 3 segments in memory (3 × 10 s = 30 s)
3. Drop D-09..D-11's first-chunk-pin logic (obsolete under restart-segments — each segment is self-contained, has its own header)
4. Reuse the D-12 base64 wire-format per-segment for the 3 segments
5. SW concatenates 3 self-contained WebMs (multi-EBML-header file; Chrome handles this; spec §10 #7 only requires it plays in *a* browser, so Chrome's acceptance is sufficient)
## Out of scope for this session
- **Playback in players other than Chrome.** SPEC §10 #7 only requires Chrome playback. VLC / mpv may handle multi-EBML-header WebMs differently. Not a Phase 1 concern.
- **Audio capture.** Phase 2 / SPEC §9.
- **The "File ended prematurely" finalization gap.** Restart-segments solves it as a side effect (each segment gets a proper .stop()). No separate fix needed.
## Evidence
- timestamp: 2026-05-15T20:38Z — RED test #1 landed: `tests/offscreen/webm-playback.test.ts`. Two assertions:
* `ffmpeg dry-run on last_30sec.webm produces zero decoder packet errors` — FAILS with `expected 1 to be 0` (the one "last message repeated 7 times" Line means 8 actual events, ffmpeg condenses the report).
* `ffmpeg dry-run on last_30sec.webm does not end prematurely` — FAILS with `expected true to be false`.
Both failures cite the exact ffmpeg stderr that originally surfaced the bug, so a regression bisect lands on a useful diff. Skip-fence via `it.skipIf(!ffmpegAvailable())` so CI environments without ffmpeg auto-skip rather than fail.
- timestamp: 2026-05-15T20:40Z — RED test #2 landed: `tests/offscreen/segment-keyframes.test.ts`. Three describe blocks:
* **documentation block** — pure-simulation tests that pass today, encode the D-09..D-11 failure mode as executable evidence (regression guard against re-introducing the single-continuous-recorder semantics post-fix).
* **GREEN-pinning block** — pure-simulation tests that pin the D-13 segment-keyframe invariant; pass today as a forward contract for the fix reviewer.
* **production-driven RED block** — imports `src/offscreen/recorder.ts` and asserts (i) `getSegments` is exported as a function, (ii) it returns at most 3 Blobs. FAILS today (the export does not exist); flips GREEN when D-13 is activated and a `getSegments` export is added.
- timestamp: 2026-05-15T20:40Z — Full vitest run: `4 failed | 21 passed (25 total)`. Pre-existing 15/15 tests still pass; the 4 failures are exactly the new RED tests above (2 in webm-playback, 2 in segment-keyframes). `npx tsc --noEmit` passes without diagnostics — the new tests are type-clean.
## Eliminated
- **Container corruption due to base64-transfer wire format.** Already fixed by the d12 session; ffprobe `-show_streams` shows valid VP9, 912×886, bt709 metadata. Container is well-formed; payload semantics are the failure.
- **MIME-type misdetection on the SW side.** `merged.type === 'video/webm'` is enforced by `mergeVideoChunks`; the SW's `base64ToBlob(wire.data, wire.type || VIDEO_MIME_FALLBACK)` round-trips correctly per the GREEN-pinning block of `tests/offscreen/port-serialization.test.ts`.
- **Chunk ordering bug.** `mergeVideoChunks` sorts by `timestamp` before concatenation; the keyframe-map shows monotonically increasing pts_time after the gap, ruling out a sort-order issue.
- **Audio interference.** `getDisplayMedia({ video: true, audio: false })` — no audio track exists to interleave.
- **VP9 codec misconfiguration.** `videoBitsPerSecond: 400_000` + `mimeType: 'video/webm;codecs=vp9'` is the Chrome-supported config (codec-check test asserts `MediaRecorder.isTypeSupported('video/webm;codecs=vp9') === true`).
## Resolution
**Root cause:** Single continuous `MediaRecorder` + 30 s age-trim ring buffer (D-09..D-11) loses VP9 keyframe references when chunks in the *middle* of the recording are evicted. The pinned first chunk's keyframe anchors only the first ~0.1 s; every subsequent retained chunk's P-frames reference keyframes that lived in trimmed chunks. Chrome's decoder fails the moment it has to render a frame whose I-frame predecessor is missing — observed empirically as freeze at ~1 s of playback. Secondary issue: mid-stream buffer read without `MediaRecorder.stop()` means Matroska SegmentSize / Cues are never written, producing the `File ended prematurely` line; D-13's per-segment `.stop()` finalizes this naturally.
**Fix applied (2026-05-15):** Activated the pre-staged **D-13 restart-segments** skeleton in `src/offscreen/recorder.ts`. Recorder lifecycle replaced: every `SEGMENT_DURATION_MS = 10_000` ms the recorder calls `.stop()` (finalizes the segment naturally), `onstop` assembles `currentChunks` into one self-contained ~10 s WebM Blob, pushes to `segments`, evicts oldest if over `MAX_SEGMENTS = 3`, and constructs a fresh `MediaRecorder` on the SAME `mediaStream` — preserving the user gesture, seeding a new EBML header + initial VP9 keyframe in the new segment. SW-side `mergeVideoSegments` concatenates the segments sequentially; Chrome plays multi-EBML-header WebMs natively (SPEC §10 #7 scope). The retired D-09..D-11 API (`addChunk`, `trimAged`, `getBuffer`, `firstChunkSaved`, `isFirst`) was deleted in the same atomic commits; new public API surface is `getSegments`, `pushSegmentForTest`, `resetBuffer`, `MAX_SEGMENTS`, `SEGMENT_DURATION_MS`, `VIDEO_BUFFER_DURATION_MS`, `assertCodecSupported`. Types renamed: `TransferredVideoChunk``TransferredVideoSegment`, `VideoChunk``VideoSegment`, `PortMessage.chunks``PortMessage.segments`, `VideoBufferResponse.chunks``VideoBufferResponse.segments`. The `isFirst` header-pin field dropped entirely — meaningless under D-13.
**Verification (in-tree):**
- `npx vitest run` → 28 passed / 2 failed. The two reds are the empirical ffmpeg dry-runs in `tests/offscreen/webm-playback.test.ts`; they assert against the stale Plan 07 fixture (committed in fix-a3 commit 5) and stay RED until the operator regenerates it. The production-driven RED block in `tests/offscreen/segment-keyframes.test.ts` is fully GREEN.
- `npx tsc --noEmit` → clean.
- `npm run build` → succeeds; all 60 modules transformed.
- `! grep -RIn "as any\|@ts-ignore" src/offscreen src/background src/shared` → clean (zero new occurrences in fix scope).
- `! grep -RIn "addChunk\|trimAged\|firstChunkSaved\|isFirst" src/` → clean (old API fully retired).
- `grep -c "getSegments" src/offscreen/recorder.ts` → 2 (export + JSDoc citation).
- 8 new tests in `tests/offscreen/segment-rotation.test.ts` pin the new ring-buffer invariants in place of the retired `ring-buffer.test.ts` first-chunk-pin assertions.
**Operator action required to close §10 #7:** Re-run `./smoke.sh` per the 6-step reproduction. The smoke script regenerates `tests/fixtures/last_30sec.webm` against the D-13 recorder. Then:
1. `npx vitest run tests/offscreen/webm-playback.test.ts` — both assertions should flip GREEN.
2. Open the regenerated `last_30sec.webm` in Chrome's built-in player — should play end-to-end (30 s, no freeze).
3. `/usr/bin/ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null -` — should produce empty stderr.
Once these three checks pass, Plan 07's REQ-video-ring-buffer completion gate is closed and Phase 1 can be marked complete.
**Files changed (5):**
- `src/offscreen/recorder.ts` — D-13 activation (the main rewrite)
- `src/background/index.ts` — segment-semantics adaptation + type renames
- `src/shared/types.ts` — rename + field drop
- `tests/offscreen/ring-buffer.test.ts` — retired (vestigial breadcrumb)
- `tests/offscreen/segment-rotation.test.ts` (new) — pins D-13 invariants
**Commits (6 in fix-a3 cycle on `gsd/phase-01-stabilize-video-pipeline`):** 5530292, 6a1a034, 670daa3, f81438d, 87909d9, and the docs commit landing this resolution.
## Activation Plan (for executor — Plan 01-07 amendment or new Plan 01-08)
**Scope:** ≤5 files. Recommend `/gsd-execute-phase` continuation with a focused executor task, NOT `/gsd-insert-phase 1.1` — the architecture (MediaRecorder, base64 wire format, port keepalive) is unchanged; only the recorder *lifecycle* shape rotates.
1. **`src/offscreen/recorder.ts`** — primary edit:
* Remove `firstChunkSaved`, `addChunk`'s `isFirst` flag-pin logic, the header-pinning branch in `trimAged`.
* Introduce `segments: Blob[]` and `currentChunks: Blob[]` at module scope.
* Introduce `SEGMENT_MS = 10_000` and `MAX_SEGMENTS = 3` constants.
* On `START_RECORDING`: after the first `videoRecorder.start()`, schedule `setTimeout(rotateSegment, SEGMENT_MS)`.
* `rotateSegment()` calls `videoRecorder?.stop()`. Set `videoRecorder.onstop = onSegmentStopped`.
* `onSegmentStopped()`: assemble `currentChunks` into a Blob, push to `segments`, shift if over `MAX_SEGMENTS`, reset `currentChunks`, re-construct `MediaRecorder` on the same `mediaStream`, re-attach `ondataavailable`/`onstop`, call `.start()`, schedule next `rotateSegment` via `setTimeout`.
* `ondataavailable`: push `event.data` to `currentChunks` (no more `addChunk`).
* Add **export** `getSegments(): Blob[]` — returns `[...segments, ...(currentChunks.length > 0 ? [new Blob(currentChunks, { type: 'video/webm' })] : [])]` so an in-flight current segment is also exposed (otherwise SAVE_ARCHIVE during a fresh session would return empty until the first rotation).
* Update `encodeAndSendBuffer()` to iterate segments instead of chunks; each `TransferredVideoChunk` becomes one self-contained per-segment base64 entry (timestamp = segment start ms; isFirst meaningless — drop or repurpose for `segmentIndex`).
* Add `STOP_RECORDING` cleanup: clear the rotation timer + reset `segments` + `currentChunks` on `resetBuffer()`.
2. **`src/background/index.ts`** — `mergeVideoChunks` simplifies: each "chunk" is now already a complete self-contained WebM segment; concatenation gives a multi-EBML-header file. **No SeekHead / Cues injection needed** (Chrome's MSE pipeline handles multi-segment WebMs). Update the function name to `mergeVideoSegments` for clarity (and the log lines).
3. **`src/shared/types.ts`** — clarify `TransferredVideoChunk` doc comment to note that under D-13 each entry represents one self-contained WebM segment. Optionally rename to `TransferredVideoSegment` (cosmetic but reduces future confusion). If renamed, update `port-serialization.test.ts` references.
4. **`tests/offscreen/ring-buffer.test.ts`** — the existing tests pin D-09..D-11 semantics (first-chunk-pin, header retention via `isFirst`). Either:
* Replace with `tests/offscreen/segment-rotation.test.ts` that exercises the new segment-based ring buffer (preferred — the old tests are obsolete invariants), OR
* Keep ring-buffer.test.ts but delete the `isFirst`-pin assertions and rewrite around segment cadence.
The `segment-keyframes.test.ts` production-driven block (the RED one) becomes GREEN once `getSegments` is exported.
5. **Smoke regen + commit fixture:** After the source edits land and `npm test` is GREEN (all 25 tests pass), regenerate `tests/fixtures/last_30sec.webm` via `./smoke.sh` per the documented 6-step reproduction, then commit the fresh fixture in the same commit as the source edits. The empirical `webm-playback.test.ts` only flips GREEN after the regeneration.
**Validation gates:**
- `npm test` → 25/25 pass (all new RED tests GREEN + all pre-existing).
- `npx tsc --noEmit` → clean.
- Manual smoke per the reproduction steps → file plays end-to-end in Chrome's built-in player.
- `/usr/bin/ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null -` → empty stderr (no "Error submitting packet" lines, no "File ended prematurely" line).
**Phase 1 decision retirement:** D-09, D-10, D-11 are retired in favor of D-13. The Phase 1 CONTEXT.md or a new SUMMARY note should record this transition explicitly. RESEARCH.md A3 moves from `HIGH-risk — mitigated by D-12 gate + D-13 fallback (pre-staged)` to `VERIFIED-FAILED — mitigated by D-13 activation in Plan 01-08`.
## Process observation (for GSD framework feedback)
This is the SECOND debug session in Phase 1's life (first: `d12-blob-port-transfer-fails`). Both were issues that the planner explicitly anticipated and pre-staged contingencies for (D-12 ffprobe gate + base64 wire-format research; D-13 restart-segments skeleton). Neither was a planning oversight — both were "the documented HIGH-risk assumption activated as expected." The cycle latency between "manual smoke reveals the issue" and "RED test in place" was ~30 minutes for d12 and ~15 minutes for this session, which suggests the pre-staging strategy is working: contingencies are findable, activatable, and reviewable.
**Pattern worth raising:** When RESEARCH.md flags an assumption as HIGH-risk AND the plan pre-stages a fallback, the executor's smoke-test step (Plan 01-07) should probably *also* be the moment to evaluate "does the simple approach pass the empirical gate or do we need to land the fallback before merging the phase?" — i.e. the smoke step is an A/B gate, not a unilateral confirmation. The current sequence (Plan 01 → 02 → ... → 07 = smoke → debug session if smoke fails) works, but a slightly tighter feedback loop in Plan 07's checklist ("if smoke reveals a HIGH-risk-A3-class issue, escalate to the pre-staged fallback BEFORE creating a debug session") might shorten the orchestration overhead for future phases.
Not a process bug — a possible process refinement. Logging for `/gsd-plan-phase` retro consideration in Phase 2 or beyond.