Files
mokosh/.planning/debug/resolved/webm-playback-freeze.md
Mark 872f25d649 docs(fix-a3): resolve webm-playback-freeze debug session, update STATE
Closes the second debug session in Phase 1's life (after d12). Both
sessions resolved fast — ~30 min for d12, ~15 min for the RED-test
landing in this one — because the planner had explicitly pre-staged
contingencies (D-12 ffprobe gate + D-13 restart-segments skeleton)
for the assumptions RESEARCH.md flagged HIGH-risk. Neither was a
planning oversight; both were the documented HIGH-risk assumption
activating as expected.

Changes:
- Moved .planning/debug/webm-playback-freeze.md →
  .planning/debug/resolved/webm-playback-freeze.md (status:
  root-cause-confirmed → resolved).
- Added the Resolution section: root-cause one-liner, applied-fix
  description, the 5 files-changed list, the 6 fix-a3 commit hashes,
  the in-tree verification matrix, and the explicit operator
  next-step (re-run ./smoke.sh, verify Chrome playback +
  ffmpeg-clean stderr + the 2 webm-playback.test.ts assertions
  flipping GREEN, then Phase 1 closes).
- Updated STATE.md frontmatter `stopped_at`, the Decisions log
  with a [Phase 01-07-debug-a3] entry summarising D-13 activation
  + the type renames + the retired old-API surface, and the
  Session Continuity block (timestamp, stopped_at narrative,
  resume-file pointer).

Phase 1 close is still pending operator regen of
tests/fixtures/last_30sec.webm. REQ-video-ring-buffer must not
be marked complete by this commit — Plan 07's §10 #7 acceptance
criterion owns that and only the in-Chrome playback + ffmpeg-clean
stderr (against a freshly regenerated fixture) can close it.
2026-05-15 21:18:36 +02:00

19 KiB
Raw Blame History

slug, status, trigger, created, updated, resolved, phase, plan, related_resolved, resolution_commits
slug status trigger created updated resolved phase plan related_resolved resolution_commits
webm-playback-freeze resolved Phase 1 A3 cluster-alignment failure — last_30sec.webm freezes ~1 s into playback in Chrome despite ffprobe structural validation passing. Surfaced during /gsd-execute-phase 1 Plan 01-07 manual smoke retest after the D-12 binary-transfer fix landed. 2026-05-15 2026-05-15 2026-05-15 1 01-07 d12-blob-port-transfer-fails
5530292 feat(fix-a3)
retire ring-buffer first-chunk pin tests, add segment-rotation contract
6a1a034 feat(fix-a3)
activate D-13 restart-segments in src/offscreen/recorder.ts
670daa3 feat(fix-a3)
adapt SW receive path to segment semantics
f81438d feat(fix-a3)
rename TransferredVideoChunk → TransferredVideoSegment
87909d9 test(fix-a3)
commit debug-session test artifacts + stale fixture

Debug session — WebM playback freeze (A3 cluster alignment)

Symptoms

  • Expected: tests/fixtures/last_30sec.webm plays end-to-end (~30 s of video) in Chrome's built-in player. SPEC §10 #7 acceptance criterion: "архив открывается, last_30sec.webm воспроизводится в браузере" — plays back in the browser.
  • Actual: The file is 2.1 MB of valid VP9 stream metadata (ffprobe passes structural validation; D-12 gate green). When opened in Chrome, playback FREEZES ~1 s in. Decoder cannot continue past the early frames.
  • Error messages (from ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null -):
    • [vist#0:0/vp9 @ ...] [dec:vp9 @ ...] Error submitting packet to decoder: Invalid data found when processing input ×8
    • [in#0/matroska,webm @ ...] File ended prematurely at pos. 2100851 (0x200e73)
  • Timeline: First playback test after the D-12 base64-transfer fix landed (commits c0d9166..bf07619). The fix made the WebM container valid; this is the next-layer failure that the prior session masked.
  • Reproduction:
    1. Build is current (commit bf07619 on gsd/phase-01-stabilize-video-pipeline).
    2. ./smoke.sh (KEEP_PROFILE=1 — extension already loaded). Reload extension at chrome://extensions.
    3. Click extension → wait ~35 s → click "Сохранить отчёт об ошибке".
    4. unzip -p ~/Downloads/session_report_2026-05-15_20-28-58.zip video/last_30sec.webm > /tmp/last_30sec.webm
    5. ffprobe -v error -f matroska -i /tmp/last_30sec.webm; echo $? → exit 0 (D-12 passes)
    6. Open the WebM in Chrome → playback freezes ~1 s in.

Evidence already collected

  • timestamp: 2026-05-15T20:35Z — Keyframe distribution map via ffprobe -select_streams v:0 -show_frames -show_entries frame=key_frame,pts_time:
    • Keyframes at pts_time: 0.000, 0.029, 0.095, then a 26.4-second gap with NO keyframes, then 26.474, 29.843, 33.209, 36.577, 39.945, ... (regular ~3 s cadence)
    • First ~50 packets after t=0.095s are all P-frames (___ flag, no keyframe)
  • timestamp: 2026-05-15T20:35Z — ffmpeg -v warning -i ... -f null - decode dry-run: 8× Error submitting packet to decoder: Invalid data found + File ended prematurely at pos. 2100851. Empirical proof that decoding fails partway through.
  • timestamp: 2026-05-15T20:30Z — Container validity confirmed by ffprobe -show_streams: VP9 codec, profile 0, 912×886, valid color metadata (bt709), start_pts=0, time_base=1/1000. The container is structurally valid; the content is not decodable end-to-end.
  • timestamp: 2026-05-15T20:30Z — Fixture committed at tests/fixtures/last_30sec.webm (2.1 MB) by Plan 07's executor before the playback freeze was discovered. This fixture IS the reproduction case.

Current Focus

  • hypothesis: The ring-buffer trim removes chunks containing P-frames that subsequent retained chunks depend on. MediaRecorder.start(2000) emits chunks at the 2 s timeslice but does NOT force a keyframe at each chunk boundary; VP9's kf_max_dist default places keyframes every ~35 s (bugzilla #1666487 cited in RESEARCH.md). So most "later" chunks contain only P-frames whose reference frames are in earlier (trimmed) chunks. Concretely: chunk 1 contains a keyframe + ~0.1 s of frames; the ring buffer keeps chunk 1 (header retention per D-10) plus the most recent 30 s of chunks. But the keyframe needed for the retained recent chunks lives in trimmed-out middle chunks, so decoding hits a wall just past chunk 1's end.
  • Secondary cause: The WebM lacks proper MediaRecorder.stop() finalization (no Cues/SegmentSize markers) because the SW reads the in-memory buffer mid-stream without stopping the recorder. Hence "File ended prematurely". This compounds the freeze but is not the root cause; even with proper finalization, the keyframe gap would still break playback.
  • next_action: RED tests have landed (see Evidence below). Hand off to executor for D-13 activation per the Resolution / Activation Plan section below.
  • expecting: RED today on (a) empirical fixture decode and (b) production getSegments API. D-13 activation + fresh fixture regeneration flips both GREEN.
  • reasoning_checkpoint: A3 was explicitly flagged HIGH-risk in RESEARCH.md and D-13 was specifically pre-staged for this. The keyframe map empirically matches the predicted failure exactly. This is NOT a "we missed it" situation — it's "the documented contingency activated as expected." The RED tests are landed first before any source edit per TDD discipline + the GSD-ceremony feedback the user gave earlier in this session (no hot-fixes).
  • specialist_hint: chrome-extension-mv3 — the fix lives in the MediaRecorder lifecycle in the offscreen document; the format constraints come from VP9/WebM/Matroska spec. There is no language-specialist agent for this in the current dispatcher table, so engineering:debug or a manual review path is appropriate.

Pre-existing fix material (D-13 skeleton)

Per Phase 1 CONTEXT.md decisions D-13 + Plan 01-03's SUMMARY, a commented-out restart-segments skeleton already lives at the bottom of src/offscreen/recorder.ts (lines 298-316). The activation plan needs to:

  1. Replace the single-continuous-MediaRecorder lifecycle with a segment-based one (stop+restart every ~10 s on the same MediaStream)
  2. Keep the last 3 segments in memory (3 × 10 s = 30 s)
  3. Drop D-09..D-11's first-chunk-pin logic (obsolete under restart-segments — each segment is self-contained, has its own header)
  4. Reuse the D-12 base64 wire-format per-segment for the 3 segments
  5. SW concatenates 3 self-contained WebMs (multi-EBML-header file; Chrome handles this; spec §10 #7 only requires it plays in a browser, so Chrome's acceptance is sufficient)

Out of scope for this session

  • Playback in players other than Chrome. SPEC §10 #7 only requires Chrome playback. VLC / mpv may handle multi-EBML-header WebMs differently. Not a Phase 1 concern.
  • Audio capture. Phase 2 / SPEC §9.
  • The "File ended prematurely" finalization gap. Restart-segments solves it as a side effect (each segment gets a proper .stop()). No separate fix needed.

Evidence

  • timestamp: 2026-05-15T20:38Z — RED test #1 landed: tests/offscreen/webm-playback.test.ts. Two assertions:
    • ffmpeg dry-run on last_30sec.webm produces zero decoder packet errors — FAILS with expected 1 to be 0 (the one "last message repeated 7 times" Line means 8 actual events, ffmpeg condenses the report).
    • ffmpeg dry-run on last_30sec.webm does not end prematurely — FAILS with expected true to be false. Both failures cite the exact ffmpeg stderr that originally surfaced the bug, so a regression bisect lands on a useful diff. Skip-fence via it.skipIf(!ffmpegAvailable()) so CI environments without ffmpeg auto-skip rather than fail.
  • timestamp: 2026-05-15T20:40Z — RED test #2 landed: tests/offscreen/segment-keyframes.test.ts. Three describe blocks:
    • documentation block — pure-simulation tests that pass today, encode the D-09..D-11 failure mode as executable evidence (regression guard against re-introducing the single-continuous-recorder semantics post-fix).
    • GREEN-pinning block — pure-simulation tests that pin the D-13 segment-keyframe invariant; pass today as a forward contract for the fix reviewer.
    • production-driven RED block — imports src/offscreen/recorder.ts and asserts (i) getSegments is exported as a function, (ii) it returns at most 3 Blobs. FAILS today (the export does not exist); flips GREEN when D-13 is activated and a getSegments export is added.
  • timestamp: 2026-05-15T20:40Z — Full vitest run: 4 failed | 21 passed (25 total). Pre-existing 15/15 tests still pass; the 4 failures are exactly the new RED tests above (2 in webm-playback, 2 in segment-keyframes). npx tsc --noEmit passes without diagnostics — the new tests are type-clean.

Eliminated

  • Container corruption due to base64-transfer wire format. Already fixed by the d12 session; ffprobe -show_streams shows valid VP9, 912×886, bt709 metadata. Container is well-formed; payload semantics are the failure.
  • MIME-type misdetection on the SW side. merged.type === 'video/webm' is enforced by mergeVideoChunks; the SW's base64ToBlob(wire.data, wire.type || VIDEO_MIME_FALLBACK) round-trips correctly per the GREEN-pinning block of tests/offscreen/port-serialization.test.ts.
  • Chunk ordering bug. mergeVideoChunks sorts by timestamp before concatenation; the keyframe-map shows monotonically increasing pts_time after the gap, ruling out a sort-order issue.
  • Audio interference. getDisplayMedia({ video: true, audio: false }) — no audio track exists to interleave.
  • VP9 codec misconfiguration. videoBitsPerSecond: 400_000 + mimeType: 'video/webm;codecs=vp9' is the Chrome-supported config (codec-check test asserts MediaRecorder.isTypeSupported('video/webm;codecs=vp9') === true).

Resolution

Root cause: Single continuous MediaRecorder + 30 s age-trim ring buffer (D-09..D-11) loses VP9 keyframe references when chunks in the middle of the recording are evicted. The pinned first chunk's keyframe anchors only the first ~0.1 s; every subsequent retained chunk's P-frames reference keyframes that lived in trimmed chunks. Chrome's decoder fails the moment it has to render a frame whose I-frame predecessor is missing — observed empirically as freeze at ~1 s of playback. Secondary issue: mid-stream buffer read without MediaRecorder.stop() means Matroska SegmentSize / Cues are never written, producing the File ended prematurely line; D-13's per-segment .stop() finalizes this naturally.

Fix applied (2026-05-15): Activated the pre-staged D-13 restart-segments skeleton in src/offscreen/recorder.ts. Recorder lifecycle replaced: every SEGMENT_DURATION_MS = 10_000 ms the recorder calls .stop() (finalizes the segment naturally), onstop assembles currentChunks into one self-contained ~10 s WebM Blob, pushes to segments, evicts oldest if over MAX_SEGMENTS = 3, and constructs a fresh MediaRecorder on the SAME mediaStream — preserving the user gesture, seeding a new EBML header + initial VP9 keyframe in the new segment. SW-side mergeVideoSegments concatenates the segments sequentially; Chrome plays multi-EBML-header WebMs natively (SPEC §10 #7 scope). The retired D-09..D-11 API (addChunk, trimAged, getBuffer, firstChunkSaved, isFirst) was deleted in the same atomic commits; new public API surface is getSegments, pushSegmentForTest, resetBuffer, MAX_SEGMENTS, SEGMENT_DURATION_MS, VIDEO_BUFFER_DURATION_MS, assertCodecSupported. Types renamed: TransferredVideoChunkTransferredVideoSegment, VideoChunkVideoSegment, PortMessage.chunksPortMessage.segments, VideoBufferResponse.chunksVideoBufferResponse.segments. The isFirst header-pin field dropped entirely — meaningless under D-13.

Verification (in-tree):

  • npx vitest run → 28 passed / 2 failed. The two reds are the empirical ffmpeg dry-runs in tests/offscreen/webm-playback.test.ts; they assert against the stale Plan 07 fixture (committed in fix-a3 commit 5) and stay RED until the operator regenerates it. The production-driven RED block in tests/offscreen/segment-keyframes.test.ts is fully GREEN.
  • npx tsc --noEmit → clean.
  • npm run build → succeeds; all 60 modules transformed.
  • ! grep -RIn "as any\|@ts-ignore" src/offscreen src/background src/shared → clean (zero new occurrences in fix scope).
  • ! grep -RIn "addChunk\|trimAged\|firstChunkSaved\|isFirst" src/ → clean (old API fully retired).
  • grep -c "getSegments" src/offscreen/recorder.ts → 2 (export + JSDoc citation).
  • 8 new tests in tests/offscreen/segment-rotation.test.ts pin the new ring-buffer invariants in place of the retired ring-buffer.test.ts first-chunk-pin assertions.

Operator action required to close §10 #7: Re-run ./smoke.sh per the 6-step reproduction. The smoke script regenerates tests/fixtures/last_30sec.webm against the D-13 recorder. Then:

  1. npx vitest run tests/offscreen/webm-playback.test.ts — both assertions should flip GREEN.
  2. Open the regenerated last_30sec.webm in Chrome's built-in player — should play end-to-end (30 s, no freeze).
  3. /usr/bin/ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null - — should produce empty stderr.

Once these three checks pass, Plan 07's REQ-video-ring-buffer completion gate is closed and Phase 1 can be marked complete.

Files changed (5):

  • src/offscreen/recorder.ts — D-13 activation (the main rewrite)
  • src/background/index.ts — segment-semantics adaptation + type renames
  • src/shared/types.ts — rename + field drop
  • tests/offscreen/ring-buffer.test.ts — retired (vestigial breadcrumb)
  • tests/offscreen/segment-rotation.test.ts (new) — pins D-13 invariants

Commits (6 in fix-a3 cycle on gsd/phase-01-stabilize-video-pipeline): 5530292, 6a1a034, 670daa3, f81438d, 87909d9, and the docs commit landing this resolution.

Activation Plan (for executor — Plan 01-07 amendment or new Plan 01-08)

Scope: ≤5 files. Recommend /gsd-execute-phase continuation with a focused executor task, NOT /gsd-insert-phase 1.1 — the architecture (MediaRecorder, base64 wire format, port keepalive) is unchanged; only the recorder lifecycle shape rotates.

  1. src/offscreen/recorder.ts — primary edit:

    • Remove firstChunkSaved, addChunk's isFirst flag-pin logic, the header-pinning branch in trimAged.
    • Introduce segments: Blob[] and currentChunks: Blob[] at module scope.
    • Introduce SEGMENT_MS = 10_000 and MAX_SEGMENTS = 3 constants.
    • On START_RECORDING: after the first videoRecorder.start(), schedule setTimeout(rotateSegment, SEGMENT_MS).
    • rotateSegment() calls videoRecorder?.stop(). Set videoRecorder.onstop = onSegmentStopped.
    • onSegmentStopped(): assemble currentChunks into a Blob, push to segments, shift if over MAX_SEGMENTS, reset currentChunks, re-construct MediaRecorder on the same mediaStream, re-attach ondataavailable/onstop, call .start(), schedule next rotateSegment via setTimeout.
    • ondataavailable: push event.data to currentChunks (no more addChunk).
    • Add export getSegments(): Blob[] — returns [...segments, ...(currentChunks.length > 0 ? [new Blob(currentChunks, { type: 'video/webm' })] : [])] so an in-flight current segment is also exposed (otherwise SAVE_ARCHIVE during a fresh session would return empty until the first rotation).
    • Update encodeAndSendBuffer() to iterate segments instead of chunks; each TransferredVideoChunk becomes one self-contained per-segment base64 entry (timestamp = segment start ms; isFirst meaningless — drop or repurpose for segmentIndex).
    • Add STOP_RECORDING cleanup: clear the rotation timer + reset segments + currentChunks on resetBuffer().
  2. src/background/index.tsmergeVideoChunks simplifies: each "chunk" is now already a complete self-contained WebM segment; concatenation gives a multi-EBML-header file. No SeekHead / Cues injection needed (Chrome's MSE pipeline handles multi-segment WebMs). Update the function name to mergeVideoSegments for clarity (and the log lines).

  3. src/shared/types.ts — clarify TransferredVideoChunk doc comment to note that under D-13 each entry represents one self-contained WebM segment. Optionally rename to TransferredVideoSegment (cosmetic but reduces future confusion). If renamed, update port-serialization.test.ts references.

  4. tests/offscreen/ring-buffer.test.ts — the existing tests pin D-09..D-11 semantics (first-chunk-pin, header retention via isFirst). Either:

    • Replace with tests/offscreen/segment-rotation.test.ts that exercises the new segment-based ring buffer (preferred — the old tests are obsolete invariants), OR
    • Keep ring-buffer.test.ts but delete the isFirst-pin assertions and rewrite around segment cadence. The segment-keyframes.test.ts production-driven block (the RED one) becomes GREEN once getSegments is exported.
  5. Smoke regen + commit fixture: After the source edits land and npm test is GREEN (all 25 tests pass), regenerate tests/fixtures/last_30sec.webm via ./smoke.sh per the documented 6-step reproduction, then commit the fresh fixture in the same commit as the source edits. The empirical webm-playback.test.ts only flips GREEN after the regeneration.

Validation gates:

  • npm test → 25/25 pass (all new RED tests GREEN + all pre-existing).
  • npx tsc --noEmit → clean.
  • Manual smoke per the reproduction steps → file plays end-to-end in Chrome's built-in player.
  • /usr/bin/ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null - → empty stderr (no "Error submitting packet" lines, no "File ended prematurely" line).

Phase 1 decision retirement: D-09, D-10, D-11 are retired in favor of D-13. The Phase 1 CONTEXT.md or a new SUMMARY note should record this transition explicitly. RESEARCH.md A3 moves from HIGH-risk — mitigated by D-12 gate + D-13 fallback (pre-staged) to VERIFIED-FAILED — mitigated by D-13 activation in Plan 01-08.

Process observation (for GSD framework feedback)

This is the SECOND debug session in Phase 1's life (first: d12-blob-port-transfer-fails). Both were issues that the planner explicitly anticipated and pre-staged contingencies for (D-12 ffprobe gate + base64 wire-format research; D-13 restart-segments skeleton). Neither was a planning oversight — both were "the documented HIGH-risk assumption activated as expected." The cycle latency between "manual smoke reveals the issue" and "RED test in place" was ~30 minutes for d12 and ~15 minutes for this session, which suggests the pre-staging strategy is working: contingencies are findable, activatable, and reviewable.

Pattern worth raising: When RESEARCH.md flags an assumption as HIGH-risk AND the plan pre-stages a fallback, the executor's smoke-test step (Plan 01-07) should probably also be the moment to evaluate "does the simple approach pass the empirical gate or do we need to land the fallback before merging the phase?" — i.e. the smoke step is an A/B gate, not a unilateral confirmation. The current sequence (Plan 01 → 02 → ... → 07 = smoke → debug session if smoke fails) works, but a slightly tighter feedback loop in Plan 07's checklist ("if smoke reveals a HIGH-risk-A3-class issue, escalate to the pre-staged fallback BEFORE creating a debug session") might shorten the orchestration overhead for future phases.

Not a process bug — a possible process refinement. Logging for /gsd-plan-phase retro consideration in Phase 2 or beyond.