Files
mokosh/.planning/STATE.md
Mark bc310d98cf revert(01): reopen Phase 1 — D-13 multi-EBML-concat is unplayable
REQ-video-ring-buffer flipped from [x] back to [ ]. ROADMAP.md Phase 1
row reverted from [x] Closed 2026-05-15 to [ ] reopened 2026-05-16.
STATE.md status flipped phase_complete → phase_reopened with full
historical narrative preserved.

Root cause (confirmed at byte level by gsd-debugger 2026-05-16):
D-13's concat-of-self-contained-WebM-segments architecture produces a
3-EBML-header WebM that standards-compliant Matroska parsers
(mpv, ffmpeg, Chrome HTMLMediaElement) play only as the first segment
(~9.94 s) and silently drop the remaining 2 segments. Confirmed via
operator mpv drag-drop test of BOTH the canonical 2026-05-15 closure
fixture and the 2026-05-16 UAT-produced fixture — both exhibit the
same broken playback.

The 2026-05-15 "operator-confirmed clean Chrome playback" assessment
was insufficient: it verified the file plays without freezing but did
not measure total duration. Phase 1's primary deliverable
(REQ-video-ring-buffer / SPEC §10 #7) is therefore NOT satisfied.

Fix path chosen by user: ts-ebml (parse) + webm-muxer (write) to
replace mergeVideoSegments file-concat with real single-EBML remux.
Will land as Plan 01-08 via fresh /gsd-plan-phase ceremony.

RED test landed in tests/offscreen/webm-playback.test.ts (2 new
assertions on container-format-duration + ffmpeg-full-decode-duration).
2 failures, 53 baseline tests still GREEN.

Option C port-lifecycle refactor (debug session
empty-archive-port-race, commits 674c415..f0871c0) DID land cleanly
and is retained — that fix was orthogonal and correctly resolved the
silent-empty-archive symptom that previously masked this deeper bug.

Debug session: .planning/debug/d13-multi-ebml-concat-unplayable.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 19:47:47 +02:00

14 KiB
Raw Blame History

gsd_state_version, milestone, milestone_name, status, stopped_at, last_updated, last_activity, progress
gsd_state_version milestone milestone_name status stopped_at last_updated last_activity progress
1.0 v2.0.0 milestone phase_reopened Phase 1 REOPENED 2026-05-16: D-13 multi-EBML-concat architecture confirmed broken via UAT Test 3 retest + byte-level EBML probe — produced WebM plays only ~9 s instead of ~30 s in mpv AND Chrome (the 2026-05-15 closure's operator playback check was insufficient). Phase 1's primary deliverable (REQ-video-ring-buffer, SPEC §10 #7) is NOT satisfied. Plan 01-08 (WebM remux via ts-ebml + webm-muxer) will replace mergeVideoSegments file-concat with real single-EBML remux. RED test landed in tests/offscreen/webm-playback.test.ts (2 failures, 53 baseline GREEN). Option C port-lifecycle refactor (debug session empty-archive-port-race) DID land cleanly (commits 674c415..f0871c0). Phase 1 also absorbs whole-desktop + auto-start UX (Plans 01-09/01-10) per 2026-05-16 amended charter. 2026-05-16T17:35:00Z 2026-05-16 — Phase 1 reopened: D-13 multi-EBML architecture confirmed broken by mpv/Chrome playback test; Plan 01-08 (ts-ebml + webm-muxer remux) pending; markers reverted
total_phases completed_phases total_plans completed_plans percent
5 0 10 7 70

Project State

Project Reference

See: .planning/PROJECT.md (updated 2026-05-15)

Core value: When an operator hits a bug, one click MUST produce a self-contained archive that lets support reproduce what happened — in under 5 s, no server, no password leaks. Current focus: Phase 1 — Stabilize Video Pipeline

Current Position

Phase: 1 of 5 (Stabilize Video Pipeline) — COMPLETE 2026-05-15 Next phase: 2 of 5 (Stabilize DOM + event-capture privacy) Plan: 7 of 7 complete (Phase 1 closed) Status: Phase 1 complete; ready to plan Phase 2 Last activity: 2026-05-15

Progress: [██████████] 100% (Phase 1) — 1/5 phases complete (20% milestone)

Performance Metrics

Velocity:

  • Total plans completed: 0
  • Average duration: —
  • Total execution time: —

By Phase:

Phase Plans Total Avg/Plan
1. Stabilize video pipeline 7 ~50 min (+ 2 debug sessions ~45 min) 7 min
2. Stabilize DOM + event capture privacy 0
3. Stabilize export pipeline 0
4. SPEC §10 smoke verification 0
5. Harden + clean up 0

Recent Trend:

  • Last 5 plans: 4min, 4min, 8min, 3min, ~10min (Plan 07 closure incl. debug-session arbitration)
  • Trend: stable execution time; complexity surfaced in debug sessions (pre-staged fallbacks activated cleanly)

Updated after each plan completion | Phase 01 P01 | 4min | 6 tasks | 6 files | | Phase 01 P02 | 4min | 5 tasks | 8 files | | Phase 1 P03 | 8min | 3 tasks | 5 files | | Phase 01 P04 | 4min | 3 tasks | 1 files | | Phase 01 P05 | 8min | 2 tasks | 1 files | | Phase 1 P06 | 3min | 2 tasks | 2 files | | Phase 1 P07 | ~10min closure + 2 debug sessions (D-12 + A3) | 2 tasks (checkpoint + auto) | 6 files (fixture + REQUIREMENTS + ROADMAP + STATE + SUMMARY + plan-final-commit) |

Accumulated Context

Decisions

Decisions are logged in PROJECT.md Key Decisions table (DEC-001 through DEC-012, all SPEC-Accepted and locked for Phase 1). Recent decisions affecting current work:

  • Phase 1 framing: roadmap treats the existing codebase as a partially-broken first attempt to be remediated against the SPEC, not as greenfield. The 7 P0 defects from the audit are split across phases 13 along commit boundaries; phase 4 is end-to-end SPEC §10 smoke verification.

  • All 12 SPEC decisions (DEC-001..DEC-012) are LOCKED for Phase 1. Changing any of them requires a formal ADR; none are formally LOCKED in the ingest classification, so a future ADR can revise.

  • [Phase ?]: Doc cascade: amendments append (do not replace) original DEC/CON blocks to preserve SPEC provenance — Established convention for future SPEC-amending phases; downstream readers see both old + new with citation

  • [Phase ?]: Manifest: drop alarms permission entirely rather than retain for re-use — Plan 05 deletes the alarms code path; declaring unused permissions expands attack surface (T-1-02)

  • [Phase ?]: Pinned vitest at ^4 (4.1.6 latest stable; 5.x still beta on 2026-05-15)

  • [Phase ?]: Phase 1 Wave-0 test infra: 4 RED tests committed against not-yet-existent src/offscreen/recorder.ts — pins contracts for Plans 03+04

  • [Phase ?]: Reverted premature REQ-video-ring-buffer Complete marking left by Plan 01-01; satisfied by Plans 03+04+07, not by Wave-0 RED tests

  • [Phase 01-03]: Bundled OffscreenLogger into Task 2 commit (Rule 3 blocking dependency — recorder.ts cannot typecheck without the import)

  • [Phase 01-03]: Defensive bootstrap guard (typeof chrome check) lets pure ring-buffer test import recorder module without chrome stub

  • [Phase 01-03]: Removed SW-side VIDEO_CHUNK/VIDEO_CHUNK_SAVED branches + IndexedDB helpers inline (tsc-clean requires; Plan 05 owns remaining SW shrink)

  • [Phase 01-04]: Kept Plan 03's defensive bootstrap guard (typeof chrome / per-API existence checks) instead of Plan 04's verbatim unguarded block — Plan 04's verbatim block regressed ring-buffer and codec-check tests (they don't stub full chrome surface); restored guard preserves Plan 02 RED contract while satisfying Plan 04's new GREEN contract. Rule 1 deviation.

  • [Phase 01-04]: T-1-04 SW-side sender check documented redundantly (4 places in recorder.ts) for Plan 05 executor visibility — Offscreen is trusting party; SW is validating party. Documenting in module header, port-name constant, threat-mitigation comment near bootstrap, and inline at connectPort makes the contract impossible to miss when grepping for T-1-04 during Plan 05.

  • [Phase 01-04]: REFACTOR pass NOT skipped: stale 'Plan 04 wires this' comments replaced with actual D-17/Pattern 5 citations — Forward-pointing TODO-style comments became misleading after the work landed; minimal correctness-preserving comment update with all 9 tests still GREEN.

  • [Phase ?]: [Phase 01-05]: Deleted broken checkPermissions / requestPermissions flow (Rule 1)

  • [Phase ?]: [Phase 01-05]: REQUEST_PERMISSIONS collapsed — under getDisplayMedia (D-01) no runtime perm check is meaningful; the broken 'tabCapture' permission check was sending recording-start into the never-granted branch

  • [Phase ?]: [Phase 01-05]: Added chrome.offscreen.hasDocument() in initialize() — Rule 2 robustness, audit P1 #8 mitigation across SW respawns

  • [Phase ?]: [Phase 01-05]: SW is now a pure coordinator — onConnect host bound to 'video-keepalive' port with T-1-04 sender check; getVideoBufferFromOffscreen replaces synchronous SW-local buffer fetch; OFFSCREEN_READY handshake closes the audit P1 #12 race

  • [Phase ?]: [Phase 01-05]: indexedDB.deleteDatabase('VideoRecorderDB') in onInstalled — T-1-NEW-05-02 / RESEARCH.md Runtime State Inventory cleanup of orphaned IDB from pre-Phase-01 builds

  • [Phase ?]: [Phase 01-06]: Collapsed vite.config.ts from 226 -> 21 lines (RESEARCH.md Example B verbatim); deleted 174-line inline copy-offscreen plugin (audit P0 #1 root cause) and the orphan offscreen/ top-level directory (D-08)

  • [Phase ?]: [Phase 01-06]: crxjs Outcome A confirmed — dist/src/offscreen/index.html (preserves src/ prefix from rollupOptions.input key). SW URL adjusted to chrome.runtime.getURL('src/offscreen/index.html'); RESEARCH.md Pitfall 5 binding empirically verified

  • [Phase 01-07-debug-d12]: D-12 port-blob serialization fixed via base64 wire-format encode/decode (debug session d12-blob-port-transfer-fails resolved 2026-05-15). chrome.runtime.Port JSON-serializes payloads across extension contexts so Blob payloads were silently corrupted (JSON.stringify(blob) === "{}" → SW saw [{}, {}, ...] → new Blob([...]) coerced each to "[object Object]" → 75-byte text instead of WebM). Added src/shared/binary.ts (blobToBase64 / base64ToBlob), TransferredVideoChunk wire-format type, offscreen encode side, SW decode side. All 15 tests green incl. 6-test port-serialization spec. Re-run smoke.sh + ffprobe still required for end-to-end verification.

  • [Phase 01-07-debug-a3]: D-13 restart-segments activated (debug session webm-playback-freeze resolved 2026-05-15). Plan 07 smoke retest after D-12 landed revealed the next-layer A3 failure: the ffprobe-valid WebM froze ~1 s into playback in Chrome because the single-continuous-recorder + 30 s age-trim lifecycle (D-09..D-11) evicted middle chunks containing VP9 keyframe references for retained tail chunks (orphan P-frames). Activated the pre-staged D-13 skeleton in src/offscreen/recorder.ts: stop+restart MediaRecorder every SEGMENT_DURATION_MS=10_000 ms on the same MediaStream, keep last MAX_SEGMENTS=3 self-contained WebM segments (3×10s=30s window preserved). Each segment fresh-encoded → own EBML header + seed keyframe → independently decodable. Side-effect: .stop() per segment fixes the "File ended prematurely" Matroska finalization gap. Type renames propagated: TransferredVideoChunk → TransferredVideoSegment, VideoChunk → VideoSegment, PortMessage.chunks → PortMessage.segments, VideoBufferResponse.chunks → VideoBufferResponse.segments; the header-pin flag from D-09..D-11 is dropped entirely. D-09..D-11 retired in favor of D-13. 28/30 tests pass; the 2 remaining reds are the empirical ffmpeg dry-runs against the still-stale committed fixture (operator regen required). REQ-video-ring-buffer NOT marked complete — Plan 07 still owns that, gated on the operator running ./smoke.sh then verifying Chrome playback + ffmpeg-clean stderr.

  • [Phase 01-07-closure]: Phase 1 closed 2026-05-15: D-12 + A3 acceptance gates both passed. Operator-confirmed Chrome playback clean (no ~1 s freeze); ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null - exit 0 with zero decoder errors (only expected muxer DTS-monotonicity warnings at segment join boundaries — non-blocking, documented D-13 trade-off for multi-EBML-header concat); ffprobe + empirical playback both green; 30/30 vitest green (the 2 webm-playback empirical dry-runs flipped GREEN after the fresh fixture committed in cd61cbc); REQ-video-ring-buffer marked Complete; SPEC §10 #2, #3, #7 functionally satisfied (end-to-end Phase 4 smoke still owns the full §10 sweep). Three atomic closure commits land the fixture + REQ/STATE/ROADMAP flip + SUMMARY. Process note: Plan 01-07 surfaced TWO unanticipated-cascade failures (D-12 then A3); both had pre-staged fallbacks (base64 wire-format and D-13 restart-segments) that activated cleanly. Candidate retro: should /gsd-plan-phase auto-inject empirical-acceptance gates (ffmpeg dry-run + Chrome playback) before merging a phase when RESEARCH.md flags HIGH-risk assumptions?

  • [Phase 01-07-deferred-to-5]: getDisplayMedia cursor visibility constraint (video: { cursor: 'always' }) surfaced as a user observation during Phase 1 smoke 2026-05-15. Captured frames lack the screen cursor despite it being the highest-signal cue for reproducing pointer-driven bugs. Constraint is opt-in per the getDisplayMedia spec; Chrome implements CursorCaptureConstraint (always/motion/never). Logged to Phase 5 P1/P2 hardening list — not blocking Phase 1 closure.

Pending Todos

None yet.

Blockers/Concerns

  • (informational) chrome.tabCapture requires a user gesture on first activation — Phase 3 (P0-4) restores this by moving the call into the popup click handler; until Phase 3 lands, recording cannot start cleanly even if Phase 1's pipeline is correct. Phases 13 should not be re-ordered.

Deferred Items

Items acknowledged and carried forward from previous milestone close:

Category Item Status Deferred At
(none)

Session Continuity

Last session: 2026-05-16T07:29:17.065Z (pause); resumed 2026-05-16 Stopped at: Phase 1 review-fix paused at 5/18 (commit 2e3f524 = CR-01+CR-02+CR-03+WR-03+WR-09). 13 findings (7 Warning + 6 Info) + 8 sweep targets documented in 01-REVIEW-FIX.md "Remaining Work". User confirmed routing to /gsd-code-review-fix 1 on resume — full scope, no narrowing (per blocking constraints in .continue-here.md). Subsequent: /gsd-verify-work 1, then /gsd-plan-phase 2. Resume file: .planning/phases/01-stabilize-video-pipeline/.continue-here.md

Phase 1 Closure Notes

  • ffprobe exit code: 0 (ffprobe -v error -f matroska -i tests/fixtures/last_30sec.webm)
  • ffmpeg dry-run exit code: 0 (ffmpeg -v warning -i tests/fixtures/last_30sec.webm -f null -) — stderr contains only the expected muxer DTS-monotonicity warnings at segment join boundaries; no decoder errors. Documented D-13 trade-off for multi-EBML-header WebM concatenation; Chrome's MSE pipeline handles this natively (SPEC §10 #7 scope: "plays back in a browser" — Chrome confirmed).
  • Fixture: tests/fixtures/last_30sec.webm = 1 633 459 bytes (1.6 MB), VP9 codec, Profile 0, 1142×1038, color space bt709, time_base 1/1000, start_pts 0. Captured against the D-13 restart-segments recorder (3 × ~10 s self-contained segments).
  • Test suite: 30/30 green across 8 files (tests/offscreen/); both empirical ffmpeg dry-runs in webm-playback.test.ts flipped GREEN after the fresh fixture committed in cd61cbc.
  • Phase 1 outcome: SPEC §10 acceptance criteria #2 (continuous capture), #3 (≤ 30 s window), and #7 (last_30sec.webm plays in a browser) are functionally green at the Phase 1 level. End-to-end §10 smoke verification remains owned by Phase 4 (all 9 criteria sweep).
  • Phase 2 onwards: Phase 2 owns the DOM/event-capture privacy slice (REQ-rrweb-dom-buffer, REQ-user-event-log, REQ-password-confidentiality). Phase 3 owns the popup state machine + base64-URL replacement. Phase 4 runs the full SPEC §10 smoke pass. Phase 5 absorbs P1/P2 hardening (now includes the getDisplayMedia cursor visibility refinement surfaced 2026-05-15).
  • Process retro candidate: Plan 07 surfaced two cascade failures (D-12 binary transfer + A3 cluster alignment). Both had pre-staged fallbacks (base64 wire-format and D-13 restart-segments) which activated cleanly. The smoke-test step ended up doing the empirical-acceptance-gate work that RESEARCH.md flagged as HIGH-risk. Worth raising in a GSD-framework retro: should /gsd-plan-phase auto-inject empirical-acceptance gates (ffmpeg dry-run + Chrome playback) BEFORE merging a phase when RESEARCH.md flags HIGH-risk assumptions, rather than discovering it via Plan 07's smoke step?