docs(01): capture phase 1 context — desktopCapture pivot + offscreen consolidation .planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md .planning/phases/01-stabilize-video-pipeline/01-DISCUSSION-LOG.md

This commit is contained in:
2026-05-15 15:40:44 +02:00
parent 4ba318876b
commit 05f7d1bf64
2 changed files with 385 additions and 0 deletions

View File

@@ -0,0 +1,303 @@
# Phase 1: Stabilize Video Pipeline — Context
**Gathered:** 2026-05-15
**Status:** Ready for planning
<domain>
## Phase Boundary
Stabilize the video-capture pipeline so the extension continuously records the
operator's session into a ring buffer and produces a **playable** WebM on
demand. The phase ends when:
1. The video buffer is alive end-to-end with **one** authoritative source of
truth for the recorder code (no parallel offscreen implementations).
2. The MediaRecorder lifecycle is correct (no shadowed bindings, predictable
start/stop).
3. The buffer holds at most 30 s of footage at any moment AND concatenating
the retained chunks produces a file that `ffprobe -v error` accepts as
playable.
4. State (the buffer itself) survives the MV3 service-worker idle unload
cycle without losing video.
Scope anchor from ROADMAP.md — Phase 2 (DOM/event privacy), Phase 3 (export
UX), Phase 4 (smoke verification), and Phase 5 (P1/P2 hardening) are
explicitly out of scope here.
</domain>
<decisions>
## Implementation Decisions
### Capture API — AMENDS DEC-003
This phase REPLACES the SPEC-locked `chrome.tabCapture` choice with
`getDisplayMedia()` capture. Done eyes-open: the operator gains broader
capture coverage at the cost of the SPEC §1 "silent operation" property.
The doc cascade is enumerated in the **Doc Amendments (precede code)**
subsection below.
- **D-01:** Capture mechanism is `navigator.mediaDevices.getDisplayMedia()`
invoked **inside the offscreen document**. No more
`chrome.tabCapture.getMediaStreamId`, no more SW-side gesture juggling.
- **D-02:** Offscreen document is created with
`chrome.offscreen.Reason.DISPLAY_MEDIA` (replaces `USER_MEDIA`).
- **D-03:** One-time source picker on session start; the operator picks
"screen" or "window" once. If they later click the Chrome "Stop sharing"
banner or the captured source disappears, the offscreen surfaces an error
to the SW and the popup re-prompts on next interaction. (Exact error-UX
copy is deferred to Phase 3 — see Deferred Ideas.)
- **D-04:** Operator UX is **NOT** silent. Chrome's permanent "Sharing your
screen" indicator is shown while recording. We accept this as the cost
of the API choice.
- **D-05:** `manifest.json` permissions follow the new API: `desktopCapture`
replaces `tabCapture`; `activeTab` becomes unnecessary for the video
pipeline but stays for `chrome.tabs.captureVisibleTab` (screenshot path,
Phase 3 concern — kept).
### Offscreen source-of-truth location
- **D-06:** Recorder code lives at **`src/offscreen/recorder.ts`** as a real
TypeScript module with strict type-check, source maps, and IDE support.
- **D-07:** `offscreen/index.html` is rewritten to load the bundled module
via crxjs. The runtime path remains `offscreen/index.html` (referenced
from SW via `chrome.runtime.getURL('offscreen/index.html')`).
- **D-08:** **DELETE** `offscreen/index.ts` (orphaned dead code) and the
entire `copy-offscreen` plugin block in `vite.config.ts:11-184`. crxjs
picks up the new TS entry through the HTML reference.
### Ring-buffer mechanism
- **D-09:** **Single continuous MediaRecorder** for the whole session.
`mediaRecorder.start(2000)` so chunks land on cluster boundaries per the
spec timeslice (DEC-003, SPEC §4.1). No restart strategy at this point.
- **D-10:** Retain the **first emitted chunk** (the chunk produced by the
first `dataavailable` event after `start()`) **indefinitely** — it carries
the EBML header plus the initial cluster. CON-webm-header-retention.
- **D-11:** Drop later chunks once they are older than 30 s, by chunk
arrival timestamp. Keep header + every chunk newer than `now - 30000 ms`.
- **D-12:** Acceptance gate for Phase 1: `ffprobe -v error -f matroska -i
<last_30sec.webm>` must return exit 0 with no decoder warnings on a
fresh-export sample. Plan-checker enforces this as a phase success
criterion.
- **D-13:** **Fallback if D-12 fails:** revise the plan mid-phase to use
*restart-segments* (stop + restart the MediaRecorder every 10 s, keep
the 3 most-recent self-contained segments, concat on save). Documented
as a known fallback so the planner can pre-stage the alternative
structure in PLAN.md.
### Tab-switch behavior
- **D-14:** **Not applicable** under the new capture API. `getDisplayMedia()`
captures a screen or window, not a tab — there is nothing to re-attach
on `chrome.tabs.onActivated`. Phase 1 explicitly **removes** any
tab-switch handling from `src/background/index.ts`.
- **D-15:** Operator switching tabs no longer interrupts the recording —
the buffer keeps filling regardless of active tab.
### State survival across SW unload
- **D-16:** Video buffer **ownership moves to the offscreen document**. The
offscreen survives SW unloads because it holds the
`DISPLAY_MEDIA`-reason capture; chunks accumulate there.
- **D-17:** A long-lived `chrome.runtime.connect` port from offscreen → SW
serves as the keepalive (this is the only mechanism that actually
resets the SW idle timer — `chrome.alarms` callbacks do not, contrary
to DEC-010).
- **D-18:** **DELETE** the `chrome.alarms` keepalive
(`src/background/index.ts:171-178`). DEC-010 and CON-service-worker-keepalive
are amended in the doc cascade below.
- **D-19:** On export, SW requests the buffer from offscreen over the port
(or one-shot `chrome.runtime.sendMessage`). SW does **NOT** cache
chunks. CON-buffer-storage is honored — buffer is plain JS variable in
offscreen memory, no `chrome.storage.session`, no IndexedDB. The
existing IndexedDB code path in `vite.config.ts:43-104` is **DELETED**
along with the inline plugin.
### Doc Amendments (precede code)
These document edits **MUST** ship before any code-touching task in this
phase, so downstream phases see a consistent baseline:
- **D-A1:** Amend `.planning/intel/decisions.md` DEC-003 to record the
`getDisplayMedia` replacement, with rationale and the explicit silent-
operation trade-off. Amend DEC-010 to record port keepalive replacing
alarms keepalive.
- **D-A2:** Amend `.planning/intel/constraints.md` to **RETIRE**
CON-tab-capture-binding and CON-service-worker-keepalive. Add new
CON-display-capture-binding (one-time picker, "Sharing" indicator).
- **D-A3:** Amend `.planning/PROJECT.md` Key Decisions table (DEC-003,
DEC-010) and Constraints section accordingly.
- **D-A4:** Amend `.planning/REQUIREMENTS.md` REQ-video-ring-buffer to
remove "active-tab" wording and update API binding.
- **D-A5:** Amend `.planning/ROADMAP.md` Phase 1 description and Success
Criterion #2 (drop the "tab re-attach" clause).
- **D-A6:** Amend `manifest.json`: swap `tabCapture` → `desktopCapture`
in `permissions`. Keep `activeTab` for the screenshot path.
### Claude's Discretion
- Exact protocol choice for offscreen↔SW messaging (port for keepalive +
sendMessage for one-shot vs port-only).
- Codec strictness: enforce `video/webm; codecs=vp9` via
`MediaRecorder.isTypeSupported`; fail loud if unsupported (no fallback
chain — current code's vp9→vp8→h264→default fallback is removed).
- Internal naming for the new buffer-owning module (offscreen-recorder vs
display-recorder etc.).
- Code-style choices around TS strictness within `src/offscreen/`
(already on `"strict": true` per tsconfig).
</decisions>
<specifics>
## Specific Ideas
- "Fail loud" on missing `getDisplayMedia` support — log a clear error to
the SW console with the failing user-agent string. No silent fallback.
- The ffprobe verification gate is a real CLI invocation in the plan's
acceptance, not a hand-wave — Phase 1 success requires producing a
sample WebM and running ffprobe against it.
- The "stop sharing" recovery path: when the offscreen detects
`MediaStreamTrack.onended`, it should clear the buffer, send a
RECORDING_STOPPED message to SW, and stop attempting to record. The
popup will re-prompt next time the user clicks save (Phase 3
territory).
- The `OFFSCREEN_READY` Message type is already declared in
`src/shared/types.ts:18` but never used. Phase 1 wires up an actual
handshake so SW doesn't fire `START_RECORDING` before the offscreen
listener is registered (this fixes audit item P1 #12 in passing).
</specifics>
<canonical_refs>
## Canonical References
**Downstream agents MUST read these before planning or implementing.**
### Project baseline (current state, will be amended by D-A1..D-A6)
- `.planning/PROJECT.md` — overall project context. Note that DEC-003,
DEC-010, and the Constraints section will be amended by this phase.
- `.planning/REQUIREMENTS.md` §"Video" — REQ-video-ring-buffer wording
changes in this phase.
- `.planning/ROADMAP.md` §"Phase 1" — phase definition and success
criteria (Success Criterion #2 changes).
- `.planning/intel/SYNTHESIS.md` — entry point for ingested intel.
- `.planning/intel/decisions.md` §DEC-003, §DEC-009, §DEC-010.
- `.planning/intel/constraints.md` §CON-video-window, §CON-video-codec,
§CON-webm-header-retention, §CON-tab-capture-binding (RETIRED),
§CON-service-worker-keepalive (RETIRED), §CON-buffer-storage.
### Authoritative SPEC
- `Тз расширение фаза1.md` §2 (stack), §4.1 (video buffer parameters),
§7 (manifest permissions), §8 row 3 (tab-capture binding — being
AMENDED), §10 acceptance criteria #2, #3, #7.
### Audit
- `/home/parf/.claude/plans/dear-claude-there-is-snazzy-fox.md` — full
P0 analysis. Key spots: P0-1 (offscreen duality + mediaRecorder
shadow, `vite.config.ts:113`/`vite.config.ts:208`), P0-2 (WebM ring
buffer + 200 ms timeslice issue), P0-3 (always-on capture), P1 #8
(alarms keepalive ineffective), P1 #12 (OFFSCREEN_READY handshake
missing).
### Chrome APIs (read before researching)
- MDN: `MediaDevices.getDisplayMedia()` — capture semantics, picker UX,
MediaStreamTrack.onended.
- Chrome docs: `chrome.offscreen` API, especially `Reason.DISPLAY_MEDIA`
(Chrome 116+).
- Chrome docs: Service worker lifecycle in MV3 — confirm `chrome.alarms`
vs `chrome.runtime.connect` port behavior with respect to idle timer.
- WebM/Matroska EBML format — cluster boundary semantics; needed only if
the D-13 fallback (restart-segments) activates.
</canonical_refs>
<code_context>
## Existing Code Insights
### Reusable assets
- `src/background/index.ts:16-73` — `videoBuffer: VideoChunk[]` array +
`cleanupVideoBuffer()`. Structural pattern (header retention + age
trim) is **kept**; the implementation is **moved** to
`src/offscreen/recorder.ts`.
- `src/shared/types.ts` — `Message`, `VideoChunk`, `VideoBufferResponse`,
`OFFSCREEN_READY`, `RECORDING_ERROR`. Reuse as-is; wire up
`OFFSCREEN_READY` and `RECORDING_ERROR` in Phase 1.
- `src/shared/logger.ts` — `Logger` (SW context) and `ContentLogger`
(content-script context). Add an `OffscreenLogger` or reuse `Logger`
with prefix `[Offscreen:...]`.
### Established patterns
- Strict TS with `noUnusedLocals`/`noUnusedParameters` (`tsconfig.json`).
New code must comply — no `as any`, no `@ts-ignore` (audit item P1
#13 informs this).
- Russian comments inline are acceptable per project provenance
(see `Тз расширение фаза1.md`, the user-facing UI strings in REQ-popup-ui).
### Files to DELETE in this phase
- `offscreen/index.ts` — dead code (audit P2 #18).
- `offscreen/index.html` — REPLACED by a crxjs-managed equivalent
pointing at the new `src/offscreen/recorder.ts`. (Path stays the same
in dist; source moves.)
- `vite.config.ts:11-184` — the entire `copy-offscreen` inline plugin
block including the embedded JS string + IndexedDB plumbing. Replaces
with a small note + the standard crx() invocation.
- `src/background/index.ts:171-178` — `setupKeepalive()` and its
`chrome.alarms` registration.
- `src/background/index.ts:445-475` — `loadChunkFromIndexedDB()` and
`openIndexedDB()` (the SW-side IDB plumbing).
- `src/background/index.ts:432-442` — the `VIDEO_CHUNK` message handler
that expected Blob over `chrome.runtime.sendMessage` (the
fundamentally broken path from audit P0 #2 — never reachable today
but documented to delete).
### Files to CREATE
- `src/offscreen/recorder.ts` — new module owning getDisplayMedia +
MediaRecorder + ring buffer + keepalive port.
- `src/offscreen/index.html` (or amended `offscreen/index.html`) —
minimal HTML that loads the new module via crxjs.
### Files to MODIFY heavily
- `src/background/index.ts` — owns offscreen lifecycle, port handling,
export-time buffer fetch. Most of the file shrinks.
- `manifest.json` — permissions swap, `offscreen` already present.
- `vite.config.ts` — collapse to a clean crxjs invocation, no inline
plugin.
</code_context>
<deferred>
## Deferred Ideas
- **Error UX for "user stopped sharing" mid-session.** The popup needs a
state for this — Phase 3 territory (REQ-popup-ui state machine
extension).
- **Audio capture.** `getDisplayMedia()` makes audio capture trivial
(`audio: true`), but SPEC §9 explicitly excludes audio from Phase 1
(Phase 2 work — CAP-01). Capture this as an easier-now-than-before
follow-up.
- **Per-tab silent capture mode** as an opt-in via `config.json`. Could
re-introduce tabCapture for installations that prioritize silent
operation over broad coverage. Future phase if there's demand.
- **Cluster-aware EBML trim (ts-ebml).** Not needed for Phase 1 if
continuous + age-trim verifies via ffprobe. Keep on the shelf as a
third fallback under D-13.
- **`chrome.storage.session` cold-start recovery.** Buffer pointer
rehydration after offscreen crash. Phase 5 (Harden + clean up)
territory.
</deferred>
---
*Phase: 01-stabilize-video-pipeline*
*Context gathered: 2026-05-15*

View File

@@ -0,0 +1,82 @@
# Phase 1: Stabilize Video Pipeline — Discussion Log
> **Audit trail only.** Do not use as input to planning, research, or execution agents.
> Decisions are captured in CONTEXT.md — this log preserves the alternatives considered.
**Date:** 2026-05-15
**Phase:** 01-stabilize-video-pipeline
**Areas discussed:** Offscreen source-of-truth location, Ring-buffer mechanism, Tab-switch behavior, State survival across SW unload
---
## Offscreen source-of-truth location
| Option | Description | Selected |
|--------|-------------|----------|
| `src/offscreen/recorder.ts` as crxjs offscreen entry | Real TS module bundled by crxjs; HTML references the bundle; type-check + source maps; deletes both `offscreen/index.ts` and the inline `copy-offscreen` plugin | ✓ |
| `src/offscreen/recorder.ts` via `web_accessible_resources` | Same source file, declared as WAR; reachable from `chrome.runtime.getURL()`; more boilerplate | |
| Keep vite.config.ts inline string | Status quo — hard NO per audit; no type-check, no source maps, easy to lose | |
**User's choice:** crxjs offscreen entry (Recommended)
**Notes:** Aligns with the audit's primary structural recommendation. The inline plugin and dead `offscreen/index.ts` are both deleted in the same phase.
---
## Ring-buffer mechanism
| Option | Description | Selected |
|--------|-------------|----------|
| Single continuous MediaRecorder + age-based trim (`start(2000)`) | Spec-aligned timeslice; retain header forever, drop later chunks by age; verify with ffprobe; pivot to restart-segments if verification fails | ✓ |
| Restart-segments every ~10 s | Each segment self-contained valid WebM; bulletproof playability; ~200300 ms gap per restart (~23% loss); higher RAM (3 segments in flight) | (fallback) |
| Cluster-aware EBML trim via ts-ebml | Theoretically optimal; ~80 KB dep + more test surface | (third fallback, in deferred) |
| Chunk-arrival-time trim, 200 ms timeslice (status quo) | Current broken approach | |
**User's choice:** "It seems that 2 is the most stable? But if you think 1 is better it is up to you."
After follow-up: locked **Option 1** with **Option 2 as documented fallback**.
**Notes:** Decision deferred to Claude after weighing the trade-offs; rationale captured in CONTEXT.md D-09..D-13. Plan-checker enforces ffprobe verification as the Phase 1 success gate; if it fails, the plan revision pivots to restart-segments without re-asking the user.
---
## Tab-switch behavior
| Option | Description | Selected |
|--------|-------------|----------|
| Hard restart (stop old recorder, start fresh on new tab) | SPEC §8-implied; spec-aligned | (rendered N/A by capture-API change) |
| Cross-tab carry-over | Preserve prior tab's buffer until new fills; ~2× RAM during overlap | |
| Per-tab buffers (one per recently-visited tab) | High RAM; risks CON-ram-ceiling | |
| **(emergent option from user: capture whole browser / whole screen)** | Switch capture API to `getDisplayMedia()` / `chrome.desktopCapture`; tab-switch handling becomes N/A | ✓ |
**User's choice:** "May we just record the whole browser? or the whole screen?" → confirmed in the follow-up question as **switch to chrome.desktopCapture / getDisplayMedia**.
**Notes:** This is a project-level amendment to DEC-003 (capture API), not just a Phase 1 implementation detail. Trade-off was made explicit to the user before confirming: Chrome shows a permanent "Sharing your screen" banner (loses SPEC §1 "silent for operator" property), picker dialog required on session start. User accepted. Phase 1's plan therefore includes the doc-amendment cascade (D-A1..D-A6) BEFORE any code work, and removes all `chrome.tabs.onActivated`/`onUpdated` reattachment logic.
---
## State survival across SW unload
| Option | Description | Selected |
|--------|-------------|----------|
| Move buffer ownership to offscreen + long-lived port keepalive | Offscreen survives SW; port resets idle timer (only mechanism that does); buffer in plain offscreen memory; honors CON-buffer-storage | ✓ |
| Same as above + chrome.storage.session belt-and-suspenders | Cold-start recovery on offscreen crash; borderline CON-buffer-storage compliance | (deferred to Phase 5) |
| Keep SW-owned buffer, add proper keepalive port | Smallest diff; SW can still die under memory pressure | |
| Accept buffer loss on SW unload | Don't fight lifecycle; violates "continuous" intent | |
**User's choice:** "one seems no problem." — locked **Option 1**.
**Notes:** `chrome.alarms` keepalive deleted (DEC-010 / CON-service-worker-keepalive amended). The `chrome.storage.session` belt-and-suspenders variant moved to Phase 5 deferred ideas. The SW-side IndexedDB code path in `vite.config.ts:43-104` and `src/background/index.ts:445-475` is deleted alongside.
---
## Claude's Discretion
- Internal protocol choice between `chrome.runtime.connect` port and `chrome.runtime.sendMessage` for offscreen↔SW messaging (decision: port for keepalive, sendMessage for one-shot requests — captured in CONTEXT.md D-19).
- Codec strictness — enforce vp9 only via `MediaRecorder.isTypeSupported`, fail loud if unavailable. No fallback chain (current code's vp9→vp8→h264→default is removed).
- Internal module naming for the offscreen recorder.
- Exact error-UX copy for "user stopped sharing" — deferred to Phase 3 (popup state machine territory).
- Code-style choices within `src/offscreen/` (already constrained by `tsconfig` strict mode).
## Deferred Ideas
- **Audio capture.** `getDisplayMedia({ audio: true })` makes this trivial — but SPEC §9 excludes audio from Phase 1. Phase 2 work.
- **Per-tab silent capture mode** (re-introducing `tabCapture` as an opt-in via `config.json`) for installations that prioritize silent operation over broad coverage.
- **Cluster-aware EBML trim (ts-ebml)** — third-line fallback under D-13 if even restart-segments don't produce playable output. Currently parked.
- **`chrome.storage.session` cold-start buffer-pointer recovery** — Phase 5 (Harden + clean up).
- **Error UX for "user stops sharing"** — popup state extension, Phase 3.