# Phase 1: Stabilize Video Pipeline — Research **Researched:** 2026-05-15 **Domain:** Chrome MV3 extension, offscreen documents, `getDisplayMedia`, `MediaRecorder` ring buffer, WebM container, SW lifecycle, Vite + crxjs. **Confidence:** HIGH on Chrome API contracts; HIGH on canonical patterns (verified against an in-the-wild production extension); MEDIUM on `MediaRecorder` cluster-boundary alignment with `timeslice=2000ms` (the spec is silent and Chromium docs are silent — published evidence is indirect; we have a mitigation in place via the D-13 fallback). ## User Constraints (from CONTEXT.md) ### Locked Decisions **Capture API — AMENDS DEC-003** This phase REPLACES the SPEC-locked `chrome.tabCapture` choice with `getDisplayMedia()` capture. Done eyes-open: the operator gains broader capture coverage at the cost of the SPEC §1 "silent operation" property. The doc cascade is enumerated in the **Doc Amendments (precede code)** subsection below. - **D-01:** Capture mechanism is `navigator.mediaDevices.getDisplayMedia()` invoked **inside the offscreen document**. No more `chrome.tabCapture.getMediaStreamId`, no more SW-side gesture juggling. - **D-02:** Offscreen document is created with `chrome.offscreen.Reason.DISPLAY_MEDIA` (replaces `USER_MEDIA`). - **D-03:** One-time source picker on session start; the operator picks "screen" or "window" once. If they later click the Chrome "Stop sharing" banner or the captured source disappears, the offscreen surfaces an error to the SW and the popup re-prompts on next interaction. (Exact error-UX copy is deferred to Phase 3 — see Deferred Ideas.) - **D-04:** Operator UX is **NOT** silent. Chrome's permanent "Sharing your screen" indicator is shown while recording. We accept this as the cost of the API choice. - **D-05:** `manifest.json` permissions follow the new API: `desktopCapture` replaces `tabCapture`; `activeTab` becomes unnecessary for the video pipeline but stays for `chrome.tabs.captureVisibleTab` (screenshot path, Phase 3 concern — kept). **Offscreen source-of-truth location** - **D-06:** Recorder code lives at **`src/offscreen/recorder.ts`** as a real TypeScript module with strict type-check, source maps, and IDE support. - **D-07:** `offscreen/index.html` is rewritten to load the bundled module via crxjs. The runtime path remains `offscreen/index.html` (referenced from SW via `chrome.runtime.getURL('offscreen/index.html')`). - **D-08:** **DELETE** `offscreen/index.ts` (orphaned dead code) and the entire `copy-offscreen` plugin block in `vite.config.ts:11-184`. crxjs picks up the new TS entry through the HTML reference. **Ring-buffer mechanism** - **D-09:** **Single continuous MediaRecorder** for the whole session. `mediaRecorder.start(2000)` so chunks land on cluster boundaries per the spec timeslice (DEC-003, SPEC §4.1). No restart strategy at this point. - **D-10:** Retain the **first emitted chunk** (the chunk produced by the first `dataavailable` event after `start()`) **indefinitely** — it carries the EBML header plus the initial cluster. CON-webm-header-retention. - **D-11:** Drop later chunks once they are older than 30 s, by chunk arrival timestamp. Keep header + every chunk newer than `now - 30000 ms`. - **D-12:** Acceptance gate for Phase 1: `ffprobe -v error -f matroska -i ` must return exit 0 with no decoder warnings on a fresh-export sample. Plan-checker enforces this as a phase success criterion. - **D-13:** **Fallback if D-12 fails:** revise the plan mid-phase to use *restart-segments* (stop + restart the MediaRecorder every 10 s, keep the 3 most-recent self-contained segments, concat on save). Documented as a known fallback so the planner can pre-stage the alternative structure in PLAN.md. **Tab-switch behavior** - **D-14:** **Not applicable** under the new capture API. `getDisplayMedia()` captures a screen or window, not a tab — there is nothing to re-attach on `chrome.tabs.onActivated`. Phase 1 explicitly **removes** any tab-switch handling from `src/background/index.ts`. - **D-15:** Operator switching tabs no longer interrupts the recording — the buffer keeps filling regardless of active tab. **State survival across SW unload** - **D-16:** Video buffer **ownership moves to the offscreen document**. The offscreen survives SW unloads because it holds the `DISPLAY_MEDIA`-reason capture; chunks accumulate there. - **D-17:** A long-lived `chrome.runtime.connect` port from offscreen → SW serves as the keepalive (this is the only mechanism that actually resets the SW idle timer — `chrome.alarms` callbacks do not, contrary to DEC-010). - **D-18:** **DELETE** the `chrome.alarms` keepalive (`src/background/index.ts:171-178`). DEC-010 and CON-service-worker-keepalive are amended in the doc cascade below. - **D-19:** On export, SW requests the buffer from offscreen over the port (or one-shot `chrome.runtime.sendMessage`). SW does **NOT** cache chunks. CON-buffer-storage is honored — buffer is plain JS variable in offscreen memory, no `chrome.storage.session`, no IndexedDB. The existing IndexedDB code path in `vite.config.ts:43-104` is **DELETED** along with the inline plugin. **Doc Amendments (precede code)** These document edits **MUST** ship before any code-touching task in this phase, so downstream phases see a consistent baseline: - **D-A1:** Amend `.planning/intel/decisions.md` DEC-003 to record the `getDisplayMedia` replacement, with rationale and the explicit silent- operation trade-off. Amend DEC-010 to record port keepalive replacing alarms keepalive. - **D-A2:** Amend `.planning/intel/constraints.md` to **RETIRE** CON-tab-capture-binding and CON-service-worker-keepalive. Add new CON-display-capture-binding (one-time picker, "Sharing" indicator). - **D-A3:** Amend `.planning/PROJECT.md` Key Decisions table (DEC-003, DEC-010) and Constraints section accordingly. - **D-A4:** Amend `.planning/REQUIREMENTS.md` REQ-video-ring-buffer to remove "active-tab" wording and update API binding. - **D-A5:** Amend `.planning/ROADMAP.md` Phase 1 description and Success Criterion #2 (drop the "tab re-attach" clause). - **D-A6:** Amend `manifest.json`: swap `tabCapture` → `desktopCapture` in `permissions`. Keep `activeTab` for the screenshot path. ### Claude's Discretion - Exact protocol choice for offscreen↔SW messaging (port for keepalive + sendMessage for one-shot vs port-only). - Codec strictness: enforce `video/webm; codecs=vp9` via `MediaRecorder.isTypeSupported`; fail loud if unsupported (no fallback chain — current code's vp9→vp8→h264→default fallback is removed). - Internal naming for the new buffer-owning module (offscreen-recorder vs display-recorder etc.). - Code-style choices around TS strictness within `src/offscreen/` (already on `"strict": true` per tsconfig). ### Deferred Ideas (OUT OF SCOPE) - **Error UX for "user stopped sharing" mid-session.** The popup needs a state for this — Phase 3 territory (REQ-popup-ui state machine extension). - **Audio capture.** `getDisplayMedia()` makes audio capture trivial (`audio: true`), but SPEC §9 explicitly excludes audio from Phase 1 (Phase 2 work — CAP-01). Capture this as an easier-now-than-before follow-up. - **Per-tab silent capture mode** as an opt-in via `config.json`. Could re-introduce tabCapture for installations that prioritize silent operation over broad coverage. Future phase if there's demand. - **Cluster-aware EBML trim (ts-ebml).** Not needed for Phase 1 if continuous + age-trim verifies via ffprobe. Keep on the shelf as a third fallback under D-13. - **`chrome.storage.session` cold-start recovery.** Buffer pointer rehydration after offscreen crash. Phase 5 (Harden + clean up) territory. ## Phase Requirements | ID | Description | Research Support | |----|-------------|------------------| | REQ-video-ring-buffer | 30 s active-tab video ring buffer captured via `MediaRecorder` at `video/webm; codecs=vp9` @ 400 kbps with 2 s timeslice. AMENDED: capture API is `getDisplayMedia()` (D-01), not `chrome.tabCapture`. First chunk (WebM header) retained indefinitely (CON-webm-header-retention); subsequent chunks rotate out by 30 s TTL. **Capture is always-on**: starts on first popup invocation, runs continuously regardless of which tab the operator is on (no tab re-attach needed — display capture is screen/window-bound, not tab-bound). | (1) Canonical pattern for SW + offscreen + getDisplayMedia confirmed by Google sample + working production extension (Proscreen-S3). (2) WebM header / cluster trim semantics documented under "Pitfall 1" + "Validation Architecture". (3) Port-keepalive replaces alarm-keepalive per Chrome 110+ docs. (4) `MediaRecorder.start(2000)` semantics documented under Pitfall 1 with D-13 fallback if cluster alignment fails ffprobe gate. | ## Project Constraints (from CLAUDE.md) > No project-level CLAUDE.md exists at `/home/parf/projects/work/repremium/CLAUDE.md`. > User's global `~/.claude/CLAUDE.md` applies — relevant excerpts: - **Iterative development:** Small, reviewable changes. Break large work into phases. Plans should be concise (< 100 lines); detail goes into context/research files. - **Extension over duplication:** Add functionality to existing code via options/parameters rather than parallel implementations. *(Applies to reusing `videoBuffer`/`cleanupVideoBuffer` patterns from the current SW — preserve structure, relocate to offscreen.)* - **Defensive coding:** Validate external dependencies and environment early; fail fast with clear error messages. *(Codec fail-loud via `MediaRecorder.isTypeSupported`; track-ended detection.)* - **Naming:** Full words, `isFoo`/`hasFoo`/`shouldFoo` for booleans, `SCREAMING_SNAKE` for true constants. - **Tools first:** Use automated tools before manual edits. *(crxjs handles the offscreen build; do not hand-roll Vite plugins.)* - **Verify claims before presenting.** Cite authoritative sources. - **TypeScript:** Type arrow-function parameters explicitly. - **Don't ignore lint/type errors without research.** *(Maps to audit P1 #13: no `as any`, no `@ts-ignore` in new code.)* - **Naming convention violation already in repo:** `mediaRecorder` (camel) shadowing module-level `let mediaRecorder` is the exact P0 #2 defect we are fixing — rename module-level to avoid recurrence. > **Note on the codebase's Russian inline comments:** The user's global > rule prefers Python/Google style guides, but this repo is a TypeScript > extension built to a Russian-authored SPEC. Inline Russian comments are > idiomatic and preserved per the SPEC's source-of-truth language (also > reaffirmed in CONTEXT.md "Established patterns"). User-facing strings > ("Сохранить отчёт об ошибке" etc.) are part of the contract. ## Summary The audit's seven P0 defects boil down to two structural problems in this phase: **(a) the offscreen runtime lives as a string literal inside `vite.config.ts:11-184` and shadows the real `offscreen/index.ts`, with a shadow `let mediaRecorder` that makes `stopRecording` a no-op**; **(b) the ring-buffer math is right in `src/background/index.ts` but the lifecycle plumbing is wrong**: `mediaRecorder.start(200)` produces too-short chunks that mostly don't start on WebM cluster boundaries, capture only begins when the popup is opened, the SW's `chrome.alarms` keepalive does run but the SW still loses its `videoBuffer` array between idle unloads, and the SW's `VIDEO_CHUNK` message handler expects a Blob that `chrome.runtime.sendMessage` cannot transmit (forcing the buggy IndexedDB workaround in `vite.config.ts:43-104`). CONTEXT.md amends DEC-003 to `getDisplayMedia()` instead of `chrome.tabCapture` — eyes-open trade-off, broader capture coverage at the cost of the Chrome "Sharing your screen" banner. This is a canonical Chrome MV3 pattern: [CITED: developer.chrome.com/docs/extensions/how-to/web-platform/screen-capture] "To record in the background and across navigations, use an offscreen document with the DISPLAY_MEDIA reason." We have at least one in-the-wild production extension (Proscreen-S3) confirming the exact architecture works. **Primary recommendation:** Build `src/offscreen/recorder.ts` as a real TS module that owns: (1) a single continuous `MediaRecorder` started with `timeslice=2000`, (2) the in-memory ring buffer with WebM-header pinning and 30 s arrival-timestamp trim, (3) a long-lived `chrome.runtime.connect` port to the SW that doubles as the SW keepalive, and (4) a single on-demand `GET_BUFFER` handler that returns the chunks for ZIP packaging. The SW shrinks to: offscreen lifecycle management + port handling + manifest-time recording bootstrap. The verification gate is `ffprobe -v error` on a fresh export sample — if that fails because cluster boundaries don't align with the 2 s timeslice, fall back to D-13's restart-segments strategy (pre-staged in PLAN.md so we don't have to re-plan mid-phase). ## Architectural Responsibility Map | Capability | Primary Tier | Secondary Tier | Rationale | |------------|-------------|----------------|-----------| | Display capture (`getDisplayMedia`) | Offscreen Document | — | SW has no DOM and cannot hold a `MediaStream`. Chrome 116+ requires `chrome.offscreen.Reason.DISPLAY_MEDIA`. [CITED: developer.chrome.com/docs/extensions/reference/api/offscreen] | | MediaRecorder lifecycle | Offscreen Document | — | `MediaRecorder` instances are tied to a `MediaStream` which lives in the offscreen DOM context. | | In-memory ring buffer | Offscreen Document | — | SW unloads after ~30 s idle (Chrome 110+ rules); offscreen survives because it owns the `DISPLAY_MEDIA` capture. | | Codec capability check (`isTypeSupported`) | Offscreen Document | — | API is on `MediaRecorder`, which is offscreen-bound. SW reports the result for telemetry. | | Offscreen lifecycle (create / close / hasDocument) | Service Worker | — | `chrome.offscreen.*` API is SW-bound. | | Long-lived port keepalive | Offscreen Document → SW | — | Offscreen initiates `chrome.runtime.connect()` because it is the long-living party with a real reason to stay alive. SW receives the port. | | Buffer export on user action | Service Worker | Offscreen Document | SW receives popup message, requests buffer from offscreen over the port, returns chunks to popup. | | Manifest permission boundary | Manifest | — | `desktopCapture` for the API name (CONTEXT.md D-A6); `offscreen` to gate `chrome.offscreen.*`. Note: `getDisplayMedia()` itself is a web standard API and does NOT require `desktopCapture` (which gates only `chrome.desktopCapture.chooseDesktopMedia`). Including `desktopCapture` is harmless and matches CONTEXT.md D-05. [VERIFIED: chrome.desktopCapture API docs] | | Stop-sharing recovery | Offscreen Document | Service Worker | `MediaStreamTrack.onended` fires inside offscreen; offscreen messages SW; SW updates state for popup (popup state machine is Phase 3 territory). | ## Standard Stack ### Core | Library | Version | Purpose | Why Standard | |---------|---------|---------|--------------| | `@crxjs/vite-plugin` | `^2.4.0` (currently `^2.0.0-beta.25` in `package.json`) | Vite plugin that reads `manifest.json`, bundles each entry (SW, content scripts, popup, offscreen HTML), and produces a Chrome-loadable `dist/`. | Standard build for MV3 + TS + Vite per the project's existing setup (DEC-012). [VERIFIED: npm view @crxjs/vite-plugin version returned 2.4.0 on 2026-05-15] | | `@types/chrome` | `^0.1.42` (currently `^0.0.268` in `package.json`) | Type definitions for the `chrome.*` namespace including `chrome.offscreen.Reason.DISPLAY_MEDIA`. | Audit P1 #13 calls out that the current `0.0.268` is stale; the project needs to bump to drop the `as any` on `reasons: ['USER_MEDIA']`. [VERIFIED: npm view @types/chrome version returned 0.1.42 on 2026-05-15] | | `vite` | `^8.0.13` (currently `^5.4.2` in `package.json`) | Bundler. | Already a hard project decision (DEC-012). Phase 1 does NOT mandate a Vite bump — sticking with 5.4 is fine; the bump is a Phase 5 housekeeping task. [VERIFIED: npm view vite version returned 8.0.13 on 2026-05-15] | | `typescript` | `^6.0.3` (currently `^5.5.4` in `package.json`) | Type-check. Strict mode is already enabled in `tsconfig.json`. | Project decision. Phase 1 keeps 5.5; same Phase 5 housekeeping observation. [VERIFIED: npm view typescript version returned 6.0.3 on 2026-05-15] | > **No new dependencies are needed for Phase 1.** `JSZip` and `rrweb` > stay untouched (Phase 2 / 3 territory). All new code uses the standard > Web Platform APIs (`MediaRecorder`, `navigator.mediaDevices`, > `chrome.offscreen`, `chrome.runtime.connect`). ### Supporting (Phase 1 specifically uses) | Library | Version | Purpose | When to Use | |---------|---------|---------|-------------| | Web Platform: `MediaRecorder` | Built-in | Encode the captured `MediaStream` into a chunked WebM stream. | Inside the offscreen, after `getDisplayMedia()` returns a stream. | | Web Platform: `navigator.mediaDevices.getDisplayMedia` | Built-in | Acquire the operator's choice of screen/window/tab as a `MediaStream`. | Inside the offscreen, once on session start, in the message handler for `START_RECORDING`. | | Chrome API: `chrome.offscreen.{createDocument, closeDocument, hasDocument, Reason}` | Chrome 109+ for API; Chrome 116+ recommended baseline (matches the canonical Google sample's `minimum_chrome_version`). | Create + tear down the offscreen runtime. | SW only. | | Chrome API: `chrome.runtime.{connect, sendMessage, onConnect, onMessage}` | Built-in | Cross-context messaging. | Both SW and offscreen. | ### Alternatives Considered (Honored CONTEXT.md, recorded for completeness) | Instead of | Could Use | Tradeoff | |------------|-----------|----------| | `getDisplayMedia()` in offscreen | `chrome.tabCapture.getMediaStreamId` in SW + `getUserMedia({chromeMediaSource: 'tab'})` in offscreen (canonical Google sample pattern) | Tab-scoped only; silent (no Chrome banner); requires user-gesture juggling on first activation; loses capture on tab switch. **Rejected per CONTEXT.md D-01.** | | `getDisplayMedia()` in offscreen | `chrome.desktopCapture.chooseDesktopMedia` in SW + redeem ID in offscreen | Chrome-specific; doc explicitly says streamId not usable in offscreen MV3 [CITED: groups.google.com chromium-extensions/3RanHldyp9c]. **Not viable.** | | Single continuous recorder + age-trim | Restart-segments (10 s self-contained segments, keep 3 most-recent) | Each segment is its own valid WebM, concat-on-save is trivial, but burns ~3× more keyframes (bigger files). **Held in reserve as D-13 fallback** if `ffprobe -v error` fails on the simpler approach. | | Restart-segments | ts-ebml header injection on save | More plumbing, dependency, and runtime cost. **Held in reserve as third fallback per CONTEXT.md deferred.** | **Installation:** No `npm install` needed for Phase 1 (zero new deps). Type-bump for `@types/chrome` (`^0.0.268` → `^0.1.42`) is a one-line `package.json` edit, optional within this phase but recommended. **Version verification:** All package versions in the table above are verified via `npm view version` on 2026-05-15. ## Architecture Patterns ### System Architecture Diagram ``` ┌────────────────────────────────────────────────────────────────────────┐ │ Operator interactions │ └────────────────────────────────────────────────────────────────────────┘ │ click popup ▼ ┌────────────┐ REQUEST_PERMISSIONS / GET_VIDEO_BUFFER ┌──────────────┐ │ popup │ ──────────────────────────────────────────► │ Service │ │ (Russian │ ◄────────────────────────────────────────── │ Worker │ │ state-mc) │ responses │ (background) │ └────────────┘ └──────┬───────┘ │ chrome.offscreen.createDocument ({reasons:['DISPLAY_MEDIA']}) │ ▼ ┌──────────────────┐ long-lived │ Offscreen Doc │ port (keepalive + │ (DOM context) │ buffer fetch) │ │ SW ◄──────────────────────────────►│ recorder.ts │ │ - getDisplayMedia │ - MediaRecorder │ │ - ring buffer │ │ - track.onended │ └─────┬────────────┘ │ navigator.mediaDevices .getDisplayMedia() │ ▼ [ Chrome native ] [ source picker ] [ + Sharing UI ] │ ▼ ┌──────────────────┐ │ MediaStream │ │ (screen/window) │ └─────┬────────────┘ │ MediaRecorder.start(2000) │ ▼ dataavailable chunks (every ~2000 ms) │ ▼ in-memory ring buffer (offscreen JS array) Data flow on export (Phase 3 territory but the SW↔offscreen contract is locked here): popup --SAVE_ARCHIVE--> SW --GET_BUFFER--> offscreen offscreen --VIDEO_CHUNKS--> SW --(merge)--> popup --(jszip + download) ``` | Component | File | Responsibilities | |-----------|------|------------------| | Operator-facing popup | `src/popup/index.{ts,html,css}` | UI state machine, click handlers, archive trigger. Phase 3 owns most edits; Phase 1 touches it only minimally to unwire the dead `REQUEST_PERMISSIONS` path. | | Service Worker (background coordinator) | `src/background/index.ts` | Offscreen lifecycle (`createDocument` / `closeDocument` / `hasDocument`), port handling, buffer-fetch on export, message routing. **Shrinks substantially** in this phase. | | Offscreen recorder (NEW) | `src/offscreen/recorder.ts` | `getDisplayMedia` call, `MediaRecorder` instance, ring buffer, codec capability check, port to SW (keepalive + on-demand buffer push), `MediaStreamTrack.onended` handler. | | Offscreen page (NEW) | `src/offscreen/index.html` | Minimal HTML referencing `recorder.ts` via ``. crxjs picks it up. | | Manifest | `manifest.json` | Swap `tabCapture` → `desktopCapture`. Add nothing else; `offscreen` is already declared. | | Vite config | `vite.config.ts` | Collapse to a clean `crx({manifest, contentScripts: {injectCss: false}})` + `rollupOptions.input` entry for offscreen HTML. Delete the entire 174-line `copy-offscreen` plugin block. | ### Recommended Project Structure ``` repremium/ ├── manifest.json # swap tabCapture→desktopCapture ├── vite.config.ts # collapse to ~30 lines ├── src/ │ ├── background/ │ │ └── index.ts # shrinks: lifecycle + port + export │ ├── content/ │ │ └── index.ts # untouched in Phase 1 │ ├── popup/ │ │ ├── index.html # untouched in Phase 1 │ │ ├── index.ts # minor: drop dead REQUEST_PERMISSIONS path │ │ └── style.css # untouched │ ├── offscreen/ # NEW directory (replaces top-level offscreen/) │ │ ├── index.html # NEW: ``` ### Example B — Minimal `vite.config.ts` (REPLACES the 184-line current one) ```typescript // Source: crxjs documentation + discussion #919 import { defineConfig } from 'vite'; import { crx } from '@crxjs/vite-plugin'; import manifest from './manifest.json'; export default defineConfig({ plugins: [ crx({ manifest, contentScripts: { injectCss: false } }), ], build: { rollupOptions: { input: { offscreen: 'src/offscreen/index.html', }, }, }, }); ``` ### Example C — SW: ensure-offscreen pattern (snippet for `src/background/index.ts`) ```typescript // Source: github.com/GoogleChrome/chrome-extensions-samples/tree/main/functional-samples/sample.tabcapture-recorder/service-worker.js // [VERIFIED: canonical Google sample, license Apache-2.0] async function ensureOffscreenDocument(): Promise { const existing = await chrome.runtime.getContexts({ contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT], }); if (existing.length > 0) return; await chrome.offscreen.createDocument({ url: 'src/offscreen/index.html', reasons: [chrome.offscreen.Reason.DISPLAY_MEDIA], justification: 'Continuous screen recording for operator session diagnostics', }); } ``` ### Example D — ffprobe verification (used in the acceptance gate D-12) ```bash # Source: ffmpeg.org/ffprobe.html, exit code semantics: # 0 = recognized media; >0 = could not open / not multimedia / decode error # Force-format -f matroska because WebM is a Matroska subset and helps # ffprobe choose the right demuxer when the file is "live" (no SeekHead). ffprobe -v error -f matroska -i last_30sec.webm echo "ffprobe exit: $?" # Optional: dump cluster timeline for diagnosis if exit != 0 ffprobe -v error -show_packets -i last_30sec.webm 2>&1 | head -50 ``` ### Example E — Codec capability strict-mode (CONTEXT.md D-20) ```typescript // Source: MDN MediaRecorder.isTypeSupported + CONTEXT.md D-20 const VIDEO_MIME = 'video/webm;codecs=vp9'; const VIDEO_BITRATE = 400_000; // CON-video-codec const TIMESLICE_MS = 2000; // CON-video-codec / SPEC §4.1 if (!MediaRecorder.isTypeSupported(VIDEO_MIME)) { const ua = navigator.userAgent; chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: `vp9 unsupported. UA=${ua}`, }); throw new Error(`MediaRecorder mime not supported: ${VIDEO_MIME}; UA=${ua}`); } const videoRecorder = new MediaRecorder(stream, { mimeType: VIDEO_MIME, videoBitsPerSecond: VIDEO_BITRATE, }); videoRecorder.start(TIMESLICE_MS); ``` ### Example F — `MediaStreamTrack.onended` for "Stop sharing" ```typescript // Source: MDN MediaStreamTrack#ended_event stream.getTracks().forEach((track) => { track.addEventListener('ended', () => { // Clear the buffer (the captured source is gone) ringBuffer.length = 0; // Disconnect the port so SW can clean up port?.disconnect(); // Notify SW for state transition; popup state change is Phase 3 territory chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: 'user-stopped-sharing' }); // Stop the recorder explicitly if (videoRecorder.state !== 'inactive') videoRecorder.stop(); }, { once: true }); }); ``` ## State of the Art | Old Approach | Current Approach | When Changed | Impact | |--------------|------------------|--------------|--------| | Background page (persistent) in MV2 | MV3 service worker | Chrome 88 → MV3 default; MV2 sunset 2024 | All capture APIs must be reachable from SW or offscreen, NOT a persistent page. Drives the SW + offscreen split. | | `chrome.desktopCapture.chooseDesktopMedia` returning a streamId redeemable in any context | streamId from `chrome.desktopCapture` not usable in offscreen MV3 | Chrome 109+ offscreen API rollout | Forces the choice between (a) tabCapture + USER_MEDIA pattern (canonical Google sample) or (b) getDisplayMedia + DISPLAY_MEDIA pattern (CONTEXT.md D-01..D-05). [CITED: groups.google.com chromium-extensions/3RanHldyp9c] | | `chrome.alarms` as the universal SW keepalive | Long-lived port `postMessage` traffic | Chrome 110+ "all events reset idle timer" + Chrome 114 "Sending a message with long-lived messaging keeps the service worker alive" + Chrome 116 WebSockets | Alarms still work in Chrome 110+ but are no longer the recommended primary keepalive for offscreen-paired extensions. [CITED: developer.chrome.com/blog/longer-esw-lifetimes] | | `rrweb.record({maskInputSelector: ...})` | `rrweb.record({maskInputFn: ...})` | rrweb 2.0.0-alpha | Not Phase 1 territory (Phase 2 owns it), but flagged because the audit lists it as a P0. The current code uses `maskTextSelector` which is yet a third thing and is wrong (audit P0 #6). | | Tab capture as active-tab-bound, requiring re-attach on `chrome.tabs.onActivated` | Display capture as screen/window-bound, NO re-attach (CONTEXT.md D-14/D-15) | This phase (DEC-003 AMENDED) | Deletes `chrome.tabs.onActivated` and `chrome.tabs.onUpdated` listener requirements from REQ-video-ring-buffer. | **Deprecated/outdated:** - `chrome.tabCapture.capture()` (the legacy callback form) — replaced by `chrome.tabCapture.getMediaStreamId` + offscreen `getUserMedia` redemption. We're abandoning this whole path per CONTEXT.md D-01. - `mandatory: { chromeMediaSource: 'tab' }` constraint syntax — Chrome-specific extension to `getUserMedia`. Phase 1 doesn't use it (we use the standard `getDisplayMedia`). ## Assumptions Log | # | Claim | Section | Risk if Wrong | |---|-------|---------|---------------| | A1 | Restart-segments fallback structural sketch (Pattern 3) | Architecture Patterns / Pattern 3 | Low — pattern is an inferred application of standard MediaRecorder semantics; if it fails, we have the third-tier ts-ebml deferred fallback. The risk is implementation-time, not phase-blocking. | | A2 | Chrome enforces ~5 minute lifetime on long-lived ports (Pattern 5 / Pitfall 4) | Pitfall 4 | MEDIUM — multiple community sources corroborate, but no canonical Chrome doc states the exact limit. If the limit is shorter, our reconnect should still recover. If longer, our 290s reconnect is just defensive overhead. | | A3 | `MediaRecorder.start(2000)` produces chunks that align with cluster boundaries about half the time (consequence of Chrome's `kf_max_dist=100` and 30 fps default) | Pitfall 1 / Pattern 2 | HIGH — this is the load-bearing claim that makes Pattern 2 work *at all*. The ffprobe gate (D-12) is exactly the mitigation; if ffprobe rejects, we escalate to Pattern 3 by design. So the assumption is **already mitigated by the plan's fallback structure**. | | A4 | Chrome propagates transient user activation through `chrome.runtime.sendMessage` for the SW → offscreen → `getDisplayMedia` chain | Pattern 1 + Pitfall 2 | LOW — verified against a real production extension (Proscreen-S3) doing exactly this. Mitigation: OFFSCREEN_READY handshake (Pattern 4) tightens the timing window so we never exceed the ~5 s activation budget. | | A5 | The 30-second window's "30" is an upper bound, not an exact target (CON-video-window allows ±10 s slack for the restart-segments fallback) | Pattern 3 | LOW — REQUIREMENTS.md says "the most recent 30 seconds" and "no more than 30 seconds", which our restart-segments stays inside (3×10 s = 30 s exactly at one phase of rotation, dropping to 20 s right after rotation). User confirmation desirable but the contract permits it. | | A6 | `getDisplayMedia()` does NOT need `desktopCapture` permission in the manifest (it's a web standard API; `desktopCapture` only gates `chrome.desktopCapture.chooseDesktopMedia`) | Architectural Responsibility Map (Manifest row) + Standard Stack | LOW — multiple sources confirm. CONTEXT.md D-05 chooses to declare `desktopCapture` anyway, which is harmless. If we DROPPED `desktopCapture` from the manifest, the only ill effect would be losing the option to call `chrome.desktopCapture.chooseDesktopMedia` (which we don't use). | | A7 | The `chrome.runtime.getContexts` API is available in Chrome ≥ 116 and is the recommended way to test for an existing offscreen document (replaces `chrome.offscreen.hasDocument`) | Pattern 1 / Example C | MEDIUM — `chrome.offscreen.hasDocument` is the older, simpler check and still works. The canonical Google sample uses `getContexts`. Either works; planner can pick. | **If this table contains items:** The planner should treat them as candidates for user verification during `/gsd-plan-phase` review. ## Open Questions 1. **Will `MediaRecorder.start(2000)` produce ffprobe-clean WebM on a typical screen-cap?** - What we know: Cluster boundaries align with keyframes; Chrome keyframes appear every ~3-5 s by default (vp9 `kf_max_dist=100` on a 30 fps stream); timeslice does NOT force keyframes. - What's unclear: How often *in practice* does a 2 s timeslice happen to land at a cluster boundary for a desktop screen-cap (which has lots of static frames and may have different keyframe cadence than a webcam)? - Recommendation: Build Pattern 2 first; run the D-12 ffprobe gate; keep Pattern 3 (restart-segments) pre-staged in PLAN.md per CONTEXT.md D-13 so we don't re-plan if Pattern 2 fails. Plan-checker can ratchet this in the success criteria. 2. **Does the 5-minute port lifetime kill the recording session?** - What we know: Multiple corroborating community sources cite a ~5 minute hard cap on long-lived ports. - What's unclear: Whether the cap applies to *port lifetime* (the port object dies and must be reconnected) OR to *SW lifetime extension* (after 5 minutes of port keepalive, the SW is killed anyway and the port goes with it). - Recommendation: Pessimistic — assume the worst, reconnect every ~290 s. Cheap defensive code. If we learn the cap is different, the reconnect is still harmless. 3. **What's the exact crxjs path-emit behavior for the offscreen entry?** - What we know: The discussion #919 working answer uses `input: { offscreen: 'src/offscreen/offscreen.html' }` and SW fetches `chrome.runtime.getURL('src/offscreen/offscreen.html')`. - What's unclear: Some crxjs versions strip the leading `src/`; the 2.0.0-beta vs 2.4.0 difference might matter. - Recommendation: After the first `npm run build`, inspect `dist/` to confirm the actual emitted path, then encode that path as a constant in SW. This is a verifiable runtime check, not a design decision. ## Environment Availability | Dependency | Required By | Available | Version | Fallback | |------------|------------|-----------|---------|----------| | Node.js | Vite, TypeScript, npm | ✓ | v24.14.0 | — | | npm | Dep install | ✓ | 11.9.0 | — | | ffprobe (FFmpeg) | D-12 acceptance gate; ffprobe-based verification of every export sample | ✓ | 8.1.1 | None needed (ffprobe is the gate) | | Chrome / Chromium | Manual smoke test (unpacked load → Сохранить отчёт → inspect dist) | ✗ | — | Plan must call out "manual test requires Chrome ≥ 116; install via `apt install google-chrome-stable` or note the gap to the operator." | | Playwright / chromium-test-runner | Optional headed-Chrome integration tests (see Validation Architecture) | ✗ | — | Phase 1 acceptance does NOT require Playwright. Manual smoke is acceptable per ROADMAP Phase 4. If we want unit-test coverage for the trim logic, Vitest in node mode is enough. | | node_modules/ | `vite build`, `tsc` | ✗ | — | Run `npm install` at start of phase; no fallback. | **Missing dependencies with no fallback (blocking execution):** - `node_modules/` — must run `npm install` once before any TS/Vite work. Add as Wave 0 task. **Missing dependencies with fallback (acceptable):** - Chrome browser — manual smoke is Phase 4's job; for Phase 1, type-check + ffprobe-on-test-fixture is the deepest automated gate. If the developer doesn't have Chrome installed, the plan still completes; the Phase 4 ROADMAP item is where Chrome becomes mandatory. - Playwright — not needed; see Validation Architecture below for why. ## Validation Architecture Nyquist validation is enabled (`workflow.tdd_mode: true` in `.planning/config.json`). The validation strategy is layered: ### Test Framework | Property | Value | |----------|-------| | Framework | **Vitest** (Node mode for pure logic; Browser mode if needed for `MediaRecorder` mocks) — recommended, NOT currently installed. Vite is already a dev dep so Vitest is a zero-friction add. | | Config file | NONE — Wave 0 creates `vitest.config.ts`. | | Quick run command | `npx vitest run --reporter=dot` (after install) | | Full suite command | `npx vitest run` + `npm run build` (typecheck via `tsc --noEmit`) + ffprobe gate (D-12) | **Why not Jest:** `vite` is already the build tool; Vitest is the zero-config-mismatch choice. No transformer dance for TS. **Why not Playwright:** `MediaRecorder` + `getDisplayMedia` ARE driveable in Chromium via Playwright with permissions auto-granted, but the acceptance gate (ffprobe on a real exported file) requires actually running the extension. Manual smoke + ffprobe is sufficient for Phase 1. Playwright-driven smoke tests are Phase 4/5 territory. **What's testable in Node-only Vitest:** - Ring buffer logic (`addChunk`, `trimAged`) — pure function, takes `{data: {size: number}, timestamp: number, isHeader: boolean}[]` and returns the trimmed array. Mock `Blob` as `{size: N, type: 'video/webm'}`. - Message handlers (mock `chrome.runtime` with `vitest-chrome` or a lightweight stub). - Port lifecycle / reconnect logic. - Codec strict-mode error path (mock `MediaRecorder.isTypeSupported` → false). **What's NOT testable in Vitest, requires manual smoke / Phase 4:** - The actual `getDisplayMedia` flow (browser picker). - Real WebM playability (covered by ffprobe gate on a test-fixture file). - SW idle-unload survival (covered by manual DevTools "Force stop" test in Phase 4 smoke checklist). ### Phase Requirements → Test Map | Req ID | Behavior | Test Type | Automated Command | File Exists? | |--------|----------|-----------|-------------------|--------------| | REQ-video-ring-buffer | Ring buffer adds chunk; first chunk gets `isHeader: true` | unit | `npx vitest run tests/offscreen/ring-buffer.test.ts -t "first chunk is header"` | ❌ Wave 0 | | REQ-video-ring-buffer | Ring buffer evicts chunks older than 30 s; keeps header | unit | `npx vitest run tests/offscreen/ring-buffer.test.ts -t "trim 30s"` | ❌ Wave 0 | | REQ-video-ring-buffer | Codec strict-mode throws when vp9 unsupported (D-20) | unit | `npx vitest run tests/offscreen/codec-check.test.ts` | ❌ Wave 0 | | REQ-video-ring-buffer | OFFSCREEN_READY message sent on listener registration | unit | `npx vitest run tests/offscreen/handshake.test.ts` | ❌ Wave 0 | | REQ-video-ring-buffer | Port reconnect on disconnect within 1 s | unit | `npx vitest run tests/offscreen/port.test.ts -t "reconnects"` | ❌ Wave 0 | | REQ-video-ring-buffer | SW deletes alarms keepalive (D-18) | type-check / grep | `! grep -RIn "chrome.alarms" src/background/` | NO CODE NEEDED (CI grep) | | REQ-video-ring-buffer | SW deletes IndexedDB code path (D-19) | grep | `! grep -RIn "VideoRecorderDB\|openIndexedDB" src/` | NO CODE NEEDED (CI grep) | | REQ-video-ring-buffer | `vite.config.ts:11-184` inline plugin deleted (D-08) | grep | `! grep -RIn "copy-offscreen\|chromeMediaSource" vite.config.ts` | NO CODE NEEDED | | REQ-video-ring-buffer (acceptance gate D-12) | `last_30sec.webm` plays ffprobe-clean | integration (manual smoke + ffprobe) | `ffprobe -v error -f matroska -i sample/last_30sec.webm; echo $?` | ❌ Sample fixture produced manually for this gate, OR captured by Playwright in Phase 4. **For Phase 1, run on the file the manual smoke produces.** | | REQ-video-ring-buffer | Type-check passes with zero `as any` and zero `@ts-ignore` regressions | static | `npx tsc --noEmit && ! grep -RIn "as any\|@ts-ignore" src/` | EXISTS (`tsc --noEmit` in `npm run build`) | | REQ-video-ring-buffer | Manifest permission swap (D-A6 / D-05) | grep | `! grep "tabCapture" manifest.json && grep "desktopCapture" manifest.json` | NO CODE NEEDED | | REQ-video-ring-buffer | Build produces a loadable extension | manual | `npm run build && ls dist/manifest.json dist/src/offscreen/index.html dist/assets/*.js` | NO TEST FILE; CI shell check | ### Sampling Rate - **Per task commit:** `npx vitest run --reporter=dot && npx tsc --noEmit` (≤ 10 s). - **Per wave merge:** Full Vitest + `npm run build` + grep guards (≤ 30 s). - **Phase gate (D-12):** Manually load `dist/` into Chrome, capture a test session, click save, run `ffprobe -v error -f matroska -i ~/Downloads/session_report_*.zip:video/last_30sec.webm` (extract via `unzip -p`), confirm exit 0 with zero stderr lines. ### Wave 0 Gaps - [ ] Install Vitest: `npm install -D vitest@^3 @vitest/ui` (verify current major via `npm view vitest version` at the time of install). - [ ] `vitest.config.ts` — pull in path aliases from `tsconfig.json`. - [ ] `tests/offscreen/` directory with at minimum: - `ring-buffer.test.ts` — covers REQ-video-ring-buffer trim & header pinning. - `codec-check.test.ts` — covers D-20 strict-mode error path. - `handshake.test.ts` — covers Pattern 4 OFFSCREEN_READY. - `port.test.ts` — covers Pattern 5 reconnect. - [ ] `tests/fixtures/` — keep a known-good WebM for ffprobe sanity (e.g. produced once on a developer machine and committed). Used by CI to verify the ffprobe gate runs at all. - [ ] `npm test` script in `package.json`: `"test": "vitest run"`. - [ ] CI? — out of scope per audit P2 #22 (Phase 5). ## Security Domain > Default per `.planning/config.json`: `security_enforcement` is absent → > treated as enabled (per researcher contract). ### Applicable ASVS Categories | ASVS Category | Applies | Standard Control | |---------------|---------|-----------------| | V2 Authentication | No | No authentication surface in Phase 1 (local-only, no server). | | V3 Session Management | No | No sessions. | | V4 Access Control | Yes (limited) | Manifest permissions are the access-control boundary. Minimize: `desktopCapture` is unnecessary if we use only `getDisplayMedia` (web API), but harmless. `tabCapture` is being REMOVED. `host_permissions: [""]` remains for content-script injection (Phase 2 territory). | | V5 Input Validation | Yes (limited) | The only "input" Phase 1 handles is the streamId NOT applicable (we don't use streamIds in the new path) and inter-context messages. Each `chrome.runtime.onMessage` handler should validate `msg.type` against the typed `MessageType` enum (already exists in `src/shared/types.ts`). | | V6 Cryptography | No | No crypto. | | V14 Configuration | Yes | `manifest.json` enumerates the permission set verbatim. The Doc-Cascade tasks (D-A1..D-A6) keep `.planning/intel/constraints.md` in lockstep with `manifest.json`. | ### Known Threat Patterns for {Chrome MV3 extension} | Pattern | STRIDE | Standard Mitigation | |---------|--------|---------------------| | Untrusted message origin (cross-extension message injection) | Spoofing | Every `chrome.runtime.onMessage` listener should check `sender.id === chrome.runtime.id`. The current code doesn't; Phase 1 should add it where it adds new listeners (low effort). | | `` host permission exposes the SW to messages from any content script on any site | Tampering | Already in design (REQ-manifest-permissions). The mitigation is that the SW only processes messages from its own content script (validated by `sender.id` check). | | Stored video buffer contains sensitive operator session data | Information Disclosure | CON-buffer-storage: in-memory only, no persistence. CONTEXT.md D-19 reinforces (no IndexedDB, no `chrome.storage.session`). | | Captured video may show passwords typed into other apps (since `getDisplayMedia` can grab the whole screen) | Information Disclosure | OUT OF SCOPE per Phase 1: this is exactly the trade-off accepted in CONTEXT.md D-04. The Chrome "Sharing" banner is the user-facing mitigation. Phase 2's password masking applies to rrweb / event-log, not to video pixels. | | `eval` or string-injected code | Tampering | The `vite.config.ts:35-213` inline-string offscreen JS is effectively static (no user input), but it IS string-injected build output. CSP for MV3 extensions disallows `eval`, but a long template literal is allowed. Phase 1 DELETES this, which is also a security improvement. | **Phase 1 has no novel security surface** beyond the manifest swap (D-A6) and the sender-id check best-practice. ## Sources ### Primary (HIGH confidence) - developer.chrome.com — `chrome.offscreen` API reference, `Reason` enum values, including `DISPLAY_MEDIA`: — confirmed via direct fetch on 2026-05-15. - developer.chrome.com — Audio recording and screen capture guide, including the canonical "use offscreen + DISPLAY_MEDIA" sentence: — fetched verbatim via gh API on 2026-05-15. - developer.chrome.com — Service worker lifecycle: — fetched, confirms Chrome 110 "all events reset idle timer", Chrome 114 "message via long-lived messaging keeps SW alive". - developer.chrome.com — Longer extension SW lifetimes blog: . - developer.chrome.com — `chrome.alarms` API reference: — confirms 30 s minimum period (Chrome 120+) for store-loaded; unpacked has no limit. - GoogleChrome/chrome-extensions-samples — `functional-samples/sample.tabcapture-recorder/`: — fetched all files via gh API; confirms the offscreen + USER_MEDIA pattern (the close cousin of our DISPLAY_MEDIA pattern). - MDN — `MediaRecorder.start()`: — confirms timeslice is purely time-based, NOT codec-aware. - ffmpeg.org — ffprobe documentation: — exit code semantics for the D-12 gate. ### Secondary (MEDIUM confidence — verified with multiple sources) - bugzilla.mozilla.org #1666487 — quote from Andreas Pehrson: — Chrome's default keyframe cadence (`kf_max_dist=100`) cross-confirmed by Chrome's MediaRecorder README. - crxjs/chrome-extension-tools — Discussion #919 "Set up offscreen with TypeScript": — and follow-up #1060: working pattern for HTML + TS module entry. - Mozilla Firefox bug #1666487 — Pehrson's design rationale on timeslice-vs-keyframe. - Graham King's blog — "Reading MediaRecorder's webm/opus output": — third-party EBML walkthrough, confirms that MediaRecorder doesn't split on SimpleBlock. - chrome-extensions-samples issue #1111 — "Sample for chrome.offscreen": — confirms there is NO official sample for DISPLAY_MEDIA + getDisplayMedia. - ngocquy020196/Proscreen-S3 — in-the-wild production extension: + `src/offscreen/recorder.ts` — confirms the exact CONTEXT.md D-01..D-05 architecture works in practice. - schniti269/meeting_mate — second corroborating real extension: . - crxjs.dev — Vite plugin docs: — confirms manifest-driven entry but multi-entry HTML needs `rollupOptions.input`. - GitHub gist sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e — MV3 keepalive patterns including port reconnect at 290 s. - developer.chrome.com issue #2688 — clarifies that the original "native messaging port keeps SW alive" claim has caveats. ### Tertiary (LOW confidence — flagged for cross-validation) - chromium-extensions group thread — getDisplayMedia in offscreen: — one thread suggests user-gesture issues in offscreen; this appears contradicted by Proscreen-S3 working. Resolution: empirical testing during Wave 1 (manual smoke). - recall.ai blog post on how to build a Chrome recording extension: — uses tabCapture pattern (not our path), but confirms the high-level three-component split. - Stack Overflow #62236838 — concatenation of MediaRecorder WebM chunks: cited content via WebSearch results only (no direct fetch — site blocked); pattern matches what I confirmed via Graham King's blog and ts-ebml docs. ## Metadata **Confidence breakdown:** - Standard stack & versions: HIGH — all verified via `npm view`. - Architecture (offscreen + DISPLAY_MEDIA + port keepalive): HIGH — verified against (a) official Chrome docs, (b) Google sample (offscreen + USER_MEDIA — same architectural shape), (c) at least two in-the-wild production extensions doing the exact DISPLAY_MEDIA path. - Ring-buffer pattern: MEDIUM-HIGH — the structural pattern is solid; the open question is cluster-boundary alignment of `start(2000)`, which is *the* assumption the ffprobe gate (D-12) and the D-13 fallback are designed to handle. - Common pitfalls: HIGH — every pitfall ties to a specific audit defect or a citable Chrome doc / Chromium bug. - Validation strategy: MEDIUM — the unit-testable surface is real and documented; the integration test gap (browser/picker) is genuine but accepted (Phase 4 territory). - Security: HIGH for what's in scope; nothing exotic. **Research date:** 2026-05-15 **Valid until:** 2026-06-15 (30 days, stable-ecosystem assumption). Re-validate sooner if Chrome releases a 12X version that changes SW lifecycle rules or the offscreen API stability promise. The most volatile finding is A2 (5-minute port lifetime cap) — Chrome team has been actively tuning this. --- *Phase: 01-stabilize-video-pipeline* *Research completed: 2026-05-15*