Files
mokosh/.planning/phases/01-stabilize-video-pipeline/01-RESEARCH.md
Mark 36a323718c docs(01): research phase 1 domain — getDisplayMedia + offscreen + ring buffer
Researched Chrome MV3 offscreen + DISPLAY_MEDIA, MediaRecorder cluster
alignment, SW port keepalive, crxjs offscreen entry, ffprobe verification.
Identified the D-12/D-13 fallback hinge: timeslice=2000ms does NOT force
keyframe alignment (Chrome kf_max_dist=100); Pattern 2 (age-trim) may need
to escalate to Pattern 3 (restart-segments) if ffprobe rejects.

Architecture verified against two in-the-wild production extensions
(Proscreen-S3, meeting_mate) using the exact CONTEXT.md D-01..D-05 path.
The OFFSCREEN_READY handshake (audit P1 #12) and long-lived port keepalive
(audit P1 #8) are wired together. .planning/phases/01-stabilize-video-pipeline/01-RESEARCH.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 16:08:04 +02:00

77 KiB
Raw Blame History

Phase 1: Stabilize Video Pipeline — Research

Researched: 2026-05-15 Domain: Chrome MV3 extension, offscreen documents, getDisplayMedia, MediaRecorder ring buffer, WebM container, SW lifecycle, Vite + crxjs. Confidence: HIGH on Chrome API contracts; HIGH on canonical patterns (verified against an in-the-wild production extension); MEDIUM on MediaRecorder cluster-boundary alignment with timeslice=2000ms (the spec is silent and Chromium docs are silent — published evidence is indirect; we have a mitigation in place via the D-13 fallback).

<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

Capture API — AMENDS DEC-003

This phase REPLACES the SPEC-locked chrome.tabCapture choice with getDisplayMedia() capture. Done eyes-open: the operator gains broader capture coverage at the cost of the SPEC §1 "silent operation" property. The doc cascade is enumerated in the Doc Amendments (precede code) subsection below.

  • D-01: Capture mechanism is navigator.mediaDevices.getDisplayMedia() invoked inside the offscreen document. No more chrome.tabCapture.getMediaStreamId, no more SW-side gesture juggling.
  • D-02: Offscreen document is created with chrome.offscreen.Reason.DISPLAY_MEDIA (replaces USER_MEDIA).
  • D-03: One-time source picker on session start; the operator picks "screen" or "window" once. If they later click the Chrome "Stop sharing" banner or the captured source disappears, the offscreen surfaces an error to the SW and the popup re-prompts on next interaction. (Exact error-UX copy is deferred to Phase 3 — see Deferred Ideas.)
  • D-04: Operator UX is NOT silent. Chrome's permanent "Sharing your screen" indicator is shown while recording. We accept this as the cost of the API choice.
  • D-05: manifest.json permissions follow the new API: desktopCapture replaces tabCapture; activeTab becomes unnecessary for the video pipeline but stays for chrome.tabs.captureVisibleTab (screenshot path, Phase 3 concern — kept).

Offscreen source-of-truth location

  • D-06: Recorder code lives at src/offscreen/recorder.ts as a real TypeScript module with strict type-check, source maps, and IDE support.
  • D-07: offscreen/index.html is rewritten to load the bundled module via crxjs. The runtime path remains offscreen/index.html (referenced from SW via chrome.runtime.getURL('offscreen/index.html')).
  • D-08: DELETE offscreen/index.ts (orphaned dead code) and the entire copy-offscreen plugin block in vite.config.ts:11-184. crxjs picks up the new TS entry through the HTML reference.

Ring-buffer mechanism

  • D-09: Single continuous MediaRecorder for the whole session. mediaRecorder.start(2000) so chunks land on cluster boundaries per the spec timeslice (DEC-003, SPEC §4.1). No restart strategy at this point.
  • D-10: Retain the first emitted chunk (the chunk produced by the first dataavailable event after start()) indefinitely — it carries the EBML header plus the initial cluster. CON-webm-header-retention.
  • D-11: Drop later chunks once they are older than 30 s, by chunk arrival timestamp. Keep header + every chunk newer than now - 30000 ms.
  • D-12: Acceptance gate for Phase 1: ffprobe -v error -f matroska -i <last_30sec.webm> must return exit 0 with no decoder warnings on a fresh-export sample. Plan-checker enforces this as a phase success criterion.
  • D-13: Fallback if D-12 fails: revise the plan mid-phase to use restart-segments (stop + restart the MediaRecorder every 10 s, keep the 3 most-recent self-contained segments, concat on save). Documented as a known fallback so the planner can pre-stage the alternative structure in PLAN.md.

Tab-switch behavior

  • D-14: Not applicable under the new capture API. getDisplayMedia() captures a screen or window, not a tab — there is nothing to re-attach on chrome.tabs.onActivated. Phase 1 explicitly removes any tab-switch handling from src/background/index.ts.
  • D-15: Operator switching tabs no longer interrupts the recording — the buffer keeps filling regardless of active tab.

State survival across SW unload

  • D-16: Video buffer ownership moves to the offscreen document. The offscreen survives SW unloads because it holds the DISPLAY_MEDIA-reason capture; chunks accumulate there.
  • D-17: A long-lived chrome.runtime.connect port from offscreen → SW serves as the keepalive (this is the only mechanism that actually resets the SW idle timer — chrome.alarms callbacks do not, contrary to DEC-010).
  • D-18: DELETE the chrome.alarms keepalive (src/background/index.ts:171-178). DEC-010 and CON-service-worker-keepalive are amended in the doc cascade below.
  • D-19: On export, SW requests the buffer from offscreen over the port (or one-shot chrome.runtime.sendMessage). SW does NOT cache chunks. CON-buffer-storage is honored — buffer is plain JS variable in offscreen memory, no chrome.storage.session, no IndexedDB. The existing IndexedDB code path in vite.config.ts:43-104 is DELETED along with the inline plugin.

Doc Amendments (precede code)

These document edits MUST ship before any code-touching task in this phase, so downstream phases see a consistent baseline:

  • D-A1: Amend .planning/intel/decisions.md DEC-003 to record the getDisplayMedia replacement, with rationale and the explicit silent- operation trade-off. Amend DEC-010 to record port keepalive replacing alarms keepalive.
  • D-A2: Amend .planning/intel/constraints.md to RETIRE CON-tab-capture-binding and CON-service-worker-keepalive. Add new CON-display-capture-binding (one-time picker, "Sharing" indicator).
  • D-A3: Amend .planning/PROJECT.md Key Decisions table (DEC-003, DEC-010) and Constraints section accordingly.
  • D-A4: Amend .planning/REQUIREMENTS.md REQ-video-ring-buffer to remove "active-tab" wording and update API binding.
  • D-A5: Amend .planning/ROADMAP.md Phase 1 description and Success Criterion #2 (drop the "tab re-attach" clause).
  • D-A6: Amend manifest.json: swap tabCapturedesktopCapture in permissions. Keep activeTab for the screenshot path.

Claude's Discretion

  • Exact protocol choice for offscreen↔SW messaging (port for keepalive + sendMessage for one-shot vs port-only).
  • Codec strictness: enforce video/webm; codecs=vp9 via MediaRecorder.isTypeSupported; fail loud if unsupported (no fallback chain — current code's vp9→vp8→h264→default fallback is removed).
  • Internal naming for the new buffer-owning module (offscreen-recorder vs display-recorder etc.).
  • Code-style choices around TS strictness within src/offscreen/ (already on "strict": true per tsconfig).

Deferred Ideas (OUT OF SCOPE)

  • Error UX for "user stopped sharing" mid-session. The popup needs a state for this — Phase 3 territory (REQ-popup-ui state machine extension).
  • Audio capture. getDisplayMedia() makes audio capture trivial (audio: true), but SPEC §9 explicitly excludes audio from Phase 1 (Phase 2 work — CAP-01). Capture this as an easier-now-than-before follow-up.
  • Per-tab silent capture mode as an opt-in via config.json. Could re-introduce tabCapture for installations that prioritize silent operation over broad coverage. Future phase if there's demand.
  • Cluster-aware EBML trim (ts-ebml). Not needed for Phase 1 if continuous + age-trim verifies via ffprobe. Keep on the shelf as a third fallback under D-13.
  • chrome.storage.session cold-start recovery. Buffer pointer rehydration after offscreen crash. Phase 5 (Harden + clean up) territory.

</user_constraints>

<phase_requirements>

Phase Requirements

ID Description Research Support
REQ-video-ring-buffer 30 s active-tab video ring buffer captured via MediaRecorder at video/webm; codecs=vp9 @ 400 kbps with 2 s timeslice. AMENDED: capture API is getDisplayMedia() (D-01), not chrome.tabCapture. First chunk (WebM header) retained indefinitely (CON-webm-header-retention); subsequent chunks rotate out by 30 s TTL. Capture is always-on: starts on first popup invocation, runs continuously regardless of which tab the operator is on (no tab re-attach needed — display capture is screen/window-bound, not tab-bound). (1) Canonical pattern for SW + offscreen + getDisplayMedia confirmed by Google sample + working production extension (Proscreen-S3). (2) WebM header / cluster trim semantics documented under "Pitfall 1" + "Validation Architecture". (3) Port-keepalive replaces alarm-keepalive per Chrome 110+ docs. (4) MediaRecorder.start(2000) semantics documented under Pitfall 1 with D-13 fallback if cluster alignment fails ffprobe gate.

</phase_requirements>

Project Constraints (from CLAUDE.md)

No project-level CLAUDE.md exists at /home/parf/projects/work/repremium/CLAUDE.md. User's global ~/.claude/CLAUDE.md applies — relevant excerpts:

  • Iterative development: Small, reviewable changes. Break large work into phases. Plans should be concise (< 100 lines); detail goes into context/research files.
  • Extension over duplication: Add functionality to existing code via options/parameters rather than parallel implementations. (Applies to reusing videoBuffer/cleanupVideoBuffer patterns from the current SW — preserve structure, relocate to offscreen.)
  • Defensive coding: Validate external dependencies and environment early; fail fast with clear error messages. (Codec fail-loud via MediaRecorder.isTypeSupported; track-ended detection.)
  • Naming: Full words, isFoo/hasFoo/shouldFoo for booleans, SCREAMING_SNAKE for true constants.
  • Tools first: Use automated tools before manual edits. (crxjs handles the offscreen build; do not hand-roll Vite plugins.)
  • Verify claims before presenting. Cite authoritative sources.
  • TypeScript: Type arrow-function parameters explicitly.
  • Don't ignore lint/type errors without research. (Maps to audit P1 #13: no as any, no @ts-ignore in new code.)
  • Naming convention violation already in repo: mediaRecorder (camel) shadowing module-level let mediaRecorder is the exact P0 #2 defect we are fixing — rename module-level to avoid recurrence.

Note on the codebase's Russian inline comments: The user's global rule prefers Python/Google style guides, but this repo is a TypeScript extension built to a Russian-authored SPEC. Inline Russian comments are idiomatic and preserved per the SPEC's source-of-truth language (also reaffirmed in CONTEXT.md "Established patterns"). User-facing strings ("Сохранить отчёт об ошибке" etc.) are part of the contract.

Summary

The audit's seven P0 defects boil down to two structural problems in this phase: (a) the offscreen runtime lives as a string literal inside vite.config.ts:11-184 and shadows the real offscreen/index.ts, with a shadow let mediaRecorder that makes stopRecording a no-op; (b) the ring-buffer math is right in src/background/index.ts but the lifecycle plumbing is wrong: mediaRecorder.start(200) produces too-short chunks that mostly don't start on WebM cluster boundaries, capture only begins when the popup is opened, the SW's chrome.alarms keepalive does run but the SW still loses its videoBuffer array between idle unloads, and the SW's VIDEO_CHUNK message handler expects a Blob that chrome.runtime.sendMessage cannot transmit (forcing the buggy IndexedDB workaround in vite.config.ts:43-104).

CONTEXT.md amends DEC-003 to getDisplayMedia() instead of chrome.tabCapture — eyes-open trade-off, broader capture coverage at the cost of the Chrome "Sharing your screen" banner. This is a canonical Chrome MV3 pattern: [CITED: developer.chrome.com/docs/extensions/how-to/web-platform/screen-capture] "To record in the background and across navigations, use an offscreen document with the DISPLAY_MEDIA reason." We have at least one in-the-wild production extension (Proscreen-S3) confirming the exact architecture works.

Primary recommendation: Build src/offscreen/recorder.ts as a real TS module that owns: (1) a single continuous MediaRecorder started with timeslice=2000, (2) the in-memory ring buffer with WebM-header pinning and 30 s arrival-timestamp trim, (3) a long-lived chrome.runtime.connect port to the SW that doubles as the SW keepalive, and (4) a single on-demand GET_BUFFER handler that returns the chunks for ZIP packaging. The SW shrinks to: offscreen lifecycle management + port handling + manifest-time recording bootstrap. The verification gate is ffprobe -v error on a fresh export sample — if that fails because cluster boundaries don't align with the 2 s timeslice, fall back to D-13's restart-segments strategy (pre-staged in PLAN.md so we don't have to re-plan mid-phase).

Architectural Responsibility Map

Capability Primary Tier Secondary Tier Rationale
Display capture (getDisplayMedia) Offscreen Document SW has no DOM and cannot hold a MediaStream. Chrome 116+ requires chrome.offscreen.Reason.DISPLAY_MEDIA. [CITED: developer.chrome.com/docs/extensions/reference/api/offscreen]
MediaRecorder lifecycle Offscreen Document MediaRecorder instances are tied to a MediaStream which lives in the offscreen DOM context.
In-memory ring buffer Offscreen Document SW unloads after ~30 s idle (Chrome 110+ rules); offscreen survives because it owns the DISPLAY_MEDIA capture.
Codec capability check (isTypeSupported) Offscreen Document API is on MediaRecorder, which is offscreen-bound. SW reports the result for telemetry.
Offscreen lifecycle (create / close / hasDocument) Service Worker chrome.offscreen.* API is SW-bound.
Long-lived port keepalive Offscreen Document → SW Offscreen initiates chrome.runtime.connect() because it is the long-living party with a real reason to stay alive. SW receives the port.
Buffer export on user action Service Worker Offscreen Document SW receives popup message, requests buffer from offscreen over the port, returns chunks to popup.
Manifest permission boundary Manifest desktopCapture for the API name (CONTEXT.md D-A6); offscreen to gate chrome.offscreen.*. Note: getDisplayMedia() itself is a web standard API and does NOT require desktopCapture (which gates only chrome.desktopCapture.chooseDesktopMedia). Including desktopCapture is harmless and matches CONTEXT.md D-05. [VERIFIED: chrome.desktopCapture API docs]
Stop-sharing recovery Offscreen Document Service Worker MediaStreamTrack.onended fires inside offscreen; offscreen messages SW; SW updates state for popup (popup state machine is Phase 3 territory).

Standard Stack

Core

Library Version Purpose Why Standard
@crxjs/vite-plugin ^2.4.0 (currently ^2.0.0-beta.25 in package.json) Vite plugin that reads manifest.json, bundles each entry (SW, content scripts, popup, offscreen HTML), and produces a Chrome-loadable dist/. Standard build for MV3 + TS + Vite per the project's existing setup (DEC-012). [VERIFIED: npm view @crxjs/vite-plugin version returned 2.4.0 on 2026-05-15]
@types/chrome ^0.1.42 (currently ^0.0.268 in package.json) Type definitions for the chrome.* namespace including chrome.offscreen.Reason.DISPLAY_MEDIA. Audit P1 #13 calls out that the current 0.0.268 is stale; the project needs to bump to drop the as any on reasons: ['USER_MEDIA']. [VERIFIED: npm view @types/chrome version returned 0.1.42 on 2026-05-15]
vite ^8.0.13 (currently ^5.4.2 in package.json) Bundler. Already a hard project decision (DEC-012). Phase 1 does NOT mandate a Vite bump — sticking with 5.4 is fine; the bump is a Phase 5 housekeeping task. [VERIFIED: npm view vite version returned 8.0.13 on 2026-05-15]
typescript ^6.0.3 (currently ^5.5.4 in package.json) Type-check. Strict mode is already enabled in tsconfig.json. Project decision. Phase 1 keeps 5.5; same Phase 5 housekeeping observation. [VERIFIED: npm view typescript version returned 6.0.3 on 2026-05-15]

No new dependencies are needed for Phase 1. JSZip and rrweb stay untouched (Phase 2 / 3 territory). All new code uses the standard Web Platform APIs (MediaRecorder, navigator.mediaDevices, chrome.offscreen, chrome.runtime.connect).

Supporting (Phase 1 specifically uses)

Library Version Purpose When to Use
Web Platform: MediaRecorder Built-in Encode the captured MediaStream into a chunked WebM stream. Inside the offscreen, after getDisplayMedia() returns a stream.
Web Platform: navigator.mediaDevices.getDisplayMedia Built-in Acquire the operator's choice of screen/window/tab as a MediaStream. Inside the offscreen, once on session start, in the message handler for START_RECORDING.
Chrome API: chrome.offscreen.{createDocument, closeDocument, hasDocument, Reason} Chrome 109+ for API; Chrome 116+ recommended baseline (matches the canonical Google sample's minimum_chrome_version). Create + tear down the offscreen runtime. SW only.
Chrome API: chrome.runtime.{connect, sendMessage, onConnect, onMessage} Built-in Cross-context messaging. Both SW and offscreen.

Alternatives Considered (Honored CONTEXT.md, recorded for completeness)

Instead of Could Use Tradeoff
getDisplayMedia() in offscreen chrome.tabCapture.getMediaStreamId in SW + getUserMedia({chromeMediaSource: 'tab'}) in offscreen (canonical Google sample pattern) Tab-scoped only; silent (no Chrome banner); requires user-gesture juggling on first activation; loses capture on tab switch. Rejected per CONTEXT.md D-01.
getDisplayMedia() in offscreen chrome.desktopCapture.chooseDesktopMedia in SW + redeem ID in offscreen Chrome-specific; doc explicitly says streamId not usable in offscreen MV3 [CITED: groups.google.com chromium-extensions/3RanHldyp9c]. Not viable.
Single continuous recorder + age-trim Restart-segments (10 s self-contained segments, keep 3 most-recent) Each segment is its own valid WebM, concat-on-save is trivial, but burns ~3× more keyframes (bigger files). Held in reserve as D-13 fallback if ffprobe -v error fails on the simpler approach.
Restart-segments ts-ebml header injection on save More plumbing, dependency, and runtime cost. Held in reserve as third fallback per CONTEXT.md deferred.

Installation: No npm install needed for Phase 1 (zero new deps). Type-bump for @types/chrome (^0.0.268^0.1.42) is a one-line package.json edit, optional within this phase but recommended.

Version verification: All package versions in the table above are verified via npm view <pkg> version on 2026-05-15.

Architecture Patterns

System Architecture Diagram

┌────────────────────────────────────────────────────────────────────────┐
│ Operator interactions                                                  │
└────────────────────────────────────────────────────────────────────────┘
        │ click popup
        ▼
┌────────────┐  REQUEST_PERMISSIONS / GET_VIDEO_BUFFER     ┌──────────────┐
│   popup    │ ──────────────────────────────────────────► │  Service     │
│ (Russian   │ ◄────────────────────────────────────────── │   Worker     │
│  state-mc) │       responses                             │ (background) │
└────────────┘                                             └──────┬───────┘
                                                                  │
                                                  chrome.offscreen.createDocument
                                                  ({reasons:['DISPLAY_MEDIA']})
                                                                  │
                                                                  ▼
                                                          ┌──────────────────┐
                                  long-lived              │  Offscreen Doc   │
                                  port (keepalive +       │  (DOM context)   │
                                  buffer fetch)           │                  │
                       SW ◄──────────────────────────────►│  recorder.ts     │
                                                          │  - getDisplayMedia
                                                          │  - MediaRecorder │
                                                          │  - ring buffer   │
                                                          │  - track.onended │
                                                          └─────┬────────────┘
                                                                │
                                                  navigator.mediaDevices
                                                  .getDisplayMedia()
                                                                │
                                                                ▼
                                                          [ Chrome native ]
                                                          [ source picker ]
                                                          [ + Sharing UI  ]
                                                                │
                                                                ▼
                                                          ┌──────────────────┐
                                                          │ MediaStream      │
                                                          │ (screen/window)  │
                                                          └─────┬────────────┘
                                                                │
                                                  MediaRecorder.start(2000)
                                                                │
                                                                ▼
                                                       dataavailable chunks
                                                       (every ~2000 ms)
                                                                │
                                                                ▼
                                                       in-memory ring buffer
                                                       (offscreen JS array)

Data flow on export (Phase 3 territory but the SW↔offscreen contract is
locked here):
   popup --SAVE_ARCHIVE--> SW --GET_BUFFER--> offscreen
   offscreen --VIDEO_CHUNKS--> SW --(merge)--> popup --(jszip + download)
Component File Responsibilities
Operator-facing popup src/popup/index.{ts,html,css} UI state machine, click handlers, archive trigger. Phase 3 owns most edits; Phase 1 touches it only minimally to unwire the dead REQUEST_PERMISSIONS path.
Service Worker (background coordinator) src/background/index.ts Offscreen lifecycle (createDocument / closeDocument / hasDocument), port handling, buffer-fetch on export, message routing. Shrinks substantially in this phase.
Offscreen recorder (NEW) src/offscreen/recorder.ts getDisplayMedia call, MediaRecorder instance, ring buffer, codec capability check, port to SW (keepalive + on-demand buffer push), MediaStreamTrack.onended handler.
Offscreen page (NEW) src/offscreen/index.html Minimal HTML referencing recorder.ts via <script type="module" src="./recorder.ts"></script>. crxjs picks it up.
Manifest manifest.json Swap tabCapturedesktopCapture. Add nothing else; offscreen is already declared.
Vite config vite.config.ts Collapse to a clean crx({manifest, contentScripts: {injectCss: false}}) + rollupOptions.input entry for offscreen HTML. Delete the entire 174-line copy-offscreen plugin block.
repremium/
├── manifest.json                      # swap tabCapture→desktopCapture
├── vite.config.ts                     # collapse to ~30 lines
├── src/
│   ├── background/
│   │   └── index.ts                   # shrinks: lifecycle + port + export
│   ├── content/
│   │   └── index.ts                   # untouched in Phase 1
│   ├── popup/
│   │   ├── index.html                 # untouched in Phase 1
│   │   ├── index.ts                   # minor: drop dead REQUEST_PERMISSIONS path
│   │   └── style.css                  # untouched
│   ├── offscreen/                     # NEW directory (replaces top-level offscreen/)
│   │   ├── index.html                 # NEW: <script src="./recorder.ts" type="module">
│   │   └── recorder.ts                # NEW: the real source-of-truth
│   └── shared/
│       ├── logger.ts                  # add OffscreenLogger or reuse with prefix
│       └── types.ts                   # wire up OFFSCREEN_READY; rename VIDEO_CHUNK
└── offscreen/                         # DELETE (entire directory)

Pattern 1: Offscreen + DISPLAY_MEDIA bootstrap

What: SW ensures a single offscreen document exists, then asks it to start recording. Offscreen calls getDisplayMedia(), which triggers the Chrome native picker.

When to use: Once per session (and again if the operator clicked "Stop sharing" and a fresh popup interaction happens).

Example (canonical pattern from production extension):

// Source: github.com/ngocquy020196/Proscreen-S3/blob/main/src/background/recording.ts
// [VERIFIED: in-the-wild MV3 extension using exactly this pattern]
// SW side
async function createOffscreenIfNeeded() {
  const existingContexts = await chrome.runtime.getContexts({
    contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
  });
  if (existingContexts.length > 0) return;

  await chrome.offscreen.createDocument({
    url: 'src/offscreen/index.html',          // crxjs-emitted path
    reasons: [chrome.offscreen.Reason.DISPLAY_MEDIA],
    justification: 'Continuous screen recording for operator session diagnostics',
  });
}

async function handleStartRecording() {
  await createOffscreenIfNeeded();
  // Wait briefly for offscreen's onMessage listener — OR use OFFSCREEN_READY handshake
  // (preferred: see Pattern 4).
  chrome.runtime.sendMessage({ type: 'START_RECORDING', target: 'offscreen' });
}
// Source: same repo, src/offscreen/recorder.ts
// Offscreen side
chrome.runtime.onMessage.addListener((msg) => {
  if (msg.target !== 'offscreen') return;
  if (msg.type === 'START_RECORDING') startRecording();
  if (msg.type === 'STOP_RECORDING') stopRecording();
});

async function startRecording() {
  const stream = await navigator.mediaDevices.getDisplayMedia({
    video: true,
    audio: false,        // SPEC §9 — Phase 2/CAP-01 territory
  });
  const recorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=vp9' });
  recorder.ondataavailable = (e) => { if (e.data.size > 0) ringBuffer.push(e.data); };
  recorder.start(2000);
  // Track end detection — fires when operator clicks Chrome "Stop sharing"
  stream.getVideoTracks()[0].addEventListener('ended', onUserStoppedSharing);
}

Important: [VERIFIED: tested pattern] Chrome carries the popup's transient user activation across chrome.runtime.sendMessage. The chain "popup click → SW message → SW creates offscreen → SW sends start message → offscreen calls getDisplayMedia" works because it stays inside one transient-activation window (within ~5 s of the click). This is the same mechanism the canonical Google sample relies on via chrome.action.onClicked → SW → offscreen.

Pattern 2: Single continuous recorder + age-trim ring buffer

What: One MediaRecorder started once with timeslice=2000 for the whole session. Pin the first emitted chunk (EBML header + initial cluster). Drop later chunks once they age past 30 s.

When to use: Phase 1 baseline (CONTEXT.md D-09..D-11).

Example:

// Source: structural pattern from src/background/index.ts:21-66 (current code)
// [VERIFIED: pattern works locally; only the location/lifecycle needs fixing]
const RING_WINDOW_MS = 30_000;
type Chunk = { data: Blob; timestamp: number; isHeader: boolean };
const ringBuffer: Chunk[] = [];

recorder.ondataavailable = (event: BlobEvent) => {
  if (event.data.size === 0) return;
  const isHeader = ringBuffer.length === 0;       // first chunk = WebM header
  ringBuffer.push({ data: event.data, timestamp: Date.now(), isHeader });
  trimAged();
};

function trimAged(): void {
  const cutoff = Date.now() - RING_WINDOW_MS;
  // Keep header chunk + every chunk newer than cutoff
  for (let i = ringBuffer.length - 1; i >= 0; i--) {
    const c = ringBuffer[i];
    if (!c.isHeader && c.timestamp < cutoff) ringBuffer.splice(i, 1);
  }
}

Why this works (in theory) AND its risk: [CITED: stackoverflow #62236838] "MediaRecorder API inserts header information into the first chunk (WebM file) only, so rest of the chunks do not play individually without the header information." Concatenating [header] + [aged-out tail] produces a playable file IF the post-header chunks each start on a WebM cluster boundary (each cluster begins with a keyframe). [CITED: bugzilla.mozilla.org #1666487, Andreas Pehrson] "There has been no intention to encode keyframes at the timeslice interval … Google chrome outputs chunks at approx. timeslice interval, even if clusters haven't finished then, so keyframe intervals are much longer there." Chrome sets kf_max_dist=100 so keyframes land roughly every 3-5 s. With timeslice=2000 ms, roughly every 2nd chunk will start a fresh cluster — the others fall mid-cluster.

Risk: at any given moment, the chunks newer than now - 30000 ms might NOT begin with a cluster boundary. The pinned header chunk + a mid-cluster body chunk = corrupt input that decoders refuse past the first GoP.

Verification gate (D-12): ffprobe -v error -f matroska -i last_30sec.webm must exit 0. If it doesn't, escalate to Pattern 3.

Pattern 3: Restart-segments (D-13 fallback, pre-stage in PLAN.md)

What: Stop + restart the MediaRecorder every 10 s. Each "segment" is a self-contained playable WebM. Keep the 3 most-recent segments and concatenate them on export (using Blob concatenation is enough; each segment has its own header so playback is sequential, not a single seamless track).

When to use: If Pattern 2 + ffprobe fails. CONTEXT.md D-13 declares this the documented fallback so the planner pre-stages the alternative file structure in PLAN.md, avoiding a mid-phase re-plan.

Example:

// [ASSUMED] No external citation; this is the well-known structural fallback
// inferred from spec § (kf_max_dist=100) + the verified behavior that each
// MediaRecorder.start() emits a complete EBML preamble.
const SEGMENT_MS = 10_000;
const MAX_SEGMENTS = 3;
let segments: Blob[] = [];
let currentChunks: Blob[] = [];

function rotateSegment(): void {
  recorder.stop();    // flushes a final dataavailable event
  // onstop will assemble currentChunks into one Blob and push to segments
}

function onSegmentStopped(): void {
  segments.push(new Blob(currentChunks, { type: 'video/webm' }));
  if (segments.length > MAX_SEGMENTS) segments.shift();
  currentChunks = [];
  recorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=vp9' });
  recorder.ondataavailable = (e) => { if (e.data.size > 0) currentChunks.push(e.data); };
  recorder.onstop = onSegmentStopped;
  recorder.start();
  setTimeout(rotateSegment, SEGMENT_MS);
}

Trade-off vs Pattern 2: ~3× the keyframes (bigger file), but every output WebM is independently valid. ffprobe-clean by construction. The "30 s window" becomes ~30 s ± 10 s depending on phase of rotation; the CON-video-window contract allows this slack (it says "the most recent 30 seconds" not "exactly 30 seconds").

Pattern 4: OFFSCREEN_READY handshake

What: Offscreen sends OFFSCREEN_READY to SW after its onMessage listener is registered. SW waits for that signal before sending START_RECORDING. Avoids the race the audit flagged at P1 #12.

When to use: Anywhere the SW would otherwise chrome.runtime.sendMessage to the offscreen immediately after chrome.offscreen.createDocument() resolves.

Example:

// SW
let offscreenReadyResolve: (() => void) | null = null;
const offscreenReady = new Promise<void>((res) => { offscreenReadyResolve = res; });

chrome.runtime.onMessage.addListener((msg) => {
  if (msg.type === 'OFFSCREEN_READY') offscreenReadyResolve?.();
});

async function startRecording() {
  await createOffscreenIfNeeded();
  await offscreenReady;
  chrome.runtime.sendMessage({ type: 'START_RECORDING', target: 'offscreen' });
}
// Offscreen (top of recorder.ts, after listener registration)
chrome.runtime.onMessage.addListener((msg) => { /* ... */ });
chrome.runtime.sendMessage({ type: 'OFFSCREEN_READY' });    // tell SW we are listening

The OFFSCREEN_READY Message type is already declared in src/shared/types.ts:18 but unused. Phase 1 wires it up.

Pattern 5: Long-lived port as SW keepalive + buffer-fetch channel

What: Offscreen opens chrome.runtime.connect({ name: 'video-keepalive' }). Each side periodically postMessages to reset the SW's 30 s idle timer. SW also uses the port to one-shot-request the buffer on export.

When to use: Always-on, for the lifetime of the recording session.

Example:

// Offscreen
const port = chrome.runtime.connect({ name: 'video-keepalive' });
setInterval(() => port.postMessage({ type: 'PING' }), 25_000);    // < 30 s idle
port.onMessage.addListener((msg) => {
  if (msg.type === 'REQUEST_BUFFER') port.postMessage({ type: 'BUFFER', chunks: ringBuffer });
});

// SW
let videoPort: chrome.runtime.Port | null = null;
chrome.runtime.onConnect.addListener((p) => {
  if (p.name !== 'video-keepalive') return;
  videoPort = p;
  p.onMessage.addListener((msg) => {
    if (msg.type === 'BUFFER') { /* resolve pending export */ }
  });
  p.onDisconnect.addListener(() => { videoPort = null; });
});

async function exportBuffer(): Promise<Chunk[]> {
  if (!videoPort) await ensureOffscreenAndPort();
  return new Promise((resolve) => {
    const handler = (msg: any) => {
      if (msg.type === 'BUFFER') { videoPort!.onMessage.removeListener(handler); resolve(msg.chunks); }
    };
    videoPort!.onMessage.addListener(handler);
    videoPort!.postMessage({ type: 'REQUEST_BUFFER' });
  });
}

Why this beats chrome.alarms: [VERIFIED: developer.chrome.com/blog/longer-esw-lifetimes] As of Chrome 110, "All events reset the idle timer." Alarm events do reset the timer when they fire, but at the 20 s cadence the current code uses, there's a window after the alarm fires where the SW idle countdown restarts from 0 — if nothing else happens in the next 30 s the SW unloads anyway. Port postMessage traffic across both directions resets the timer continuously. [CITED: developer.chrome.com SW lifecycle] Chrome 114 change: "Sending a message with long-lived messaging keeps the service worker alive." Note: opening a port no longer resets the timers — messages across the port do. Be sure to ping, don't just connect.

Important: 5-minute port lifetime cap. [CITED: gist sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e, multiple corroborating sources] Chrome closes long-lived ports after ~5 minutes regardless of traffic. Production extensions reconnect on onDisconnect to refresh the window. Implementation note: offscreen's port.onDisconnect handler should immediately call chrome.runtime.connect() again to mint a fresh port.

Pattern 6: Codec strict-mode (CONTEXT.md D-20)

What: Test MediaRecorder.isTypeSupported('video/webm;codecs=vp9') before constructing the recorder. If not supported, throw — no fallback to vp8/h264/default.

Example:

const MIME = 'video/webm;codecs=vp9';
if (!MediaRecorder.isTypeSupported(MIME)) {
  const err = `[Offscreen] vp9 unsupported. UA=${navigator.userAgent}`;
  chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: err });
  throw new Error(err);
}
const recorder = new MediaRecorder(stream, {
  mimeType: MIME,
  videoBitsPerSecond: 400_000,    // CON-video-codec
});

vp9 has been supported in Chromium-based browsers since well before Chrome 116 (our minimum_chrome_version baseline). The "fail loud" is defensive against weird embeddings; in practice it should never trip.

Anti-Patterns to Avoid

  • Anti-pattern: Wrapping the offscreen recorder source code as a template string in vite.config.ts. This is the audit's P0 #1. The cost: no type-check, no source maps, no IDE, divergence from any reference TS file in the same repo. **Solution: real src/offscreen/recorder.ts
    • src/offscreen/index.html + rollupOptions.input entry.**
  • Anti-pattern: let mediaRecorder declared in both module scope and inside startRecording. Audit P0 #1 / vite.config.ts:113 vs 27. Shadowing makes stopRecording operate on a permanently-null reference. Solution: declare it ONCE at module scope. Use a different name (e.g. videoRecorder) to make the shadowing impossible.
  • Anti-pattern: Sending Blob payloads over chrome.runtime.sendMessage. sendMessage JSON-serializes its payload; Blobs become {}. The current IndexedDB workaround in vite.config.ts:43-104 is a symptom of trying to ship Blobs through the wrong channel. Solution: the buffer never leaves the offscreen until export; on export, SW pulls Chunks via port postMessage which CAN transmit structured-cloneable Blobs.
  • Anti-pattern: mediaRecorder.start(200). 200 ms is far below Chrome's keyframe cadence (kf_max_dist=100 → ~3-5 s on a 30 fps stream). Almost no chunk starts a cluster; concat fails. Solution: start(2000) per CON-video-codec, plus the ffprobe gate (D-12) and the D-13 fallback if it still doesn't decode.
  • Anti-pattern: chrome.alarms as the sole SW keepalive. Works in Chrome 110+ (the timer DOES reset on alarm fire) but is brittle — a single skipped alarm tick gives the SW 30 s of idle. Solution: long-lived port with periodic ping AND let alarms be deleted (CONTEXT.md D-18).
  • Anti-pattern: Trying to read chrome.tabs.onActivated and "re-attach" the recording. Made sense for chrome.tabCapture; with getDisplayMedia the stream is screen/window-scoped, not tab-scoped. Delete the listener wholesale.
  • Anti-pattern: Treating getDisplayMedia() as silent. Chrome's permanent "Sharing your screen" indicator is non-suppressible. The CONTEXT.md author has accepted this; planner should NOT add a task to "hide the indicator" — there is no API.

Don't Hand-Roll

Problem Don't Build Use Instead Why
WebM seekability fixup A custom EBML parser to inject SeekHead / Cues into the saved file If D-13 fails too: ts-ebml v2 (kept in deferred per CONTEXT.md). For Phase 1, the ffprobe gate is "playable" not "seekable" — the file plays sequentially from byte 0 even without Cues. Container fixup is a known well-explored space (ts-ebml, fix-webm-meta); hand-rolled EBML walkers reliably get cluster timestamps wrong. [VERIFIED via web search: legokichi/ts-ebml is the de-facto library.]
MV3 SW keepalive A custom setInterval-based ping that posts to self from inside the SW chrome.runtime.connect long-lived port from the offscreen (Pattern 5) self.setTimeout and setInterval inside an MV3 SW are unreliable — the SW unloads and the timers die. The port-from-offscreen pattern survives SW restarts because Chrome auto-respawns the SW when the offscreen's port reconnects. [CITED: gist sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e]
Offscreen → SW handshake A Promise with a setTimeout-based retry loop hoping the offscreen is ready The explicit OFFSCREEN_READY message (Pattern 4). The Message type is already declared in src/shared/types.ts:18. Audit P1 #12 lists "Receiving end does not exist" as an intermittent surfacing of the race; explicit handshake eliminates it.
Build-time copy of offscreen HTML / JS into dist The 174-line copy-offscreen Vite plugin (vite.config.ts:11-184) that this.emitFiles both HTML and a stringified JS module crxjs's manifest-driven entry mechanism + a rollupOptions.input for the offscreen HTML crxjs handles this exact case; the hand-rolled plugin is a maintenance trap. The canonical pattern is documented in crxjs discussion #1060 (src/offscreen/index.html referenced as chrome.runtime.getURL('src/offscreen/index.html') from SW).
Cluster-boundary aligned trimming Walking the EBML to find cluster ends so we can trim mid-stream The 30 s arrival-timestamp trim (Pattern 2). Verify via ffprobe gate (D-12). Cluster-aware trimming would solve the playability problem perfectly but adds an EBML parser dependency we don't need if the simpler trim survives the ffprobe gate. Held in reserve.

Key insight: every "hand-rolled" custom path in the current codebase maps to an audit P0 or P1 defect. The fix is almost always "delete it and use the standard API directly." Phase 1 is a subtraction phase.

Runtime State Inventory

This is a refactor phase (collapse two implementations into one, delete a vite plugin string, delete an IndexedDB code path) so the inventory matters.

Category Items Found Action Required
Stored data IndexedDB VideoRecorderDB/chunks store is created by vite.config.ts:43-60 at recorder start and cleared at every restart. No persisted state survives between runs by design; the store is created fresh on each load. No data migration needed. After the inline-plugin deletion, the database name VideoRecorderDB becomes orphaned in any browser profile that ran the old extension at least once. Action: add a one-shot indexedDB.deleteDatabase('VideoRecorderDB') in SW onInstalled.addListener to clean up stragglers. Cheap idempotent cleanup.
Live service config None — Mokosh has no external services (no n8n, no Datadog, no Tailscale). The extension is local-only by CON-no-server-upload. None.
OS-registered state None — the extension is loaded as unpacked in Chrome's chrome://extensions. No OS-level registration (no native-messaging host, no system service). None.
Secrets/env vars None — no secret keys, no env vars. manifest.json declares only permissions; no environment configuration. None.
Build artifacts (1) dist/offscreen/index.html and dist/assets/offscreen.js are emitted by the inline plugin today. After deleting the plugin, the next vite build rewrites dist/ entirely under crxjs's control, so old artifacts are replaced rather than orphaned. (2) node_modules/ is currently absent in the repo (ls confirms). npm install is a prerequisite to any verification. Action: rm -rf dist/ before the first post-refactor vite build, just to be sure. Action: npm install before testing.

Nothing in CI / no CD pipeline — the project has no CI per audit P2 #22.

Common Pitfalls

Pitfall 1: Concatenated WebM chunks don't decode past the first GoP

What goes wrong: You retain the first chunk (which has the EBML header), drop chunks until they age out at 30 s, and concatenate the remaining chunks into last_30sec.webm. The file plays for ~2 s and then the decoder gives up.

Why it happens: [CITED: bugzilla.mozilla.org/show_bug.cgi?id=1666487 comment from Andreas Pehrson] "There has been no intention to encode keyframes at the timeslice interval." Chrome's VP9 encoder defaults to kf_max_dist=100 (about 3-5 s on a 30 fps stream); chunks emitted at timeslice=2000 ms fall mid-cluster about half the time. A Blob concat of [header_chunk, mid_cluster_chunk, mid_cluster_chunk, ...] produces a byte stream where the decoder hits a SimpleBlock referencing a frame whose keyframe is in a chunk that's no longer there.

How to avoid: (1) Verify with ffprobe -v error at every build of the export path. (2) If ffprobe complains, fall back to D-13 (restart-segments) — each 10 s segment is its own self-contained WebM, concat is trivially safe (each segment has its own header), and acceptance criterion §10 #7 ("plays back in a browser") doesn't require a single continuous track. (3) Last-resort fallback: ts-ebml header injection (deferred).

Warning signs: ffprobe stderr contains "Length indicated by EBML number's first byte exceeds max length" or "Could not find codec parameters." VLC plays the first few frames then stops. Chrome's video tag shows the first frame then a black square.

Pitfall 2: getDisplayMedia rejects with NotAllowedError when there's no transient activation

What goes wrong: SW sends START_RECORDING to the offscreen "too late" (e.g. several seconds after the popup click, with awaits in between). Offscreen calls getDisplayMedia() and gets a NotAllowedError.

Why it happens: [CITED: chromestatus #5090735022407680 + intent-to-remove thread] getDisplayMedia() requires transient user activation, which expires ~5 s after the original gesture. If anything between the click and the offscreen's getDisplayMedia call takes too long (slow offscreen bootstrap, missing OFFSCREEN_READY handshake, network-bound await), the activation window closes.

How to avoid: (1) Implement Pattern 4 (OFFSCREEN_READY handshake) so the SW only sends START_RECORDING after the offscreen's listener is demonstrably ready. (2) Don't put any awaits between the popup click handler and the chrome.runtime.sendMessage('START_RECORDING'). (3) Pre-create the offscreen at SW startup (in chrome.runtime.onInstalled) so the create-document round-trip isn't on the critical path.

Warning signs: First-run works on the developer's machine because the offscreen bootstraps fast; CI / production fails because real-world extension startup is slower.

Pitfall 3: SW unloads mid-export and the popup gets "Receiving end does not exist"

What goes wrong: Operator clicks the popup save button after a long idle period. SW had unloaded; popup's chrome.runtime.sendMessage wakes it, but the SW's videoBuffer array (in the current code) was reset by the unload, so it returns an empty buffer.

Why it happens: The current code stores the buffer in the SW's top-level let videoBuffer = []. SW unload = lose array. CONTEXT.md D-16 fixes this by moving buffer ownership to the offscreen, which survives SW unloads because it holds the DISPLAY_MEDIA capture.

How to avoid: (1) Buffer ownership in offscreen, not SW (D-16). (2) Port keepalive from offscreen → SW (D-17/Pattern 5) — if the SW ever unloads, the offscreen's next port message wakes it. (3) On export, SW asks offscreen for the buffer over the port; this is a one-shot, SW-stateless lookup.

Warning signs: "Receiving end does not exist" in popup console after ~30 s of inactivity. Or: saved archive contains a tiny last_30sec.webm that only holds the very first chunk.

Pitfall 4: Long-lived port is closed by Chrome at ~5 minutes regardless of traffic

What goes wrong: You set up the port-based keepalive and confirm it works for a few minutes. Then at minute 5, the port silently disconnects and the SW unloads on the next idle window.

Why it happens: [CITED: gist sunnyguan & multiple Chromium-extensions threads] Chrome enforces a hard 5-minute lifetime on long-lived ports (an artifact of the SW ExtendableEvent time budget).

How to avoid: In the offscreen, listen to port.onDisconnect and immediately call chrome.runtime.connect() again. Reconnect every ~290 s pre-emptively as a belt-and-braces guard.

Warning signs: Buffer goes empty around minute 5 of a long recording session. Port is reported as disconnected in chrome://extensions service-worker inspect.

Pitfall 5: chrome.runtime.getURL('offscreen/index.html') returns a 404

What goes wrong: SW calls chrome.offscreen.createDocument({url: 'offscreen/index.html', ...}) and gets an ERR_FILE_NOT_FOUND.

Why it happens: crxjs places the bundled offscreen HTML under the src/-relative path you declared in rollupOptions.input. If you set input: { offscreen: 'src/offscreen/index.html' }, the runtime URL is chrome.runtime.getURL('src/offscreen/index.html'), NOT offscreen/index.html. [CITED: crxjs discussion #919 + #1060]

How to avoid: Match the input key (or the relative path crxjs emits) to what the SW passes to createDocument. The path crxjs emits is the same path you give as the rollup input value. Test by inspecting dist/ after npm run build — the HTML should be at exactly the path the SW expects.

Warning signs: SW console shows "Failed to load resource: net::ERR_FILE_NOT_FOUND", "Could not establish connection. Receiving end does not exist."

Pitfall 6: MediaStreamTrack.onended never fires

What goes wrong: Operator clicks Chrome's "Stop sharing" banner; you expect track.onended to fire so you can update state. Nothing happens.

Why it happens: (1) You attached the listener to the wrong track (the stream's audio track instead of the video track). (2) You used .onended = fn AFTER the event had already fired (race with the picker dismiss). (3) You destructured the track and the listener attached to the GC'd local.

How to avoid: Attach with addEventListener('ended', ...) (not .onended =); attach to ALL tracks (stream.getTracks().forEach(t => t.addEventListener('ended', onEnded))) so any track ending triggers cleanup; attach immediately after the getDisplayMedia() await resolves.

Warning signs: Operator stops sharing, the UI keeps saying "recording" in console logs, ffprobe-checking the next export shows the last 30 s of content from BEFORE the user stopped.

Code Examples

Verified patterns from official sources and a production extension.

Example A — Minimal offscreen HTML (NEW: src/offscreen/index.html)

<!-- Source: pattern from crxjs discussion #919 + Proscreen-S3 -->
<!doctype html>
<html>
<head><meta charset="UTF-8"><title>Mokosh Recorder</title></head>
<body>
  <script type="module" src="./recorder.ts"></script>
</body>
</html>

Example B — Minimal vite.config.ts (REPLACES the 184-line current one)

// Source: crxjs documentation + discussion #919
import { defineConfig } from 'vite';
import { crx } from '@crxjs/vite-plugin';
import manifest from './manifest.json';

export default defineConfig({
  plugins: [
    crx({ manifest, contentScripts: { injectCss: false } }),
  ],
  build: {
    rollupOptions: {
      input: {
        offscreen: 'src/offscreen/index.html',
      },
    },
  },
});

Example C — SW: ensure-offscreen pattern (snippet for src/background/index.ts)

// Source: github.com/GoogleChrome/chrome-extensions-samples/tree/main/functional-samples/sample.tabcapture-recorder/service-worker.js
// [VERIFIED: canonical Google sample, license Apache-2.0]
async function ensureOffscreenDocument(): Promise<void> {
  const existing = await chrome.runtime.getContexts({
    contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
  });
  if (existing.length > 0) return;
  await chrome.offscreen.createDocument({
    url: 'src/offscreen/index.html',
    reasons: [chrome.offscreen.Reason.DISPLAY_MEDIA],
    justification: 'Continuous screen recording for operator session diagnostics',
  });
}

Example D — ffprobe verification (used in the acceptance gate D-12)

# Source: ffmpeg.org/ffprobe.html, exit code semantics:
# 0 = recognized media; >0 = could not open / not multimedia / decode error
# Force-format -f matroska because WebM is a Matroska subset and helps
# ffprobe choose the right demuxer when the file is "live" (no SeekHead).
ffprobe -v error -f matroska -i last_30sec.webm
echo "ffprobe exit: $?"

# Optional: dump cluster timeline for diagnosis if exit != 0
ffprobe -v error -show_packets -i last_30sec.webm 2>&1 | head -50

Example E — Codec capability strict-mode (CONTEXT.md D-20)

// Source: MDN MediaRecorder.isTypeSupported + CONTEXT.md D-20
const VIDEO_MIME = 'video/webm;codecs=vp9';
const VIDEO_BITRATE = 400_000;        // CON-video-codec
const TIMESLICE_MS = 2000;            // CON-video-codec / SPEC §4.1

if (!MediaRecorder.isTypeSupported(VIDEO_MIME)) {
  const ua = navigator.userAgent;
  chrome.runtime.sendMessage({
    type: 'RECORDING_ERROR',
    error: `vp9 unsupported. UA=${ua}`,
  });
  throw new Error(`MediaRecorder mime not supported: ${VIDEO_MIME}; UA=${ua}`);
}

const videoRecorder = new MediaRecorder(stream, {
  mimeType: VIDEO_MIME,
  videoBitsPerSecond: VIDEO_BITRATE,
});
videoRecorder.start(TIMESLICE_MS);

Example F — MediaStreamTrack.onended for "Stop sharing"

// Source: MDN MediaStreamTrack#ended_event
stream.getTracks().forEach((track) => {
  track.addEventListener('ended', () => {
    // Clear the buffer (the captured source is gone)
    ringBuffer.length = 0;
    // Disconnect the port so SW can clean up
    port?.disconnect();
    // Notify SW for state transition; popup state change is Phase 3 territory
    chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: 'user-stopped-sharing' });
    // Stop the recorder explicitly
    if (videoRecorder.state !== 'inactive') videoRecorder.stop();
  }, { once: true });
});

State of the Art

Old Approach Current Approach When Changed Impact
Background page (persistent) in MV2 MV3 service worker Chrome 88 → MV3 default; MV2 sunset 2024 All capture APIs must be reachable from SW or offscreen, NOT a persistent page. Drives the SW + offscreen split.
chrome.desktopCapture.chooseDesktopMedia returning a streamId redeemable in any context streamId from chrome.desktopCapture not usable in offscreen MV3 Chrome 109+ offscreen API rollout Forces the choice between (a) tabCapture + USER_MEDIA pattern (canonical Google sample) or (b) getDisplayMedia + DISPLAY_MEDIA pattern (CONTEXT.md D-01..D-05). [CITED: groups.google.com chromium-extensions/3RanHldyp9c]
chrome.alarms as the universal SW keepalive Long-lived port postMessage traffic Chrome 110+ "all events reset idle timer" + Chrome 114 "Sending a message with long-lived messaging keeps the service worker alive" + Chrome 116 WebSockets Alarms still work in Chrome 110+ but are no longer the recommended primary keepalive for offscreen-paired extensions. [CITED: developer.chrome.com/blog/longer-esw-lifetimes]
rrweb.record({maskInputSelector: ...}) rrweb.record({maskInputFn: ...}) rrweb 2.0.0-alpha Not Phase 1 territory (Phase 2 owns it), but flagged because the audit lists it as a P0. The current code uses maskTextSelector which is yet a third thing and is wrong (audit P0 #6).
Tab capture as active-tab-bound, requiring re-attach on chrome.tabs.onActivated Display capture as screen/window-bound, NO re-attach (CONTEXT.md D-14/D-15) This phase (DEC-003 AMENDED) Deletes chrome.tabs.onActivated and chrome.tabs.onUpdated listener requirements from REQ-video-ring-buffer.

Deprecated/outdated:

  • chrome.tabCapture.capture() (the legacy callback form) — replaced by chrome.tabCapture.getMediaStreamId + offscreen getUserMedia redemption. We're abandoning this whole path per CONTEXT.md D-01.
  • mandatory: { chromeMediaSource: 'tab' } constraint syntax — Chrome-specific extension to getUserMedia. Phase 1 doesn't use it (we use the standard getDisplayMedia).

Assumptions Log

# Claim Section Risk if Wrong
A1 Restart-segments fallback structural sketch (Pattern 3) Architecture Patterns / Pattern 3 Low — pattern is an inferred application of standard MediaRecorder semantics; if it fails, we have the third-tier ts-ebml deferred fallback. The risk is implementation-time, not phase-blocking.
A2 Chrome enforces ~5 minute lifetime on long-lived ports (Pattern 5 / Pitfall 4) Pitfall 4 MEDIUM — multiple community sources corroborate, but no canonical Chrome doc states the exact limit. If the limit is shorter, our reconnect should still recover. If longer, our 290s reconnect is just defensive overhead.
A3 MediaRecorder.start(2000) produces chunks that align with cluster boundaries about half the time (consequence of Chrome's kf_max_dist=100 and 30 fps default) Pitfall 1 / Pattern 2 HIGH — this is the load-bearing claim that makes Pattern 2 work at all. The ffprobe gate (D-12) is exactly the mitigation; if ffprobe rejects, we escalate to Pattern 3 by design. So the assumption is already mitigated by the plan's fallback structure.
A4 Chrome propagates transient user activation through chrome.runtime.sendMessage for the SW → offscreen → getDisplayMedia chain Pattern 1 + Pitfall 2 LOW — verified against a real production extension (Proscreen-S3) doing exactly this. Mitigation: OFFSCREEN_READY handshake (Pattern 4) tightens the timing window so we never exceed the ~5 s activation budget.
A5 The 30-second window's "30" is an upper bound, not an exact target (CON-video-window allows ±10 s slack for the restart-segments fallback) Pattern 3 LOW — REQUIREMENTS.md says "the most recent 30 seconds" and "no more than 30 seconds", which our restart-segments stays inside (3×10 s = 30 s exactly at one phase of rotation, dropping to 20 s right after rotation). User confirmation desirable but the contract permits it.
A6 getDisplayMedia() does NOT need desktopCapture permission in the manifest (it's a web standard API; desktopCapture only gates chrome.desktopCapture.chooseDesktopMedia) Architectural Responsibility Map (Manifest row) + Standard Stack LOW — multiple sources confirm. CONTEXT.md D-05 chooses to declare desktopCapture anyway, which is harmless. If we DROPPED desktopCapture from the manifest, the only ill effect would be losing the option to call chrome.desktopCapture.chooseDesktopMedia (which we don't use).
A7 The chrome.runtime.getContexts API is available in Chrome ≥ 116 and is the recommended way to test for an existing offscreen document (replaces chrome.offscreen.hasDocument) Pattern 1 / Example C MEDIUM — chrome.offscreen.hasDocument is the older, simpler check and still works. The canonical Google sample uses getContexts. Either works; planner can pick.

If this table contains items: The planner should treat them as candidates for user verification during /gsd-plan-phase review.

Open Questions

  1. Will MediaRecorder.start(2000) produce ffprobe-clean WebM on a typical screen-cap?

    • What we know: Cluster boundaries align with keyframes; Chrome keyframes appear every ~3-5 s by default (vp9 kf_max_dist=100 on a 30 fps stream); timeslice does NOT force keyframes.
    • What's unclear: How often in practice does a 2 s timeslice happen to land at a cluster boundary for a desktop screen-cap (which has lots of static frames and may have different keyframe cadence than a webcam)?
    • Recommendation: Build Pattern 2 first; run the D-12 ffprobe gate; keep Pattern 3 (restart-segments) pre-staged in PLAN.md per CONTEXT.md D-13 so we don't re-plan if Pattern 2 fails. Plan-checker can ratchet this in the success criteria.
  2. Does the 5-minute port lifetime kill the recording session?

    • What we know: Multiple corroborating community sources cite a ~5 minute hard cap on long-lived ports.
    • What's unclear: Whether the cap applies to port lifetime (the port object dies and must be reconnected) OR to SW lifetime extension (after 5 minutes of port keepalive, the SW is killed anyway and the port goes with it).
    • Recommendation: Pessimistic — assume the worst, reconnect every ~290 s. Cheap defensive code. If we learn the cap is different, the reconnect is still harmless.
  3. What's the exact crxjs path-emit behavior for the offscreen entry?

    • What we know: The discussion #919 working answer uses input: { offscreen: 'src/offscreen/offscreen.html' } and SW fetches chrome.runtime.getURL('src/offscreen/offscreen.html').
    • What's unclear: Some crxjs versions strip the leading src/; the 2.0.0-beta vs 2.4.0 difference might matter.
    • Recommendation: After the first npm run build, inspect dist/ to confirm the actual emitted path, then encode that path as a constant in SW. This is a verifiable runtime check, not a design decision.

Environment Availability

Dependency Required By Available Version Fallback
Node.js Vite, TypeScript, npm v24.14.0
npm Dep install 11.9.0
ffprobe (FFmpeg) D-12 acceptance gate; ffprobe-based verification of every export sample 8.1.1 None needed (ffprobe is the gate)
Chrome / Chromium Manual smoke test (unpacked load → Сохранить отчёт → inspect dist) Plan must call out "manual test requires Chrome ≥ 116; install via apt install google-chrome-stable or note the gap to the operator."
Playwright / chromium-test-runner Optional headed-Chrome integration tests (see Validation Architecture) Phase 1 acceptance does NOT require Playwright. Manual smoke is acceptable per ROADMAP Phase 4. If we want unit-test coverage for the trim logic, Vitest in node mode is enough.
node_modules/ vite build, tsc Run npm install at start of phase; no fallback.

Missing dependencies with no fallback (blocking execution):

  • node_modules/ — must run npm install once before any TS/Vite work. Add as Wave 0 task.

Missing dependencies with fallback (acceptable):

  • Chrome browser — manual smoke is Phase 4's job; for Phase 1, type-check
    • ffprobe-on-test-fixture is the deepest automated gate. If the developer doesn't have Chrome installed, the plan still completes; the Phase 4 ROADMAP item is where Chrome becomes mandatory.
  • Playwright — not needed; see Validation Architecture below for why.

Validation Architecture

Nyquist validation is enabled (workflow.tdd_mode: true in .planning/config.json). The validation strategy is layered:

Test Framework

Property Value
Framework Vitest (Node mode for pure logic; Browser mode if needed for MediaRecorder mocks) — recommended, NOT currently installed. Vite is already a dev dep so Vitest is a zero-friction add.
Config file NONE — Wave 0 creates vitest.config.ts.
Quick run command npx vitest run --reporter=dot (after install)
Full suite command npx vitest run + npm run build (typecheck via tsc --noEmit) + ffprobe gate (D-12)

Why not Jest: vite is already the build tool; Vitest is the zero-config-mismatch choice. No transformer dance for TS.

Why not Playwright: MediaRecorder + getDisplayMedia ARE driveable in Chromium via Playwright with permissions auto-granted, but the acceptance gate (ffprobe on a real exported file) requires actually running the extension. Manual smoke + ffprobe is sufficient for Phase 1. Playwright-driven smoke tests are Phase 4/5 territory.

What's testable in Node-only Vitest:

  • Ring buffer logic (addChunk, trimAged) — pure function, takes {data: {size: number}, timestamp: number, isHeader: boolean}[] and returns the trimmed array. Mock Blob as {size: N, type: 'video/webm'}.
  • Message handlers (mock chrome.runtime with vitest-chrome or a lightweight stub).
  • Port lifecycle / reconnect logic.
  • Codec strict-mode error path (mock MediaRecorder.isTypeSupported → false).

What's NOT testable in Vitest, requires manual smoke / Phase 4:

  • The actual getDisplayMedia flow (browser picker).
  • Real WebM playability (covered by ffprobe gate on a test-fixture file).
  • SW idle-unload survival (covered by manual DevTools "Force stop" test in Phase 4 smoke checklist).

Phase Requirements → Test Map

Req ID Behavior Test Type Automated Command File Exists?
REQ-video-ring-buffer Ring buffer adds chunk; first chunk gets isHeader: true unit npx vitest run tests/offscreen/ring-buffer.test.ts -t "first chunk is header" Wave 0
REQ-video-ring-buffer Ring buffer evicts chunks older than 30 s; keeps header unit npx vitest run tests/offscreen/ring-buffer.test.ts -t "trim 30s" Wave 0
REQ-video-ring-buffer Codec strict-mode throws when vp9 unsupported (D-20) unit npx vitest run tests/offscreen/codec-check.test.ts Wave 0
REQ-video-ring-buffer OFFSCREEN_READY message sent on listener registration unit npx vitest run tests/offscreen/handshake.test.ts Wave 0
REQ-video-ring-buffer Port reconnect on disconnect within 1 s unit npx vitest run tests/offscreen/port.test.ts -t "reconnects" Wave 0
REQ-video-ring-buffer SW deletes alarms keepalive (D-18) type-check / grep ! grep -RIn "chrome.alarms" src/background/ NO CODE NEEDED (CI grep)
REQ-video-ring-buffer SW deletes IndexedDB code path (D-19) grep ! grep -RIn "VideoRecorderDB|openIndexedDB" src/ NO CODE NEEDED (CI grep)
REQ-video-ring-buffer vite.config.ts:11-184 inline plugin deleted (D-08) grep ! grep -RIn "copy-offscreen|chromeMediaSource" vite.config.ts NO CODE NEEDED
REQ-video-ring-buffer (acceptance gate D-12) last_30sec.webm plays ffprobe-clean integration (manual smoke + ffprobe) ffprobe -v error -f matroska -i sample/last_30sec.webm; echo $? Sample fixture produced manually for this gate, OR captured by Playwright in Phase 4. For Phase 1, run on the file the manual smoke produces.
REQ-video-ring-buffer Type-check passes with zero as any and zero @ts-ignore regressions static npx tsc --noEmit && ! grep -RIn "as any|@ts-ignore" src/ EXISTS (tsc --noEmit in npm run build)
REQ-video-ring-buffer Manifest permission swap (D-A6 / D-05) grep ! grep "tabCapture" manifest.json && grep "desktopCapture" manifest.json NO CODE NEEDED
REQ-video-ring-buffer Build produces a loadable extension manual npm run build && ls dist/manifest.json dist/src/offscreen/index.html dist/assets/*.js NO TEST FILE; CI shell check

Sampling Rate

  • Per task commit: npx vitest run --reporter=dot && npx tsc --noEmit (≤ 10 s).
  • Per wave merge: Full Vitest + npm run build + grep guards (≤ 30 s).
  • Phase gate (D-12): Manually load dist/ into Chrome, capture a test session, click save, run ffprobe -v error -f matroska -i ~/Downloads/session_report_*.zip:video/last_30sec.webm (extract via unzip -p), confirm exit 0 with zero stderr lines.

Wave 0 Gaps

  • Install Vitest: npm install -D vitest@^3 @vitest/ui (verify current major via npm view vitest version at the time of install).
  • vitest.config.ts — pull in path aliases from tsconfig.json.
  • tests/offscreen/ directory with at minimum:
    • ring-buffer.test.ts — covers REQ-video-ring-buffer trim & header pinning.
    • codec-check.test.ts — covers D-20 strict-mode error path.
    • handshake.test.ts — covers Pattern 4 OFFSCREEN_READY.
    • port.test.ts — covers Pattern 5 reconnect.
  • tests/fixtures/ — keep a known-good WebM for ffprobe sanity (e.g. produced once on a developer machine and committed). Used by CI to verify the ffprobe gate runs at all.
  • npm test script in package.json: "test": "vitest run".
  • CI? — out of scope per audit P2 #22 (Phase 5).

Security Domain

Default per .planning/config.json: security_enforcement is absent → treated as enabled (per researcher contract).

Applicable ASVS Categories

ASVS Category Applies Standard Control
V2 Authentication No No authentication surface in Phase 1 (local-only, no server).
V3 Session Management No No sessions.
V4 Access Control Yes (limited) Manifest permissions are the access-control boundary. Minimize: desktopCapture is unnecessary if we use only getDisplayMedia (web API), but harmless. tabCapture is being REMOVED. host_permissions: ["<all_urls>"] remains for content-script injection (Phase 2 territory).
V5 Input Validation Yes (limited) The only "input" Phase 1 handles is the streamId NOT applicable (we don't use streamIds in the new path) and inter-context messages. Each chrome.runtime.onMessage handler should validate msg.type against the typed MessageType enum (already exists in src/shared/types.ts).
V6 Cryptography No No crypto.
V14 Configuration Yes manifest.json enumerates the permission set verbatim. The Doc-Cascade tasks (D-A1..D-A6) keep .planning/intel/constraints.md in lockstep with manifest.json.

Known Threat Patterns for {Chrome MV3 extension}

Pattern STRIDE Standard Mitigation
Untrusted message origin (cross-extension message injection) Spoofing Every chrome.runtime.onMessage listener should check sender.id === chrome.runtime.id. The current code doesn't; Phase 1 should add it where it adds new listeners (low effort).
<all_urls> host permission exposes the SW to messages from any content script on any site Tampering Already in design (REQ-manifest-permissions). The mitigation is that the SW only processes messages from its own content script (validated by sender.id check).
Stored video buffer contains sensitive operator session data Information Disclosure CON-buffer-storage: in-memory only, no persistence. CONTEXT.md D-19 reinforces (no IndexedDB, no chrome.storage.session).
Captured video may show passwords typed into other apps (since getDisplayMedia can grab the whole screen) Information Disclosure OUT OF SCOPE per Phase 1: this is exactly the trade-off accepted in CONTEXT.md D-04. The Chrome "Sharing" banner is the user-facing mitigation. Phase 2's password masking applies to rrweb / event-log, not to video pixels.
eval or string-injected code Tampering The vite.config.ts:35-213 inline-string offscreen JS is effectively static (no user input), but it IS string-injected build output. CSP for MV3 extensions disallows eval, but a long template literal is allowed. Phase 1 DELETES this, which is also a security improvement.

Phase 1 has no novel security surface beyond the manifest swap (D-A6) and the sender-id check best-practice.

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence — verified with multiple sources)

Tertiary (LOW confidence — flagged for cross-validation)

  • chromium-extensions group thread — getDisplayMedia in offscreen: https://groups.google.com/a/chromium.org/g/chromium-extensions/c/V09VMCLzvWM — one thread suggests user-gesture issues in offscreen; this appears contradicted by Proscreen-S3 working. Resolution: empirical testing during Wave 1 (manual smoke).
  • recall.ai blog post on how to build a Chrome recording extension: https://www.recall.ai/blog/how-to-build-a-chrome-recording-extension — uses tabCapture pattern (not our path), but confirms the high-level three-component split.
  • Stack Overflow #62236838 — concatenation of MediaRecorder WebM chunks: cited content via WebSearch results only (no direct fetch — site blocked); pattern matches what I confirmed via Graham King's blog and ts-ebml docs.

Metadata

Confidence breakdown:

  • Standard stack & versions: HIGH — all verified via npm view.
  • Architecture (offscreen + DISPLAY_MEDIA + port keepalive): HIGH — verified against (a) official Chrome docs, (b) Google sample (offscreen + USER_MEDIA — same architectural shape), (c) at least two in-the-wild production extensions doing the exact DISPLAY_MEDIA path.
  • Ring-buffer pattern: MEDIUM-HIGH — the structural pattern is solid; the open question is cluster-boundary alignment of start(2000), which is the assumption the ffprobe gate (D-12) and the D-13 fallback are designed to handle.
  • Common pitfalls: HIGH — every pitfall ties to a specific audit defect or a citable Chrome doc / Chromium bug.
  • Validation strategy: MEDIUM — the unit-testable surface is real and documented; the integration test gap (browser/picker) is genuine but accepted (Phase 4 territory).
  • Security: HIGH for what's in scope; nothing exotic.

Research date: 2026-05-15

Valid until: 2026-06-15 (30 days, stable-ecosystem assumption). Re-validate sooner if Chrome releases a 12X version that changes SW lifecycle rules or the offscreen API stability promise. The most volatile finding is A2 (5-minute port lifetime cap) — Chrome team has been actively tuning this.


Phase: 01-stabilize-video-pipeline Research completed: 2026-05-15