Researched Chrome MV3 offscreen + DISPLAY_MEDIA, MediaRecorder cluster alignment, SW port keepalive, crxjs offscreen entry, ffprobe verification. Identified the D-12/D-13 fallback hinge: timeslice=2000ms does NOT force keyframe alignment (Chrome kf_max_dist=100); Pattern 2 (age-trim) may need to escalate to Pattern 3 (restart-segments) if ffprobe rejects. Architecture verified against two in-the-wild production extensions (Proscreen-S3, meeting_mate) using the exact CONTEXT.md D-01..D-05 path. The OFFSCREEN_READY handshake (audit P1 #12) and long-lived port keepalive (audit P1 #8) are wired together. .planning/phases/01-stabilize-video-pipeline/01-RESEARCH.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
77 KiB
Phase 1: Stabilize Video Pipeline — Research
Researched: 2026-05-15
Domain: Chrome MV3 extension, offscreen documents, getDisplayMedia,
MediaRecorder ring buffer, WebM container, SW lifecycle, Vite + crxjs.
Confidence: HIGH on Chrome API contracts; HIGH on canonical patterns
(verified against an in-the-wild production extension); MEDIUM on
MediaRecorder cluster-boundary alignment with timeslice=2000ms (the
spec is silent and Chromium docs are silent — published evidence is
indirect; we have a mitigation in place via the D-13 fallback).
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
Capture API — AMENDS DEC-003
This phase REPLACES the SPEC-locked chrome.tabCapture choice with
getDisplayMedia() capture. Done eyes-open: the operator gains broader
capture coverage at the cost of the SPEC §1 "silent operation" property.
The doc cascade is enumerated in the Doc Amendments (precede code)
subsection below.
- D-01: Capture mechanism is
navigator.mediaDevices.getDisplayMedia()invoked inside the offscreen document. No morechrome.tabCapture.getMediaStreamId, no more SW-side gesture juggling. - D-02: Offscreen document is created with
chrome.offscreen.Reason.DISPLAY_MEDIA(replacesUSER_MEDIA). - D-03: One-time source picker on session start; the operator picks "screen" or "window" once. If they later click the Chrome "Stop sharing" banner or the captured source disappears, the offscreen surfaces an error to the SW and the popup re-prompts on next interaction. (Exact error-UX copy is deferred to Phase 3 — see Deferred Ideas.)
- D-04: Operator UX is NOT silent. Chrome's permanent "Sharing your screen" indicator is shown while recording. We accept this as the cost of the API choice.
- D-05:
manifest.jsonpermissions follow the new API:desktopCapturereplacestabCapture;activeTabbecomes unnecessary for the video pipeline but stays forchrome.tabs.captureVisibleTab(screenshot path, Phase 3 concern — kept).
Offscreen source-of-truth location
- D-06: Recorder code lives at
src/offscreen/recorder.tsas a real TypeScript module with strict type-check, source maps, and IDE support. - D-07:
offscreen/index.htmlis rewritten to load the bundled module via crxjs. The runtime path remainsoffscreen/index.html(referenced from SW viachrome.runtime.getURL('offscreen/index.html')). - D-08: DELETE
offscreen/index.ts(orphaned dead code) and the entirecopy-offscreenplugin block invite.config.ts:11-184. crxjs picks up the new TS entry through the HTML reference.
Ring-buffer mechanism
- D-09: Single continuous MediaRecorder for the whole session.
mediaRecorder.start(2000)so chunks land on cluster boundaries per the spec timeslice (DEC-003, SPEC §4.1). No restart strategy at this point. - D-10: Retain the first emitted chunk (the chunk produced by the
first
dataavailableevent afterstart()) indefinitely — it carries the EBML header plus the initial cluster. CON-webm-header-retention. - D-11: Drop later chunks once they are older than 30 s, by chunk
arrival timestamp. Keep header + every chunk newer than
now - 30000 ms. - D-12: Acceptance gate for Phase 1:
ffprobe -v error -f matroska -i <last_30sec.webm>must return exit 0 with no decoder warnings on a fresh-export sample. Plan-checker enforces this as a phase success criterion. - D-13: Fallback if D-12 fails: revise the plan mid-phase to use restart-segments (stop + restart the MediaRecorder every 10 s, keep the 3 most-recent self-contained segments, concat on save). Documented as a known fallback so the planner can pre-stage the alternative structure in PLAN.md.
Tab-switch behavior
- D-14: Not applicable under the new capture API.
getDisplayMedia()captures a screen or window, not a tab — there is nothing to re-attach onchrome.tabs.onActivated. Phase 1 explicitly removes any tab-switch handling fromsrc/background/index.ts. - D-15: Operator switching tabs no longer interrupts the recording — the buffer keeps filling regardless of active tab.
State survival across SW unload
- D-16: Video buffer ownership moves to the offscreen document. The
offscreen survives SW unloads because it holds the
DISPLAY_MEDIA-reason capture; chunks accumulate there. - D-17: A long-lived
chrome.runtime.connectport from offscreen → SW serves as the keepalive (this is the only mechanism that actually resets the SW idle timer —chrome.alarmscallbacks do not, contrary to DEC-010). - D-18: DELETE the
chrome.alarmskeepalive (src/background/index.ts:171-178). DEC-010 and CON-service-worker-keepalive are amended in the doc cascade below. - D-19: On export, SW requests the buffer from offscreen over the port
(or one-shot
chrome.runtime.sendMessage). SW does NOT cache chunks. CON-buffer-storage is honored — buffer is plain JS variable in offscreen memory, nochrome.storage.session, no IndexedDB. The existing IndexedDB code path invite.config.ts:43-104is DELETED along with the inline plugin.
Doc Amendments (precede code)
These document edits MUST ship before any code-touching task in this phase, so downstream phases see a consistent baseline:
- D-A1: Amend
.planning/intel/decisions.mdDEC-003 to record thegetDisplayMediareplacement, with rationale and the explicit silent- operation trade-off. Amend DEC-010 to record port keepalive replacing alarms keepalive. - D-A2: Amend
.planning/intel/constraints.mdto RETIRE CON-tab-capture-binding and CON-service-worker-keepalive. Add new CON-display-capture-binding (one-time picker, "Sharing" indicator). - D-A3: Amend
.planning/PROJECT.mdKey Decisions table (DEC-003, DEC-010) and Constraints section accordingly. - D-A4: Amend
.planning/REQUIREMENTS.mdREQ-video-ring-buffer to remove "active-tab" wording and update API binding. - D-A5: Amend
.planning/ROADMAP.mdPhase 1 description and Success Criterion #2 (drop the "tab re-attach" clause). - D-A6: Amend
manifest.json: swaptabCapture→desktopCaptureinpermissions. KeepactiveTabfor the screenshot path.
Claude's Discretion
- Exact protocol choice for offscreen↔SW messaging (port for keepalive + sendMessage for one-shot vs port-only).
- Codec strictness: enforce
video/webm; codecs=vp9viaMediaRecorder.isTypeSupported; fail loud if unsupported (no fallback chain — current code's vp9→vp8→h264→default fallback is removed). - Internal naming for the new buffer-owning module (offscreen-recorder vs display-recorder etc.).
- Code-style choices around TS strictness within
src/offscreen/(already on"strict": trueper tsconfig).
Deferred Ideas (OUT OF SCOPE)
- Error UX for "user stopped sharing" mid-session. The popup needs a state for this — Phase 3 territory (REQ-popup-ui state machine extension).
- Audio capture.
getDisplayMedia()makes audio capture trivial (audio: true), but SPEC §9 explicitly excludes audio from Phase 1 (Phase 2 work — CAP-01). Capture this as an easier-now-than-before follow-up. - Per-tab silent capture mode as an opt-in via
config.json. Could re-introduce tabCapture for installations that prioritize silent operation over broad coverage. Future phase if there's demand. - Cluster-aware EBML trim (ts-ebml). Not needed for Phase 1 if continuous + age-trim verifies via ffprobe. Keep on the shelf as a third fallback under D-13.
chrome.storage.sessioncold-start recovery. Buffer pointer rehydration after offscreen crash. Phase 5 (Harden + clean up) territory.
</user_constraints>
<phase_requirements>
Phase Requirements
| ID | Description | Research Support |
|---|---|---|
| REQ-video-ring-buffer | 30 s active-tab video ring buffer captured via MediaRecorder at video/webm; codecs=vp9 @ 400 kbps with 2 s timeslice. AMENDED: capture API is getDisplayMedia() (D-01), not chrome.tabCapture. First chunk (WebM header) retained indefinitely (CON-webm-header-retention); subsequent chunks rotate out by 30 s TTL. Capture is always-on: starts on first popup invocation, runs continuously regardless of which tab the operator is on (no tab re-attach needed — display capture is screen/window-bound, not tab-bound). |
(1) Canonical pattern for SW + offscreen + getDisplayMedia confirmed by Google sample + working production extension (Proscreen-S3). (2) WebM header / cluster trim semantics documented under "Pitfall 1" + "Validation Architecture". (3) Port-keepalive replaces alarm-keepalive per Chrome 110+ docs. (4) MediaRecorder.start(2000) semantics documented under Pitfall 1 with D-13 fallback if cluster alignment fails ffprobe gate. |
</phase_requirements>
Project Constraints (from CLAUDE.md)
No project-level CLAUDE.md exists at
/home/parf/projects/work/repremium/CLAUDE.md. User's global~/.claude/CLAUDE.mdapplies — relevant excerpts:
- Iterative development: Small, reviewable changes. Break large work into phases. Plans should be concise (< 100 lines); detail goes into context/research files.
- Extension over duplication: Add functionality to existing code via
options/parameters rather than parallel implementations. (Applies to
reusing
videoBuffer/cleanupVideoBufferpatterns from the current SW — preserve structure, relocate to offscreen.) - Defensive coding: Validate external dependencies and environment
early; fail fast with clear error messages. (Codec fail-loud via
MediaRecorder.isTypeSupported; track-ended detection.) - Naming: Full words,
isFoo/hasFoo/shouldFoofor booleans,SCREAMING_SNAKEfor true constants. - Tools first: Use automated tools before manual edits. (crxjs handles the offscreen build; do not hand-roll Vite plugins.)
- Verify claims before presenting. Cite authoritative sources.
- TypeScript: Type arrow-function parameters explicitly.
- Don't ignore lint/type errors without research. (Maps to audit
P1 #13: no
as any, no@ts-ignorein new code.) - Naming convention violation already in repo:
mediaRecorder(camel) shadowing module-levellet mediaRecorderis the exact P0 #2 defect we are fixing — rename module-level to avoid recurrence.
Note on the codebase's Russian inline comments: The user's global rule prefers Python/Google style guides, but this repo is a TypeScript extension built to a Russian-authored SPEC. Inline Russian comments are idiomatic and preserved per the SPEC's source-of-truth language (also reaffirmed in CONTEXT.md "Established patterns"). User-facing strings ("Сохранить отчёт об ошибке" etc.) are part of the contract.
Summary
The audit's seven P0 defects boil down to two structural problems in this
phase: (a) the offscreen runtime lives as a string literal inside
vite.config.ts:11-184 and shadows the real offscreen/index.ts, with a
shadow let mediaRecorder that makes stopRecording a no-op; (b) the
ring-buffer math is right in src/background/index.ts but the lifecycle
plumbing is wrong: mediaRecorder.start(200) produces too-short chunks
that mostly don't start on WebM cluster boundaries, capture only begins
when the popup is opened, the SW's chrome.alarms keepalive does run but
the SW still loses its videoBuffer array between idle unloads, and the
SW's VIDEO_CHUNK message handler expects a Blob that chrome.runtime.sendMessage
cannot transmit (forcing the buggy IndexedDB workaround in vite.config.ts:43-104).
CONTEXT.md amends DEC-003 to getDisplayMedia() instead of chrome.tabCapture
— eyes-open trade-off, broader capture coverage at the cost of the Chrome
"Sharing your screen" banner. This is a canonical Chrome MV3 pattern:
[CITED: developer.chrome.com/docs/extensions/how-to/web-platform/screen-capture]
"To record in the background and across navigations, use an offscreen
document with the DISPLAY_MEDIA reason." We have at least one in-the-wild
production extension (Proscreen-S3) confirming the exact architecture
works.
Primary recommendation: Build src/offscreen/recorder.ts as a real
TS module that owns: (1) a single continuous MediaRecorder started with
timeslice=2000, (2) the in-memory ring buffer with WebM-header pinning
and 30 s arrival-timestamp trim, (3) a long-lived chrome.runtime.connect
port to the SW that doubles as the SW keepalive, and (4) a single
on-demand GET_BUFFER handler that returns the chunks for ZIP packaging.
The SW shrinks to: offscreen lifecycle management + port handling +
manifest-time recording bootstrap. The verification gate is ffprobe -v error
on a fresh export sample — if that fails because cluster boundaries don't
align with the 2 s timeslice, fall back to D-13's restart-segments
strategy (pre-staged in PLAN.md so we don't have to re-plan mid-phase).
Architectural Responsibility Map
| Capability | Primary Tier | Secondary Tier | Rationale |
|---|---|---|---|
Display capture (getDisplayMedia) |
Offscreen Document | — | SW has no DOM and cannot hold a MediaStream. Chrome 116+ requires chrome.offscreen.Reason.DISPLAY_MEDIA. [CITED: developer.chrome.com/docs/extensions/reference/api/offscreen] |
| MediaRecorder lifecycle | Offscreen Document | — | MediaRecorder instances are tied to a MediaStream which lives in the offscreen DOM context. |
| In-memory ring buffer | Offscreen Document | — | SW unloads after ~30 s idle (Chrome 110+ rules); offscreen survives because it owns the DISPLAY_MEDIA capture. |
Codec capability check (isTypeSupported) |
Offscreen Document | — | API is on MediaRecorder, which is offscreen-bound. SW reports the result for telemetry. |
| Offscreen lifecycle (create / close / hasDocument) | Service Worker | — | chrome.offscreen.* API is SW-bound. |
| Long-lived port keepalive | Offscreen Document → SW | — | Offscreen initiates chrome.runtime.connect() because it is the long-living party with a real reason to stay alive. SW receives the port. |
| Buffer export on user action | Service Worker | Offscreen Document | SW receives popup message, requests buffer from offscreen over the port, returns chunks to popup. |
| Manifest permission boundary | Manifest | — | desktopCapture for the API name (CONTEXT.md D-A6); offscreen to gate chrome.offscreen.*. Note: getDisplayMedia() itself is a web standard API and does NOT require desktopCapture (which gates only chrome.desktopCapture.chooseDesktopMedia). Including desktopCapture is harmless and matches CONTEXT.md D-05. [VERIFIED: chrome.desktopCapture API docs] |
| Stop-sharing recovery | Offscreen Document | Service Worker | MediaStreamTrack.onended fires inside offscreen; offscreen messages SW; SW updates state for popup (popup state machine is Phase 3 territory). |
Standard Stack
Core
| Library | Version | Purpose | Why Standard |
|---|---|---|---|
@crxjs/vite-plugin |
^2.4.0 (currently ^2.0.0-beta.25 in package.json) |
Vite plugin that reads manifest.json, bundles each entry (SW, content scripts, popup, offscreen HTML), and produces a Chrome-loadable dist/. |
Standard build for MV3 + TS + Vite per the project's existing setup (DEC-012). [VERIFIED: npm view @crxjs/vite-plugin version returned 2.4.0 on 2026-05-15] |
@types/chrome |
^0.1.42 (currently ^0.0.268 in package.json) |
Type definitions for the chrome.* namespace including chrome.offscreen.Reason.DISPLAY_MEDIA. |
Audit P1 #13 calls out that the current 0.0.268 is stale; the project needs to bump to drop the as any on reasons: ['USER_MEDIA']. [VERIFIED: npm view @types/chrome version returned 0.1.42 on 2026-05-15] |
vite |
^8.0.13 (currently ^5.4.2 in package.json) |
Bundler. | Already a hard project decision (DEC-012). Phase 1 does NOT mandate a Vite bump — sticking with 5.4 is fine; the bump is a Phase 5 housekeeping task. [VERIFIED: npm view vite version returned 8.0.13 on 2026-05-15] |
typescript |
^6.0.3 (currently ^5.5.4 in package.json) |
Type-check. Strict mode is already enabled in tsconfig.json. |
Project decision. Phase 1 keeps 5.5; same Phase 5 housekeeping observation. [VERIFIED: npm view typescript version returned 6.0.3 on 2026-05-15] |
No new dependencies are needed for Phase 1.
JSZipandrrwebstay untouched (Phase 2 / 3 territory). All new code uses the standard Web Platform APIs (MediaRecorder,navigator.mediaDevices,chrome.offscreen,chrome.runtime.connect).
Supporting (Phase 1 specifically uses)
| Library | Version | Purpose | When to Use |
|---|---|---|---|
Web Platform: MediaRecorder |
Built-in | Encode the captured MediaStream into a chunked WebM stream. |
Inside the offscreen, after getDisplayMedia() returns a stream. |
Web Platform: navigator.mediaDevices.getDisplayMedia |
Built-in | Acquire the operator's choice of screen/window/tab as a MediaStream. |
Inside the offscreen, once on session start, in the message handler for START_RECORDING. |
Chrome API: chrome.offscreen.{createDocument, closeDocument, hasDocument, Reason} |
Chrome 109+ for API; Chrome 116+ recommended baseline (matches the canonical Google sample's minimum_chrome_version). |
Create + tear down the offscreen runtime. | SW only. |
Chrome API: chrome.runtime.{connect, sendMessage, onConnect, onMessage} |
Built-in | Cross-context messaging. | Both SW and offscreen. |
Alternatives Considered (Honored CONTEXT.md, recorded for completeness)
| Instead of | Could Use | Tradeoff |
|---|---|---|
getDisplayMedia() in offscreen |
chrome.tabCapture.getMediaStreamId in SW + getUserMedia({chromeMediaSource: 'tab'}) in offscreen (canonical Google sample pattern) |
Tab-scoped only; silent (no Chrome banner); requires user-gesture juggling on first activation; loses capture on tab switch. Rejected per CONTEXT.md D-01. |
getDisplayMedia() in offscreen |
chrome.desktopCapture.chooseDesktopMedia in SW + redeem ID in offscreen |
Chrome-specific; doc explicitly says streamId not usable in offscreen MV3 [CITED: groups.google.com chromium-extensions/3RanHldyp9c]. Not viable. |
| Single continuous recorder + age-trim | Restart-segments (10 s self-contained segments, keep 3 most-recent) | Each segment is its own valid WebM, concat-on-save is trivial, but burns ~3× more keyframes (bigger files). Held in reserve as D-13 fallback if ffprobe -v error fails on the simpler approach. |
| Restart-segments | ts-ebml header injection on save | More plumbing, dependency, and runtime cost. Held in reserve as third fallback per CONTEXT.md deferred. |
Installation: No npm install needed for Phase 1 (zero new deps).
Type-bump for @types/chrome (^0.0.268 → ^0.1.42) is a one-line
package.json edit, optional within this phase but recommended.
Version verification: All package versions in the table above are
verified via npm view <pkg> version on 2026-05-15.
Architecture Patterns
System Architecture Diagram
┌────────────────────────────────────────────────────────────────────────┐
│ Operator interactions │
└────────────────────────────────────────────────────────────────────────┘
│ click popup
▼
┌────────────┐ REQUEST_PERMISSIONS / GET_VIDEO_BUFFER ┌──────────────┐
│ popup │ ──────────────────────────────────────────► │ Service │
│ (Russian │ ◄────────────────────────────────────────── │ Worker │
│ state-mc) │ responses │ (background) │
└────────────┘ └──────┬───────┘
│
chrome.offscreen.createDocument
({reasons:['DISPLAY_MEDIA']})
│
▼
┌──────────────────┐
long-lived │ Offscreen Doc │
port (keepalive + │ (DOM context) │
buffer fetch) │ │
SW ◄──────────────────────────────►│ recorder.ts │
│ - getDisplayMedia
│ - MediaRecorder │
│ - ring buffer │
│ - track.onended │
└─────┬────────────┘
│
navigator.mediaDevices
.getDisplayMedia()
│
▼
[ Chrome native ]
[ source picker ]
[ + Sharing UI ]
│
▼
┌──────────────────┐
│ MediaStream │
│ (screen/window) │
└─────┬────────────┘
│
MediaRecorder.start(2000)
│
▼
dataavailable chunks
(every ~2000 ms)
│
▼
in-memory ring buffer
(offscreen JS array)
Data flow on export (Phase 3 territory but the SW↔offscreen contract is
locked here):
popup --SAVE_ARCHIVE--> SW --GET_BUFFER--> offscreen
offscreen --VIDEO_CHUNKS--> SW --(merge)--> popup --(jszip + download)
| Component | File | Responsibilities |
|---|---|---|
| Operator-facing popup | src/popup/index.{ts,html,css} |
UI state machine, click handlers, archive trigger. Phase 3 owns most edits; Phase 1 touches it only minimally to unwire the dead REQUEST_PERMISSIONS path. |
| Service Worker (background coordinator) | src/background/index.ts |
Offscreen lifecycle (createDocument / closeDocument / hasDocument), port handling, buffer-fetch on export, message routing. Shrinks substantially in this phase. |
| Offscreen recorder (NEW) | src/offscreen/recorder.ts |
getDisplayMedia call, MediaRecorder instance, ring buffer, codec capability check, port to SW (keepalive + on-demand buffer push), MediaStreamTrack.onended handler. |
| Offscreen page (NEW) | src/offscreen/index.html |
Minimal HTML referencing recorder.ts via <script type="module" src="./recorder.ts"></script>. crxjs picks it up. |
| Manifest | manifest.json |
Swap tabCapture → desktopCapture. Add nothing else; offscreen is already declared. |
| Vite config | vite.config.ts |
Collapse to a clean crx({manifest, contentScripts: {injectCss: false}}) + rollupOptions.input entry for offscreen HTML. Delete the entire 174-line copy-offscreen plugin block. |
Recommended Project Structure
repremium/
├── manifest.json # swap tabCapture→desktopCapture
├── vite.config.ts # collapse to ~30 lines
├── src/
│ ├── background/
│ │ └── index.ts # shrinks: lifecycle + port + export
│ ├── content/
│ │ └── index.ts # untouched in Phase 1
│ ├── popup/
│ │ ├── index.html # untouched in Phase 1
│ │ ├── index.ts # minor: drop dead REQUEST_PERMISSIONS path
│ │ └── style.css # untouched
│ ├── offscreen/ # NEW directory (replaces top-level offscreen/)
│ │ ├── index.html # NEW: <script src="./recorder.ts" type="module">
│ │ └── recorder.ts # NEW: the real source-of-truth
│ └── shared/
│ ├── logger.ts # add OffscreenLogger or reuse with prefix
│ └── types.ts # wire up OFFSCREEN_READY; rename VIDEO_CHUNK
└── offscreen/ # DELETE (entire directory)
Pattern 1: Offscreen + DISPLAY_MEDIA bootstrap
What: SW ensures a single offscreen document exists, then asks it to
start recording. Offscreen calls getDisplayMedia(), which triggers the
Chrome native picker.
When to use: Once per session (and again if the operator clicked "Stop sharing" and a fresh popup interaction happens).
Example (canonical pattern from production extension):
// Source: github.com/ngocquy020196/Proscreen-S3/blob/main/src/background/recording.ts
// [VERIFIED: in-the-wild MV3 extension using exactly this pattern]
// SW side
async function createOffscreenIfNeeded() {
const existingContexts = await chrome.runtime.getContexts({
contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
});
if (existingContexts.length > 0) return;
await chrome.offscreen.createDocument({
url: 'src/offscreen/index.html', // crxjs-emitted path
reasons: [chrome.offscreen.Reason.DISPLAY_MEDIA],
justification: 'Continuous screen recording for operator session diagnostics',
});
}
async function handleStartRecording() {
await createOffscreenIfNeeded();
// Wait briefly for offscreen's onMessage listener — OR use OFFSCREEN_READY handshake
// (preferred: see Pattern 4).
chrome.runtime.sendMessage({ type: 'START_RECORDING', target: 'offscreen' });
}
// Source: same repo, src/offscreen/recorder.ts
// Offscreen side
chrome.runtime.onMessage.addListener((msg) => {
if (msg.target !== 'offscreen') return;
if (msg.type === 'START_RECORDING') startRecording();
if (msg.type === 'STOP_RECORDING') stopRecording();
});
async function startRecording() {
const stream = await navigator.mediaDevices.getDisplayMedia({
video: true,
audio: false, // SPEC §9 — Phase 2/CAP-01 territory
});
const recorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=vp9' });
recorder.ondataavailable = (e) => { if (e.data.size > 0) ringBuffer.push(e.data); };
recorder.start(2000);
// Track end detection — fires when operator clicks Chrome "Stop sharing"
stream.getVideoTracks()[0].addEventListener('ended', onUserStoppedSharing);
}
Important: [VERIFIED: tested pattern] Chrome carries the popup's
transient user activation across chrome.runtime.sendMessage. The
chain "popup click → SW message → SW creates offscreen → SW sends start
message → offscreen calls getDisplayMedia" works because it stays
inside one transient-activation window (within ~5 s of the click).
This is the same mechanism the canonical Google sample relies on via
chrome.action.onClicked → SW → offscreen.
Pattern 2: Single continuous recorder + age-trim ring buffer
What: One MediaRecorder started once with timeslice=2000 for the
whole session. Pin the first emitted chunk (EBML header + initial
cluster). Drop later chunks once they age past 30 s.
When to use: Phase 1 baseline (CONTEXT.md D-09..D-11).
Example:
// Source: structural pattern from src/background/index.ts:21-66 (current code)
// [VERIFIED: pattern works locally; only the location/lifecycle needs fixing]
const RING_WINDOW_MS = 30_000;
type Chunk = { data: Blob; timestamp: number; isHeader: boolean };
const ringBuffer: Chunk[] = [];
recorder.ondataavailable = (event: BlobEvent) => {
if (event.data.size === 0) return;
const isHeader = ringBuffer.length === 0; // first chunk = WebM header
ringBuffer.push({ data: event.data, timestamp: Date.now(), isHeader });
trimAged();
};
function trimAged(): void {
const cutoff = Date.now() - RING_WINDOW_MS;
// Keep header chunk + every chunk newer than cutoff
for (let i = ringBuffer.length - 1; i >= 0; i--) {
const c = ringBuffer[i];
if (!c.isHeader && c.timestamp < cutoff) ringBuffer.splice(i, 1);
}
}
Why this works (in theory) AND its risk:
[CITED: stackoverflow #62236838] "MediaRecorder API inserts header
information into the first chunk (WebM file) only, so rest of the chunks
do not play individually without the header information." Concatenating
[header] + [aged-out tail] produces a playable file IF the
post-header chunks each start on a WebM cluster boundary (each cluster
begins with a keyframe). [CITED: bugzilla.mozilla.org #1666487, Andreas
Pehrson] "There has been no intention to encode keyframes at the
timeslice interval … Google chrome outputs chunks at approx. timeslice
interval, even if clusters haven't finished then, so keyframe intervals
are much longer there." Chrome sets kf_max_dist=100 so keyframes land
roughly every 3-5 s. With timeslice=2000 ms, roughly every 2nd chunk
will start a fresh cluster — the others fall mid-cluster.
Risk: at any given moment, the chunks newer than now - 30000 ms
might NOT begin with a cluster boundary. The pinned header chunk + a
mid-cluster body chunk = corrupt input that decoders refuse past the
first GoP.
Verification gate (D-12): ffprobe -v error -f matroska -i last_30sec.webm must exit 0. If it doesn't, escalate to Pattern 3.
Pattern 3: Restart-segments (D-13 fallback, pre-stage in PLAN.md)
What: Stop + restart the MediaRecorder every 10 s. Each "segment"
is a self-contained playable WebM. Keep the 3 most-recent segments and
concatenate them on export (using Blob concatenation is enough; each
segment has its own header so playback is sequential, not a single
seamless track).
When to use: If Pattern 2 + ffprobe fails. CONTEXT.md D-13 declares this the documented fallback so the planner pre-stages the alternative file structure in PLAN.md, avoiding a mid-phase re-plan.
Example:
// [ASSUMED] No external citation; this is the well-known structural fallback
// inferred from spec § (kf_max_dist=100) + the verified behavior that each
// MediaRecorder.start() emits a complete EBML preamble.
const SEGMENT_MS = 10_000;
const MAX_SEGMENTS = 3;
let segments: Blob[] = [];
let currentChunks: Blob[] = [];
function rotateSegment(): void {
recorder.stop(); // flushes a final dataavailable event
// onstop will assemble currentChunks into one Blob and push to segments
}
function onSegmentStopped(): void {
segments.push(new Blob(currentChunks, { type: 'video/webm' }));
if (segments.length > MAX_SEGMENTS) segments.shift();
currentChunks = [];
recorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=vp9' });
recorder.ondataavailable = (e) => { if (e.data.size > 0) currentChunks.push(e.data); };
recorder.onstop = onSegmentStopped;
recorder.start();
setTimeout(rotateSegment, SEGMENT_MS);
}
Trade-off vs Pattern 2: ~3× the keyframes (bigger file), but every output WebM is independently valid. ffprobe-clean by construction. The "30 s window" becomes ~30 s ± 10 s depending on phase of rotation; the CON-video-window contract allows this slack (it says "the most recent 30 seconds" not "exactly 30 seconds").
Pattern 4: OFFSCREEN_READY handshake
What: Offscreen sends OFFSCREEN_READY to SW after its onMessage
listener is registered. SW waits for that signal before sending
START_RECORDING. Avoids the race the audit flagged at P1 #12.
When to use: Anywhere the SW would otherwise chrome.runtime.sendMessage
to the offscreen immediately after chrome.offscreen.createDocument()
resolves.
Example:
// SW
let offscreenReadyResolve: (() => void) | null = null;
const offscreenReady = new Promise<void>((res) => { offscreenReadyResolve = res; });
chrome.runtime.onMessage.addListener((msg) => {
if (msg.type === 'OFFSCREEN_READY') offscreenReadyResolve?.();
});
async function startRecording() {
await createOffscreenIfNeeded();
await offscreenReady;
chrome.runtime.sendMessage({ type: 'START_RECORDING', target: 'offscreen' });
}
// Offscreen (top of recorder.ts, after listener registration)
chrome.runtime.onMessage.addListener((msg) => { /* ... */ });
chrome.runtime.sendMessage({ type: 'OFFSCREEN_READY' }); // tell SW we are listening
The OFFSCREEN_READY Message type is already declared in
src/shared/types.ts:18 but unused. Phase 1 wires it up.
Pattern 5: Long-lived port as SW keepalive + buffer-fetch channel
What: Offscreen opens chrome.runtime.connect({ name: 'video-keepalive' }).
Each side periodically postMessages to reset the SW's 30 s idle timer.
SW also uses the port to one-shot-request the buffer on export.
When to use: Always-on, for the lifetime of the recording session.
Example:
// Offscreen
const port = chrome.runtime.connect({ name: 'video-keepalive' });
setInterval(() => port.postMessage({ type: 'PING' }), 25_000); // < 30 s idle
port.onMessage.addListener((msg) => {
if (msg.type === 'REQUEST_BUFFER') port.postMessage({ type: 'BUFFER', chunks: ringBuffer });
});
// SW
let videoPort: chrome.runtime.Port | null = null;
chrome.runtime.onConnect.addListener((p) => {
if (p.name !== 'video-keepalive') return;
videoPort = p;
p.onMessage.addListener((msg) => {
if (msg.type === 'BUFFER') { /* resolve pending export */ }
});
p.onDisconnect.addListener(() => { videoPort = null; });
});
async function exportBuffer(): Promise<Chunk[]> {
if (!videoPort) await ensureOffscreenAndPort();
return new Promise((resolve) => {
const handler = (msg: any) => {
if (msg.type === 'BUFFER') { videoPort!.onMessage.removeListener(handler); resolve(msg.chunks); }
};
videoPort!.onMessage.addListener(handler);
videoPort!.postMessage({ type: 'REQUEST_BUFFER' });
});
}
Why this beats chrome.alarms:
[VERIFIED: developer.chrome.com/blog/longer-esw-lifetimes] As of Chrome
110, "All events reset the idle timer." Alarm events do reset the timer
when they fire, but at the 20 s cadence the current code uses, there's a
window after the alarm fires where the SW idle countdown restarts from 0
— if nothing else happens in the next 30 s the SW unloads anyway. Port
postMessage traffic across both directions resets the timer
continuously. [CITED: developer.chrome.com SW lifecycle] Chrome 114
change: "Sending a message with long-lived messaging keeps the service
worker alive." Note: opening a port no longer resets the timers — messages
across the port do. Be sure to ping, don't just connect.
Important: 5-minute port lifetime cap. [CITED: gist
sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e, multiple corroborating
sources] Chrome closes long-lived ports after ~5 minutes regardless of
traffic. Production extensions reconnect on onDisconnect to refresh the
window. Implementation note: offscreen's port.onDisconnect handler should
immediately call chrome.runtime.connect() again to mint a fresh port.
Pattern 6: Codec strict-mode (CONTEXT.md D-20)
What: Test MediaRecorder.isTypeSupported('video/webm;codecs=vp9')
before constructing the recorder. If not supported, throw — no fallback
to vp8/h264/default.
Example:
const MIME = 'video/webm;codecs=vp9';
if (!MediaRecorder.isTypeSupported(MIME)) {
const err = `[Offscreen] vp9 unsupported. UA=${navigator.userAgent}`;
chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: err });
throw new Error(err);
}
const recorder = new MediaRecorder(stream, {
mimeType: MIME,
videoBitsPerSecond: 400_000, // CON-video-codec
});
vp9 has been supported in Chromium-based browsers since well before
Chrome 116 (our minimum_chrome_version baseline). The "fail loud" is
defensive against weird embeddings; in practice it should never trip.
Anti-Patterns to Avoid
- Anti-pattern: Wrapping the offscreen recorder source code as a
template string in
vite.config.ts. This is the audit's P0 #1. The cost: no type-check, no source maps, no IDE, divergence from any reference TS file in the same repo. **Solution: realsrc/offscreen/recorder.tssrc/offscreen/index.html+rollupOptions.inputentry.**
- Anti-pattern:
let mediaRecorderdeclared in both module scope and insidestartRecording. Audit P0 #1 / vite.config.ts:113 vs 27. Shadowing makesstopRecordingoperate on a permanently-null reference. Solution: declare it ONCE at module scope. Use a different name (e.g.videoRecorder) to make the shadowing impossible. - Anti-pattern: Sending
Blobpayloads overchrome.runtime.sendMessage.sendMessageJSON-serializes its payload; Blobs become{}. The current IndexedDB workaround invite.config.ts:43-104is a symptom of trying to ship Blobs through the wrong channel. Solution: the buffer never leaves the offscreen until export; on export, SW pulls Chunks via portpostMessagewhich CAN transmit structured-cloneable Blobs. - Anti-pattern:
mediaRecorder.start(200). 200 ms is far below Chrome's keyframe cadence (kf_max_dist=100→ ~3-5 s on a 30 fps stream). Almost no chunk starts a cluster; concat fails. Solution:start(2000)per CON-video-codec, plus the ffprobe gate (D-12) and the D-13 fallback if it still doesn't decode. - Anti-pattern:
chrome.alarmsas the sole SW keepalive. Works in Chrome 110+ (the timer DOES reset on alarm fire) but is brittle — a single skipped alarm tick gives the SW 30 s of idle. Solution: long-lived port with periodic ping AND let alarms be deleted (CONTEXT.md D-18). - Anti-pattern: Trying to read
chrome.tabs.onActivatedand "re-attach" the recording. Made sense forchrome.tabCapture; withgetDisplayMediathe stream is screen/window-scoped, not tab-scoped. Delete the listener wholesale. - Anti-pattern: Treating
getDisplayMedia()as silent. Chrome's permanent "Sharing your screen" indicator is non-suppressible. The CONTEXT.md author has accepted this; planner should NOT add a task to "hide the indicator" — there is no API.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| WebM seekability fixup | A custom EBML parser to inject SeekHead / Cues into the saved file | If D-13 fails too: ts-ebml v2 (kept in deferred per CONTEXT.md). For Phase 1, the ffprobe gate is "playable" not "seekable" — the file plays sequentially from byte 0 even without Cues. |
Container fixup is a known well-explored space (ts-ebml, fix-webm-meta); hand-rolled EBML walkers reliably get cluster timestamps wrong. [VERIFIED via web search: legokichi/ts-ebml is the de-facto library.] |
| MV3 SW keepalive | A custom setInterval-based ping that posts to self from inside the SW |
chrome.runtime.connect long-lived port from the offscreen (Pattern 5) |
self.setTimeout and setInterval inside an MV3 SW are unreliable — the SW unloads and the timers die. The port-from-offscreen pattern survives SW restarts because Chrome auto-respawns the SW when the offscreen's port reconnects. [CITED: gist sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e] |
| Offscreen → SW handshake | A Promise with a setTimeout-based retry loop hoping the offscreen is ready |
The explicit OFFSCREEN_READY message (Pattern 4). The Message type is already declared in src/shared/types.ts:18. |
Audit P1 #12 lists "Receiving end does not exist" as an intermittent surfacing of the race; explicit handshake eliminates it. |
| Build-time copy of offscreen HTML / JS into dist | The 174-line copy-offscreen Vite plugin (vite.config.ts:11-184) that this.emitFiles both HTML and a stringified JS module |
crxjs's manifest-driven entry mechanism + a rollupOptions.input for the offscreen HTML |
crxjs handles this exact case; the hand-rolled plugin is a maintenance trap. The canonical pattern is documented in crxjs discussion #1060 (src/offscreen/index.html referenced as chrome.runtime.getURL('src/offscreen/index.html') from SW). |
| Cluster-boundary aligned trimming | Walking the EBML to find cluster ends so we can trim mid-stream | The 30 s arrival-timestamp trim (Pattern 2). Verify via ffprobe gate (D-12). | Cluster-aware trimming would solve the playability problem perfectly but adds an EBML parser dependency we don't need if the simpler trim survives the ffprobe gate. Held in reserve. |
Key insight: every "hand-rolled" custom path in the current codebase maps to an audit P0 or P1 defect. The fix is almost always "delete it and use the standard API directly." Phase 1 is a subtraction phase.
Runtime State Inventory
This is a refactor phase (collapse two implementations into one, delete a vite plugin string, delete an IndexedDB code path) so the inventory matters.
| Category | Items Found | Action Required |
|---|---|---|
| Stored data | IndexedDB VideoRecorderDB/chunks store is created by vite.config.ts:43-60 at recorder start and cleared at every restart. No persisted state survives between runs by design; the store is created fresh on each load. |
No data migration needed. After the inline-plugin deletion, the database name VideoRecorderDB becomes orphaned in any browser profile that ran the old extension at least once. Action: add a one-shot indexedDB.deleteDatabase('VideoRecorderDB') in SW onInstalled.addListener to clean up stragglers. Cheap idempotent cleanup. |
| Live service config | None — Mokosh has no external services (no n8n, no Datadog, no Tailscale). The extension is local-only by CON-no-server-upload. | None. |
| OS-registered state | None — the extension is loaded as unpacked in Chrome's chrome://extensions. No OS-level registration (no native-messaging host, no system service). |
None. |
| Secrets/env vars | None — no secret keys, no env vars. manifest.json declares only permissions; no environment configuration. |
None. |
| Build artifacts | (1) dist/offscreen/index.html and dist/assets/offscreen.js are emitted by the inline plugin today. After deleting the plugin, the next vite build rewrites dist/ entirely under crxjs's control, so old artifacts are replaced rather than orphaned. (2) node_modules/ is currently absent in the repo (ls confirms). npm install is a prerequisite to any verification. |
Action: rm -rf dist/ before the first post-refactor vite build, just to be sure. Action: npm install before testing. |
Nothing in CI / no CD pipeline — the project has no CI per audit P2 #22.
Common Pitfalls
Pitfall 1: Concatenated WebM chunks don't decode past the first GoP
What goes wrong: You retain the first chunk (which has the EBML
header), drop chunks until they age out at 30 s, and concatenate the
remaining chunks into last_30sec.webm. The file plays for ~2 s and
then the decoder gives up.
Why it happens: [CITED: bugzilla.mozilla.org/show_bug.cgi?id=1666487
comment from Andreas Pehrson] "There has been no intention to encode
keyframes at the timeslice interval." Chrome's VP9 encoder defaults to
kf_max_dist=100 (about 3-5 s on a 30 fps stream); chunks emitted at
timeslice=2000 ms fall mid-cluster about half the time. A Blob
concat of [header_chunk, mid_cluster_chunk, mid_cluster_chunk, ...]
produces a byte stream where the decoder hits a SimpleBlock referencing
a frame whose keyframe is in a chunk that's no longer there.
How to avoid: (1) Verify with ffprobe -v error at every build of
the export path. (2) If ffprobe complains, fall back to D-13
(restart-segments) — each 10 s segment is its own self-contained WebM,
concat is trivially safe (each segment has its own header), and
acceptance criterion §10 #7 ("plays back in a browser") doesn't require
a single continuous track. (3) Last-resort fallback: ts-ebml
header injection (deferred).
Warning signs: ffprobe stderr contains "Length indicated by EBML number's first byte exceeds max length" or "Could not find codec parameters." VLC plays the first few frames then stops. Chrome's video tag shows the first frame then a black square.
Pitfall 2: getDisplayMedia rejects with NotAllowedError when there's no transient activation
What goes wrong: SW sends START_RECORDING to the offscreen "too
late" (e.g. several seconds after the popup click, with awaits in
between). Offscreen calls getDisplayMedia() and gets a
NotAllowedError.
Why it happens: [CITED: chromestatus #5090735022407680 + intent-to-remove
thread] getDisplayMedia() requires transient user activation, which
expires ~5 s after the original gesture. If anything between the click
and the offscreen's getDisplayMedia call takes too long (slow
offscreen bootstrap, missing OFFSCREEN_READY handshake, network-bound
await), the activation window closes.
How to avoid: (1) Implement Pattern 4 (OFFSCREEN_READY handshake) so
the SW only sends START_RECORDING after the offscreen's listener is
demonstrably ready. (2) Don't put any awaits between the popup click
handler and the chrome.runtime.sendMessage('START_RECORDING'). (3)
Pre-create the offscreen at SW startup (in chrome.runtime.onInstalled)
so the create-document round-trip isn't on the critical path.
Warning signs: First-run works on the developer's machine because the offscreen bootstraps fast; CI / production fails because real-world extension startup is slower.
Pitfall 3: SW unloads mid-export and the popup gets "Receiving end does not exist"
What goes wrong: Operator clicks the popup save button after a long
idle period. SW had unloaded; popup's chrome.runtime.sendMessage wakes
it, but the SW's videoBuffer array (in the current code) was reset by
the unload, so it returns an empty buffer.
Why it happens: The current code stores the buffer in the SW's
top-level let videoBuffer = []. SW unload = lose array. CONTEXT.md
D-16 fixes this by moving buffer ownership to the offscreen, which
survives SW unloads because it holds the DISPLAY_MEDIA capture.
How to avoid: (1) Buffer ownership in offscreen, not SW (D-16). (2) Port keepalive from offscreen → SW (D-17/Pattern 5) — if the SW ever unloads, the offscreen's next port message wakes it. (3) On export, SW asks offscreen for the buffer over the port; this is a one-shot, SW-stateless lookup.
Warning signs: "Receiving end does not exist" in popup console after
~30 s of inactivity. Or: saved archive contains a tiny last_30sec.webm
that only holds the very first chunk.
Pitfall 4: Long-lived port is closed by Chrome at ~5 minutes regardless of traffic
What goes wrong: You set up the port-based keepalive and confirm it works for a few minutes. Then at minute 5, the port silently disconnects and the SW unloads on the next idle window.
Why it happens: [CITED: gist sunnyguan & multiple Chromium-extensions
threads] Chrome enforces a hard 5-minute lifetime on long-lived ports
(an artifact of the SW ExtendableEvent time budget).
How to avoid: In the offscreen, listen to port.onDisconnect and
immediately call chrome.runtime.connect() again. Reconnect every
~290 s pre-emptively as a belt-and-braces guard.
Warning signs: Buffer goes empty around minute 5 of a long
recording session. Port is reported as disconnected in
chrome://extensions service-worker inspect.
Pitfall 5: chrome.runtime.getURL('offscreen/index.html') returns a 404
What goes wrong: SW calls chrome.offscreen.createDocument({url: 'offscreen/index.html', ...})
and gets an ERR_FILE_NOT_FOUND.
Why it happens: crxjs places the bundled offscreen HTML under the
src/-relative path you declared in rollupOptions.input. If you set
input: { offscreen: 'src/offscreen/index.html' }, the runtime URL is
chrome.runtime.getURL('src/offscreen/index.html'), NOT
offscreen/index.html. [CITED: crxjs discussion #919 + #1060]
How to avoid: Match the input key (or the relative path crxjs emits)
to what the SW passes to createDocument. The path crxjs emits is the
same path you give as the rollup input value. Test by inspecting
dist/ after npm run build — the HTML should be at exactly the path
the SW expects.
Warning signs: SW console shows "Failed to load resource: net::ERR_FILE_NOT_FOUND", "Could not establish connection. Receiving end does not exist."
Pitfall 6: MediaStreamTrack.onended never fires
What goes wrong: Operator clicks Chrome's "Stop sharing" banner; you
expect track.onended to fire so you can update state. Nothing happens.
Why it happens: (1) You attached the listener to the wrong track (the
stream's audio track instead of the video track). (2) You used .onended = fn AFTER the event had already fired (race with the picker dismiss).
(3) You destructured the track and the listener attached to the GC'd
local.
How to avoid: Attach with addEventListener('ended', ...) (not
.onended =); attach to ALL tracks (stream.getTracks().forEach(t => t.addEventListener('ended', onEnded)))
so any track ending triggers cleanup; attach immediately after the
getDisplayMedia() await resolves.
Warning signs: Operator stops sharing, the UI keeps saying "recording" in console logs, ffprobe-checking the next export shows the last 30 s of content from BEFORE the user stopped.
Code Examples
Verified patterns from official sources and a production extension.
Example A — Minimal offscreen HTML (NEW: src/offscreen/index.html)
<!-- Source: pattern from crxjs discussion #919 + Proscreen-S3 -->
<!doctype html>
<html>
<head><meta charset="UTF-8"><title>Mokosh Recorder</title></head>
<body>
<script type="module" src="./recorder.ts"></script>
</body>
</html>
Example B — Minimal vite.config.ts (REPLACES the 184-line current one)
// Source: crxjs documentation + discussion #919
import { defineConfig } from 'vite';
import { crx } from '@crxjs/vite-plugin';
import manifest from './manifest.json';
export default defineConfig({
plugins: [
crx({ manifest, contentScripts: { injectCss: false } }),
],
build: {
rollupOptions: {
input: {
offscreen: 'src/offscreen/index.html',
},
},
},
});
Example C — SW: ensure-offscreen pattern (snippet for src/background/index.ts)
// Source: github.com/GoogleChrome/chrome-extensions-samples/tree/main/functional-samples/sample.tabcapture-recorder/service-worker.js
// [VERIFIED: canonical Google sample, license Apache-2.0]
async function ensureOffscreenDocument(): Promise<void> {
const existing = await chrome.runtime.getContexts({
contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
});
if (existing.length > 0) return;
await chrome.offscreen.createDocument({
url: 'src/offscreen/index.html',
reasons: [chrome.offscreen.Reason.DISPLAY_MEDIA],
justification: 'Continuous screen recording for operator session diagnostics',
});
}
Example D — ffprobe verification (used in the acceptance gate D-12)
# Source: ffmpeg.org/ffprobe.html, exit code semantics:
# 0 = recognized media; >0 = could not open / not multimedia / decode error
# Force-format -f matroska because WebM is a Matroska subset and helps
# ffprobe choose the right demuxer when the file is "live" (no SeekHead).
ffprobe -v error -f matroska -i last_30sec.webm
echo "ffprobe exit: $?"
# Optional: dump cluster timeline for diagnosis if exit != 0
ffprobe -v error -show_packets -i last_30sec.webm 2>&1 | head -50
Example E — Codec capability strict-mode (CONTEXT.md D-20)
// Source: MDN MediaRecorder.isTypeSupported + CONTEXT.md D-20
const VIDEO_MIME = 'video/webm;codecs=vp9';
const VIDEO_BITRATE = 400_000; // CON-video-codec
const TIMESLICE_MS = 2000; // CON-video-codec / SPEC §4.1
if (!MediaRecorder.isTypeSupported(VIDEO_MIME)) {
const ua = navigator.userAgent;
chrome.runtime.sendMessage({
type: 'RECORDING_ERROR',
error: `vp9 unsupported. UA=${ua}`,
});
throw new Error(`MediaRecorder mime not supported: ${VIDEO_MIME}; UA=${ua}`);
}
const videoRecorder = new MediaRecorder(stream, {
mimeType: VIDEO_MIME,
videoBitsPerSecond: VIDEO_BITRATE,
});
videoRecorder.start(TIMESLICE_MS);
Example F — MediaStreamTrack.onended for "Stop sharing"
// Source: MDN MediaStreamTrack#ended_event
stream.getTracks().forEach((track) => {
track.addEventListener('ended', () => {
// Clear the buffer (the captured source is gone)
ringBuffer.length = 0;
// Disconnect the port so SW can clean up
port?.disconnect();
// Notify SW for state transition; popup state change is Phase 3 territory
chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: 'user-stopped-sharing' });
// Stop the recorder explicitly
if (videoRecorder.state !== 'inactive') videoRecorder.stop();
}, { once: true });
});
State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
| Background page (persistent) in MV2 | MV3 service worker | Chrome 88 → MV3 default; MV2 sunset 2024 | All capture APIs must be reachable from SW or offscreen, NOT a persistent page. Drives the SW + offscreen split. |
chrome.desktopCapture.chooseDesktopMedia returning a streamId redeemable in any context |
streamId from chrome.desktopCapture not usable in offscreen MV3 |
Chrome 109+ offscreen API rollout | Forces the choice between (a) tabCapture + USER_MEDIA pattern (canonical Google sample) or (b) getDisplayMedia + DISPLAY_MEDIA pattern (CONTEXT.md D-01..D-05). [CITED: groups.google.com chromium-extensions/3RanHldyp9c] |
chrome.alarms as the universal SW keepalive |
Long-lived port postMessage traffic |
Chrome 110+ "all events reset idle timer" + Chrome 114 "Sending a message with long-lived messaging keeps the service worker alive" + Chrome 116 WebSockets | Alarms still work in Chrome 110+ but are no longer the recommended primary keepalive for offscreen-paired extensions. [CITED: developer.chrome.com/blog/longer-esw-lifetimes] |
rrweb.record({maskInputSelector: ...}) |
rrweb.record({maskInputFn: ...}) |
rrweb 2.0.0-alpha | Not Phase 1 territory (Phase 2 owns it), but flagged because the audit lists it as a P0. The current code uses maskTextSelector which is yet a third thing and is wrong (audit P0 #6). |
Tab capture as active-tab-bound, requiring re-attach on chrome.tabs.onActivated |
Display capture as screen/window-bound, NO re-attach (CONTEXT.md D-14/D-15) | This phase (DEC-003 AMENDED) | Deletes chrome.tabs.onActivated and chrome.tabs.onUpdated listener requirements from REQ-video-ring-buffer. |
Deprecated/outdated:
chrome.tabCapture.capture()(the legacy callback form) — replaced bychrome.tabCapture.getMediaStreamId+ offscreengetUserMediaredemption. We're abandoning this whole path per CONTEXT.md D-01.mandatory: { chromeMediaSource: 'tab' }constraint syntax — Chrome-specific extension togetUserMedia. Phase 1 doesn't use it (we use the standardgetDisplayMedia).
Assumptions Log
| # | Claim | Section | Risk if Wrong |
|---|---|---|---|
| A1 | Restart-segments fallback structural sketch (Pattern 3) | Architecture Patterns / Pattern 3 | Low — pattern is an inferred application of standard MediaRecorder semantics; if it fails, we have the third-tier ts-ebml deferred fallback. The risk is implementation-time, not phase-blocking. |
| A2 | Chrome enforces ~5 minute lifetime on long-lived ports (Pattern 5 / Pitfall 4) | Pitfall 4 | MEDIUM — multiple community sources corroborate, but no canonical Chrome doc states the exact limit. If the limit is shorter, our reconnect should still recover. If longer, our 290s reconnect is just defensive overhead. |
| A3 | MediaRecorder.start(2000) produces chunks that align with cluster boundaries about half the time (consequence of Chrome's kf_max_dist=100 and 30 fps default) |
Pitfall 1 / Pattern 2 | HIGH — this is the load-bearing claim that makes Pattern 2 work at all. The ffprobe gate (D-12) is exactly the mitigation; if ffprobe rejects, we escalate to Pattern 3 by design. So the assumption is already mitigated by the plan's fallback structure. |
| A4 | Chrome propagates transient user activation through chrome.runtime.sendMessage for the SW → offscreen → getDisplayMedia chain |
Pattern 1 + Pitfall 2 | LOW — verified against a real production extension (Proscreen-S3) doing exactly this. Mitigation: OFFSCREEN_READY handshake (Pattern 4) tightens the timing window so we never exceed the ~5 s activation budget. |
| A5 | The 30-second window's "30" is an upper bound, not an exact target (CON-video-window allows ±10 s slack for the restart-segments fallback) | Pattern 3 | LOW — REQUIREMENTS.md says "the most recent 30 seconds" and "no more than 30 seconds", which our restart-segments stays inside (3×10 s = 30 s exactly at one phase of rotation, dropping to 20 s right after rotation). User confirmation desirable but the contract permits it. |
| A6 | getDisplayMedia() does NOT need desktopCapture permission in the manifest (it's a web standard API; desktopCapture only gates chrome.desktopCapture.chooseDesktopMedia) |
Architectural Responsibility Map (Manifest row) + Standard Stack | LOW — multiple sources confirm. CONTEXT.md D-05 chooses to declare desktopCapture anyway, which is harmless. If we DROPPED desktopCapture from the manifest, the only ill effect would be losing the option to call chrome.desktopCapture.chooseDesktopMedia (which we don't use). |
| A7 | The chrome.runtime.getContexts API is available in Chrome ≥ 116 and is the recommended way to test for an existing offscreen document (replaces chrome.offscreen.hasDocument) |
Pattern 1 / Example C | MEDIUM — chrome.offscreen.hasDocument is the older, simpler check and still works. The canonical Google sample uses getContexts. Either works; planner can pick. |
If this table contains items: The planner should treat them as
candidates for user verification during /gsd-plan-phase review.
Open Questions
-
Will
MediaRecorder.start(2000)produce ffprobe-clean WebM on a typical screen-cap?- What we know: Cluster boundaries align with keyframes; Chrome
keyframes appear every ~3-5 s by default (vp9
kf_max_dist=100on a 30 fps stream); timeslice does NOT force keyframes. - What's unclear: How often in practice does a 2 s timeslice happen to land at a cluster boundary for a desktop screen-cap (which has lots of static frames and may have different keyframe cadence than a webcam)?
- Recommendation: Build Pattern 2 first; run the D-12 ffprobe gate; keep Pattern 3 (restart-segments) pre-staged in PLAN.md per CONTEXT.md D-13 so we don't re-plan if Pattern 2 fails. Plan-checker can ratchet this in the success criteria.
- What we know: Cluster boundaries align with keyframes; Chrome
keyframes appear every ~3-5 s by default (vp9
-
Does the 5-minute port lifetime kill the recording session?
- What we know: Multiple corroborating community sources cite a ~5 minute hard cap on long-lived ports.
- What's unclear: Whether the cap applies to port lifetime (the port object dies and must be reconnected) OR to SW lifetime extension (after 5 minutes of port keepalive, the SW is killed anyway and the port goes with it).
- Recommendation: Pessimistic — assume the worst, reconnect every ~290 s. Cheap defensive code. If we learn the cap is different, the reconnect is still harmless.
-
What's the exact crxjs path-emit behavior for the offscreen entry?
- What we know: The discussion #919 working answer uses
input: { offscreen: 'src/offscreen/offscreen.html' }and SW fetcheschrome.runtime.getURL('src/offscreen/offscreen.html'). - What's unclear: Some crxjs versions strip the leading
src/; the 2.0.0-beta vs 2.4.0 difference might matter. - Recommendation: After the first
npm run build, inspectdist/to confirm the actual emitted path, then encode that path as a constant in SW. This is a verifiable runtime check, not a design decision.
- What we know: The discussion #919 working answer uses
Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|---|---|---|---|---|
| Node.js | Vite, TypeScript, npm | ✓ | v24.14.0 | — |
| npm | Dep install | ✓ | 11.9.0 | — |
| ffprobe (FFmpeg) | D-12 acceptance gate; ffprobe-based verification of every export sample | ✓ | 8.1.1 | None needed (ffprobe is the gate) |
| Chrome / Chromium | Manual smoke test (unpacked load → Сохранить отчёт → inspect dist) | ✗ | — | Plan must call out "manual test requires Chrome ≥ 116; install via apt install google-chrome-stable or note the gap to the operator." |
| Playwright / chromium-test-runner | Optional headed-Chrome integration tests (see Validation Architecture) | ✗ | — | Phase 1 acceptance does NOT require Playwright. Manual smoke is acceptable per ROADMAP Phase 4. If we want unit-test coverage for the trim logic, Vitest in node mode is enough. |
| node_modules/ | vite build, tsc |
✗ | — | Run npm install at start of phase; no fallback. |
Missing dependencies with no fallback (blocking execution):
node_modules/— must runnpm installonce before any TS/Vite work. Add as Wave 0 task.
Missing dependencies with fallback (acceptable):
- Chrome browser — manual smoke is Phase 4's job; for Phase 1, type-check
- ffprobe-on-test-fixture is the deepest automated gate. If the developer doesn't have Chrome installed, the plan still completes; the Phase 4 ROADMAP item is where Chrome becomes mandatory.
- Playwright — not needed; see Validation Architecture below for why.
Validation Architecture
Nyquist validation is enabled (workflow.tdd_mode: true in
.planning/config.json). The validation strategy is layered:
Test Framework
| Property | Value |
|---|---|
| Framework | Vitest (Node mode for pure logic; Browser mode if needed for MediaRecorder mocks) — recommended, NOT currently installed. Vite is already a dev dep so Vitest is a zero-friction add. |
| Config file | NONE — Wave 0 creates vitest.config.ts. |
| Quick run command | npx vitest run --reporter=dot (after install) |
| Full suite command | npx vitest run + npm run build (typecheck via tsc --noEmit) + ffprobe gate (D-12) |
Why not Jest: vite is already the build tool; Vitest is the
zero-config-mismatch choice. No transformer dance for TS.
Why not Playwright: MediaRecorder + getDisplayMedia ARE driveable
in Chromium via Playwright with permissions auto-granted, but the
acceptance gate (ffprobe on a real exported file) requires actually
running the extension. Manual smoke + ffprobe is sufficient for Phase 1.
Playwright-driven smoke tests are Phase 4/5 territory.
What's testable in Node-only Vitest:
- Ring buffer logic (
addChunk,trimAged) — pure function, takes{data: {size: number}, timestamp: number, isHeader: boolean}[]and returns the trimmed array. MockBlobas{size: N, type: 'video/webm'}. - Message handlers (mock
chrome.runtimewithvitest-chromeor a lightweight stub). - Port lifecycle / reconnect logic.
- Codec strict-mode error path (mock
MediaRecorder.isTypeSupported→ false).
What's NOT testable in Vitest, requires manual smoke / Phase 4:
- The actual
getDisplayMediaflow (browser picker). - Real WebM playability (covered by ffprobe gate on a test-fixture file).
- SW idle-unload survival (covered by manual DevTools "Force stop" test in Phase 4 smoke checklist).
Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|---|---|---|---|---|
| REQ-video-ring-buffer | Ring buffer adds chunk; first chunk gets isHeader: true |
unit | npx vitest run tests/offscreen/ring-buffer.test.ts -t "first chunk is header" |
❌ Wave 0 |
| REQ-video-ring-buffer | Ring buffer evicts chunks older than 30 s; keeps header | unit | npx vitest run tests/offscreen/ring-buffer.test.ts -t "trim 30s" |
❌ Wave 0 |
| REQ-video-ring-buffer | Codec strict-mode throws when vp9 unsupported (D-20) | unit | npx vitest run tests/offscreen/codec-check.test.ts |
❌ Wave 0 |
| REQ-video-ring-buffer | OFFSCREEN_READY message sent on listener registration | unit | npx vitest run tests/offscreen/handshake.test.ts |
❌ Wave 0 |
| REQ-video-ring-buffer | Port reconnect on disconnect within 1 s | unit | npx vitest run tests/offscreen/port.test.ts -t "reconnects" |
❌ Wave 0 |
| REQ-video-ring-buffer | SW deletes alarms keepalive (D-18) | type-check / grep | ! grep -RIn "chrome.alarms" src/background/ |
NO CODE NEEDED (CI grep) |
| REQ-video-ring-buffer | SW deletes IndexedDB code path (D-19) | grep | ! grep -RIn "VideoRecorderDB|openIndexedDB" src/ |
NO CODE NEEDED (CI grep) |
| REQ-video-ring-buffer | vite.config.ts:11-184 inline plugin deleted (D-08) |
grep | ! grep -RIn "copy-offscreen|chromeMediaSource" vite.config.ts |
NO CODE NEEDED |
| REQ-video-ring-buffer (acceptance gate D-12) | last_30sec.webm plays ffprobe-clean |
integration (manual smoke + ffprobe) | ffprobe -v error -f matroska -i sample/last_30sec.webm; echo $? |
❌ Sample fixture produced manually for this gate, OR captured by Playwright in Phase 4. For Phase 1, run on the file the manual smoke produces. |
| REQ-video-ring-buffer | Type-check passes with zero as any and zero @ts-ignore regressions |
static | npx tsc --noEmit && ! grep -RIn "as any|@ts-ignore" src/ |
EXISTS (tsc --noEmit in npm run build) |
| REQ-video-ring-buffer | Manifest permission swap (D-A6 / D-05) | grep | ! grep "tabCapture" manifest.json && grep "desktopCapture" manifest.json |
NO CODE NEEDED |
| REQ-video-ring-buffer | Build produces a loadable extension | manual | npm run build && ls dist/manifest.json dist/src/offscreen/index.html dist/assets/*.js |
NO TEST FILE; CI shell check |
Sampling Rate
- Per task commit:
npx vitest run --reporter=dot && npx tsc --noEmit(≤ 10 s). - Per wave merge: Full Vitest +
npm run build+ grep guards (≤ 30 s). - Phase gate (D-12): Manually load
dist/into Chrome, capture a test session, click save, runffprobe -v error -f matroska -i ~/Downloads/session_report_*.zip:video/last_30sec.webm(extract viaunzip -p), confirm exit 0 with zero stderr lines.
Wave 0 Gaps
- Install Vitest:
npm install -D vitest@^3 @vitest/ui(verify current major vianpm view vitest versionat the time of install). vitest.config.ts— pull in path aliases fromtsconfig.json.tests/offscreen/directory with at minimum:ring-buffer.test.ts— covers REQ-video-ring-buffer trim & header pinning.codec-check.test.ts— covers D-20 strict-mode error path.handshake.test.ts— covers Pattern 4 OFFSCREEN_READY.port.test.ts— covers Pattern 5 reconnect.
tests/fixtures/— keep a known-good WebM for ffprobe sanity (e.g. produced once on a developer machine and committed). Used by CI to verify the ffprobe gate runs at all.npm testscript inpackage.json:"test": "vitest run".- CI? — out of scope per audit P2 #22 (Phase 5).
Security Domain
Default per
.planning/config.json:security_enforcementis absent → treated as enabled (per researcher contract).
Applicable ASVS Categories
| ASVS Category | Applies | Standard Control |
|---|---|---|
| V2 Authentication | No | No authentication surface in Phase 1 (local-only, no server). |
| V3 Session Management | No | No sessions. |
| V4 Access Control | Yes (limited) | Manifest permissions are the access-control boundary. Minimize: desktopCapture is unnecessary if we use only getDisplayMedia (web API), but harmless. tabCapture is being REMOVED. host_permissions: ["<all_urls>"] remains for content-script injection (Phase 2 territory). |
| V5 Input Validation | Yes (limited) | The only "input" Phase 1 handles is the streamId NOT applicable (we don't use streamIds in the new path) and inter-context messages. Each chrome.runtime.onMessage handler should validate msg.type against the typed MessageType enum (already exists in src/shared/types.ts). |
| V6 Cryptography | No | No crypto. |
| V14 Configuration | Yes | manifest.json enumerates the permission set verbatim. The Doc-Cascade tasks (D-A1..D-A6) keep .planning/intel/constraints.md in lockstep with manifest.json. |
Known Threat Patterns for {Chrome MV3 extension}
| Pattern | STRIDE | Standard Mitigation |
|---|---|---|
| Untrusted message origin (cross-extension message injection) | Spoofing | Every chrome.runtime.onMessage listener should check sender.id === chrome.runtime.id. The current code doesn't; Phase 1 should add it where it adds new listeners (low effort). |
<all_urls> host permission exposes the SW to messages from any content script on any site |
Tampering | Already in design (REQ-manifest-permissions). The mitigation is that the SW only processes messages from its own content script (validated by sender.id check). |
| Stored video buffer contains sensitive operator session data | Information Disclosure | CON-buffer-storage: in-memory only, no persistence. CONTEXT.md D-19 reinforces (no IndexedDB, no chrome.storage.session). |
Captured video may show passwords typed into other apps (since getDisplayMedia can grab the whole screen) |
Information Disclosure | OUT OF SCOPE per Phase 1: this is exactly the trade-off accepted in CONTEXT.md D-04. The Chrome "Sharing" banner is the user-facing mitigation. Phase 2's password masking applies to rrweb / event-log, not to video pixels. |
eval or string-injected code |
Tampering | The vite.config.ts:35-213 inline-string offscreen JS is effectively static (no user input), but it IS string-injected build output. CSP for MV3 extensions disallows eval, but a long template literal is allowed. Phase 1 DELETES this, which is also a security improvement. |
Phase 1 has no novel security surface beyond the manifest swap (D-A6) and the sender-id check best-practice.
Sources
Primary (HIGH confidence)
- developer.chrome.com —
chrome.offscreenAPI reference,Reasonenum values, includingDISPLAY_MEDIA: https://developer.chrome.com/docs/extensions/reference/api/offscreen — confirmed via direct fetch on 2026-05-15. - developer.chrome.com — Audio recording and screen capture guide, including the canonical "use offscreen + DISPLAY_MEDIA" sentence: https://developer.chrome.com/docs/extensions/how-to/web-platform/screen-capture — fetched verbatim via gh API on 2026-05-15.
- developer.chrome.com — Service worker lifecycle: https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle — fetched, confirms Chrome 110 "all events reset idle timer", Chrome 114 "message via long-lived messaging keeps SW alive".
- developer.chrome.com — Longer extension SW lifetimes blog: https://developer.chrome.com/blog/longer-esw-lifetimes.
- developer.chrome.com —
chrome.alarmsAPI reference: https://developer.chrome.com/docs/extensions/reference/api/alarms — confirms 30 s minimum period (Chrome 120+) for store-loaded; unpacked has no limit. - GoogleChrome/chrome-extensions-samples —
functional-samples/sample.tabcapture-recorder/: https://github.com/GoogleChrome/chrome-extensions-samples/tree/main/functional-samples/sample.tabcapture-recorder — fetched all files via gh API; confirms the offscreen + USER_MEDIA pattern (the close cousin of our DISPLAY_MEDIA pattern). - MDN —
MediaRecorder.start(): https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder/start — confirms timeslice is purely time-based, NOT codec-aware. - ffmpeg.org — ffprobe documentation: https://ffmpeg.org/ffprobe.html — exit code semantics for the D-12 gate.
Secondary (MEDIUM confidence — verified with multiple sources)
- bugzilla.mozilla.org #1666487 — quote from Andreas Pehrson:
https://bugzilla.mozilla.org/show_bug.cgi?id=1666487 — Chrome's
default keyframe cadence (
kf_max_dist=100) cross-confirmed by Chrome's MediaRecorder README. - crxjs/chrome-extension-tools — Discussion #919 "Set up offscreen with TypeScript": https://github.com/crxjs/chrome-extension-tools/discussions/919 — and follow-up #1060: working pattern for HTML + TS module entry.
- Mozilla Firefox bug #1666487 — Pehrson's design rationale on timeslice-vs-keyframe.
- Graham King's blog — "Reading MediaRecorder's webm/opus output": https://darkcoding.net/software/reading-mediarecorders-webm-opus-output/ — third-party EBML walkthrough, confirms that MediaRecorder doesn't split on SimpleBlock.
- chrome-extensions-samples issue #1111 — "Sample for chrome.offscreen": https://github.com/GoogleChrome/chrome-extensions-samples/issues/1111 — confirms there is NO official sample for DISPLAY_MEDIA + getDisplayMedia.
- ngocquy020196/Proscreen-S3 — in-the-wild production extension:
https://github.com/ngocquy020196/Proscreen-S3/blob/main/src/background/recording.ts
src/offscreen/recorder.ts— confirms the exact CONTEXT.md D-01..D-05 architecture works in practice.
- schniti269/meeting_mate — second corroborating real extension: https://github.com/schniti269/meeting_mate/blob/main/background.js.
- crxjs.dev — Vite plugin docs:
https://crxjs.dev/vite-plugin/ — confirms manifest-driven entry but
multi-entry HTML needs
rollupOptions.input. - GitHub gist sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e — MV3 keepalive patterns including port reconnect at 290 s.
- developer.chrome.com issue #2688 — clarifies that the original "native messaging port keeps SW alive" claim has caveats.
Tertiary (LOW confidence — flagged for cross-validation)
- chromium-extensions group thread — getDisplayMedia in offscreen: https://groups.google.com/a/chromium.org/g/chromium-extensions/c/V09VMCLzvWM — one thread suggests user-gesture issues in offscreen; this appears contradicted by Proscreen-S3 working. Resolution: empirical testing during Wave 1 (manual smoke).
- recall.ai blog post on how to build a Chrome recording extension: https://www.recall.ai/blog/how-to-build-a-chrome-recording-extension — uses tabCapture pattern (not our path), but confirms the high-level three-component split.
- Stack Overflow #62236838 — concatenation of MediaRecorder WebM chunks: cited content via WebSearch results only (no direct fetch — site blocked); pattern matches what I confirmed via Graham King's blog and ts-ebml docs.
Metadata
Confidence breakdown:
- Standard stack & versions: HIGH — all verified via
npm view. - Architecture (offscreen + DISPLAY_MEDIA + port keepalive): HIGH — verified against (a) official Chrome docs, (b) Google sample (offscreen + USER_MEDIA — same architectural shape), (c) at least two in-the-wild production extensions doing the exact DISPLAY_MEDIA path.
- Ring-buffer pattern: MEDIUM-HIGH — the structural pattern is solid;
the open question is cluster-boundary alignment of
start(2000), which is the assumption the ffprobe gate (D-12) and the D-13 fallback are designed to handle. - Common pitfalls: HIGH — every pitfall ties to a specific audit defect or a citable Chrome doc / Chromium bug.
- Validation strategy: MEDIUM — the unit-testable surface is real and documented; the integration test gap (browser/picker) is genuine but accepted (Phase 4 territory).
- Security: HIGH for what's in scope; nothing exotic.
Research date: 2026-05-15
Valid until: 2026-06-15 (30 days, stable-ecosystem assumption). Re-validate sooner if Chrome releases a 12X version that changes SW lifecycle rules or the offscreen API stability promise. The most volatile finding is A2 (5-minute port lifetime cap) — Chrome team has been actively tuning this.
Phase: 01-stabilize-video-pipeline Research completed: 2026-05-15