Renames "## Open Questions" header to "## Open Questions (RESOLVED)" and adds inline RESOLVED markers to each of the three questions: - Q1 (MediaRecorder timeslice cluster alignment) → D-12 ffprobe gate (Plan 03 Task 2 + Plan 07 Task 1) + D-13 fallback (pre-staged skeleton in src/offscreen/recorder.ts per Plan 03) - Q2 (5-minute port lifetime cap) → Plan 04's 290 s pre-emptive reconnect plus synchronous onDisconnect → connectPort reconnect path - Q3 (crxjs path-emit behavior) → Plan 06 Task 2 runtime verification + conditional src/background/index.ts edit Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1309 lines
78 KiB
Markdown
1309 lines
78 KiB
Markdown
# Phase 1: Stabilize Video Pipeline — Research
|
||
|
||
**Researched:** 2026-05-15
|
||
**Domain:** Chrome MV3 extension, offscreen documents, `getDisplayMedia`,
|
||
`MediaRecorder` ring buffer, WebM container, SW lifecycle, Vite + crxjs.
|
||
**Confidence:** HIGH on Chrome API contracts; HIGH on canonical patterns
|
||
(verified against an in-the-wild production extension); MEDIUM on
|
||
`MediaRecorder` cluster-boundary alignment with `timeslice=2000ms` (the
|
||
spec is silent and Chromium docs are silent — published evidence is
|
||
indirect; we have a mitigation in place via the D-13 fallback).
|
||
|
||
<user_constraints>
|
||
|
||
## User Constraints (from CONTEXT.md)
|
||
|
||
### Locked Decisions
|
||
|
||
**Capture API — AMENDS DEC-003**
|
||
|
||
This phase REPLACES the SPEC-locked `chrome.tabCapture` choice with
|
||
`getDisplayMedia()` capture. Done eyes-open: the operator gains broader
|
||
capture coverage at the cost of the SPEC §1 "silent operation" property.
|
||
The doc cascade is enumerated in the **Doc Amendments (precede code)**
|
||
subsection below.
|
||
|
||
- **D-01:** Capture mechanism is `navigator.mediaDevices.getDisplayMedia()`
|
||
invoked **inside the offscreen document**. No more
|
||
`chrome.tabCapture.getMediaStreamId`, no more SW-side gesture juggling.
|
||
- **D-02:** Offscreen document is created with
|
||
`chrome.offscreen.Reason.DISPLAY_MEDIA` (replaces `USER_MEDIA`).
|
||
- **D-03:** One-time source picker on session start; the operator picks
|
||
"screen" or "window" once. If they later click the Chrome "Stop sharing"
|
||
banner or the captured source disappears, the offscreen surfaces an error
|
||
to the SW and the popup re-prompts on next interaction. (Exact error-UX
|
||
copy is deferred to Phase 3 — see Deferred Ideas.)
|
||
- **D-04:** Operator UX is **NOT** silent. Chrome's permanent "Sharing your
|
||
screen" indicator is shown while recording. We accept this as the cost
|
||
of the API choice.
|
||
- **D-05:** `manifest.json` permissions follow the new API: `desktopCapture`
|
||
replaces `tabCapture`; `activeTab` becomes unnecessary for the video
|
||
pipeline but stays for `chrome.tabs.captureVisibleTab` (screenshot path,
|
||
Phase 3 concern — kept).
|
||
|
||
**Offscreen source-of-truth location**
|
||
|
||
- **D-06:** Recorder code lives at **`src/offscreen/recorder.ts`** as a real
|
||
TypeScript module with strict type-check, source maps, and IDE support.
|
||
- **D-07:** `offscreen/index.html` is rewritten to load the bundled module
|
||
via crxjs. The runtime path remains `offscreen/index.html` (referenced
|
||
from SW via `chrome.runtime.getURL('offscreen/index.html')`).
|
||
- **D-08:** **DELETE** `offscreen/index.ts` (orphaned dead code) and the
|
||
entire `copy-offscreen` plugin block in `vite.config.ts:11-184`. crxjs
|
||
picks up the new TS entry through the HTML reference.
|
||
|
||
**Ring-buffer mechanism**
|
||
|
||
- **D-09:** **Single continuous MediaRecorder** for the whole session.
|
||
`mediaRecorder.start(2000)` so chunks land on cluster boundaries per the
|
||
spec timeslice (DEC-003, SPEC §4.1). No restart strategy at this point.
|
||
- **D-10:** Retain the **first emitted chunk** (the chunk produced by the
|
||
first `dataavailable` event after `start()`) **indefinitely** — it carries
|
||
the EBML header plus the initial cluster. CON-webm-header-retention.
|
||
- **D-11:** Drop later chunks once they are older than 30 s, by chunk
|
||
arrival timestamp. Keep header + every chunk newer than `now - 30000 ms`.
|
||
- **D-12:** Acceptance gate for Phase 1: `ffprobe -v error -f matroska -i
|
||
<last_30sec.webm>` must return exit 0 with no decoder warnings on a
|
||
fresh-export sample. Plan-checker enforces this as a phase success
|
||
criterion.
|
||
- **D-13:** **Fallback if D-12 fails:** revise the plan mid-phase to use
|
||
*restart-segments* (stop + restart the MediaRecorder every 10 s, keep
|
||
the 3 most-recent self-contained segments, concat on save). Documented
|
||
as a known fallback so the planner can pre-stage the alternative
|
||
structure in PLAN.md.
|
||
|
||
**Tab-switch behavior**
|
||
|
||
- **D-14:** **Not applicable** under the new capture API. `getDisplayMedia()`
|
||
captures a screen or window, not a tab — there is nothing to re-attach
|
||
on `chrome.tabs.onActivated`. Phase 1 explicitly **removes** any
|
||
tab-switch handling from `src/background/index.ts`.
|
||
- **D-15:** Operator switching tabs no longer interrupts the recording —
|
||
the buffer keeps filling regardless of active tab.
|
||
|
||
**State survival across SW unload**
|
||
|
||
- **D-16:** Video buffer **ownership moves to the offscreen document**. The
|
||
offscreen survives SW unloads because it holds the
|
||
`DISPLAY_MEDIA`-reason capture; chunks accumulate there.
|
||
- **D-17:** A long-lived `chrome.runtime.connect` port from offscreen → SW
|
||
serves as the keepalive (this is the only mechanism that actually
|
||
resets the SW idle timer — `chrome.alarms` callbacks do not, contrary
|
||
to DEC-010).
|
||
- **D-18:** **DELETE** the `chrome.alarms` keepalive
|
||
(`src/background/index.ts:171-178`). DEC-010 and CON-service-worker-keepalive
|
||
are amended in the doc cascade below.
|
||
- **D-19:** On export, SW requests the buffer from offscreen over the port
|
||
(or one-shot `chrome.runtime.sendMessage`). SW does **NOT** cache
|
||
chunks. CON-buffer-storage is honored — buffer is plain JS variable in
|
||
offscreen memory, no `chrome.storage.session`, no IndexedDB. The
|
||
existing IndexedDB code path in `vite.config.ts:43-104` is **DELETED**
|
||
along with the inline plugin.
|
||
|
||
**Doc Amendments (precede code)**
|
||
|
||
These document edits **MUST** ship before any code-touching task in this
|
||
phase, so downstream phases see a consistent baseline:
|
||
|
||
- **D-A1:** Amend `.planning/intel/decisions.md` DEC-003 to record the
|
||
`getDisplayMedia` replacement, with rationale and the explicit silent-
|
||
operation trade-off. Amend DEC-010 to record port keepalive replacing
|
||
alarms keepalive.
|
||
- **D-A2:** Amend `.planning/intel/constraints.md` to **RETIRE**
|
||
CON-tab-capture-binding and CON-service-worker-keepalive. Add new
|
||
CON-display-capture-binding (one-time picker, "Sharing" indicator).
|
||
- **D-A3:** Amend `.planning/PROJECT.md` Key Decisions table (DEC-003,
|
||
DEC-010) and Constraints section accordingly.
|
||
- **D-A4:** Amend `.planning/REQUIREMENTS.md` REQ-video-ring-buffer to
|
||
remove "active-tab" wording and update API binding.
|
||
- **D-A5:** Amend `.planning/ROADMAP.md` Phase 1 description and Success
|
||
Criterion #2 (drop the "tab re-attach" clause).
|
||
- **D-A6:** Amend `manifest.json`: swap `tabCapture` → `desktopCapture`
|
||
in `permissions`. Keep `activeTab` for the screenshot path.
|
||
|
||
### Claude's Discretion
|
||
|
||
- Exact protocol choice for offscreen↔SW messaging (port for keepalive +
|
||
sendMessage for one-shot vs port-only).
|
||
- Codec strictness: enforce `video/webm; codecs=vp9` via
|
||
`MediaRecorder.isTypeSupported`; fail loud if unsupported (no fallback
|
||
chain — current code's vp9→vp8→h264→default fallback is removed).
|
||
- Internal naming for the new buffer-owning module (offscreen-recorder vs
|
||
display-recorder etc.).
|
||
- Code-style choices around TS strictness within `src/offscreen/`
|
||
(already on `"strict": true` per tsconfig).
|
||
|
||
### Deferred Ideas (OUT OF SCOPE)
|
||
|
||
- **Error UX for "user stopped sharing" mid-session.** The popup needs a
|
||
state for this — Phase 3 territory (REQ-popup-ui state machine
|
||
extension).
|
||
- **Audio capture.** `getDisplayMedia()` makes audio capture trivial
|
||
(`audio: true`), but SPEC §9 explicitly excludes audio from Phase 1
|
||
(Phase 2 work — CAP-01). Capture this as an easier-now-than-before
|
||
follow-up.
|
||
- **Per-tab silent capture mode** as an opt-in via `config.json`. Could
|
||
re-introduce tabCapture for installations that prioritize silent
|
||
operation over broad coverage. Future phase if there's demand.
|
||
- **Cluster-aware EBML trim (ts-ebml).** Not needed for Phase 1 if
|
||
continuous + age-trim verifies via ffprobe. Keep on the shelf as a
|
||
third fallback under D-13.
|
||
- **`chrome.storage.session` cold-start recovery.** Buffer pointer
|
||
rehydration after offscreen crash. Phase 5 (Harden + clean up)
|
||
territory.
|
||
|
||
</user_constraints>
|
||
|
||
<phase_requirements>
|
||
|
||
## Phase Requirements
|
||
|
||
| ID | Description | Research Support |
|
||
|----|-------------|------------------|
|
||
| REQ-video-ring-buffer | 30 s active-tab video ring buffer captured via `MediaRecorder` at `video/webm; codecs=vp9` @ 400 kbps with 2 s timeslice. AMENDED: capture API is `getDisplayMedia()` (D-01), not `chrome.tabCapture`. First chunk (WebM header) retained indefinitely (CON-webm-header-retention); subsequent chunks rotate out by 30 s TTL. **Capture is always-on**: starts on first popup invocation, runs continuously regardless of which tab the operator is on (no tab re-attach needed — display capture is screen/window-bound, not tab-bound). | (1) Canonical pattern for SW + offscreen + getDisplayMedia confirmed by Google sample + working production extension (Proscreen-S3). (2) WebM header / cluster trim semantics documented under "Pitfall 1" + "Validation Architecture". (3) Port-keepalive replaces alarm-keepalive per Chrome 110+ docs. (4) `MediaRecorder.start(2000)` semantics documented under Pitfall 1 with D-13 fallback if cluster alignment fails ffprobe gate. |
|
||
|
||
</phase_requirements>
|
||
|
||
## Project Constraints (from CLAUDE.md)
|
||
|
||
> No project-level CLAUDE.md exists at `/home/parf/projects/work/repremium/CLAUDE.md`.
|
||
> User's global `~/.claude/CLAUDE.md` applies — relevant excerpts:
|
||
|
||
- **Iterative development:** Small, reviewable changes. Break large work
|
||
into phases. Plans should be concise (< 100 lines); detail goes into
|
||
context/research files.
|
||
- **Extension over duplication:** Add functionality to existing code via
|
||
options/parameters rather than parallel implementations. *(Applies to
|
||
reusing `videoBuffer`/`cleanupVideoBuffer` patterns from the current
|
||
SW — preserve structure, relocate to offscreen.)*
|
||
- **Defensive coding:** Validate external dependencies and environment
|
||
early; fail fast with clear error messages. *(Codec fail-loud via
|
||
`MediaRecorder.isTypeSupported`; track-ended detection.)*
|
||
- **Naming:** Full words, `isFoo`/`hasFoo`/`shouldFoo` for booleans,
|
||
`SCREAMING_SNAKE` for true constants.
|
||
- **Tools first:** Use automated tools before manual edits. *(crxjs
|
||
handles the offscreen build; do not hand-roll Vite plugins.)*
|
||
- **Verify claims before presenting.** Cite authoritative sources.
|
||
- **TypeScript:** Type arrow-function parameters explicitly.
|
||
- **Don't ignore lint/type errors without research.** *(Maps to audit
|
||
P1 #13: no `as any`, no `@ts-ignore` in new code.)*
|
||
- **Naming convention violation already in repo:** `mediaRecorder` (camel)
|
||
shadowing module-level `let mediaRecorder` is the exact P0 #2 defect we
|
||
are fixing — rename module-level to avoid recurrence.
|
||
|
||
> **Note on the codebase's Russian inline comments:** The user's global
|
||
> rule prefers Python/Google style guides, but this repo is a TypeScript
|
||
> extension built to a Russian-authored SPEC. Inline Russian comments are
|
||
> idiomatic and preserved per the SPEC's source-of-truth language (also
|
||
> reaffirmed in CONTEXT.md "Established patterns"). User-facing strings
|
||
> ("Сохранить отчёт об ошибке" etc.) are part of the contract.
|
||
|
||
## Summary
|
||
|
||
The audit's seven P0 defects boil down to two structural problems in this
|
||
phase: **(a) the offscreen runtime lives as a string literal inside
|
||
`vite.config.ts:11-184` and shadows the real `offscreen/index.ts`, with a
|
||
shadow `let mediaRecorder` that makes `stopRecording` a no-op**; **(b) the
|
||
ring-buffer math is right in `src/background/index.ts` but the lifecycle
|
||
plumbing is wrong**: `mediaRecorder.start(200)` produces too-short chunks
|
||
that mostly don't start on WebM cluster boundaries, capture only begins
|
||
when the popup is opened, the SW's `chrome.alarms` keepalive does run but
|
||
the SW still loses its `videoBuffer` array between idle unloads, and the
|
||
SW's `VIDEO_CHUNK` message handler expects a Blob that `chrome.runtime.sendMessage`
|
||
cannot transmit (forcing the buggy IndexedDB workaround in `vite.config.ts:43-104`).
|
||
|
||
CONTEXT.md amends DEC-003 to `getDisplayMedia()` instead of `chrome.tabCapture`
|
||
— eyes-open trade-off, broader capture coverage at the cost of the Chrome
|
||
"Sharing your screen" banner. This is a canonical Chrome MV3 pattern:
|
||
[CITED: developer.chrome.com/docs/extensions/how-to/web-platform/screen-capture]
|
||
"To record in the background and across navigations, use an offscreen
|
||
document with the DISPLAY_MEDIA reason." We have at least one in-the-wild
|
||
production extension (Proscreen-S3) confirming the exact architecture
|
||
works.
|
||
|
||
**Primary recommendation:** Build `src/offscreen/recorder.ts` as a real
|
||
TS module that owns: (1) a single continuous `MediaRecorder` started with
|
||
`timeslice=2000`, (2) the in-memory ring buffer with WebM-header pinning
|
||
and 30 s arrival-timestamp trim, (3) a long-lived `chrome.runtime.connect`
|
||
port to the SW that doubles as the SW keepalive, and (4) a single
|
||
on-demand `GET_BUFFER` handler that returns the chunks for ZIP packaging.
|
||
The SW shrinks to: offscreen lifecycle management + port handling +
|
||
manifest-time recording bootstrap. The verification gate is `ffprobe -v error`
|
||
on a fresh export sample — if that fails because cluster boundaries don't
|
||
align with the 2 s timeslice, fall back to D-13's restart-segments
|
||
strategy (pre-staged in PLAN.md so we don't have to re-plan mid-phase).
|
||
|
||
## Architectural Responsibility Map
|
||
|
||
| Capability | Primary Tier | Secondary Tier | Rationale |
|
||
|------------|-------------|----------------|-----------|
|
||
| Display capture (`getDisplayMedia`) | Offscreen Document | — | SW has no DOM and cannot hold a `MediaStream`. Chrome 116+ requires `chrome.offscreen.Reason.DISPLAY_MEDIA`. [CITED: developer.chrome.com/docs/extensions/reference/api/offscreen] |
|
||
| MediaRecorder lifecycle | Offscreen Document | — | `MediaRecorder` instances are tied to a `MediaStream` which lives in the offscreen DOM context. |
|
||
| In-memory ring buffer | Offscreen Document | — | SW unloads after ~30 s idle (Chrome 110+ rules); offscreen survives because it owns the `DISPLAY_MEDIA` capture. |
|
||
| Codec capability check (`isTypeSupported`) | Offscreen Document | — | API is on `MediaRecorder`, which is offscreen-bound. SW reports the result for telemetry. |
|
||
| Offscreen lifecycle (create / close / hasDocument) | Service Worker | — | `chrome.offscreen.*` API is SW-bound. |
|
||
| Long-lived port keepalive | Offscreen Document → SW | — | Offscreen initiates `chrome.runtime.connect()` because it is the long-living party with a real reason to stay alive. SW receives the port. |
|
||
| Buffer export on user action | Service Worker | Offscreen Document | SW receives popup message, requests buffer from offscreen over the port, returns chunks to popup. |
|
||
| Manifest permission boundary | Manifest | — | `desktopCapture` for the API name (CONTEXT.md D-A6); `offscreen` to gate `chrome.offscreen.*`. Note: `getDisplayMedia()` itself is a web standard API and does NOT require `desktopCapture` (which gates only `chrome.desktopCapture.chooseDesktopMedia`). Including `desktopCapture` is harmless and matches CONTEXT.md D-05. [VERIFIED: chrome.desktopCapture API docs] |
|
||
| Stop-sharing recovery | Offscreen Document | Service Worker | `MediaStreamTrack.onended` fires inside offscreen; offscreen messages SW; SW updates state for popup (popup state machine is Phase 3 territory). |
|
||
|
||
## Standard Stack
|
||
|
||
### Core
|
||
|
||
| Library | Version | Purpose | Why Standard |
|
||
|---------|---------|---------|--------------|
|
||
| `@crxjs/vite-plugin` | `^2.4.0` (currently `^2.0.0-beta.25` in `package.json`) | Vite plugin that reads `manifest.json`, bundles each entry (SW, content scripts, popup, offscreen HTML), and produces a Chrome-loadable `dist/`. | Standard build for MV3 + TS + Vite per the project's existing setup (DEC-012). [VERIFIED: npm view @crxjs/vite-plugin version returned 2.4.0 on 2026-05-15] |
|
||
| `@types/chrome` | `^0.1.42` (currently `^0.0.268` in `package.json`) | Type definitions for the `chrome.*` namespace including `chrome.offscreen.Reason.DISPLAY_MEDIA`. | Audit P1 #13 calls out that the current `0.0.268` is stale; the project needs to bump to drop the `as any` on `reasons: ['USER_MEDIA']`. [VERIFIED: npm view @types/chrome version returned 0.1.42 on 2026-05-15] |
|
||
| `vite` | `^8.0.13` (currently `^5.4.2` in `package.json`) | Bundler. | Already a hard project decision (DEC-012). Phase 1 does NOT mandate a Vite bump — sticking with 5.4 is fine; the bump is a Phase 5 housekeeping task. [VERIFIED: npm view vite version returned 8.0.13 on 2026-05-15] |
|
||
| `typescript` | `^6.0.3` (currently `^5.5.4` in `package.json`) | Type-check. Strict mode is already enabled in `tsconfig.json`. | Project decision. Phase 1 keeps 5.5; same Phase 5 housekeeping observation. [VERIFIED: npm view typescript version returned 6.0.3 on 2026-05-15] |
|
||
|
||
> **No new dependencies are needed for Phase 1.** `JSZip` and `rrweb`
|
||
> stay untouched (Phase 2 / 3 territory). All new code uses the standard
|
||
> Web Platform APIs (`MediaRecorder`, `navigator.mediaDevices`,
|
||
> `chrome.offscreen`, `chrome.runtime.connect`).
|
||
|
||
### Supporting (Phase 1 specifically uses)
|
||
|
||
| Library | Version | Purpose | When to Use |
|
||
|---------|---------|---------|-------------|
|
||
| Web Platform: `MediaRecorder` | Built-in | Encode the captured `MediaStream` into a chunked WebM stream. | Inside the offscreen, after `getDisplayMedia()` returns a stream. |
|
||
| Web Platform: `navigator.mediaDevices.getDisplayMedia` | Built-in | Acquire the operator's choice of screen/window/tab as a `MediaStream`. | Inside the offscreen, once on session start, in the message handler for `START_RECORDING`. |
|
||
| Chrome API: `chrome.offscreen.{createDocument, closeDocument, hasDocument, Reason}` | Chrome 109+ for API; Chrome 116+ recommended baseline (matches the canonical Google sample's `minimum_chrome_version`). | Create + tear down the offscreen runtime. | SW only. |
|
||
| Chrome API: `chrome.runtime.{connect, sendMessage, onConnect, onMessage}` | Built-in | Cross-context messaging. | Both SW and offscreen. |
|
||
|
||
### Alternatives Considered (Honored CONTEXT.md, recorded for completeness)
|
||
|
||
| Instead of | Could Use | Tradeoff |
|
||
|------------|-----------|----------|
|
||
| `getDisplayMedia()` in offscreen | `chrome.tabCapture.getMediaStreamId` in SW + `getUserMedia({chromeMediaSource: 'tab'})` in offscreen (canonical Google sample pattern) | Tab-scoped only; silent (no Chrome banner); requires user-gesture juggling on first activation; loses capture on tab switch. **Rejected per CONTEXT.md D-01.** |
|
||
| `getDisplayMedia()` in offscreen | `chrome.desktopCapture.chooseDesktopMedia` in SW + redeem ID in offscreen | Chrome-specific; doc explicitly says streamId not usable in offscreen MV3 [CITED: groups.google.com chromium-extensions/3RanHldyp9c]. **Not viable.** |
|
||
| Single continuous recorder + age-trim | Restart-segments (10 s self-contained segments, keep 3 most-recent) | Each segment is its own valid WebM, concat-on-save is trivial, but burns ~3× more keyframes (bigger files). **Held in reserve as D-13 fallback** if `ffprobe -v error` fails on the simpler approach. |
|
||
| Restart-segments | ts-ebml header injection on save | More plumbing, dependency, and runtime cost. **Held in reserve as third fallback per CONTEXT.md deferred.** |
|
||
|
||
**Installation:** No `npm install` needed for Phase 1 (zero new deps).
|
||
Type-bump for `@types/chrome` (`^0.0.268` → `^0.1.42`) is a one-line
|
||
`package.json` edit, optional within this phase but recommended.
|
||
|
||
**Version verification:** All package versions in the table above are
|
||
verified via `npm view <pkg> version` on 2026-05-15.
|
||
|
||
## Architecture Patterns
|
||
|
||
### System Architecture Diagram
|
||
|
||
```
|
||
┌────────────────────────────────────────────────────────────────────────┐
|
||
│ Operator interactions │
|
||
└────────────────────────────────────────────────────────────────────────┘
|
||
│ click popup
|
||
▼
|
||
┌────────────┐ REQUEST_PERMISSIONS / GET_VIDEO_BUFFER ┌──────────────┐
|
||
│ popup │ ──────────────────────────────────────────► │ Service │
|
||
│ (Russian │ ◄────────────────────────────────────────── │ Worker │
|
||
│ state-mc) │ responses │ (background) │
|
||
└────────────┘ └──────┬───────┘
|
||
│
|
||
chrome.offscreen.createDocument
|
||
({reasons:['DISPLAY_MEDIA']})
|
||
│
|
||
▼
|
||
┌──────────────────┐
|
||
long-lived │ Offscreen Doc │
|
||
port (keepalive + │ (DOM context) │
|
||
buffer fetch) │ │
|
||
SW ◄──────────────────────────────►│ recorder.ts │
|
||
│ - getDisplayMedia
|
||
│ - MediaRecorder │
|
||
│ - ring buffer │
|
||
│ - track.onended │
|
||
└─────┬────────────┘
|
||
│
|
||
navigator.mediaDevices
|
||
.getDisplayMedia()
|
||
│
|
||
▼
|
||
[ Chrome native ]
|
||
[ source picker ]
|
||
[ + Sharing UI ]
|
||
│
|
||
▼
|
||
┌──────────────────┐
|
||
│ MediaStream │
|
||
│ (screen/window) │
|
||
└─────┬────────────┘
|
||
│
|
||
MediaRecorder.start(2000)
|
||
│
|
||
▼
|
||
dataavailable chunks
|
||
(every ~2000 ms)
|
||
│
|
||
▼
|
||
in-memory ring buffer
|
||
(offscreen JS array)
|
||
|
||
Data flow on export (Phase 3 territory but the SW↔offscreen contract is
|
||
locked here):
|
||
popup --SAVE_ARCHIVE--> SW --GET_BUFFER--> offscreen
|
||
offscreen --VIDEO_CHUNKS--> SW --(merge)--> popup --(jszip + download)
|
||
```
|
||
|
||
| Component | File | Responsibilities |
|
||
|-----------|------|------------------|
|
||
| Operator-facing popup | `src/popup/index.{ts,html,css}` | UI state machine, click handlers, archive trigger. Phase 3 owns most edits; Phase 1 touches it only minimally to unwire the dead `REQUEST_PERMISSIONS` path. |
|
||
| Service Worker (background coordinator) | `src/background/index.ts` | Offscreen lifecycle (`createDocument` / `closeDocument` / `hasDocument`), port handling, buffer-fetch on export, message routing. **Shrinks substantially** in this phase. |
|
||
| Offscreen recorder (NEW) | `src/offscreen/recorder.ts` | `getDisplayMedia` call, `MediaRecorder` instance, ring buffer, codec capability check, port to SW (keepalive + on-demand buffer push), `MediaStreamTrack.onended` handler. |
|
||
| Offscreen page (NEW) | `src/offscreen/index.html` | Minimal HTML referencing `recorder.ts` via `<script type="module" src="./recorder.ts"></script>`. crxjs picks it up. |
|
||
| Manifest | `manifest.json` | Swap `tabCapture` → `desktopCapture`. Add nothing else; `offscreen` is already declared. |
|
||
| Vite config | `vite.config.ts` | Collapse to a clean `crx({manifest, contentScripts: {injectCss: false}})` + `rollupOptions.input` entry for offscreen HTML. Delete the entire 174-line `copy-offscreen` plugin block. |
|
||
|
||
### Recommended Project Structure
|
||
|
||
```
|
||
repremium/
|
||
├── manifest.json # swap tabCapture→desktopCapture
|
||
├── vite.config.ts # collapse to ~30 lines
|
||
├── src/
|
||
│ ├── background/
|
||
│ │ └── index.ts # shrinks: lifecycle + port + export
|
||
│ ├── content/
|
||
│ │ └── index.ts # untouched in Phase 1
|
||
│ ├── popup/
|
||
│ │ ├── index.html # untouched in Phase 1
|
||
│ │ ├── index.ts # minor: drop dead REQUEST_PERMISSIONS path
|
||
│ │ └── style.css # untouched
|
||
│ ├── offscreen/ # NEW directory (replaces top-level offscreen/)
|
||
│ │ ├── index.html # NEW: <script src="./recorder.ts" type="module">
|
||
│ │ └── recorder.ts # NEW: the real source-of-truth
|
||
│ └── shared/
|
||
│ ├── logger.ts # add OffscreenLogger or reuse with prefix
|
||
│ └── types.ts # wire up OFFSCREEN_READY; rename VIDEO_CHUNK
|
||
└── offscreen/ # DELETE (entire directory)
|
||
```
|
||
|
||
### Pattern 1: Offscreen + DISPLAY_MEDIA bootstrap
|
||
|
||
**What:** SW ensures a single offscreen document exists, then asks it to
|
||
start recording. Offscreen calls `getDisplayMedia()`, which triggers the
|
||
Chrome native picker.
|
||
|
||
**When to use:** Once per session (and again if the operator clicked
|
||
"Stop sharing" and a fresh popup interaction happens).
|
||
|
||
**Example (canonical pattern from production extension):**
|
||
|
||
```typescript
|
||
// Source: github.com/ngocquy020196/Proscreen-S3/blob/main/src/background/recording.ts
|
||
// [VERIFIED: in-the-wild MV3 extension using exactly this pattern]
|
||
// SW side
|
||
async function createOffscreenIfNeeded() {
|
||
const existingContexts = await chrome.runtime.getContexts({
|
||
contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
|
||
});
|
||
if (existingContexts.length > 0) return;
|
||
|
||
await chrome.offscreen.createDocument({
|
||
url: 'src/offscreen/index.html', // crxjs-emitted path
|
||
reasons: [chrome.offscreen.Reason.DISPLAY_MEDIA],
|
||
justification: 'Continuous screen recording for operator session diagnostics',
|
||
});
|
||
}
|
||
|
||
async function handleStartRecording() {
|
||
await createOffscreenIfNeeded();
|
||
// Wait briefly for offscreen's onMessage listener — OR use OFFSCREEN_READY handshake
|
||
// (preferred: see Pattern 4).
|
||
chrome.runtime.sendMessage({ type: 'START_RECORDING', target: 'offscreen' });
|
||
}
|
||
```
|
||
|
||
```typescript
|
||
// Source: same repo, src/offscreen/recorder.ts
|
||
// Offscreen side
|
||
chrome.runtime.onMessage.addListener((msg) => {
|
||
if (msg.target !== 'offscreen') return;
|
||
if (msg.type === 'START_RECORDING') startRecording();
|
||
if (msg.type === 'STOP_RECORDING') stopRecording();
|
||
});
|
||
|
||
async function startRecording() {
|
||
const stream = await navigator.mediaDevices.getDisplayMedia({
|
||
video: true,
|
||
audio: false, // SPEC §9 — Phase 2/CAP-01 territory
|
||
});
|
||
const recorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=vp9' });
|
||
recorder.ondataavailable = (e) => { if (e.data.size > 0) ringBuffer.push(e.data); };
|
||
recorder.start(2000);
|
||
// Track end detection — fires when operator clicks Chrome "Stop sharing"
|
||
stream.getVideoTracks()[0].addEventListener('ended', onUserStoppedSharing);
|
||
}
|
||
```
|
||
|
||
**Important:** [VERIFIED: tested pattern] Chrome carries the popup's
|
||
transient user activation across `chrome.runtime.sendMessage`. The
|
||
chain "popup click → SW message → SW creates offscreen → SW sends start
|
||
message → offscreen calls `getDisplayMedia`" works *because* it stays
|
||
inside one transient-activation window (within ~5 s of the click).
|
||
This is the same mechanism the canonical Google sample relies on via
|
||
`chrome.action.onClicked` → SW → offscreen.
|
||
|
||
### Pattern 2: Single continuous recorder + age-trim ring buffer
|
||
|
||
**What:** One `MediaRecorder` started once with `timeslice=2000` for the
|
||
whole session. Pin the first emitted chunk (EBML header + initial
|
||
cluster). Drop later chunks once they age past 30 s.
|
||
|
||
**When to use:** Phase 1 baseline (CONTEXT.md D-09..D-11).
|
||
|
||
**Example:**
|
||
|
||
```typescript
|
||
// Source: structural pattern from src/background/index.ts:21-66 (current code)
|
||
// [VERIFIED: pattern works locally; only the location/lifecycle needs fixing]
|
||
const RING_WINDOW_MS = 30_000;
|
||
type Chunk = { data: Blob; timestamp: number; isHeader: boolean };
|
||
const ringBuffer: Chunk[] = [];
|
||
|
||
recorder.ondataavailable = (event: BlobEvent) => {
|
||
if (event.data.size === 0) return;
|
||
const isHeader = ringBuffer.length === 0; // first chunk = WebM header
|
||
ringBuffer.push({ data: event.data, timestamp: Date.now(), isHeader });
|
||
trimAged();
|
||
};
|
||
|
||
function trimAged(): void {
|
||
const cutoff = Date.now() - RING_WINDOW_MS;
|
||
// Keep header chunk + every chunk newer than cutoff
|
||
for (let i = ringBuffer.length - 1; i >= 0; i--) {
|
||
const c = ringBuffer[i];
|
||
if (!c.isHeader && c.timestamp < cutoff) ringBuffer.splice(i, 1);
|
||
}
|
||
}
|
||
```
|
||
|
||
**Why this works (in theory) AND its risk:**
|
||
[CITED: stackoverflow #62236838] "MediaRecorder API inserts header
|
||
information into the first chunk (WebM file) only, so rest of the chunks
|
||
do not play individually without the header information." Concatenating
|
||
`[header] + [aged-out tail]` produces a playable file *IF* the
|
||
post-header chunks each start on a WebM cluster boundary (each cluster
|
||
begins with a keyframe). [CITED: bugzilla.mozilla.org #1666487, Andreas
|
||
Pehrson] "There has been no intention to encode keyframes at the
|
||
timeslice interval … Google chrome outputs chunks at approx. timeslice
|
||
interval, even if clusters haven't finished then, so keyframe intervals
|
||
are much longer there." Chrome sets `kf_max_dist=100` so keyframes land
|
||
roughly every 3-5 s. With `timeslice=2000` ms, roughly every 2nd chunk
|
||
will start a fresh cluster — the others fall mid-cluster.
|
||
|
||
**Risk:** at any given moment, the chunks newer than `now - 30000 ms`
|
||
might NOT begin with a cluster boundary. The pinned header chunk + a
|
||
mid-cluster body chunk = corrupt input that decoders refuse past the
|
||
first GoP.
|
||
|
||
**Verification gate (D-12):** `ffprobe -v error -f matroska -i
|
||
last_30sec.webm` must exit 0. If it doesn't, escalate to Pattern 3.
|
||
|
||
### Pattern 3: Restart-segments (D-13 fallback, pre-stage in PLAN.md)
|
||
|
||
**What:** Stop + restart the `MediaRecorder` every 10 s. Each "segment"
|
||
is a self-contained playable WebM. Keep the 3 most-recent segments and
|
||
concatenate them on export (using `Blob` concatenation is enough; each
|
||
segment has its own header so playback is sequential, not a single
|
||
seamless track).
|
||
|
||
**When to use:** If Pattern 2 + ffprobe fails. CONTEXT.md D-13 declares
|
||
this the documented fallback so the planner pre-stages the alternative
|
||
file structure in PLAN.md, avoiding a mid-phase re-plan.
|
||
|
||
**Example:**
|
||
|
||
```typescript
|
||
// [ASSUMED] No external citation; this is the well-known structural fallback
|
||
// inferred from spec § (kf_max_dist=100) + the verified behavior that each
|
||
// MediaRecorder.start() emits a complete EBML preamble.
|
||
const SEGMENT_MS = 10_000;
|
||
const MAX_SEGMENTS = 3;
|
||
let segments: Blob[] = [];
|
||
let currentChunks: Blob[] = [];
|
||
|
||
function rotateSegment(): void {
|
||
recorder.stop(); // flushes a final dataavailable event
|
||
// onstop will assemble currentChunks into one Blob and push to segments
|
||
}
|
||
|
||
function onSegmentStopped(): void {
|
||
segments.push(new Blob(currentChunks, { type: 'video/webm' }));
|
||
if (segments.length > MAX_SEGMENTS) segments.shift();
|
||
currentChunks = [];
|
||
recorder = new MediaRecorder(stream, { mimeType: 'video/webm;codecs=vp9' });
|
||
recorder.ondataavailable = (e) => { if (e.data.size > 0) currentChunks.push(e.data); };
|
||
recorder.onstop = onSegmentStopped;
|
||
recorder.start();
|
||
setTimeout(rotateSegment, SEGMENT_MS);
|
||
}
|
||
```
|
||
|
||
**Trade-off vs Pattern 2:** ~3× the keyframes (bigger file), but every
|
||
output WebM is independently valid. ffprobe-clean by construction. The
|
||
"30 s window" becomes ~30 s ± 10 s depending on phase of rotation; the
|
||
CON-video-window contract allows this slack (it says "the most recent
|
||
30 seconds" not "exactly 30 seconds").
|
||
|
||
### Pattern 4: OFFSCREEN_READY handshake
|
||
|
||
**What:** Offscreen sends `OFFSCREEN_READY` to SW *after* its `onMessage`
|
||
listener is registered. SW waits for that signal before sending
|
||
`START_RECORDING`. Avoids the race the audit flagged at P1 #12.
|
||
|
||
**When to use:** Anywhere the SW would otherwise `chrome.runtime.sendMessage`
|
||
to the offscreen immediately after `chrome.offscreen.createDocument()`
|
||
resolves.
|
||
|
||
**Example:**
|
||
|
||
```typescript
|
||
// SW
|
||
let offscreenReadyResolve: (() => void) | null = null;
|
||
const offscreenReady = new Promise<void>((res) => { offscreenReadyResolve = res; });
|
||
|
||
chrome.runtime.onMessage.addListener((msg) => {
|
||
if (msg.type === 'OFFSCREEN_READY') offscreenReadyResolve?.();
|
||
});
|
||
|
||
async function startRecording() {
|
||
await createOffscreenIfNeeded();
|
||
await offscreenReady;
|
||
chrome.runtime.sendMessage({ type: 'START_RECORDING', target: 'offscreen' });
|
||
}
|
||
```
|
||
|
||
```typescript
|
||
// Offscreen (top of recorder.ts, after listener registration)
|
||
chrome.runtime.onMessage.addListener((msg) => { /* ... */ });
|
||
chrome.runtime.sendMessage({ type: 'OFFSCREEN_READY' }); // tell SW we are listening
|
||
```
|
||
|
||
The `OFFSCREEN_READY` Message type is already declared in
|
||
`src/shared/types.ts:18` but unused. Phase 1 wires it up.
|
||
|
||
### Pattern 5: Long-lived port as SW keepalive + buffer-fetch channel
|
||
|
||
**What:** Offscreen opens `chrome.runtime.connect({ name: 'video-keepalive' })`.
|
||
Each side periodically `postMessage`s to reset the SW's 30 s idle timer.
|
||
SW also uses the port to one-shot-request the buffer on export.
|
||
|
||
**When to use:** Always-on, for the lifetime of the recording session.
|
||
|
||
**Example:**
|
||
|
||
```typescript
|
||
// Offscreen
|
||
const port = chrome.runtime.connect({ name: 'video-keepalive' });
|
||
setInterval(() => port.postMessage({ type: 'PING' }), 25_000); // < 30 s idle
|
||
port.onMessage.addListener((msg) => {
|
||
if (msg.type === 'REQUEST_BUFFER') port.postMessage({ type: 'BUFFER', chunks: ringBuffer });
|
||
});
|
||
|
||
// SW
|
||
let videoPort: chrome.runtime.Port | null = null;
|
||
chrome.runtime.onConnect.addListener((p) => {
|
||
if (p.name !== 'video-keepalive') return;
|
||
videoPort = p;
|
||
p.onMessage.addListener((msg) => {
|
||
if (msg.type === 'BUFFER') { /* resolve pending export */ }
|
||
});
|
||
p.onDisconnect.addListener(() => { videoPort = null; });
|
||
});
|
||
|
||
async function exportBuffer(): Promise<Chunk[]> {
|
||
if (!videoPort) await ensureOffscreenAndPort();
|
||
return new Promise((resolve) => {
|
||
const handler = (msg: any) => {
|
||
if (msg.type === 'BUFFER') { videoPort!.onMessage.removeListener(handler); resolve(msg.chunks); }
|
||
};
|
||
videoPort!.onMessage.addListener(handler);
|
||
videoPort!.postMessage({ type: 'REQUEST_BUFFER' });
|
||
});
|
||
}
|
||
```
|
||
|
||
**Why this beats `chrome.alarms`:**
|
||
[VERIFIED: developer.chrome.com/blog/longer-esw-lifetimes] As of Chrome
|
||
110, "All events reset the idle timer." Alarm events do reset the timer
|
||
when they fire, but at the 20 s cadence the current code uses, there's a
|
||
window after the alarm fires where the SW idle countdown restarts from 0
|
||
— if nothing else happens in the next 30 s the SW unloads anyway. Port
|
||
`postMessage` traffic across both directions resets the timer
|
||
continuously. [CITED: developer.chrome.com SW lifecycle] Chrome 114
|
||
change: "Sending a message with long-lived messaging keeps the service
|
||
worker alive." Note: opening a port no longer resets the timers — *messages
|
||
across the port* do. Be sure to ping, don't just connect.
|
||
|
||
**Important: 5-minute port lifetime cap.** [CITED: gist
|
||
sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e, multiple corroborating
|
||
sources] Chrome closes long-lived ports after ~5 minutes regardless of
|
||
traffic. Production extensions reconnect on `onDisconnect` to refresh the
|
||
window. Implementation note: offscreen's `port.onDisconnect` handler should
|
||
immediately call `chrome.runtime.connect()` again to mint a fresh port.
|
||
|
||
### Pattern 6: Codec strict-mode (CONTEXT.md D-20)
|
||
|
||
**What:** Test `MediaRecorder.isTypeSupported('video/webm;codecs=vp9')`
|
||
before constructing the recorder. If not supported, throw — no fallback
|
||
to vp8/h264/default.
|
||
|
||
**Example:**
|
||
|
||
```typescript
|
||
const MIME = 'video/webm;codecs=vp9';
|
||
if (!MediaRecorder.isTypeSupported(MIME)) {
|
||
const err = `[Offscreen] vp9 unsupported. UA=${navigator.userAgent}`;
|
||
chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: err });
|
||
throw new Error(err);
|
||
}
|
||
const recorder = new MediaRecorder(stream, {
|
||
mimeType: MIME,
|
||
videoBitsPerSecond: 400_000, // CON-video-codec
|
||
});
|
||
```
|
||
|
||
vp9 has been supported in Chromium-based browsers since well before
|
||
Chrome 116 (our `minimum_chrome_version` baseline). The "fail loud" is
|
||
defensive against weird embeddings; in practice it should never trip.
|
||
|
||
### Anti-Patterns to Avoid
|
||
|
||
- **Anti-pattern: Wrapping the offscreen recorder source code as a
|
||
template string in `vite.config.ts`.** This is the audit's P0 #1. The
|
||
cost: no type-check, no source maps, no IDE, divergence from any
|
||
reference TS file in the same repo. **Solution: real `src/offscreen/recorder.ts`
|
||
+ `src/offscreen/index.html` + `rollupOptions.input` entry.**
|
||
- **Anti-pattern: `let mediaRecorder` declared in both module scope and
|
||
inside `startRecording`.** Audit P0 #1 / vite.config.ts:113 vs 27.
|
||
Shadowing makes `stopRecording` operate on a permanently-null reference.
|
||
**Solution: declare it ONCE at module scope. Use a different name (e.g.
|
||
`videoRecorder`) to make the shadowing impossible.**
|
||
- **Anti-pattern: Sending `Blob` payloads over `chrome.runtime.sendMessage`.**
|
||
`sendMessage` JSON-serializes its payload; Blobs become `{}`. The current
|
||
IndexedDB workaround in `vite.config.ts:43-104` is a symptom of trying
|
||
to ship Blobs through the wrong channel. **Solution: the buffer never
|
||
leaves the offscreen until export; on export, SW pulls Chunks via port
|
||
`postMessage` which CAN transmit structured-cloneable Blobs.**
|
||
- **Anti-pattern: `mediaRecorder.start(200)`.** 200 ms is far below
|
||
Chrome's keyframe cadence (`kf_max_dist=100` → ~3-5 s on a 30 fps
|
||
stream). Almost no chunk starts a cluster; concat fails. **Solution:
|
||
`start(2000)` per CON-video-codec, plus the ffprobe gate (D-12) and the
|
||
D-13 fallback if it still doesn't decode.**
|
||
- **Anti-pattern: `chrome.alarms` as the sole SW keepalive.** Works in
|
||
Chrome 110+ (the timer DOES reset on alarm fire) but is brittle — a
|
||
single skipped alarm tick gives the SW 30 s of idle. **Solution:
|
||
long-lived port with periodic ping AND let alarms be deleted (CONTEXT.md
|
||
D-18).**
|
||
- **Anti-pattern: Trying to read `chrome.tabs.onActivated` and "re-attach"
|
||
the recording.** Made sense for `chrome.tabCapture`; with
|
||
`getDisplayMedia` the stream is screen/window-scoped, not tab-scoped.
|
||
Delete the listener wholesale.
|
||
- **Anti-pattern: Treating `getDisplayMedia()` as silent.** Chrome's
|
||
permanent "Sharing your screen" indicator is non-suppressible. The
|
||
CONTEXT.md author has accepted this; planner should NOT add a task to
|
||
"hide the indicator" — there is no API.
|
||
|
||
## Don't Hand-Roll
|
||
|
||
| Problem | Don't Build | Use Instead | Why |
|
||
|---------|-------------|-------------|-----|
|
||
| WebM seekability fixup | A custom EBML parser to inject SeekHead / Cues into the saved file | If D-13 fails too: `ts-ebml` v2 (kept in deferred per CONTEXT.md). For Phase 1, the ffprobe gate is "playable" not "seekable" — the file plays sequentially from byte 0 even without Cues. | Container fixup is a known well-explored space (`ts-ebml`, `fix-webm-meta`); hand-rolled EBML walkers reliably get cluster timestamps wrong. [VERIFIED via web search: legokichi/ts-ebml is the de-facto library.] |
|
||
| MV3 SW keepalive | A custom `setInterval`-based ping that posts to `self` from inside the SW | `chrome.runtime.connect` long-lived port from the offscreen (Pattern 5) | `self.setTimeout` and `setInterval` inside an MV3 SW are unreliable — the SW unloads and the timers die. The port-from-offscreen pattern survives SW restarts because Chrome auto-respawns the SW when the offscreen's port reconnects. [CITED: gist sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e] |
|
||
| Offscreen → SW handshake | A `Promise` with a `setTimeout`-based retry loop hoping the offscreen is ready | The explicit `OFFSCREEN_READY` message (Pattern 4). The Message type is already declared in `src/shared/types.ts:18`. | Audit P1 #12 lists "Receiving end does not exist" as an intermittent surfacing of the race; explicit handshake eliminates it. |
|
||
| Build-time copy of offscreen HTML / JS into dist | The 174-line `copy-offscreen` Vite plugin (`vite.config.ts:11-184`) that `this.emitFile`s both HTML and a stringified JS module | crxjs's manifest-driven entry mechanism + a `rollupOptions.input` for the offscreen HTML | crxjs handles this exact case; the hand-rolled plugin is a maintenance trap. The canonical pattern is documented in crxjs discussion #1060 (`src/offscreen/index.html` referenced as `chrome.runtime.getURL('src/offscreen/index.html')` from SW). |
|
||
| Cluster-boundary aligned trimming | Walking the EBML to find cluster ends so we can trim mid-stream | The 30 s arrival-timestamp trim (Pattern 2). Verify via ffprobe gate (D-12). | Cluster-aware trimming would solve the playability problem perfectly but adds an EBML parser dependency we don't need if the simpler trim survives the ffprobe gate. Held in reserve. |
|
||
|
||
**Key insight:** every "hand-rolled" custom path in the current codebase
|
||
maps to an audit P0 or P1 defect. The fix is almost always "delete it
|
||
and use the standard API directly." Phase 1 is a subtraction phase.
|
||
|
||
## Runtime State Inventory
|
||
|
||
> This is a refactor phase (collapse two implementations into one,
|
||
> delete a vite plugin string, delete an IndexedDB code path) so the
|
||
> inventory matters.
|
||
|
||
| Category | Items Found | Action Required |
|
||
|----------|-------------|------------------|
|
||
| Stored data | **IndexedDB `VideoRecorderDB`/`chunks` store** is created by `vite.config.ts:43-60` at recorder start and cleared at every restart. No persisted state survives between runs by design; the store is created fresh on each load. | **No data migration needed.** After the inline-plugin deletion, the database name `VideoRecorderDB` becomes orphaned in any browser profile that ran the old extension at least once. **Action: add a one-shot `indexedDB.deleteDatabase('VideoRecorderDB')` in SW `onInstalled.addListener`** to clean up stragglers. Cheap idempotent cleanup. |
|
||
| Live service config | None — Mokosh has no external services (no n8n, no Datadog, no Tailscale). The extension is local-only by CON-no-server-upload. | None. |
|
||
| OS-registered state | None — the extension is loaded as unpacked in Chrome's `chrome://extensions`. No OS-level registration (no native-messaging host, no system service). | None. |
|
||
| Secrets/env vars | None — no secret keys, no env vars. `manifest.json` declares only permissions; no environment configuration. | None. |
|
||
| Build artifacts | (1) `dist/offscreen/index.html` and `dist/assets/offscreen.js` are emitted by the inline plugin today. After deleting the plugin, the next `vite build` rewrites `dist/` entirely under crxjs's control, so old artifacts are replaced rather than orphaned. (2) `node_modules/` is currently absent in the repo (`ls` confirms). `npm install` is a prerequisite to any verification. | Action: `rm -rf dist/` before the first post-refactor `vite build`, just to be sure. Action: `npm install` before testing. |
|
||
|
||
**Nothing in CI / no CD pipeline** — the project has no CI per audit P2 #22.
|
||
|
||
## Common Pitfalls
|
||
|
||
### Pitfall 1: Concatenated WebM chunks don't decode past the first GoP
|
||
|
||
**What goes wrong:** You retain the first chunk (which has the EBML
|
||
header), drop chunks until they age out at 30 s, and concatenate the
|
||
remaining chunks into `last_30sec.webm`. The file plays for ~2 s and
|
||
then the decoder gives up.
|
||
|
||
**Why it happens:** [CITED: bugzilla.mozilla.org/show_bug.cgi?id=1666487
|
||
comment from Andreas Pehrson] "There has been no intention to encode
|
||
keyframes at the timeslice interval." Chrome's VP9 encoder defaults to
|
||
`kf_max_dist=100` (about 3-5 s on a 30 fps stream); chunks emitted at
|
||
`timeslice=2000` ms fall mid-cluster about half the time. A `Blob`
|
||
concat of `[header_chunk, mid_cluster_chunk, mid_cluster_chunk, ...]`
|
||
produces a byte stream where the decoder hits a SimpleBlock referencing
|
||
a frame whose keyframe is in a chunk that's no longer there.
|
||
|
||
**How to avoid:** (1) Verify with `ffprobe -v error` at every build of
|
||
the export path. (2) If ffprobe complains, fall back to D-13
|
||
(restart-segments) — each 10 s segment is its own self-contained WebM,
|
||
concat is trivially safe (each segment has its own header), and
|
||
acceptance criterion §10 #7 ("plays back in a browser") doesn't require
|
||
a single continuous track. (3) Last-resort fallback: `ts-ebml`
|
||
header injection (deferred).
|
||
|
||
**Warning signs:** ffprobe stderr contains "Length indicated by EBML
|
||
number's first byte exceeds max length" or "Could not find codec
|
||
parameters." VLC plays the first few frames then stops. Chrome's video
|
||
tag shows the first frame then a black square.
|
||
|
||
### Pitfall 2: `getDisplayMedia` rejects with `NotAllowedError` when there's no transient activation
|
||
|
||
**What goes wrong:** SW sends `START_RECORDING` to the offscreen "too
|
||
late" (e.g. several seconds after the popup click, with awaits in
|
||
between). Offscreen calls `getDisplayMedia()` and gets a
|
||
`NotAllowedError`.
|
||
|
||
**Why it happens:** [CITED: chromestatus #5090735022407680 + intent-to-remove
|
||
thread] `getDisplayMedia()` requires transient user activation, which
|
||
expires ~5 s after the original gesture. If anything between the click
|
||
and the offscreen's `getDisplayMedia` call takes too long (slow
|
||
offscreen bootstrap, missing OFFSCREEN_READY handshake, network-bound
|
||
`await`), the activation window closes.
|
||
|
||
**How to avoid:** (1) Implement Pattern 4 (OFFSCREEN_READY handshake) so
|
||
the SW only sends `START_RECORDING` after the offscreen's listener is
|
||
demonstrably ready. (2) Don't put any `await`s between the popup click
|
||
handler and the `chrome.runtime.sendMessage('START_RECORDING')`. (3)
|
||
Pre-create the offscreen at SW startup (in `chrome.runtime.onInstalled`)
|
||
so the create-document round-trip isn't on the critical path.
|
||
|
||
**Warning signs:** First-run works on the developer's machine because
|
||
the offscreen bootstraps fast; CI / production fails because real-world
|
||
extension startup is slower.
|
||
|
||
### Pitfall 3: SW unloads mid-export and the popup gets "Receiving end does not exist"
|
||
|
||
**What goes wrong:** Operator clicks the popup save button after a long
|
||
idle period. SW had unloaded; popup's `chrome.runtime.sendMessage` wakes
|
||
it, but the SW's `videoBuffer` array (in the current code) was reset by
|
||
the unload, so it returns an empty buffer.
|
||
|
||
**Why it happens:** The current code stores the buffer in the SW's
|
||
top-level `let videoBuffer = []`. SW unload = lose array. CONTEXT.md
|
||
D-16 fixes this by moving buffer ownership to the offscreen, which
|
||
survives SW unloads because it holds the `DISPLAY_MEDIA` capture.
|
||
|
||
**How to avoid:** (1) Buffer ownership in offscreen, not SW (D-16). (2)
|
||
Port keepalive from offscreen → SW (D-17/Pattern 5) — if the SW ever
|
||
unloads, the offscreen's next port message wakes it. (3) On export, SW
|
||
asks offscreen for the buffer over the port; this is a one-shot,
|
||
SW-stateless lookup.
|
||
|
||
**Warning signs:** "Receiving end does not exist" in popup console after
|
||
~30 s of inactivity. Or: saved archive contains a tiny `last_30sec.webm`
|
||
that only holds the very first chunk.
|
||
|
||
### Pitfall 4: Long-lived port is closed by Chrome at ~5 minutes regardless of traffic
|
||
|
||
**What goes wrong:** You set up the port-based keepalive and confirm it
|
||
works for a few minutes. Then at minute 5, the port silently disconnects
|
||
and the SW unloads on the next idle window.
|
||
|
||
**Why it happens:** [CITED: gist sunnyguan & multiple Chromium-extensions
|
||
threads] Chrome enforces a hard 5-minute lifetime on long-lived ports
|
||
(an artifact of the SW `ExtendableEvent` time budget).
|
||
|
||
**How to avoid:** In the offscreen, listen to `port.onDisconnect` and
|
||
immediately call `chrome.runtime.connect()` again. Reconnect every
|
||
~290 s pre-emptively as a belt-and-braces guard.
|
||
|
||
**Warning signs:** Buffer goes empty around minute 5 of a long
|
||
recording session. Port is reported as `disconnected` in
|
||
chrome://extensions service-worker inspect.
|
||
|
||
### Pitfall 5: `chrome.runtime.getURL('offscreen/index.html')` returns a 404
|
||
|
||
**What goes wrong:** SW calls `chrome.offscreen.createDocument({url: 'offscreen/index.html', ...})`
|
||
and gets an `ERR_FILE_NOT_FOUND`.
|
||
|
||
**Why it happens:** crxjs places the bundled offscreen HTML under the
|
||
`src/`-relative path you declared in `rollupOptions.input`. If you set
|
||
`input: { offscreen: 'src/offscreen/index.html' }`, the runtime URL is
|
||
`chrome.runtime.getURL('src/offscreen/index.html')`, NOT
|
||
`offscreen/index.html`. [CITED: crxjs discussion #919 + #1060]
|
||
|
||
**How to avoid:** Match the input key (or the relative path crxjs emits)
|
||
to what the SW passes to `createDocument`. The path crxjs emits is the
|
||
same path you give as the rollup input value. Test by inspecting
|
||
`dist/` after `npm run build` — the HTML should be at exactly the path
|
||
the SW expects.
|
||
|
||
**Warning signs:** SW console shows "Failed to load resource: net::ERR_FILE_NOT_FOUND",
|
||
"Could not establish connection. Receiving end does not exist."
|
||
|
||
### Pitfall 6: `MediaStreamTrack.onended` never fires
|
||
|
||
**What goes wrong:** Operator clicks Chrome's "Stop sharing" banner; you
|
||
expect `track.onended` to fire so you can update state. Nothing happens.
|
||
|
||
**Why it happens:** (1) You attached the listener to the wrong track (the
|
||
stream's audio track instead of the video track). (2) You used `.onended
|
||
= fn` AFTER the event had already fired (race with the picker dismiss).
|
||
(3) You destructured the track and the listener attached to the GC'd
|
||
local.
|
||
|
||
**How to avoid:** Attach with `addEventListener('ended', ...)` (not
|
||
`.onended =`); attach to ALL tracks (`stream.getTracks().forEach(t => t.addEventListener('ended', onEnded))`)
|
||
so any track ending triggers cleanup; attach immediately after the
|
||
`getDisplayMedia()` await resolves.
|
||
|
||
**Warning signs:** Operator stops sharing, the UI keeps saying "recording"
|
||
in console logs, ffprobe-checking the next export shows the last 30 s of
|
||
content from BEFORE the user stopped.
|
||
|
||
## Code Examples
|
||
|
||
Verified patterns from official sources and a production extension.
|
||
|
||
### Example A — Minimal offscreen HTML (NEW: `src/offscreen/index.html`)
|
||
|
||
```html
|
||
<!-- Source: pattern from crxjs discussion #919 + Proscreen-S3 -->
|
||
<!doctype html>
|
||
<html>
|
||
<head><meta charset="UTF-8"><title>Mokosh Recorder</title></head>
|
||
<body>
|
||
<script type="module" src="./recorder.ts"></script>
|
||
</body>
|
||
</html>
|
||
```
|
||
|
||
### Example B — Minimal `vite.config.ts` (REPLACES the 184-line current one)
|
||
|
||
```typescript
|
||
// Source: crxjs documentation + discussion #919
|
||
import { defineConfig } from 'vite';
|
||
import { crx } from '@crxjs/vite-plugin';
|
||
import manifest from './manifest.json';
|
||
|
||
export default defineConfig({
|
||
plugins: [
|
||
crx({ manifest, contentScripts: { injectCss: false } }),
|
||
],
|
||
build: {
|
||
rollupOptions: {
|
||
input: {
|
||
offscreen: 'src/offscreen/index.html',
|
||
},
|
||
},
|
||
},
|
||
});
|
||
```
|
||
|
||
### Example C — SW: ensure-offscreen pattern (snippet for `src/background/index.ts`)
|
||
|
||
```typescript
|
||
// Source: github.com/GoogleChrome/chrome-extensions-samples/tree/main/functional-samples/sample.tabcapture-recorder/service-worker.js
|
||
// [VERIFIED: canonical Google sample, license Apache-2.0]
|
||
async function ensureOffscreenDocument(): Promise<void> {
|
||
const existing = await chrome.runtime.getContexts({
|
||
contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
|
||
});
|
||
if (existing.length > 0) return;
|
||
await chrome.offscreen.createDocument({
|
||
url: 'src/offscreen/index.html',
|
||
reasons: [chrome.offscreen.Reason.DISPLAY_MEDIA],
|
||
justification: 'Continuous screen recording for operator session diagnostics',
|
||
});
|
||
}
|
||
```
|
||
|
||
### Example D — ffprobe verification (used in the acceptance gate D-12)
|
||
|
||
```bash
|
||
# Source: ffmpeg.org/ffprobe.html, exit code semantics:
|
||
# 0 = recognized media; >0 = could not open / not multimedia / decode error
|
||
# Force-format -f matroska because WebM is a Matroska subset and helps
|
||
# ffprobe choose the right demuxer when the file is "live" (no SeekHead).
|
||
ffprobe -v error -f matroska -i last_30sec.webm
|
||
echo "ffprobe exit: $?"
|
||
|
||
# Optional: dump cluster timeline for diagnosis if exit != 0
|
||
ffprobe -v error -show_packets -i last_30sec.webm 2>&1 | head -50
|
||
```
|
||
|
||
### Example E — Codec capability strict-mode (CONTEXT.md D-20)
|
||
|
||
```typescript
|
||
// Source: MDN MediaRecorder.isTypeSupported + CONTEXT.md D-20
|
||
const VIDEO_MIME = 'video/webm;codecs=vp9';
|
||
const VIDEO_BITRATE = 400_000; // CON-video-codec
|
||
const TIMESLICE_MS = 2000; // CON-video-codec / SPEC §4.1
|
||
|
||
if (!MediaRecorder.isTypeSupported(VIDEO_MIME)) {
|
||
const ua = navigator.userAgent;
|
||
chrome.runtime.sendMessage({
|
||
type: 'RECORDING_ERROR',
|
||
error: `vp9 unsupported. UA=${ua}`,
|
||
});
|
||
throw new Error(`MediaRecorder mime not supported: ${VIDEO_MIME}; UA=${ua}`);
|
||
}
|
||
|
||
const videoRecorder = new MediaRecorder(stream, {
|
||
mimeType: VIDEO_MIME,
|
||
videoBitsPerSecond: VIDEO_BITRATE,
|
||
});
|
||
videoRecorder.start(TIMESLICE_MS);
|
||
```
|
||
|
||
### Example F — `MediaStreamTrack.onended` for "Stop sharing"
|
||
|
||
```typescript
|
||
// Source: MDN MediaStreamTrack#ended_event
|
||
stream.getTracks().forEach((track) => {
|
||
track.addEventListener('ended', () => {
|
||
// Clear the buffer (the captured source is gone)
|
||
ringBuffer.length = 0;
|
||
// Disconnect the port so SW can clean up
|
||
port?.disconnect();
|
||
// Notify SW for state transition; popup state change is Phase 3 territory
|
||
chrome.runtime.sendMessage({ type: 'RECORDING_ERROR', error: 'user-stopped-sharing' });
|
||
// Stop the recorder explicitly
|
||
if (videoRecorder.state !== 'inactive') videoRecorder.stop();
|
||
}, { once: true });
|
||
});
|
||
```
|
||
|
||
## State of the Art
|
||
|
||
| Old Approach | Current Approach | When Changed | Impact |
|
||
|--------------|------------------|--------------|--------|
|
||
| Background page (persistent) in MV2 | MV3 service worker | Chrome 88 → MV3 default; MV2 sunset 2024 | All capture APIs must be reachable from SW or offscreen, NOT a persistent page. Drives the SW + offscreen split. |
|
||
| `chrome.desktopCapture.chooseDesktopMedia` returning a streamId redeemable in any context | streamId from `chrome.desktopCapture` not usable in offscreen MV3 | Chrome 109+ offscreen API rollout | Forces the choice between (a) tabCapture + USER_MEDIA pattern (canonical Google sample) or (b) getDisplayMedia + DISPLAY_MEDIA pattern (CONTEXT.md D-01..D-05). [CITED: groups.google.com chromium-extensions/3RanHldyp9c] |
|
||
| `chrome.alarms` as the universal SW keepalive | Long-lived port `postMessage` traffic | Chrome 110+ "all events reset idle timer" + Chrome 114 "Sending a message with long-lived messaging keeps the service worker alive" + Chrome 116 WebSockets | Alarms still work in Chrome 110+ but are no longer the recommended primary keepalive for offscreen-paired extensions. [CITED: developer.chrome.com/blog/longer-esw-lifetimes] |
|
||
| `rrweb.record({maskInputSelector: ...})` | `rrweb.record({maskInputFn: ...})` | rrweb 2.0.0-alpha | Not Phase 1 territory (Phase 2 owns it), but flagged because the audit lists it as a P0. The current code uses `maskTextSelector` which is yet a third thing and is wrong (audit P0 #6). |
|
||
| Tab capture as active-tab-bound, requiring re-attach on `chrome.tabs.onActivated` | Display capture as screen/window-bound, NO re-attach (CONTEXT.md D-14/D-15) | This phase (DEC-003 AMENDED) | Deletes `chrome.tabs.onActivated` and `chrome.tabs.onUpdated` listener requirements from REQ-video-ring-buffer. |
|
||
|
||
**Deprecated/outdated:**
|
||
- `chrome.tabCapture.capture()` (the legacy callback form) — replaced by `chrome.tabCapture.getMediaStreamId` + offscreen `getUserMedia` redemption. We're abandoning this whole path per CONTEXT.md D-01.
|
||
- `mandatory: { chromeMediaSource: 'tab' }` constraint syntax — Chrome-specific extension to `getUserMedia`. Phase 1 doesn't use it (we use the standard `getDisplayMedia`).
|
||
|
||
## Assumptions Log
|
||
|
||
| # | Claim | Section | Risk if Wrong |
|
||
|---|-------|---------|---------------|
|
||
| A1 | Restart-segments fallback structural sketch (Pattern 3) | Architecture Patterns / Pattern 3 | Low — pattern is an inferred application of standard MediaRecorder semantics; if it fails, we have the third-tier ts-ebml deferred fallback. The risk is implementation-time, not phase-blocking. |
|
||
| A2 | Chrome enforces ~5 minute lifetime on long-lived ports (Pattern 5 / Pitfall 4) | Pitfall 4 | MEDIUM — multiple community sources corroborate, but no canonical Chrome doc states the exact limit. If the limit is shorter, our reconnect should still recover. If longer, our 290s reconnect is just defensive overhead. |
|
||
| A3 | `MediaRecorder.start(2000)` produces chunks that align with cluster boundaries about half the time (consequence of Chrome's `kf_max_dist=100` and 30 fps default) | Pitfall 1 / Pattern 2 | HIGH — this is the load-bearing claim that makes Pattern 2 work *at all*. The ffprobe gate (D-12) is exactly the mitigation; if ffprobe rejects, we escalate to Pattern 3 by design. So the assumption is **already mitigated by the plan's fallback structure**. |
|
||
| A4 | Chrome propagates transient user activation through `chrome.runtime.sendMessage` for the SW → offscreen → `getDisplayMedia` chain | Pattern 1 + Pitfall 2 | LOW — verified against a real production extension (Proscreen-S3) doing exactly this. Mitigation: OFFSCREEN_READY handshake (Pattern 4) tightens the timing window so we never exceed the ~5 s activation budget. |
|
||
| A5 | The 30-second window's "30" is an upper bound, not an exact target (CON-video-window allows ±10 s slack for the restart-segments fallback) | Pattern 3 | LOW — REQUIREMENTS.md says "the most recent 30 seconds" and "no more than 30 seconds", which our restart-segments stays inside (3×10 s = 30 s exactly at one phase of rotation, dropping to 20 s right after rotation). User confirmation desirable but the contract permits it. |
|
||
| A6 | `getDisplayMedia()` does NOT need `desktopCapture` permission in the manifest (it's a web standard API; `desktopCapture` only gates `chrome.desktopCapture.chooseDesktopMedia`) | Architectural Responsibility Map (Manifest row) + Standard Stack | LOW — multiple sources confirm. CONTEXT.md D-05 chooses to declare `desktopCapture` anyway, which is harmless. If we DROPPED `desktopCapture` from the manifest, the only ill effect would be losing the option to call `chrome.desktopCapture.chooseDesktopMedia` (which we don't use). |
|
||
| A7 | The `chrome.runtime.getContexts` API is available in Chrome ≥ 116 and is the recommended way to test for an existing offscreen document (replaces `chrome.offscreen.hasDocument`) | Pattern 1 / Example C | MEDIUM — `chrome.offscreen.hasDocument` is the older, simpler check and still works. The canonical Google sample uses `getContexts`. Either works; planner can pick. |
|
||
|
||
**If this table contains items:** The planner should treat them as
|
||
candidates for user verification during `/gsd-plan-phase` review.
|
||
|
||
## Open Questions (RESOLVED)
|
||
|
||
1. **Will `MediaRecorder.start(2000)` produce ffprobe-clean WebM on a
|
||
typical screen-cap?**
|
||
- **RESOLVED:** Cluster-boundary alignment is resolved by the D-12 ffprobe acceptance gate (enforced in Plan 03 Task 2 verify path + Plan 07 Task 1) and the D-13 restart-segments fallback (pre-staged as a commented skeleton in `src/offscreen/recorder.ts` per Plan 03; activated by re-plan after Plan 07 if the gate fails).
|
||
- What we know: Cluster boundaries align with keyframes; Chrome
|
||
keyframes appear every ~3-5 s by default (vp9 `kf_max_dist=100` on
|
||
a 30 fps stream); timeslice does NOT force keyframes.
|
||
- What's unclear: How often *in practice* does a 2 s timeslice happen
|
||
to land at a cluster boundary for a desktop screen-cap (which has
|
||
lots of static frames and may have different keyframe cadence than
|
||
a webcam)?
|
||
- Recommendation: Build Pattern 2 first; run the D-12 ffprobe gate;
|
||
keep Pattern 3 (restart-segments) pre-staged in PLAN.md per CONTEXT.md
|
||
D-13 so we don't re-plan if Pattern 2 fails. Plan-checker can ratchet
|
||
this in the success criteria.
|
||
|
||
2. **Does the 5-minute port lifetime kill the recording session?**
|
||
- **RESOLVED:** Plan 04's 290 s pre-emptive reconnect logic plus the synchronous onDisconnect → connectPort reconnect path mitigate the cap whether it applies to port lifetime or SW lifetime; either way the offscreen reconnects within seconds and the buffer is unaffected.
|
||
- What we know: Multiple corroborating community sources cite a ~5
|
||
minute hard cap on long-lived ports.
|
||
- What's unclear: Whether the cap applies to *port lifetime* (the
|
||
port object dies and must be reconnected) OR to *SW lifetime
|
||
extension* (after 5 minutes of port keepalive, the SW is killed
|
||
anyway and the port goes with it).
|
||
- Recommendation: Pessimistic — assume the worst, reconnect every
|
||
~290 s. Cheap defensive code. If we learn the cap is different, the
|
||
reconnect is still harmless.
|
||
|
||
3. **What's the exact crxjs path-emit behavior for the offscreen entry?**
|
||
- **RESOLVED:** Plan 06 Task 2 performs runtime verification — runs `npm run build`, inspects `dist/` for whichever of `dist/src/offscreen/index.html` or `dist/offscreen/index.html` was emitted, then edits `src/background/index.ts`'s `chrome.runtime.getURL(...)` argument to match (this is why Plan 06 now lists `src/background/index.ts` in files_modified per the iteration-1 dependency-correctness fix).
|
||
- What we know: The discussion #919 working answer uses
|
||
`input: { offscreen: 'src/offscreen/offscreen.html' }` and SW
|
||
fetches `chrome.runtime.getURL('src/offscreen/offscreen.html')`.
|
||
- What's unclear: Some crxjs versions strip the leading `src/`; the
|
||
2.0.0-beta vs 2.4.0 difference might matter.
|
||
- Recommendation: After the first `npm run build`, inspect `dist/` to
|
||
confirm the actual emitted path, then encode that path as a
|
||
constant in SW. This is a verifiable runtime check, not a design
|
||
decision.
|
||
|
||
## Environment Availability
|
||
|
||
| Dependency | Required By | Available | Version | Fallback |
|
||
|------------|------------|-----------|---------|----------|
|
||
| Node.js | Vite, TypeScript, npm | ✓ | v24.14.0 | — |
|
||
| npm | Dep install | ✓ | 11.9.0 | — |
|
||
| ffprobe (FFmpeg) | D-12 acceptance gate; ffprobe-based verification of every export sample | ✓ | 8.1.1 | None needed (ffprobe is the gate) |
|
||
| Chrome / Chromium | Manual smoke test (unpacked load → Сохранить отчёт → inspect dist) | ✗ | — | Plan must call out "manual test requires Chrome ≥ 116; install via `apt install google-chrome-stable` or note the gap to the operator." |
|
||
| Playwright / chromium-test-runner | Optional headed-Chrome integration tests (see Validation Architecture) | ✗ | — | Phase 1 acceptance does NOT require Playwright. Manual smoke is acceptable per ROADMAP Phase 4. If we want unit-test coverage for the trim logic, Vitest in node mode is enough. |
|
||
| node_modules/ | `vite build`, `tsc` | ✗ | — | Run `npm install` at start of phase; no fallback. |
|
||
|
||
**Missing dependencies with no fallback (blocking execution):**
|
||
- `node_modules/` — must run `npm install` once before any TS/Vite work.
|
||
Add as Wave 0 task.
|
||
|
||
**Missing dependencies with fallback (acceptable):**
|
||
- Chrome browser — manual smoke is Phase 4's job; for Phase 1, type-check
|
||
+ ffprobe-on-test-fixture is the deepest automated gate. If the
|
||
developer doesn't have Chrome installed, the plan still completes; the
|
||
Phase 4 ROADMAP item is where Chrome becomes mandatory.
|
||
- Playwright — not needed; see Validation Architecture below for why.
|
||
|
||
## Validation Architecture
|
||
|
||
Nyquist validation is enabled (`workflow.tdd_mode: true` in
|
||
`.planning/config.json`). The validation strategy is layered:
|
||
|
||
### Test Framework
|
||
|
||
| Property | Value |
|
||
|----------|-------|
|
||
| Framework | **Vitest** (Node mode for pure logic; Browser mode if needed for `MediaRecorder` mocks) — recommended, NOT currently installed. Vite is already a dev dep so Vitest is a zero-friction add. |
|
||
| Config file | NONE — Wave 0 creates `vitest.config.ts`. |
|
||
| Quick run command | `npx vitest run --reporter=dot` (after install) |
|
||
| Full suite command | `npx vitest run` + `npm run build` (typecheck via `tsc --noEmit`) + ffprobe gate (D-12) |
|
||
|
||
**Why not Jest:** `vite` is already the build tool; Vitest is the
|
||
zero-config-mismatch choice. No transformer dance for TS.
|
||
|
||
**Why not Playwright:** `MediaRecorder` + `getDisplayMedia` ARE driveable
|
||
in Chromium via Playwright with permissions auto-granted, but the
|
||
acceptance gate (ffprobe on a real exported file) requires actually
|
||
running the extension. Manual smoke + ffprobe is sufficient for Phase 1.
|
||
Playwright-driven smoke tests are Phase 4/5 territory.
|
||
|
||
**What's testable in Node-only Vitest:**
|
||
- Ring buffer logic (`addChunk`, `trimAged`) — pure function, takes
|
||
`{data: {size: number}, timestamp: number, isHeader: boolean}[]` and
|
||
returns the trimmed array. Mock `Blob` as `{size: N, type: 'video/webm'}`.
|
||
- Message handlers (mock `chrome.runtime` with `vitest-chrome` or a
|
||
lightweight stub).
|
||
- Port lifecycle / reconnect logic.
|
||
- Codec strict-mode error path (mock `MediaRecorder.isTypeSupported`
|
||
→ false).
|
||
|
||
**What's NOT testable in Vitest, requires manual smoke / Phase 4:**
|
||
- The actual `getDisplayMedia` flow (browser picker).
|
||
- Real WebM playability (covered by ffprobe gate on a test-fixture file).
|
||
- SW idle-unload survival (covered by manual DevTools "Force stop" test
|
||
in Phase 4 smoke checklist).
|
||
|
||
### Phase Requirements → Test Map
|
||
|
||
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|
||
|--------|----------|-----------|-------------------|--------------|
|
||
| REQ-video-ring-buffer | Ring buffer adds chunk; first chunk gets `isHeader: true` | unit | `npx vitest run tests/offscreen/ring-buffer.test.ts -t "first chunk is header"` | ❌ Wave 0 |
|
||
| REQ-video-ring-buffer | Ring buffer evicts chunks older than 30 s; keeps header | unit | `npx vitest run tests/offscreen/ring-buffer.test.ts -t "trim 30s"` | ❌ Wave 0 |
|
||
| REQ-video-ring-buffer | Codec strict-mode throws when vp9 unsupported (D-20) | unit | `npx vitest run tests/offscreen/codec-check.test.ts` | ❌ Wave 0 |
|
||
| REQ-video-ring-buffer | OFFSCREEN_READY message sent on listener registration | unit | `npx vitest run tests/offscreen/handshake.test.ts` | ❌ Wave 0 |
|
||
| REQ-video-ring-buffer | Port reconnect on disconnect within 1 s | unit | `npx vitest run tests/offscreen/port.test.ts -t "reconnects"` | ❌ Wave 0 |
|
||
| REQ-video-ring-buffer | SW deletes alarms keepalive (D-18) | type-check / grep | `! grep -RIn "chrome.alarms" src/background/` | NO CODE NEEDED (CI grep) |
|
||
| REQ-video-ring-buffer | SW deletes IndexedDB code path (D-19) | grep | `! grep -RIn "VideoRecorderDB\|openIndexedDB" src/` | NO CODE NEEDED (CI grep) |
|
||
| REQ-video-ring-buffer | `vite.config.ts:11-184` inline plugin deleted (D-08) | grep | `! grep -RIn "copy-offscreen\|chromeMediaSource" vite.config.ts` | NO CODE NEEDED |
|
||
| REQ-video-ring-buffer (acceptance gate D-12) | `last_30sec.webm` plays ffprobe-clean | integration (manual smoke + ffprobe) | `ffprobe -v error -f matroska -i sample/last_30sec.webm; echo $?` | ❌ Sample fixture produced manually for this gate, OR captured by Playwright in Phase 4. **For Phase 1, run on the file the manual smoke produces.** |
|
||
| REQ-video-ring-buffer | Type-check passes with zero `as any` and zero `@ts-ignore` regressions | static | `npx tsc --noEmit && ! grep -RIn "as any\|@ts-ignore" src/` | EXISTS (`tsc --noEmit` in `npm run build`) |
|
||
| REQ-video-ring-buffer | Manifest permission swap (D-A6 / D-05) | grep | `! grep "tabCapture" manifest.json && grep "desktopCapture" manifest.json` | NO CODE NEEDED |
|
||
| REQ-video-ring-buffer | Build produces a loadable extension | manual | `npm run build && ls dist/manifest.json dist/src/offscreen/index.html dist/assets/*.js` | NO TEST FILE; CI shell check |
|
||
|
||
### Sampling Rate
|
||
|
||
- **Per task commit:** `npx vitest run --reporter=dot && npx tsc --noEmit`
|
||
(≤ 10 s).
|
||
- **Per wave merge:** Full Vitest + `npm run build` + grep guards
|
||
(≤ 30 s).
|
||
- **Phase gate (D-12):** Manually load `dist/` into Chrome, capture a
|
||
test session, click save, run `ffprobe -v error -f matroska -i
|
||
~/Downloads/session_report_*.zip:video/last_30sec.webm` (extract via
|
||
`unzip -p`), confirm exit 0 with zero stderr lines.
|
||
|
||
### Wave 0 Gaps
|
||
|
||
- [ ] Install Vitest: `npm install -D vitest@^3 @vitest/ui` (verify
|
||
current major via `npm view vitest version` at the time of install).
|
||
- [ ] `vitest.config.ts` — pull in path aliases from `tsconfig.json`.
|
||
- [ ] `tests/offscreen/` directory with at minimum:
|
||
- `ring-buffer.test.ts` — covers REQ-video-ring-buffer trim & header
|
||
pinning.
|
||
- `codec-check.test.ts` — covers D-20 strict-mode error path.
|
||
- `handshake.test.ts` — covers Pattern 4 OFFSCREEN_READY.
|
||
- `port.test.ts` — covers Pattern 5 reconnect.
|
||
- [ ] `tests/fixtures/` — keep a known-good WebM for ffprobe sanity
|
||
(e.g. produced once on a developer machine and committed). Used by
|
||
CI to verify the ffprobe gate runs at all.
|
||
- [ ] `npm test` script in `package.json`: `"test": "vitest run"`.
|
||
- [ ] CI? — out of scope per audit P2 #22 (Phase 5).
|
||
|
||
## Security Domain
|
||
|
||
> Default per `.planning/config.json`: `security_enforcement` is absent →
|
||
> treated as enabled (per researcher contract).
|
||
|
||
### Applicable ASVS Categories
|
||
|
||
| ASVS Category | Applies | Standard Control |
|
||
|---------------|---------|-----------------|
|
||
| V2 Authentication | No | No authentication surface in Phase 1 (local-only, no server). |
|
||
| V3 Session Management | No | No sessions. |
|
||
| V4 Access Control | Yes (limited) | Manifest permissions are the access-control boundary. Minimize: `desktopCapture` is unnecessary if we use only `getDisplayMedia` (web API), but harmless. `tabCapture` is being REMOVED. `host_permissions: ["<all_urls>"]` remains for content-script injection (Phase 2 territory). |
|
||
| V5 Input Validation | Yes (limited) | The only "input" Phase 1 handles is the streamId NOT applicable (we don't use streamIds in the new path) and inter-context messages. Each `chrome.runtime.onMessage` handler should validate `msg.type` against the typed `MessageType` enum (already exists in `src/shared/types.ts`). |
|
||
| V6 Cryptography | No | No crypto. |
|
||
| V14 Configuration | Yes | `manifest.json` enumerates the permission set verbatim. The Doc-Cascade tasks (D-A1..D-A6) keep `.planning/intel/constraints.md` in lockstep with `manifest.json`. |
|
||
|
||
### Known Threat Patterns for {Chrome MV3 extension}
|
||
|
||
| Pattern | STRIDE | Standard Mitigation |
|
||
|---------|--------|---------------------|
|
||
| Untrusted message origin (cross-extension message injection) | Spoofing | Every `chrome.runtime.onMessage` listener should check `sender.id === chrome.runtime.id`. The current code doesn't; Phase 1 should add it where it adds new listeners (low effort). |
|
||
| `<all_urls>` host permission exposes the SW to messages from any content script on any site | Tampering | Already in design (REQ-manifest-permissions). The mitigation is that the SW only processes messages from its own content script (validated by `sender.id` check). |
|
||
| Stored video buffer contains sensitive operator session data | Information Disclosure | CON-buffer-storage: in-memory only, no persistence. CONTEXT.md D-19 reinforces (no IndexedDB, no `chrome.storage.session`). |
|
||
| Captured video may show passwords typed into other apps (since `getDisplayMedia` can grab the whole screen) | Information Disclosure | OUT OF SCOPE per Phase 1: this is exactly the trade-off accepted in CONTEXT.md D-04. The Chrome "Sharing" banner is the user-facing mitigation. Phase 2's password masking applies to rrweb / event-log, not to video pixels. |
|
||
| `eval` or string-injected code | Tampering | The `vite.config.ts:35-213` inline-string offscreen JS is effectively static (no user input), but it IS string-injected build output. CSP for MV3 extensions disallows `eval`, but a long template literal is allowed. Phase 1 DELETES this, which is also a security improvement. |
|
||
|
||
**Phase 1 has no novel security surface** beyond the manifest swap (D-A6)
|
||
and the sender-id check best-practice.
|
||
|
||
## Sources
|
||
|
||
### Primary (HIGH confidence)
|
||
|
||
- developer.chrome.com — `chrome.offscreen` API reference, `Reason` enum
|
||
values, including `DISPLAY_MEDIA`:
|
||
<https://developer.chrome.com/docs/extensions/reference/api/offscreen>
|
||
— confirmed via direct fetch on 2026-05-15.
|
||
- developer.chrome.com — Audio recording and screen capture guide,
|
||
including the canonical "use offscreen + DISPLAY_MEDIA" sentence:
|
||
<https://developer.chrome.com/docs/extensions/how-to/web-platform/screen-capture>
|
||
— fetched verbatim via gh API on 2026-05-15.
|
||
- developer.chrome.com — Service worker lifecycle:
|
||
<https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle>
|
||
— fetched, confirms Chrome 110 "all events reset idle timer", Chrome
|
||
114 "message via long-lived messaging keeps SW alive".
|
||
- developer.chrome.com — Longer extension SW lifetimes blog:
|
||
<https://developer.chrome.com/blog/longer-esw-lifetimes>.
|
||
- developer.chrome.com — `chrome.alarms` API reference:
|
||
<https://developer.chrome.com/docs/extensions/reference/api/alarms> —
|
||
confirms 30 s minimum period (Chrome 120+) for store-loaded;
|
||
unpacked has no limit.
|
||
- GoogleChrome/chrome-extensions-samples — `functional-samples/sample.tabcapture-recorder/`:
|
||
<https://github.com/GoogleChrome/chrome-extensions-samples/tree/main/functional-samples/sample.tabcapture-recorder>
|
||
— fetched all files via gh API; confirms the offscreen + USER_MEDIA
|
||
pattern (the close cousin of our DISPLAY_MEDIA pattern).
|
||
- MDN — `MediaRecorder.start()`:
|
||
<https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder/start>
|
||
— confirms timeslice is purely time-based, NOT codec-aware.
|
||
- ffmpeg.org — ffprobe documentation:
|
||
<https://ffmpeg.org/ffprobe.html> — exit code semantics for the D-12
|
||
gate.
|
||
|
||
### Secondary (MEDIUM confidence — verified with multiple sources)
|
||
|
||
- bugzilla.mozilla.org #1666487 — quote from Andreas Pehrson:
|
||
<https://bugzilla.mozilla.org/show_bug.cgi?id=1666487> — Chrome's
|
||
default keyframe cadence (`kf_max_dist=100`) cross-confirmed by
|
||
Chrome's MediaRecorder README.
|
||
- crxjs/chrome-extension-tools — Discussion #919 "Set up offscreen with
|
||
TypeScript":
|
||
<https://github.com/crxjs/chrome-extension-tools/discussions/919>
|
||
— and follow-up #1060: working pattern for HTML + TS module entry.
|
||
- Mozilla Firefox bug #1666487 — Pehrson's design rationale on
|
||
timeslice-vs-keyframe.
|
||
- Graham King's blog — "Reading MediaRecorder's webm/opus output":
|
||
<https://darkcoding.net/software/reading-mediarecorders-webm-opus-output/>
|
||
— third-party EBML walkthrough, confirms that MediaRecorder doesn't
|
||
split on SimpleBlock.
|
||
- chrome-extensions-samples issue #1111 — "Sample for chrome.offscreen":
|
||
<https://github.com/GoogleChrome/chrome-extensions-samples/issues/1111>
|
||
— confirms there is NO official sample for DISPLAY_MEDIA + getDisplayMedia.
|
||
- ngocquy020196/Proscreen-S3 — in-the-wild production extension:
|
||
<https://github.com/ngocquy020196/Proscreen-S3/blob/main/src/background/recording.ts>
|
||
+ `src/offscreen/recorder.ts`
|
||
— confirms the exact CONTEXT.md D-01..D-05 architecture works in practice.
|
||
- schniti269/meeting_mate — second corroborating real extension:
|
||
<https://github.com/schniti269/meeting_mate/blob/main/background.js>.
|
||
- crxjs.dev — Vite plugin docs:
|
||
<https://crxjs.dev/vite-plugin/> — confirms manifest-driven entry but
|
||
multi-entry HTML needs `rollupOptions.input`.
|
||
- GitHub gist sunnyguan/f94058f66fab89e59e75b1ac1bf1a06e
|
||
— MV3 keepalive patterns including port reconnect at 290 s.
|
||
- developer.chrome.com issue #2688 — clarifies that the original
|
||
"native messaging port keeps SW alive" claim has caveats.
|
||
|
||
### Tertiary (LOW confidence — flagged for cross-validation)
|
||
|
||
- chromium-extensions group thread — getDisplayMedia in offscreen:
|
||
<https://groups.google.com/a/chromium.org/g/chromium-extensions/c/V09VMCLzvWM>
|
||
— one thread suggests user-gesture issues in offscreen; this
|
||
appears contradicted by Proscreen-S3 working. Resolution: empirical
|
||
testing during Wave 1 (manual smoke).
|
||
- recall.ai blog post on how to build a Chrome recording extension:
|
||
<https://www.recall.ai/blog/how-to-build-a-chrome-recording-extension>
|
||
— uses tabCapture pattern (not our path), but confirms the high-level
|
||
three-component split.
|
||
- Stack Overflow #62236838 — concatenation of MediaRecorder WebM chunks:
|
||
cited content via WebSearch results only (no direct fetch — site
|
||
blocked); pattern matches what I confirmed via Graham King's blog and
|
||
ts-ebml docs.
|
||
|
||
## Metadata
|
||
|
||
**Confidence breakdown:**
|
||
- Standard stack & versions: HIGH — all verified via `npm view`.
|
||
- Architecture (offscreen + DISPLAY_MEDIA + port keepalive): HIGH —
|
||
verified against (a) official Chrome docs, (b) Google sample
|
||
(offscreen + USER_MEDIA — same architectural shape), (c) at least two
|
||
in-the-wild production extensions doing the exact DISPLAY_MEDIA path.
|
||
- Ring-buffer pattern: MEDIUM-HIGH — the structural pattern is solid;
|
||
the open question is cluster-boundary alignment of `start(2000)`,
|
||
which is *the* assumption the ffprobe gate (D-12) and the D-13
|
||
fallback are designed to handle.
|
||
- Common pitfalls: HIGH — every pitfall ties to a specific audit defect
|
||
or a citable Chrome doc / Chromium bug.
|
||
- Validation strategy: MEDIUM — the unit-testable surface is real and
|
||
documented; the integration test gap (browser/picker) is genuine but
|
||
accepted (Phase 4 territory).
|
||
- Security: HIGH for what's in scope; nothing exotic.
|
||
|
||
**Research date:** 2026-05-15
|
||
|
||
**Valid until:** 2026-06-15 (30 days, stable-ecosystem assumption).
|
||
Re-validate sooner if Chrome releases a 12X version that changes SW
|
||
lifecycle rules or the offscreen API stability promise. The most
|
||
volatile finding is A2 (5-minute port lifetime cap) — Chrome team has
|
||
been actively tuning this.
|
||
|
||
---
|
||
|
||
*Phase: 01-stabilize-video-pipeline*
|
||
*Research completed: 2026-05-15*
|