Files
mokosh/.planning/phases/01-stabilize-video-pipeline/01-04-PLAN.md
Mark e55c1ae5d6 docs(01): revise plan 04 wave per checker iteration 1
Plan 04 depends_on: ["02", "03"], so wave must be max(1, 2)+1 = 3, not 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 16:50:10 +02:00

404 lines
21 KiB
Markdown

---
phase: 01-stabilize-video-pipeline
plan: 04
type: tdd
wave: 3
depends_on: ["02", "03"]
files_modified:
- src/offscreen/recorder.ts
autonomous: true
requirements:
- REQ-video-ring-buffer
requirements_addressed:
- REQ-video-ring-buffer
must_haves:
truths:
- "On module import, the offscreen opens exactly one port via `chrome.runtime.connect({ name: 'video-keepalive' })` AND emits exactly one `OFFSCREEN_READY` message via `chrome.runtime.sendMessage`"
- "Both the port-connect and the OFFSCREEN_READY send happen AFTER `chrome.runtime.onMessage.addListener` registration (Pattern 4 ordering)"
- "When the open port fires its registered `onDisconnect` listener, the module synchronously reconnects (Pitfall 4 mitigation)"
- "The port emits a `{ type: 'PING' }` postMessage on a ≤ 25 s interval (RESEARCH.md Pattern 5)"
- "Pre-emptive reconnect runs every 290 s (belt-and-braces against the ~5 min port-lifetime cap)"
- "The port handles incoming `REQUEST_BUFFER` messages by responding with `{ type: 'BUFFER', chunks: <getBuffer()> }`"
- "Running `npx vitest run tests/offscreen/handshake.test.ts tests/offscreen/port.test.ts` exits 0 (Plan 02 RED → GREEN)"
artifacts:
- path: "src/offscreen/recorder.ts"
provides: "Full port keepalive + handshake + REQUEST_BUFFER handler"
contains: "video-keepalive"
key_links:
- from: "src/offscreen/recorder.ts (port.onDisconnect)"
to: "src/offscreen/recorder.ts (connectPort)"
via: "synchronous reconnect call"
pattern: "onDisconnect"
- from: "src/offscreen/recorder.ts (REQUEST_BUFFER handler)"
to: "getBuffer()"
via: "port.postMessage({ type: 'BUFFER', chunks: getBuffer() })"
pattern: "REQUEST_BUFFER"
---
<objective>
Replace the import-time STUB bootstrap in Plan 03's `src/offscreen/recorder.ts`
with the full long-lived-port keepalive + OFFSCREEN_READY handshake +
REQUEST_BUFFER handler. This flips the two remaining RED tests from Plan 02
(`tests/offscreen/handshake.test.ts` and `tests/offscreen/port.test.ts`) to
GREEN.
Purpose: REQ-video-ring-buffer also covers the SW-keepalive contract per
D-17 (long-lived port replaces `chrome.alarms`) and the OFFSCREEN_READY
handshake per RESEARCH.md Pattern 4. Both behaviors are observable from a
pure-Node test that mocks `chrome.runtime` — exactly the TDD sweet spot for
a contract that survives SW unloads, port 5-min cap, and offscreen
bootstrap races.
Output: Updated `src/offscreen/recorder.ts` — only this single file is
touched. Plan 04 is an isolated unit-of-work; Plan 05 picks up the
SW-side `onConnect` handler in parallel (Plan 04 runs alongside Plan 05 in
Wave 1 with NO file overlap — Plan 04 owns offscreen-side, Plan 05 owns
SW-side).
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md
@.planning/phases/01-stabilize-video-pipeline/01-RESEARCH.md
@.planning/phases/01-stabilize-video-pipeline/01-PATTERNS.md
@tests/offscreen/handshake.test.ts
@tests/offscreen/port.test.ts
@src/offscreen/recorder.ts
</context>
<interfaces>
<!-- Pinned by Plan 02 tests. Plan 04 must satisfy these and ONLY these. -->
Module-import side-effects (from tests/offscreen/handshake.test.ts):
- `chrome.runtime.onMessage.addListener` is called at least once
- `chrome.runtime.sendMessage({ type: 'OFFSCREEN_READY' })` is called exactly once
Module-import side-effects (from tests/offscreen/port.test.ts):
- `chrome.runtime.connect({ name: 'video-keepalive' })` is called exactly once (first test)
- Firing the registered onDisconnect listener causes `chrome.runtime.connect` to be called a second time within the synchronous tick (second test)
In-port message contract (per RESEARCH.md Pattern 5):
- Outbound: `{ type: 'PING' }` every 25 s (configurable constant)
- Outbound on REQUEST_BUFFER: `{ type: 'BUFFER', chunks: VideoChunk[] }` (the chunks are `getBuffer()`)
- Inbound listener: handles `{ type: 'REQUEST_BUFFER' }` from SW side
Existing exports unchanged: `addChunk`, `trimAged`, `getBuffer`, `resetBuffer`, `assertCodecSupported`, `VIDEO_BUFFER_DURATION_MS` (Plan 03's tests must continue to pass).
</interfaces>
<threat_model>
## Trust Boundaries
| Boundary | Description |
|----------|-------------|
| port connection origin | Any context inside the extension can call `chrome.runtime.connect`; non-extension contexts cannot (Chrome enforces). The offscreen-side connect is initiated BY the offscreen, so the offscreen is the trusting party (it knows where the port goes). The SW-side `onConnect` (in Plan 05) is the listening party — that side validates `sender.id === chrome.runtime.id`. |
| inbound port `onMessage` payload | The offscreen receives `REQUEST_BUFFER` over the port; the SW is the only caller (sender validated on SW side); inbound message validation here is defense-in-depth |
## STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|-----------|----------|-----------|-------------|-----------------|
| T-1-04 | Spoofing — port hijack | Offscreen-side `port.onMessage` listener | mitigate | The port name `'video-keepalive'` is paired with the SW-side `onConnect` filter `p.name === 'video-keepalive' && p.sender?.id === chrome.runtime.id` (added in Plan 05). On the offscreen side: the port is opened BY the offscreen, so the offscreen is the trusting party and the SW is the only legitimate counterparty. As defense in depth, the offscreen-side `port.onMessage` handler explicitly switches on `msg.type` and ignores any unknown shape. Grep gate (Plan 05 owns the sender check, Plan 04 owns the type-switch on inbound port traffic): `grep -v '^#' src/offscreen/recorder.ts \| grep -cE "REQUEST_BUFFER" ` returns at least 1, AND any branch outside the known PortMessageType union is treated as no-op. |
| T-1-NEW-04-01 | DoS — port reconnect storm | offscreen `connectPort` → port.onDisconnect | mitigate | The reconnect path is idempotent: when `onDisconnect` fires, the existing port reference is cleared, a fresh port is opened, and the new port's listeners are attached. If reconnect throws (e.g., SW gone), the failure is caught and logged; the next pre-emptive 290 s reconnect cycle will retry. Grep gate: `grep -v '^#' src/offscreen/recorder.ts \| grep -cE "function connectPort"` returns 1. |
</threat_model>
<feature>
<name>Port keepalive + OFFSCREEN_READY handshake + REQUEST_BUFFER handler</name>
<files>src/offscreen/recorder.ts</files>
<behavior>
RED already exists (Plan 02 handshake + port tests). GREEN must:
Handshake (from tests/offscreen/handshake.test.ts):
- On module import, register `chrome.runtime.onMessage.addListener` BEFORE calling `chrome.runtime.sendMessage({ type: 'OFFSCREEN_READY' })`
- `OFFSCREEN_READY` is sent EXACTLY ONCE during module bootstrap (not on every subsequent message)
Port (from tests/offscreen/port.test.ts):
- On module import, call `chrome.runtime.connect({ name: 'video-keepalive' })` exactly once
- Register `onDisconnect` listener on the connected port
- When `onDisconnect` fires, immediately call `chrome.runtime.connect({ name: 'video-keepalive' })` again (synchronous reconnect)
- The reconnect attaches the same set of listeners (onMessage, onDisconnect, etc.) to the fresh port
Production behavior (not directly tested in Plan 02, but verifiable):
- Every 25 s while a port is open: `port.postMessage({ type: 'PING' })` (keepalive against SW idle)
- Every 290 s: pre-emptively close + reopen the port (Pitfall 4 — beat the ~5 min lifetime cap)
- On inbound port message `{ type: 'REQUEST_BUFFER' }`: respond with `{ type: 'BUFFER', chunks: getBuffer() }` (this is the SW's export-time data-pull path; Plan 05 wires the SW caller)
</behavior>
<implementation>
See task list — three tasks: RED-verify the two port + handshake test files, GREEN refactor the bootstrap section of recorder.ts to add the port lifecycle, then REFACTOR (skipped or minimal).
</implementation>
</feature>
<tasks>
<task type="auto">
<name>Task 1: RED-verify — confirm port + handshake tests fail on Plan 03's stub bootstrap</name>
<files>(none modified)</files>
<read_first>
- tests/offscreen/handshake.test.ts (Plan 02 output)
- tests/offscreen/port.test.ts (Plan 02 output)
- src/offscreen/recorder.ts (Plan 03 stub bootstrap — lines 144-155 of the Task-2 verbatim module)
</read_first>
<action>
Run the two test files this plan owns:
```bash
npx vitest run tests/offscreen/handshake.test.ts tests/offscreen/port.test.ts 2>&1 | tee /tmp/01-04-red.log
```
Analyse the failure mode:
- **Handshake test:** Plan 03's stub bootstrap already calls `chrome.runtime.sendMessage({ type: 'OFFSCREEN_READY' })` once after the listener registration, so this test MAY ALREADY PASS as of Plan 03's GREEN landing. If it does — that is acceptable. The handshake contract is sufficiently met by Plan 03's stub.
- **Port test (`connects on module load`):** Plan 03's stub calls `chrome.runtime.connect({ name: PORT_NAME })` once — this test MAY ALREADY PASS too.
- **Port test (`reconnects when port disconnects`):** Plan 03's stub does NOT register `onDisconnect`, so when the test fires the registered listeners, `connectCount` stays at 1 — this test MUST FAIL.
The intended RED gate for Plan 04 is the reconnect test. Confirm:
```bash
grep -q "reconnects when port disconnects" /tmp/01-04-red.log && grep -q "FAIL" /tmp/01-04-red.log
```
If the reconnect test passes spuriously (e.g., because Plan 03 already added reconnect): STOP — the plan ordering may have shifted; reconcile with Plan 03 before continuing.
If the handshake test or the `connects on module load` test fail: STOP — Plan 03 broke its own contract; revisit Plan 03 instead.
Otherwise, proceed to Task 2.
No commits in this task.
</action>
<verify>
<automated>npx vitest run tests/offscreen/port.test.ts -t "reconnects when port disconnects" 2>&1 | grep -qE "FAIL|✗"</automated>
</verify>
<acceptance_criteria>
- `/tmp/01-04-red.log` exists and is non-empty
- The reconnect test is in a FAIL state
- The handshake test and the `connects on module load` port test outcomes are recorded in /tmp/01-04-red.log (whether they pass or fail — we don't gate on them here)
</acceptance_criteria>
<done>RED gate confirmed: the reconnect test is failing because Plan 03's stub bootstrap doesn't reconnect on disconnect. Proceed to GREEN.</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: GREEN — replace stub bootstrap with full port lifecycle</name>
<files>src/offscreen/recorder.ts</files>
<read_first>
- src/offscreen/recorder.ts (Plan 03 output — the bootstrap section starting at the `// ─── Bootstrap` comment through the `void keepalivePort;` line)
- .planning/phases/01-stabilize-video-pipeline/01-RESEARCH.md §"Pattern 5: Long-lived port" (lines 590-648) and §"Pitfall 4" (lines 820-836)
- .planning/phases/01-stabilize-video-pipeline/01-PATTERNS.md §`src/offscreen/recorder.ts` — port keepalive snippet (lines 185-198)
- tests/offscreen/handshake.test.ts (the test that pins OFFSCREEN_READY ordering)
- tests/offscreen/port.test.ts (the test that pins reconnect-on-disconnect)
</read_first>
<behavior>
After Task 2 lands:
```bash
npx vitest run tests/offscreen/handshake.test.ts tests/offscreen/port.test.ts
# exit 0 — all 3 tests pass (1 in handshake, 2 in port)
npx vitest run tests/offscreen/ring-buffer.test.ts tests/offscreen/codec-check.test.ts
# exit 0 — ring-buffer and codec tests still green (NO regression from Plan 03)
```
</behavior>
<action>
Edit `src/offscreen/recorder.ts` to REPLACE the entire bootstrap section (currently starting at the `// ─── Bootstrap (Plan 04 wires the full port + handshake)` comment through the `void keepalivePort;` line) with the VERBATIM expanded bootstrap below. The pure-function exports above the bootstrap section and the D-13 fallback comment block below it are UNCHANGED.
Also: at the top of the file, ADD a new constant block (after the existing `const PORT_NAME = ...` line and before the `const logger = new OffscreenLogger(...)` line):
```typescript
const PORT_PING_MS = 25_000; // < 30 s SW idle threshold
const PORT_RECONNECT_MS = 290_000; // pre-empt the ~5 min port cap (Pitfall 4)
```
The bootstrap section becomes VERBATIM:
```typescript
// ─── Bootstrap: handshake + port lifecycle (D-17, RESEARCH.md Patterns 4 & 5) ──
function isFromOwnExtension(sender: chrome.runtime.MessageSender | undefined): boolean {
return sender?.id === chrome.runtime.id;
}
chrome.runtime.onMessage.addListener((message: Message, sender, sendResponse) => {
if (!isFromOwnExtension(sender)) {
return false;
}
switch (message.type) {
case 'START_RECORDING':
startRecording().then(() => sendResponse({ ok: true })).catch((err) => sendResponse({ ok: false, error: String(err) }));
return true;
case 'STOP_RECORDING':
stopRecording();
sendResponse({ ok: true });
return false;
default:
return false;
}
});
// Stable handles for the ping interval and the pre-emptive reconnect timer,
// so we can clear them on disconnect / re-init.
let pingIntervalId: ReturnType<typeof setInterval> | null = null;
let preemptiveReconnectId: ReturnType<typeof setTimeout> | null = null;
function teardownPortTimers(): void {
if (pingIntervalId !== null) {
clearInterval(pingIntervalId);
pingIntervalId = null;
}
if (preemptiveReconnectId !== null) {
clearTimeout(preemptiveReconnectId);
preemptiveReconnectId = null;
}
}
function onPortMessage(message: unknown): void {
// Defense-in-depth: explicit shape check before destructuring
if (typeof message !== 'object' || message === null) {
return;
}
const type = (message as { type?: unknown }).type;
if (type === 'REQUEST_BUFFER') {
if (keepalivePort !== null) {
keepalivePort.postMessage({ type: 'BUFFER', chunks: getBuffer() });
}
}
// Any unknown port message type is silently dropped (T-1-04 defense-in-depth).
}
function connectPort(): void {
teardownPortTimers();
try {
keepalivePort = chrome.runtime.connect({ name: PORT_NAME });
} catch (err) {
logger.error('port connect failed:', err);
keepalivePort = null;
return;
}
keepalivePort.onMessage.addListener(onPortMessage);
keepalivePort.onDisconnect.addListener(() => {
logger.warn('port disconnected — reconnecting');
teardownPortTimers();
keepalivePort = null;
// Synchronous reconnect — tests/offscreen/port.test.ts pins this
connectPort();
});
pingIntervalId = setInterval(() => {
keepalivePort?.postMessage({ type: 'PING' });
}, PORT_PING_MS);
preemptiveReconnectId = setTimeout(() => {
logger.log('pre-emptive port reconnect (290 s cap)');
keepalivePort?.disconnect();
// onDisconnect handler above triggers a fresh connectPort() call
}, PORT_RECONNECT_MS);
}
// Open the first port and emit OFFSCREEN_READY. Order matters per Pattern 4:
// the onMessage listener registration above already happened, so the SW can
// safely send START_RECORDING the moment it sees OFFSCREEN_READY.
connectPort();
chrome.runtime.sendMessage({ type: 'OFFSCREEN_READY' });
```
After this replacement, the previous `let keepalivePort: chrome.runtime.Port | null = null;` declaration in the module-level state block is UNCHANGED — it stays. The previous `void keepalivePort;` line is REMOVED (the variable is now used by `onPortMessage` and `connectPort`, so the no-unused-locals workaround is no longer needed).
Run:
```bash
npx vitest run
npx tsc --noEmit
```
Both MUST exit 0.
Notes:
- `setInterval` / `setTimeout` in offscreen documents: offscreen runs in a real DOM context (NOT in the SW), so timers are reliable — unlike `setTimeout` inside a SW. RESEARCH.md "Don't Hand-Roll" table confirms this.
- The `disconnect()` call inside the pre-emptive reconnect uses optional chaining; if the port is somehow already null, the call is a no-op. The `onDisconnect` listener fires synchronously in Chrome's port-disconnect path, which transitions us back through `connectPort()` cleanly.
- `PortMessage` shape: outbound traffic uses the `PortMessage` type added to `src/shared/types.ts` in Plan 03. No `as any`. Inbound messages over the port are typed `unknown` and shape-checked.
</action>
<verify>
<automated>npx vitest run && npx tsc --noEmit</automated>
</verify>
<acceptance_criteria>
- `npx vitest run` exits 0 with ALL test files in tests/offscreen/ passing (8 tests total across 4 files: 4 ring-buffer + 2 codec-check + 1 handshake + 2 port)
- `npx tsc --noEmit` exits 0
- `grep -c "function connectPort" src/offscreen/recorder.ts` returns 1
- `grep -c "PORT_PING_MS = 25_000" src/offscreen/recorder.ts` returns 1
- `grep -c "PORT_RECONNECT_MS = 290_000" src/offscreen/recorder.ts` returns 1
- `grep -c "REQUEST_BUFFER" src/offscreen/recorder.ts` returns at least 1 (the inbound handler)
- `grep -c "'BUFFER'" src/offscreen/recorder.ts` returns at least 1 (the outbound response shape)
- `grep -c "isFromOwnExtension" src/offscreen/recorder.ts` returns at least 1 (sender check on onMessage — T-1-04 defense-in-depth)
- `grep -c "as any" src/offscreen/recorder.ts` returns 0
- `grep -c "@ts-ignore" src/offscreen/recorder.ts` returns 0
- `grep -c "void keepalivePort" src/offscreen/recorder.ts` returns 0 (the no-unused-locals workaround is removed)
</acceptance_criteria>
<done>All 4 test files from Plan 02 are now GREEN. The offscreen module owns the full port lifecycle. Plan 05 can wire the SW-side `onConnect` handler against the same port name with confidence.</done>
</task>
<task type="auto">
<name>Task 3: REFACTOR — optional cleanup or no-op</name>
<files>src/offscreen/recorder.ts (only if changed)</files>
<read_first>
- src/offscreen/recorder.ts (Task 2 output)
</read_first>
<action>
Per the TDD REFACTOR phase rules:
1. Re-read `src/offscreen/recorder.ts` (the full module now that Plan 04 has landed).
2. Look ONLY for obvious improvements:
- Constants that should be hoisted into a shared block
- Type assertions that could be replaced with proper type narrowing
- Comments that became stale after the bootstrap refactor
3. If no obvious improvements: SKIP THIS TASK (do not commit). Note in summary: "Refactor: no changes needed."
4. If improvements found: make them; run `npx vitest run && npx tsc --noEmit`; commit ONLY if both exit 0.
Do NOT use this task to add features. Plan 05 picks up the SW-side wiring — do not anticipate it in this module.
</action>
<verify>
<automated>npx vitest run && npx tsc --noEmit</automated>
</verify>
<acceptance_criteria>
- All 4 test files still pass after the refactor pass (or after the no-op decision)
- `npx tsc --noEmit` still exits 0
- The SUMMARY.md documents whether refactor happened or was skipped
</acceptance_criteria>
<done>REFACTOR phase complete (or explicitly skipped with rationale).</done>
</task>
</tasks>
<verification>
After all three tasks land:
1. `npx vitest run` — exits 0 with ALL 8 tests passing across the 4 test files in tests/offscreen/.
2. `npx tsc --noEmit` — exits 0.
3. `grep -c "as any\|@ts-ignore" src/offscreen/recorder.ts` returns 0.
4. The grep gates listed in Task 2's `acceptance_criteria` all return their expected values.
5. The module's line count grew modestly (Task 2 adds ~40 lines net to the bootstrap section); the file remains well under 250 lines total.
Commit cadence:
- Task 1: no commit.
- Task 2: ONE commit (`feat(01-04): wire offscreen port keepalive and OFFSCREEN_READY handshake`).
- Task 3: zero or one commit (`refactor(01-04): {what was cleaned}`).
Total: 1-2 commits.
</verification>
<success_criteria>
- All 4 Plan-02 test files green (ring-buffer, codec-check, handshake, port) — 8 tests total
- `src/offscreen/recorder.ts` contains a single `connectPort()` function that handles connect, reconnect, REQUEST_BUFFER response, and pre-emptive reconnect
- T-1-04 mitigations in place: sender-id check on onMessage; type-switch on inbound port messages
- No `as any`, no `@ts-ignore`
- D-17 contract (long-lived port replacing alarms) is implemented at the offscreen side; SW side lands in Plan 05
</success_criteria>
<output>
After completion, create `.planning/phases/01-stabilize-video-pipeline/01-04-SUMMARY.md` with:
- RED: log excerpt from Task 1 showing the reconnect test failing
- GREEN: full vitest output showing all 8 tests passing
- REFACTOR: what changed (if anything) in Task 3
- Final line count for src/offscreen/recorder.ts
- Note: "Plan 05 must add a corresponding `chrome.runtime.onConnect.addListener` in `src/background/index.ts` that (a) filters by `port.name === 'video-keepalive'`, (b) validates `port.sender?.id === chrome.runtime.id` (T-1-04), (c) sends `{ type: 'REQUEST_BUFFER' }` over the port when SAVE_ARCHIVE arrives, and (d) handles the incoming `BUFFER` response by resolving the saveArchive Promise."
- Commit list (1-2 commits)
</output>