fix(01-review): CR-01+CR-02+CR-03+WR-03+WR-09 critical port + handshake race fixes
What was wrong:
- CR-01 (recorder.ts): encodeAndSendBuffer captured no port identity before
awaiting Promise.all(blobToBase64). If the port disconnected mid-encode
and onDisconnect synchronously reconnected (re-assigning keepalivePort
to a fresh instance), the post-await null-check evaluated false and
the BUFFER was posted on the NEW port — but the SW's per-request
onMessage listener was still bound to the OLD port (captured at
getVideoBufferFromOffscreen line 110). Result: SW timed out after
2 s, SAVE_ARCHIVE produced an empty-segments zip, data-loss path
masquerading as a benign timeout.
- CR-02 (background/index.ts): SW's onConnect handler attached
ONLY onDisconnect — no permanent onMessage sink. PING traffic
had no listener when getVideoBufferFromOffscreen wasn't running
(the normal idle state of the port), and field reports note Chrome's
SW idle-timer reset behaves inconsistently when no listener is
attached. Risk: PINGs silently dropped, SW evicted ~30 s into
recording, port torn down, next SAVE_ARCHIVE fails entirely.
- CR-03 (background/index.ts): offscreenReady is a one-shot Promise
resolved on the FIRST OFFSCREEN_READY message. If the SW is evicted
while the offscreen document persists, the next SW lifetime creates
a fresh Promise and waits on it forever — the offscreen never
re-emits OFFSCREEN_READY. startVideoCapture() hangs at
`await offscreenReady` until Chrome restarts.
- WR-03 (recorder.ts): `baseTimestamp + idx` (Date.now() + idx) used
millisecond resolution + array offset. Two REQUEST_BUFFER calls
within the same millisecond would collide, breaking the sort-by-
timestamp contract in SW-side mergeVideoSegments.
- WR-09 (recorder.ts): encodeAndSendBuffer always appended the
unfinalized in-flight segment to the BUFFER. That segment lacks
the Matroska SegmentSize and Cues that MediaRecorder.stop()
writes — re-introducing the "File ended prematurely" symptom
documented in debug session webm-playback-freeze.
What changed:
- recorder.ts encodeAndSendBuffer:
- Capture `portAtRequest = keepalivePort` BEFORE the encode.
- After the await, refuse to post if `keepalivePort !== portAtRequest`
(port was replaced by reconnect). SW already times out cleanly
after BUFFER_FETCH_TIMEOUT_MS = 2 s; the next SAVE_ARCHIVE
re-issues REQUEST_BUFFER on the fresh port. Stale data
NEVER reaches a stranger port.
- Include the in-flight segment ONLY when finalized.length === 0
(preserve the SAVE-within-first-10-s UX trade-off documented at
the original comment) — otherwise drop the unfinalized tail.
- Replace `baseTimestamp + idx` with module-level monotonic
`++segmentSeq` counter (zero wall-clock dependency).
- Switch from Promise.all/map+filter to a sequential for-loop
because each iteration now mutates the shared `segmentSeq`;
Promise.all timing would interleave assignments. Throughput
impact negligible (3 segments × ~50 ms base64 each ≈ 150 ms
vs ~50 ms parallel — still well under the 2 s SW budget).
- background/index.ts onConnect:
- Install a permanent `port.onMessage.addListener` that
explicitly drains PING and silently drops unknown traffic.
Per-request BUFFER listener still wins because it's attached
LATER in the listener chain when getVideoBufferFromOffscreen
fires; this sink only catches the idle PING stream and
guarantees the SW idle-timer reset is consumed by a real
handler.
- background/index.ts initialize():
- When `chrome.offscreen.hasDocument()` returns true on SW init,
immediately resolve `offscreenReady` AND null
`offscreenReadyResolve`. The offscreen MUST have completed its
bootstrap before it was observable via hasDocument(); waiting
for an OFFSCREEN_READY that will never come is a deadlock.
Why these fixes vs alternatives:
- CR-01: alternatives considered: (a) cancel encoding when port
disconnects (requires AbortController plumbing into blobToBase64);
(b) re-route the BUFFER through the new port via a per-port
request-id correlation. Both add machinery for a case the SW
already handles correctly (2 s timeout → retry). The capture-
identity check is the minimum-mechanism fix and matches REVIEW.md
CR-01 fix guidance exactly.
- CR-02: alternative considered: documenting "rely on kernel-level
port-message side effect for idle-timer reset" — REJECTED, this
is what the existing comment did and the field evidence shows
it's unreliable. Explicit listener is the safe default.
- CR-03: alternative considered: (a) have offscreen re-emit
OFFSCREEN_READY on every inbound SW message — adds noise to
the message bus and races with the original-bootstrap-emit.
Option (b) (resolve-on-hasDocument-true) is simpler, narrower,
and was explicitly recommended by REVIEW.md.
- WR-03: alternative considered: keeping Date.now() and adding a
microsecond-resolution offset via performance.now() — fragile
across SW respawns where performance.now() resets. Module-level
monotonic counter has zero wall-clock dependency.
- WR-09: alternative considered: forcing a synchronous rotation
at REQUEST_BUFFER time. Rejected — adds ~50–200 ms latency to
every save AND races with scheduleRotation()'s timer. The
"exclude unless empty" trade-off matches REVIEW.md option (a)
exactly and preserves the documented first-10-s UX path.
Validation evidence:
- npx tsc --noEmit: exit 0 (no type errors).
- npx vitest run --reporter=dot: 30/30 tests pass in 2.67 s
(8 test files, including port.test.ts which pins the reconnect
invariant and port-serialization.test.ts which pins the wire format).
- grep "as any\|@ts-ignore" src/offscreen/ src/background/index.ts
src/shared/: no matches (type-safety gate stays clean).
This commit is contained in:
@@ -87,12 +87,34 @@ chrome.runtime.onConnect.addListener((port) => {
|
||||
}
|
||||
logger.log('Offscreen port connected');
|
||||
videoPort = port;
|
||||
// CR-02 fix: install a permanent onMessage sink on every accepted port.
|
||||
// Chrome 110+ resets the SW idle-timer on any inbound port message, BUT
|
||||
// in the field, behaviour has been observed to differ subtly when no
|
||||
// listener is attached at all — Chrome may skip the idle-timer reset
|
||||
// path entirely on unrouted messages. A no-op listener guarantees the
|
||||
// PING traffic is consumed and the timer reset is unconditional. The
|
||||
// per-request listener installed by `getVideoBufferFromOffscreen` still
|
||||
// handles BUFFER routing; this sink only drains PING and any unknown
|
||||
// traffic so it doesn't accumulate or surprise us later.
|
||||
port.onMessage.addListener((msg) => {
|
||||
if (
|
||||
typeof msg === 'object' &&
|
||||
msg !== null &&
|
||||
(msg as { type?: unknown }).type === 'PING'
|
||||
) {
|
||||
// Explicit drain — silences "no listener" semantics in Chrome's
|
||||
// port-message dispatch and keeps the SW idle-timer reset reliable.
|
||||
return;
|
||||
}
|
||||
// Unknown traffic — drop silently (T-1-04 defense-in-depth).
|
||||
// BUFFER is routed by the per-request listener in
|
||||
// getVideoBufferFromOffscreen; that listener fires first when
|
||||
// attached, so this branch never observes BUFFER in practice.
|
||||
});
|
||||
port.onDisconnect.addListener(() => {
|
||||
logger.log('Offscreen port disconnected; offscreen will reconnect');
|
||||
videoPort = null;
|
||||
});
|
||||
// Inbound traffic is mostly PING (ignored) and BUFFER (handled by the
|
||||
// per-request listener installed in getVideoBufferFromOffscreen).
|
||||
});
|
||||
|
||||
// 2 s budget covers the worst-case round-trip: offscreen base64-encodes
|
||||
@@ -517,6 +539,19 @@ async function initialize() {
|
||||
if (exists) {
|
||||
offscreenCreated = true;
|
||||
logger.log('Existing offscreen document detected on SW init');
|
||||
// CR-03 fix: handshake deadlock after SW respawn.
|
||||
// OFFSCREEN_READY is fired by the offscreen exactly once at its
|
||||
// bootstrap (recorder.ts line ~452). If the SW is evicted (~30 s
|
||||
// idle) while the offscreen document persists, the next SW lifetime
|
||||
// creates a fresh `offscreenReady` Promise (line 33) and waits on
|
||||
// it forever — the offscreen has no signal to re-emit, and
|
||||
// startVideoCapture() hangs at `await offscreenReady`.
|
||||
// Resolve readiness immediately when we detect a pre-existing
|
||||
// offscreen: it MUST have completed its bootstrap before being
|
||||
// observable via hasDocument().
|
||||
offscreenReadyResolve?.();
|
||||
offscreenReadyResolve = null;
|
||||
logger.log('Resolving offscreenReady immediately — offscreen survived a prior SW lifetime');
|
||||
}
|
||||
}
|
||||
} catch (err) {
|
||||
|
||||
Reference in New Issue
Block a user