feat(option-c-sw): request-id'd BUFFER routing + retry on port replacement + PONG echo
Implements the SW-side architectural refactor per
.planning/debug/empty-archive-port-race.md "Fix Strategy: Option C":
1. **Request-id'd protocol** — getVideoBufferFromOffscreen generates a
uuid (crypto.randomUUID with Math.random fallback) and sends
{type:'REQUEST_BUFFER', requestId} on the live videoPort. The
per-request listener pattern is GONE; replaced by a module-level
pendingBufferRequests Map<requestId, PendingBufferRequest>. The
onConnect-level message sink routes BUFFER -> resolve by id.
2. **Stale BUFFER routing** — BUFFER messages without a matching
requestId in the Map are silently dropped (no cross-talk). BUFFER
without a valid requestId at all is rejected with a warn (Option C
protocol requires the id).
3. **Retry on port replacement** — every onConnect (post-bootstrap)
scans pendingBufferRequests and re-issues REQUEST_BUFFER on the
fresh port with the SAME requestId. The offscreen posts BUFFER on
the current keepalivePort (see prior offscreen commit), the sink
matches by id, and the request resolves. This retires the H2
silent-drop class architecturally — the BUFFER reaches the SW
regardless of port-replacement timing.
4. **PING -> PONG echo** — the sink replies to every PING with PONG.
Closes the offscreen's health-probe loop (it counts missed PONGs
and reconnects when MAX_MISSED_PONGS exceeded — see prior offscreen
commit). The PONG post is wrapped in try/catch to absorb the same
port-closed-mid-response race the offscreen ping path handles.
5. **Outer hard-timeout bumped 2s -> 10s** — the legacy per-port
BUFFER_FETCH_TIMEOUT_MS = 2000 was too tight to retry across a
reconnect. The new outer budget covers EVERY retry across port
replacements; the inner round-trip is still ~100-200 ms.
6. **decodeBufferSegments extracted** — pulled out of the legacy
inline handler so the new onConnect sink can decode wire segments
without duplicating the logic. Preserves WR-07 (empty wire segment
filter) and base64ToBlob defensive catch behaviour. Closes the
pre-existing implicit-undefined-return path the legacy flatMap
catch had (tsc happy but semantically ambiguous).
Status: 51 GREEN, 1 RED. The remaining RED (createArchive must throw
on empty video, surfacing to operator) is addressed in the next commit.
Pinning contracts (D-12 port-serialization, D-13 segment-rotation,
A3 webm-playback) untouched. tsc --noEmit exit 0; type-safety grep clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -71,108 +71,55 @@ async function ensureOffscreen() {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// SW-side port host (D-17, RESEARCH.md Pattern 5). The offscreen opens this
|
// Outer-bound buffer fetch budget. Larger than the legacy
|
||||||
// port on bootstrap and reconnects on disconnect. We use it for: (a)
|
// BUFFER_FETCH_TIMEOUT_MS (was 2 s; per-port-attempt) because the new
|
||||||
// keepalive traffic (PING) — Chrome 110+ resets the SW idle timer on every
|
// architecture covers MULTIPLE port-replacement retries inside one outer
|
||||||
// port message; (b) on-demand REQUEST_BUFFER round-trip during SAVE_ARCHIVE.
|
// budget. 10 s is generous: the inner per-port encode round-trip is
|
||||||
chrome.runtime.onConnect.addListener((port) => {
|
// still ~100-200 ms; the extra headroom covers up to ~50 reconnect
|
||||||
// T-1-04 mitigation: only accept ports from this extension
|
// cycles before the operator-visible error surfaces.
|
||||||
if (port.name !== 'video-keepalive') {
|
const BUFFER_FETCH_TIMEOUT_MS = 10_000;
|
||||||
return;
|
|
||||||
}
|
|
||||||
if (port.sender?.id !== chrome.runtime.id) {
|
|
||||||
logger.warn('Rejecting port with mismatched sender:', port.sender?.id);
|
|
||||||
port.disconnect();
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
logger.log('Offscreen port connected');
|
|
||||||
videoPort = port;
|
|
||||||
// CR-02 fix: install a permanent onMessage sink on every accepted port.
|
|
||||||
// Chrome 110+ resets the SW idle-timer on any inbound port message, BUT
|
|
||||||
// in the field, behaviour has been observed to differ subtly when no
|
|
||||||
// listener is attached at all — Chrome may skip the idle-timer reset
|
|
||||||
// path entirely on unrouted messages. A no-op listener guarantees the
|
|
||||||
// PING traffic is consumed and the timer reset is unconditional. The
|
|
||||||
// per-request listener installed by `getVideoBufferFromOffscreen` still
|
|
||||||
// handles BUFFER routing; this sink only drains PING and any unknown
|
|
||||||
// traffic so it doesn't accumulate or surprise us later.
|
|
||||||
port.onMessage.addListener((msg) => {
|
|
||||||
if (
|
|
||||||
typeof msg === 'object' &&
|
|
||||||
msg !== null &&
|
|
||||||
(msg as { type?: unknown }).type === 'PING'
|
|
||||||
) {
|
|
||||||
// Explicit drain — silences "no listener" semantics in Chrome's
|
|
||||||
// port-message dispatch and keeps the SW idle-timer reset reliable.
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
// Unknown traffic — drop silently (T-1-04 defense-in-depth).
|
|
||||||
// BUFFER is routed by the per-request listener in
|
|
||||||
// getVideoBufferFromOffscreen; that listener fires first when
|
|
||||||
// attached, so this branch never observes BUFFER in practice.
|
|
||||||
});
|
|
||||||
port.onDisconnect.addListener(() => {
|
|
||||||
logger.log('Offscreen port disconnected; offscreen will reconnect');
|
|
||||||
videoPort = null;
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
// 2 s budget covers the worst-case round-trip: offscreen base64-encodes
|
// Option C: in-flight REQUEST_BUFFER requests keyed by requestId. The
|
||||||
// up to ~15 chunks of ~100 KB each (~1.5 MB raw → ~2 MB base64) in
|
// onConnect-level message sink routes BUFFER -> resolve by id, so port
|
||||||
// well under 100 ms, post-message + JSON parse adds < 50 ms, leaving
|
// replacement (videoPort changes mid-request) does NOT lose the
|
||||||
// plenty of headroom. Bumping later is cheap if real-world recordings
|
// response — the offscreen posts BUFFER on the CURRENT port (whichever
|
||||||
// produce significantly larger buffers; today this is sufficient.
|
// that is) and our sink picks it up regardless of which Port object it
|
||||||
const BUFFER_FETCH_TIMEOUT_MS = 2_000;
|
// arrives on.
|
||||||
|
interface PendingBufferRequest {
|
||||||
|
resolve: (resp: VideoBufferResponse) => void;
|
||||||
|
hardTimer: ReturnType<typeof setTimeout>;
|
||||||
|
requestId: string;
|
||||||
|
}
|
||||||
|
const pendingBufferRequests: Map<string, PendingBufferRequest> = new Map();
|
||||||
|
|
||||||
async function getVideoBufferFromOffscreen(): Promise<VideoBufferResponse> {
|
// Generate a per-request correlation id. Uses crypto.randomUUID when
|
||||||
if (videoPort === null) {
|
// available (Chrome 92+ in SW context per
|
||||||
logger.warn('No offscreen port available; returning empty buffer');
|
// https://developer.chrome.com/docs/extensions/reference/api/runtime#secure_origin),
|
||||||
return { segments: [] };
|
// with a Math.random fallback that's still unique enough for in-process
|
||||||
}
|
// routing — collisions would require simultaneous in-flight requests
|
||||||
const port = videoPort;
|
// within the same millisecond on the same SW lifetime, vanishingly
|
||||||
return new Promise<VideoBufferResponse>((resolve) => {
|
// improbable for this UI flow.
|
||||||
const timer = setTimeout(() => {
|
function generateRequestId(): string {
|
||||||
port.onMessage.removeListener(handler);
|
|
||||||
// Sweep #5 fix: surface the diagnostic when the timeout fires
|
|
||||||
// because the port was replaced by a reconnect mid-request.
|
|
||||||
// The OLD port (captured as `port`) has a dead listener; the
|
|
||||||
// offscreen will encode-and-send on the NEW port but the
|
|
||||||
// listener installed there belongs to a different
|
|
||||||
// getVideoBufferFromOffscreen call (if any). Without this
|
|
||||||
// diagnostic the operator sees a silent timeout that masquerades
|
|
||||||
// as an offscreen-side problem. With it, the SW log shows the
|
|
||||||
// reconnect timing was the proximate cause.
|
|
||||||
const portReplaced = videoPort !== port;
|
|
||||||
logger.warn(
|
|
||||||
`Buffer fetch timed out after ${BUFFER_FETCH_TIMEOUT_MS} ms`,
|
|
||||||
'port_replaced_during_fetch:', portReplaced,
|
|
||||||
);
|
|
||||||
resolve({ segments: [] });
|
|
||||||
}, BUFFER_FETCH_TIMEOUT_MS);
|
|
||||||
const handler = (msg: unknown) => {
|
|
||||||
if (
|
if (
|
||||||
typeof msg === 'object' &&
|
typeof crypto !== 'undefined' &&
|
||||||
msg !== null &&
|
typeof crypto.randomUUID === 'function'
|
||||||
(msg as { type?: unknown }).type === 'BUFFER'
|
|
||||||
) {
|
) {
|
||||||
clearTimeout(timer);
|
return crypto.randomUUID();
|
||||||
port.onMessage.removeListener(handler);
|
}
|
||||||
// D-12 wire format + D-13 segment lifecycle: payload arrives
|
return `req-${Date.now()}-${Math.random().toString(36).slice(2)}`;
|
||||||
// as TransferredVideoSegment[] (base64 string + MIME). Decode
|
}
|
||||||
// each entry back into a VideoSegment — each is a
|
|
||||||
// self-contained ~10 s WebM (EBML header + seed keyframe).
|
// Decodes a BUFFER message's wire-format segments into VideoSegment[].
|
||||||
// Concatenating them sequentially produces a multi-EBML-header
|
// Extracted from the legacy inline handler so the onConnect sink can
|
||||||
// file Chrome plays natively. See src/shared/binary.ts +
|
// resolve a pending request without duplicating the decode logic.
|
||||||
// RESEARCH.md Pattern 3.
|
function decodeBufferSegments(
|
||||||
const wireSegments =
|
wireSegments: TransferredVideoSegment[],
|
||||||
(msg as { segments?: TransferredVideoSegment[] }).segments ?? [];
|
): VideoSegment[] {
|
||||||
// WR-07 fix: filter empty wire segments BEFORE base64 decode.
|
// WR-07 fix: filter empty wire segments BEFORE base64 decode. An empty
|
||||||
// An empty wire.data would decode to a zero-byte Blob; the
|
// wire.data would decode to a zero-byte Blob; mergeVideoSegments would
|
||||||
// SW-side mergeVideoSegments would then concat it into the
|
// then concat it into the output WebM, producing a stray empty EBML
|
||||||
// output WebM, producing a stray empty EBML segment that
|
// segment that breaks Chrome playback. Two passes (filter -> decode ->
|
||||||
// breaks Chrome playback. We split into two passes (filter →
|
// filter-non-empty) keep the iteration semantics declarative.
|
||||||
// decode → filter-non-empty) so the iteration semantics stay
|
|
||||||
// declarative (no early-return in the loop body).
|
|
||||||
const nonEmptyWires = wireSegments.filter((wire) => {
|
const nonEmptyWires = wireSegments.filter((wire) => {
|
||||||
const isEmpty = !wire.data || wire.data.length === 0;
|
const isEmpty = !wire.data || wire.data.length === 0;
|
||||||
if (isEmpty) {
|
if (isEmpty) {
|
||||||
@@ -203,11 +150,140 @@ async function getVideoBufferFromOffscreen(): Promise<VideoBufferResponse> {
|
|||||||
return [];
|
return [];
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
resolve({ segments });
|
return segments;
|
||||||
|
}
|
||||||
|
|
||||||
|
// SW-side port host (D-17, RESEARCH.md Pattern 5). The offscreen opens this
|
||||||
|
// port on bootstrap and reconnects on disconnect. We use it for: (a)
|
||||||
|
// keepalive traffic (PING/PONG health probe — Option C) — Chrome 110+
|
||||||
|
// resets the SW idle timer on every port message, AND the PONG reply
|
||||||
|
// closes the offscreen's health-probe loop; (b) on-demand REQUEST_BUFFER
|
||||||
|
// round-trip during SAVE_ARCHIVE, routed by requestId so port
|
||||||
|
// replacement mid-request does not drop the response.
|
||||||
|
chrome.runtime.onConnect.addListener((port) => {
|
||||||
|
// T-1-04 mitigation: only accept ports from this extension.
|
||||||
|
if (port.name !== 'video-keepalive') {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (port.sender?.id !== chrome.runtime.id) {
|
||||||
|
logger.warn('Rejecting port with mismatched sender:', port.sender?.id);
|
||||||
|
port.disconnect();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
logger.log('Offscreen port connected');
|
||||||
|
videoPort = port;
|
||||||
|
// CR-02 fix: install a permanent onMessage sink on every accepted port.
|
||||||
|
// Chrome 110+ resets the SW idle-timer on any inbound port message, BUT
|
||||||
|
// in the field, behaviour has been observed to differ subtly when no
|
||||||
|
// listener is attached at all — Chrome may skip the idle-timer reset
|
||||||
|
// path entirely on unrouted messages.
|
||||||
|
//
|
||||||
|
// Option C: this sink ALSO routes BUFFER responses to the matching
|
||||||
|
// pending request by requestId (the per-request listener pattern is
|
||||||
|
// gone — it could not handle port replacement). And it echoes PONG on
|
||||||
|
// every PING so the offscreen's health probe sees life.
|
||||||
|
port.onMessage.addListener((msg) => {
|
||||||
|
if (typeof msg !== 'object' || msg === null) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const type = (msg as { type?: unknown }).type;
|
||||||
|
if (type === 'PING') {
|
||||||
|
// Health-probe echo (Option C). Wrapped in try/catch because the
|
||||||
|
// port may have been disconnected between the inbound PING and
|
||||||
|
// our response — silently drop in that race window.
|
||||||
|
try {
|
||||||
|
port.postMessage({ type: 'PONG' });
|
||||||
|
} catch (err) {
|
||||||
|
logger.warn('PONG postMessage failed (port closed):', err);
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (type === 'BUFFER') {
|
||||||
|
const requestId = (msg as { requestId?: unknown }).requestId;
|
||||||
|
if (typeof requestId !== 'string' || requestId.length === 0) {
|
||||||
|
// Defense-in-depth: BUFFER without a valid requestId is invalid
|
||||||
|
// under the Option C protocol — drop with a warn. (Legacy
|
||||||
|
// offscreen code that didn't carry requestId is gone.)
|
||||||
|
logger.warn('BUFFER without a valid requestId — dropping');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const pending = pendingBufferRequests.get(requestId);
|
||||||
|
if (pending === undefined) {
|
||||||
|
// Stale BUFFER (request already resolved or timed out). Silently
|
||||||
|
// drop — this is the no-cross-talk property the request-id
|
||||||
|
// routing guarantees.
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const wireSegments =
|
||||||
|
(msg as { segments?: TransferredVideoSegment[] }).segments ?? [];
|
||||||
|
const segments = decodeBufferSegments(wireSegments);
|
||||||
|
clearTimeout(pending.hardTimer);
|
||||||
|
pendingBufferRequests.delete(requestId);
|
||||||
|
pending.resolve({ segments });
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
// Unknown traffic — drop silently (T-1-04 defense-in-depth).
|
||||||
|
});
|
||||||
|
port.onDisconnect.addListener(() => {
|
||||||
|
logger.log('Offscreen port disconnected; offscreen will reconnect');
|
||||||
|
if (videoPort === port) {
|
||||||
|
videoPort = null;
|
||||||
|
}
|
||||||
|
});
|
||||||
|
// If there are pending REQUEST_BUFFER requests at the moment this port
|
||||||
|
// connects, re-issue them on the fresh port with the SAME requestId.
|
||||||
|
// This is the architectural mechanism that retires the H2 silent-drop
|
||||||
|
// class — the BUFFER reaches the SW regardless of port-replacement
|
||||||
|
// timing. (Note: the FIRST onConnect has pendingBufferRequests.size
|
||||||
|
// === 0 so this branch correctly does nothing on bootstrap.)
|
||||||
|
if (pendingBufferRequests.size > 0) {
|
||||||
|
for (const pending of pendingBufferRequests.values()) {
|
||||||
|
try {
|
||||||
|
port.postMessage({
|
||||||
|
type: 'REQUEST_BUFFER',
|
||||||
|
requestId: pending.requestId,
|
||||||
|
});
|
||||||
|
} catch (err) {
|
||||||
|
// The fresh port disconnected synchronously — the outer hard
|
||||||
|
// timer will fire and surface the error.
|
||||||
|
logger.warn('REQUEST_BUFFER retry post failed:', err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
async function getVideoBufferFromOffscreen(): Promise<VideoBufferResponse> {
|
||||||
|
if (videoPort === null) {
|
||||||
|
logger.warn('No offscreen port available; returning empty buffer');
|
||||||
|
return { segments: [] };
|
||||||
|
}
|
||||||
|
const requestId = generateRequestId();
|
||||||
|
return new Promise<VideoBufferResponse>((resolve) => {
|
||||||
|
const hardTimer = setTimeout(() => {
|
||||||
|
pendingBufferRequests.delete(requestId);
|
||||||
|
// Outer hard-timeout: covers EVERY retry across port replacements
|
||||||
|
// (the legacy per-port BUFFER_FETCH_TIMEOUT_MS was 2 s per
|
||||||
|
// attempt — too tight to retry across a reconnect). 10 s is
|
||||||
|
// generous; the inner round-trip is still ~100-200 ms.
|
||||||
|
logger.warn(
|
||||||
|
`Buffer fetch outer timeout (${BUFFER_FETCH_TIMEOUT_MS} ms) — no BUFFER for requestId ${requestId}`,
|
||||||
|
);
|
||||||
|
resolve({ segments: [] });
|
||||||
|
}, BUFFER_FETCH_TIMEOUT_MS);
|
||||||
|
pendingBufferRequests.set(requestId, {
|
||||||
|
resolve,
|
||||||
|
hardTimer,
|
||||||
|
requestId,
|
||||||
|
});
|
||||||
|
try {
|
||||||
|
videoPort?.postMessage({ type: 'REQUEST_BUFFER', requestId });
|
||||||
|
} catch (err) {
|
||||||
|
// The current port disconnected synchronously. Don't resolve here
|
||||||
|
// — the offscreen's reconnect will fire a fresh onConnect, the
|
||||||
|
// sink will detect the in-flight request, and the retry path will
|
||||||
|
// re-post REQUEST_BUFFER on the new port.
|
||||||
|
logger.warn('Initial REQUEST_BUFFER post failed (port disconnected):', err);
|
||||||
}
|
}
|
||||||
};
|
|
||||||
port.onMessage.addListener(handler);
|
|
||||||
port.postMessage({ type: 'REQUEST_BUFFER' });
|
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user