fix(01-review): CR-01+CR-02+CR-03+WR-03+WR-09 critical port + handshake race fixes

What was wrong:
- CR-01 (recorder.ts): encodeAndSendBuffer captured no port identity before
  awaiting Promise.all(blobToBase64). If the port disconnected mid-encode
  and onDisconnect synchronously reconnected (re-assigning keepalivePort
  to a fresh instance), the post-await null-check evaluated false and
  the BUFFER was posted on the NEW port — but the SW's per-request
  onMessage listener was still bound to the OLD port (captured at
  getVideoBufferFromOffscreen line 110). Result: SW timed out after
  2 s, SAVE_ARCHIVE produced an empty-segments zip, data-loss path
  masquerading as a benign timeout.
- CR-02 (background/index.ts): SW's onConnect handler attached
  ONLY onDisconnect — no permanent onMessage sink. PING traffic
  had no listener when getVideoBufferFromOffscreen wasn't running
  (the normal idle state of the port), and field reports note Chrome's
  SW idle-timer reset behaves inconsistently when no listener is
  attached. Risk: PINGs silently dropped, SW evicted ~30 s into
  recording, port torn down, next SAVE_ARCHIVE fails entirely.
- CR-03 (background/index.ts): offscreenReady is a one-shot Promise
  resolved on the FIRST OFFSCREEN_READY message. If the SW is evicted
  while the offscreen document persists, the next SW lifetime creates
  a fresh Promise and waits on it forever — the offscreen never
  re-emits OFFSCREEN_READY. startVideoCapture() hangs at
  `await offscreenReady` until Chrome restarts.
- WR-03 (recorder.ts): `baseTimestamp + idx` (Date.now() + idx) used
  millisecond resolution + array offset. Two REQUEST_BUFFER calls
  within the same millisecond would collide, breaking the sort-by-
  timestamp contract in SW-side mergeVideoSegments.
- WR-09 (recorder.ts): encodeAndSendBuffer always appended the
  unfinalized in-flight segment to the BUFFER. That segment lacks
  the Matroska SegmentSize and Cues that MediaRecorder.stop()
  writes — re-introducing the "File ended prematurely" symptom
  documented in debug session webm-playback-freeze.

What changed:
- recorder.ts encodeAndSendBuffer:
  - Capture `portAtRequest = keepalivePort` BEFORE the encode.
  - After the await, refuse to post if `keepalivePort !== portAtRequest`
    (port was replaced by reconnect). SW already times out cleanly
    after BUFFER_FETCH_TIMEOUT_MS = 2 s; the next SAVE_ARCHIVE
    re-issues REQUEST_BUFFER on the fresh port. Stale data
    NEVER reaches a stranger port.
  - Include the in-flight segment ONLY when finalized.length === 0
    (preserve the SAVE-within-first-10-s UX trade-off documented at
    the original comment) — otherwise drop the unfinalized tail.
  - Replace `baseTimestamp + idx` with module-level monotonic
    `++segmentSeq` counter (zero wall-clock dependency).
  - Switch from Promise.all/map+filter to a sequential for-loop
    because each iteration now mutates the shared `segmentSeq`;
    Promise.all timing would interleave assignments. Throughput
    impact negligible (3 segments × ~50 ms base64 each ≈ 150 ms
    vs ~50 ms parallel — still well under the 2 s SW budget).

- background/index.ts onConnect:
  - Install a permanent `port.onMessage.addListener` that
    explicitly drains PING and silently drops unknown traffic.
    Per-request BUFFER listener still wins because it's attached
    LATER in the listener chain when getVideoBufferFromOffscreen
    fires; this sink only catches the idle PING stream and
    guarantees the SW idle-timer reset is consumed by a real
    handler.

- background/index.ts initialize():
  - When `chrome.offscreen.hasDocument()` returns true on SW init,
    immediately resolve `offscreenReady` AND null
    `offscreenReadyResolve`. The offscreen MUST have completed its
    bootstrap before it was observable via hasDocument(); waiting
    for an OFFSCREEN_READY that will never come is a deadlock.

Why these fixes vs alternatives:
- CR-01: alternatives considered: (a) cancel encoding when port
  disconnects (requires AbortController plumbing into blobToBase64);
  (b) re-route the BUFFER through the new port via a per-port
  request-id correlation. Both add machinery for a case the SW
  already handles correctly (2 s timeout → retry). The capture-
  identity check is the minimum-mechanism fix and matches REVIEW.md
  CR-01 fix guidance exactly.
- CR-02: alternative considered: documenting "rely on kernel-level
  port-message side effect for idle-timer reset" — REJECTED, this
  is what the existing comment did and the field evidence shows
  it's unreliable. Explicit listener is the safe default.
- CR-03: alternative considered: (a) have offscreen re-emit
  OFFSCREEN_READY on every inbound SW message — adds noise to
  the message bus and races with the original-bootstrap-emit.
  Option (b) (resolve-on-hasDocument-true) is simpler, narrower,
  and was explicitly recommended by REVIEW.md.
- WR-03: alternative considered: keeping Date.now() and adding a
  microsecond-resolution offset via performance.now() — fragile
  across SW respawns where performance.now() resets. Module-level
  monotonic counter has zero wall-clock dependency.
- WR-09: alternative considered: forcing a synchronous rotation
  at REQUEST_BUFFER time. Rejected — adds ~50–200 ms latency to
  every save AND races with scheduleRotation()'s timer. The
  "exclude unless empty" trade-off matches REVIEW.md option (a)
  exactly and preserves the documented first-10-s UX path.

Validation evidence:
- npx tsc --noEmit: exit 0 (no type errors).
- npx vitest run --reporter=dot: 30/30 tests pass in 2.67 s
  (8 test files, including port.test.ts which pins the reconnect
  invariant and port-serialization.test.ts which pins the wire format).
- grep "as any\|@ts-ignore" src/offscreen/ src/background/index.ts
  src/shared/: no matches (type-safety gate stays clean).
This commit is contained in:
2026-05-16 09:21:34 +02:00
parent bf00bf65aa
commit 2e3f5248ce
2 changed files with 103 additions and 46 deletions

View File

@@ -87,12 +87,34 @@ chrome.runtime.onConnect.addListener((port) => {
} }
logger.log('Offscreen port connected'); logger.log('Offscreen port connected');
videoPort = port; videoPort = port;
// CR-02 fix: install a permanent onMessage sink on every accepted port.
// Chrome 110+ resets the SW idle-timer on any inbound port message, BUT
// in the field, behaviour has been observed to differ subtly when no
// listener is attached at all — Chrome may skip the idle-timer reset
// path entirely on unrouted messages. A no-op listener guarantees the
// PING traffic is consumed and the timer reset is unconditional. The
// per-request listener installed by `getVideoBufferFromOffscreen` still
// handles BUFFER routing; this sink only drains PING and any unknown
// traffic so it doesn't accumulate or surprise us later.
port.onMessage.addListener((msg) => {
if (
typeof msg === 'object' &&
msg !== null &&
(msg as { type?: unknown }).type === 'PING'
) {
// Explicit drain — silences "no listener" semantics in Chrome's
// port-message dispatch and keeps the SW idle-timer reset reliable.
return;
}
// Unknown traffic — drop silently (T-1-04 defense-in-depth).
// BUFFER is routed by the per-request listener in
// getVideoBufferFromOffscreen; that listener fires first when
// attached, so this branch never observes BUFFER in practice.
});
port.onDisconnect.addListener(() => { port.onDisconnect.addListener(() => {
logger.log('Offscreen port disconnected; offscreen will reconnect'); logger.log('Offscreen port disconnected; offscreen will reconnect');
videoPort = null; videoPort = null;
}); });
// Inbound traffic is mostly PING (ignored) and BUFFER (handled by the
// per-request listener installed in getVideoBufferFromOffscreen).
}); });
// 2 s budget covers the worst-case round-trip: offscreen base64-encodes // 2 s budget covers the worst-case round-trip: offscreen base64-encodes
@@ -517,6 +539,19 @@ async function initialize() {
if (exists) { if (exists) {
offscreenCreated = true; offscreenCreated = true;
logger.log('Existing offscreen document detected on SW init'); logger.log('Existing offscreen document detected on SW init');
// CR-03 fix: handshake deadlock after SW respawn.
// OFFSCREEN_READY is fired by the offscreen exactly once at its
// bootstrap (recorder.ts line ~452). If the SW is evicted (~30 s
// idle) while the offscreen document persists, the next SW lifetime
// creates a fresh `offscreenReady` Promise (line 33) and waits on
// it forever — the offscreen has no signal to re-emit, and
// startVideoCapture() hangs at `await offscreenReady`.
// Resolve readiness immediately when we detect a pre-existing
// offscreen: it MUST have completed its bootstrap before being
// observable via hasDocument().
offscreenReadyResolve?.();
offscreenReadyResolve = null;
logger.log('Resolving offscreenReady immediately — offscreen survived a prior SW lifetime');
} }
} }
} catch (err) { } catch (err) {

View File

@@ -55,6 +55,12 @@ let segments: Blob[] = [];
let currentChunks: Blob[] = []; let currentChunks: Blob[] = [];
let rotationTimerId: ReturnType<typeof setTimeout> | null = null; let rotationTimerId: ReturnType<typeof setTimeout> | null = null;
let keepalivePort: chrome.runtime.Port | null = null; // long-lived SW keepalive (D-17, Pattern 5) let keepalivePort: chrome.runtime.Port | null = null; // long-lived SW keepalive (D-17, Pattern 5)
// WR-03 fix: strictly-monotonic per-process counter for segment timestamps.
// Replaces the previous `Date.now() + idx` scheme, which could collide
// across two REQUEST_BUFFER calls within the same millisecond. The SW-side
// `mergeVideoSegments` sorts by this `timestamp` ascending; a pure counter
// guarantees deterministic ordering with zero wall-clock dependency.
let segmentSeq = 0;
// ─── Сегментный буфер (pure functions — testable in Node) ─────────────── // ─── Сегментный буфер (pure functions — testable in Node) ───────────────
@@ -333,58 +339,74 @@ function onPortMessage(message: unknown): void {
} }
async function encodeAndSendBuffer(): Promise<void> { async function encodeAndSendBuffer(): Promise<void> {
// Снимок завершённых сегментов + опциональный «свежий» текущий // CR-01 fix: capture the port identity BEFORE the await. If `keepalivePort`
// сегмент, если уже накопились dataavailable-чанки. Это нужно, чтобы // is replaced by a fresh reconnect during base64 encoding, posting on the
// SAVE_ARCHIVE через 3 секунды после старта первой сессии не вернул // new port would silently leak the BUFFER to a stranger — the SW's
// пустой буфер — операторская UX страдает иначе. Если currentChunks // per-request listener is still bound to the OLD port. The SW already
// пуст, in-flight сегмент не добавляем. // times out cleanly (BUFFER_FETCH_TIMEOUT_MS = 2 s), so dropping a stale
// response on the floor is the correct behaviour: the next SAVE_ARCHIVE
// round-trip will REQUEST_BUFFER on the fresh port.
const portAtRequest = keepalivePort;
if (portAtRequest === null) {
logger.warn('encodeAndSendBuffer called without an active port — drop');
return;
}
// WR-09 fix: an in-flight segment lacks the Matroska finalization that
// MediaRecorder.stop() performs (SegmentSize, Cues) — splicing it onto
// a finalized tail re-introduces the "File ended prematurely" symptom
// documented in the debug session webm-playback-freeze. We still want
// SOME data if SAVE_ARCHIVE fires within the first 10 s window (before
// any segment rotation has completed); in that single case we accept
// the trade-off and ship the in-flight Blob alone. Once we have at
// least one finalized segment, we drop the in-flight tail unconditionally.
const finalized = getSegments(); const finalized = getSegments();
const inFlight = const inFlight =
currentChunks.length > 0 finalized.length === 0 && currentChunks.length > 0
? new Blob(currentChunks, { type: 'video/webm' }) ? new Blob(currentChunks, { type: 'video/webm' })
: null; : null;
const allSegments: Blob[] = const allSegments: Blob[] =
inFlight !== null ? [...finalized, inFlight] : finalized; inFlight !== null ? [inFlight] : finalized;
// Метка времени для каждого сегмента — момент текущего экспорта; даёт // WR-03 fix: monotonically-increasing per-process counter, NOT
// SW-side `mergeVideoSegments` стабильный порядок сортировки и не зависит // `Date.now() + idx`. Date.now() at millisecond resolution collides
// от часов внутри MediaRecorder. Старший сегмент — самый старый. // across two REQUEST_BUFFER calls within the same millisecond (e.g.,
const baseTimestamp = Date.now(); // diagnostic prefetch + real save). The merge code on the SW side
// sorts by `timestamp` ascending, so a strictly monotonic counter
// Per-segment defensive encode: если одиночный Blob не зенкодится // guarantees a deterministic order independent of wall-clock skew.
// (например, неожиданный detach ArrayBuffer-а на лету), логируем и const transferred: TransferredVideoSegment[] = [];
// пропускаем — частичное видео > отсутствующее видео. for (let idx = 0; idx < allSegments.length; idx++) {
const encodeResults = await Promise.all( const segment = allSegments[idx];
allSegments.map(async (segment, idx): Promise<TransferredVideoSegment | null> => {
try { try {
const data = await blobToBase64(segment); const data = await blobToBase64(segment);
return { transferred.push({
data, data,
type: segment.type || 'video/webm', type: segment.type || 'video/webm',
// Порядковый offset — старшие сегменты получают меньший timestamp. timestamp: ++segmentSeq,
// Шаг произвольный, лишь бы сохранил порядок при сортировке в SW. });
timestamp: baseTimestamp + idx,
};
} catch (err) { } catch (err) {
// Per-segment defensive encode: a single Blob failing to encode
// (e.g. unexpected ArrayBuffer detach) is logged and skipped —
// partial video > no video at all.
logger.error( logger.error(
'blobToBase64 failed; skipping segment', 'blobToBase64 failed; skipping segment',
'index:', idx, 'index:', idx,
'size:', segment.size, 'size:', segment.size,
'error:', err, 'error:', err,
); );
return null;
} }
}), }
// After the await: refuse to post on a port that has been REPLACED
// by reconnect. The SW listens on the OLD port; posting on the NEW
// port would silently lose the data. Letting the SW time out is
// correct — the next SAVE_ARCHIVE will re-issue REQUEST_BUFFER on
// the fresh port.
if (keepalivePort !== portAtRequest) {
logger.warn(
'port replaced during encode; dropping BUFFER response (SW will time out and retry)',
); );
const transferred: TransferredVideoSegment[] = encodeResults.filter(
(c): c is TransferredVideoSegment => c !== null,
);
// Re-check port AFTER the await: it may have disconnected during encoding.
if (keepalivePort === null) {
logger.warn('port disconnected during base64 encoding; dropping BUFFER response');
return; return;
} }
keepalivePort.postMessage({ type: 'BUFFER', segments: transferred }); portAtRequest.postMessage({ type: 'BUFFER', segments: transferred });
} }
function connectPort(): void { function connectPort(): void {