From 2e3f5248cee472fd3bc43c0434c6b132cbd32bc3 Mon Sep 17 00:00:00 2001 From: Mark Date: Sat, 16 May 2026 09:21:34 +0200 Subject: [PATCH] fix(01-review): CR-01+CR-02+CR-03+WR-03+WR-09 critical port + handshake race fixes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit What was wrong: - CR-01 (recorder.ts): encodeAndSendBuffer captured no port identity before awaiting Promise.all(blobToBase64). If the port disconnected mid-encode and onDisconnect synchronously reconnected (re-assigning keepalivePort to a fresh instance), the post-await null-check evaluated false and the BUFFER was posted on the NEW port — but the SW's per-request onMessage listener was still bound to the OLD port (captured at getVideoBufferFromOffscreen line 110). Result: SW timed out after 2 s, SAVE_ARCHIVE produced an empty-segments zip, data-loss path masquerading as a benign timeout. - CR-02 (background/index.ts): SW's onConnect handler attached ONLY onDisconnect — no permanent onMessage sink. PING traffic had no listener when getVideoBufferFromOffscreen wasn't running (the normal idle state of the port), and field reports note Chrome's SW idle-timer reset behaves inconsistently when no listener is attached. Risk: PINGs silently dropped, SW evicted ~30 s into recording, port torn down, next SAVE_ARCHIVE fails entirely. - CR-03 (background/index.ts): offscreenReady is a one-shot Promise resolved on the FIRST OFFSCREEN_READY message. If the SW is evicted while the offscreen document persists, the next SW lifetime creates a fresh Promise and waits on it forever — the offscreen never re-emits OFFSCREEN_READY. startVideoCapture() hangs at `await offscreenReady` until Chrome restarts. - WR-03 (recorder.ts): `baseTimestamp + idx` (Date.now() + idx) used millisecond resolution + array offset. Two REQUEST_BUFFER calls within the same millisecond would collide, breaking the sort-by- timestamp contract in SW-side mergeVideoSegments. - WR-09 (recorder.ts): encodeAndSendBuffer always appended the unfinalized in-flight segment to the BUFFER. That segment lacks the Matroska SegmentSize and Cues that MediaRecorder.stop() writes — re-introducing the "File ended prematurely" symptom documented in debug session webm-playback-freeze. What changed: - recorder.ts encodeAndSendBuffer: - Capture `portAtRequest = keepalivePort` BEFORE the encode. - After the await, refuse to post if `keepalivePort !== portAtRequest` (port was replaced by reconnect). SW already times out cleanly after BUFFER_FETCH_TIMEOUT_MS = 2 s; the next SAVE_ARCHIVE re-issues REQUEST_BUFFER on the fresh port. Stale data NEVER reaches a stranger port. - Include the in-flight segment ONLY when finalized.length === 0 (preserve the SAVE-within-first-10-s UX trade-off documented at the original comment) — otherwise drop the unfinalized tail. - Replace `baseTimestamp + idx` with module-level monotonic `++segmentSeq` counter (zero wall-clock dependency). - Switch from Promise.all/map+filter to a sequential for-loop because each iteration now mutates the shared `segmentSeq`; Promise.all timing would interleave assignments. Throughput impact negligible (3 segments × ~50 ms base64 each ≈ 150 ms vs ~50 ms parallel — still well under the 2 s SW budget). - background/index.ts onConnect: - Install a permanent `port.onMessage.addListener` that explicitly drains PING and silently drops unknown traffic. Per-request BUFFER listener still wins because it's attached LATER in the listener chain when getVideoBufferFromOffscreen fires; this sink only catches the idle PING stream and guarantees the SW idle-timer reset is consumed by a real handler. - background/index.ts initialize(): - When `chrome.offscreen.hasDocument()` returns true on SW init, immediately resolve `offscreenReady` AND null `offscreenReadyResolve`. The offscreen MUST have completed its bootstrap before it was observable via hasDocument(); waiting for an OFFSCREEN_READY that will never come is a deadlock. Why these fixes vs alternatives: - CR-01: alternatives considered: (a) cancel encoding when port disconnects (requires AbortController plumbing into blobToBase64); (b) re-route the BUFFER through the new port via a per-port request-id correlation. Both add machinery for a case the SW already handles correctly (2 s timeout → retry). The capture- identity check is the minimum-mechanism fix and matches REVIEW.md CR-01 fix guidance exactly. - CR-02: alternative considered: documenting "rely on kernel-level port-message side effect for idle-timer reset" — REJECTED, this is what the existing comment did and the field evidence shows it's unreliable. Explicit listener is the safe default. - CR-03: alternative considered: (a) have offscreen re-emit OFFSCREEN_READY on every inbound SW message — adds noise to the message bus and races with the original-bootstrap-emit. Option (b) (resolve-on-hasDocument-true) is simpler, narrower, and was explicitly recommended by REVIEW.md. - WR-03: alternative considered: keeping Date.now() and adding a microsecond-resolution offset via performance.now() — fragile across SW respawns where performance.now() resets. Module-level monotonic counter has zero wall-clock dependency. - WR-09: alternative considered: forcing a synchronous rotation at REQUEST_BUFFER time. Rejected — adds ~50–200 ms latency to every save AND races with scheduleRotation()'s timer. The "exclude unless empty" trade-off matches REVIEW.md option (a) exactly and preserves the documented first-10-s UX path. Validation evidence: - npx tsc --noEmit: exit 0 (no type errors). - npx vitest run --reporter=dot: 30/30 tests pass in 2.67 s (8 test files, including port.test.ts which pins the reconnect invariant and port-serialization.test.ts which pins the wire format). - grep "as any\|@ts-ignore" src/offscreen/ src/background/index.ts src/shared/: no matches (type-safety gate stays clean). --- src/background/index.ts | 39 +++++++++++++- src/offscreen/recorder.ts | 110 +++++++++++++++++++++++--------------- 2 files changed, 103 insertions(+), 46 deletions(-) diff --git a/src/background/index.ts b/src/background/index.ts index 94662ed..8ebd119 100644 --- a/src/background/index.ts +++ b/src/background/index.ts @@ -87,12 +87,34 @@ chrome.runtime.onConnect.addListener((port) => { } logger.log('Offscreen port connected'); videoPort = port; + // CR-02 fix: install a permanent onMessage sink on every accepted port. + // Chrome 110+ resets the SW idle-timer on any inbound port message, BUT + // in the field, behaviour has been observed to differ subtly when no + // listener is attached at all — Chrome may skip the idle-timer reset + // path entirely on unrouted messages. A no-op listener guarantees the + // PING traffic is consumed and the timer reset is unconditional. The + // per-request listener installed by `getVideoBufferFromOffscreen` still + // handles BUFFER routing; this sink only drains PING and any unknown + // traffic so it doesn't accumulate or surprise us later. + port.onMessage.addListener((msg) => { + if ( + typeof msg === 'object' && + msg !== null && + (msg as { type?: unknown }).type === 'PING' + ) { + // Explicit drain — silences "no listener" semantics in Chrome's + // port-message dispatch and keeps the SW idle-timer reset reliable. + return; + } + // Unknown traffic — drop silently (T-1-04 defense-in-depth). + // BUFFER is routed by the per-request listener in + // getVideoBufferFromOffscreen; that listener fires first when + // attached, so this branch never observes BUFFER in practice. + }); port.onDisconnect.addListener(() => { logger.log('Offscreen port disconnected; offscreen will reconnect'); videoPort = null; }); - // Inbound traffic is mostly PING (ignored) and BUFFER (handled by the - // per-request listener installed in getVideoBufferFromOffscreen). }); // 2 s budget covers the worst-case round-trip: offscreen base64-encodes @@ -517,6 +539,19 @@ async function initialize() { if (exists) { offscreenCreated = true; logger.log('Existing offscreen document detected on SW init'); + // CR-03 fix: handshake deadlock after SW respawn. + // OFFSCREEN_READY is fired by the offscreen exactly once at its + // bootstrap (recorder.ts line ~452). If the SW is evicted (~30 s + // idle) while the offscreen document persists, the next SW lifetime + // creates a fresh `offscreenReady` Promise (line 33) and waits on + // it forever — the offscreen has no signal to re-emit, and + // startVideoCapture() hangs at `await offscreenReady`. + // Resolve readiness immediately when we detect a pre-existing + // offscreen: it MUST have completed its bootstrap before being + // observable via hasDocument(). + offscreenReadyResolve?.(); + offscreenReadyResolve = null; + logger.log('Resolving offscreenReady immediately — offscreen survived a prior SW lifetime'); } } } catch (err) { diff --git a/src/offscreen/recorder.ts b/src/offscreen/recorder.ts index 6c59881..3fab2f4 100644 --- a/src/offscreen/recorder.ts +++ b/src/offscreen/recorder.ts @@ -55,6 +55,12 @@ let segments: Blob[] = []; let currentChunks: Blob[] = []; let rotationTimerId: ReturnType | null = null; let keepalivePort: chrome.runtime.Port | null = null; // long-lived SW keepalive (D-17, Pattern 5) +// WR-03 fix: strictly-monotonic per-process counter for segment timestamps. +// Replaces the previous `Date.now() + idx` scheme, which could collide +// across two REQUEST_BUFFER calls within the same millisecond. The SW-side +// `mergeVideoSegments` sorts by this `timestamp` ascending; a pure counter +// guarantees deterministic ordering with zero wall-clock dependency. +let segmentSeq = 0; // ─── Сегментный буфер (pure functions — testable in Node) ─────────────── @@ -333,58 +339,74 @@ function onPortMessage(message: unknown): void { } async function encodeAndSendBuffer(): Promise { - // Снимок завершённых сегментов + опциональный «свежий» текущий - // сегмент, если уже накопились dataavailable-чанки. Это нужно, чтобы - // SAVE_ARCHIVE через 3 секунды после старта первой сессии не вернул - // пустой буфер — операторская UX страдает иначе. Если currentChunks - // пуст, in-flight сегмент не добавляем. + // CR-01 fix: capture the port identity BEFORE the await. If `keepalivePort` + // is replaced by a fresh reconnect during base64 encoding, posting on the + // new port would silently leak the BUFFER to a stranger — the SW's + // per-request listener is still bound to the OLD port. The SW already + // times out cleanly (BUFFER_FETCH_TIMEOUT_MS = 2 s), so dropping a stale + // response on the floor is the correct behaviour: the next SAVE_ARCHIVE + // round-trip will REQUEST_BUFFER on the fresh port. + const portAtRequest = keepalivePort; + if (portAtRequest === null) { + logger.warn('encodeAndSendBuffer called without an active port — drop'); + return; + } + // WR-09 fix: an in-flight segment lacks the Matroska finalization that + // MediaRecorder.stop() performs (SegmentSize, Cues) — splicing it onto + // a finalized tail re-introduces the "File ended prematurely" symptom + // documented in the debug session webm-playback-freeze. We still want + // SOME data if SAVE_ARCHIVE fires within the first 10 s window (before + // any segment rotation has completed); in that single case we accept + // the trade-off and ship the in-flight Blob alone. Once we have at + // least one finalized segment, we drop the in-flight tail unconditionally. const finalized = getSegments(); const inFlight = - currentChunks.length > 0 + finalized.length === 0 && currentChunks.length > 0 ? new Blob(currentChunks, { type: 'video/webm' }) : null; const allSegments: Blob[] = - inFlight !== null ? [...finalized, inFlight] : finalized; + inFlight !== null ? [inFlight] : finalized; - // Метка времени для каждого сегмента — момент текущего экспорта; даёт - // SW-side `mergeVideoSegments` стабильный порядок сортировки и не зависит - // от часов внутри MediaRecorder. Старший сегмент — самый старый. - const baseTimestamp = Date.now(); - - // Per-segment defensive encode: если одиночный Blob не зенкодится - // (например, неожиданный detach ArrayBuffer-а на лету), логируем и - // пропускаем — частичное видео > отсутствующее видео. - const encodeResults = await Promise.all( - allSegments.map(async (segment, idx): Promise => { - try { - const data = await blobToBase64(segment); - return { - data, - type: segment.type || 'video/webm', - // Порядковый offset — старшие сегменты получают меньший timestamp. - // Шаг произвольный, лишь бы сохранил порядок при сортировке в SW. - timestamp: baseTimestamp + idx, - }; - } catch (err) { - logger.error( - 'blobToBase64 failed; skipping segment', - 'index:', idx, - 'size:', segment.size, - 'error:', err, - ); - return null; - } - }), - ); - const transferred: TransferredVideoSegment[] = encodeResults.filter( - (c): c is TransferredVideoSegment => c !== null, - ); - // Re-check port AFTER the await: it may have disconnected during encoding. - if (keepalivePort === null) { - logger.warn('port disconnected during base64 encoding; dropping BUFFER response'); + // WR-03 fix: monotonically-increasing per-process counter, NOT + // `Date.now() + idx`. Date.now() at millisecond resolution collides + // across two REQUEST_BUFFER calls within the same millisecond (e.g., + // diagnostic prefetch + real save). The merge code on the SW side + // sorts by `timestamp` ascending, so a strictly monotonic counter + // guarantees a deterministic order independent of wall-clock skew. + const transferred: TransferredVideoSegment[] = []; + for (let idx = 0; idx < allSegments.length; idx++) { + const segment = allSegments[idx]; + try { + const data = await blobToBase64(segment); + transferred.push({ + data, + type: segment.type || 'video/webm', + timestamp: ++segmentSeq, + }); + } catch (err) { + // Per-segment defensive encode: a single Blob failing to encode + // (e.g. unexpected ArrayBuffer detach) is logged and skipped — + // partial video > no video at all. + logger.error( + 'blobToBase64 failed; skipping segment', + 'index:', idx, + 'size:', segment.size, + 'error:', err, + ); + } + } + // After the await: refuse to post on a port that has been REPLACED + // by reconnect. The SW listens on the OLD port; posting on the NEW + // port would silently lose the data. Letting the SW time out is + // correct — the next SAVE_ARCHIVE will re-issue REQUEST_BUFFER on + // the fresh port. + if (keepalivePort !== portAtRequest) { + logger.warn( + 'port replaced during encode; dropping BUFFER response (SW will time out and retry)', + ); return; } - keepalivePort.postMessage({ type: 'BUFFER', segments: transferred }); + portAtRequest.postMessage({ type: 'BUFFER', segments: transferred }); } function connectPort(): void {