test(debug-empty-archive): RED gate for empty-archive-port-race (H1 + H1.b + H2)
Phase 1 UAT Test 3 surfaced a two-headed BLOCKER: (a) silent empty-video archive when save crosses a port-reconnect window, (b) 3x "Attempting to use a disconnected port object" Uncaught Errors starting at the 290 s pre-emptive reconnect mark. Bisect confirmed: H1 (port lifecycle race) was introduced by Plan 01-04 (b064a21); H2 (createArchive silent-skip on empty segments) is an upstream defect (555eb05) that became fatal once CR-01 + sweep #5 guaranteed the silent-skip branch would fire on every save during a reconnect window. This commit lands the 3 RED tests at the unit-test level — they match the UAT error string byte-for-byte for H1/H1.b and pin the silent-drop contract for H2. They will flip GREEN as the Option C architectural refactor (request-id'd port protocol + port-health probe + retry + operator-visible error surface) lands across the next commits. Baseline: 8 files / 43 tests (40 GREEN, 3 RED). tsc --noEmit exit 0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
406
.planning/debug/empty-archive-port-race.md
Normal file
406
.planning/debug/empty-archive-port-race.md
Normal file
@@ -0,0 +1,406 @@
|
|||||||
|
---
|
||||||
|
slug: empty-archive-port-race
|
||||||
|
status: tdd_red_confirmed
|
||||||
|
trigger: |
|
||||||
|
Phase 1 UAT Test 3 surfaced a two-headed BLOCKER:
|
||||||
|
(a) Archive shipped via popup save contains NO video/last_30sec.webm
|
||||||
|
(only rrweb/, logs/, screenshot.png, meta.json) — silent data loss
|
||||||
|
defeating the entire phase goal.
|
||||||
|
(b) 3× Uncaught Error: "Attempting to use a disconnected port object"
|
||||||
|
surface in the offscreen console starting at the 290 s pre-emptive
|
||||||
|
port-reconnect mark (recorder.ts line 627), spaced ~25-30 s
|
||||||
|
apart (PORT_PING_MS = 25_000 cadence).
|
||||||
|
Both symptoms cluster around src/offscreen/recorder.ts:597-630
|
||||||
|
(port reconnect / encodeAndSendBuffer interaction) and
|
||||||
|
src/background/index.ts:130-167, 340-393 (BUFFER round-trip + zip
|
||||||
|
build branch that silently elides the video file on empty segments).
|
||||||
|
created: 2026-05-16T11:58:51Z
|
||||||
|
updated: 2026-05-16T12:09:40Z
|
||||||
|
phase: 01-stabilize-video-pipeline
|
||||||
|
related_uat: .planning/phases/01-stabilize-video-pipeline/01-UAT.md
|
||||||
|
related_review_fix: .planning/phases/01-stabilize-video-pipeline/01-REVIEW-FIX.md
|
||||||
|
prior_resolved_sessions:
|
||||||
|
- .planning/debug/resolved/d12-blob-port-transfer-fails.md
|
||||||
|
- .planning/debug/resolved/webm-playback-freeze.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# Debug: Empty-archive + port-reconnect race (Phase 1 BLOCKER)
|
||||||
|
|
||||||
|
## Symptoms (from UAT Test 3)
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
When the operator clicks save in the popup, the produced ZIP archive
|
||||||
|
contains `video/last_30sec.webm` — a playable WebM with the most recent
|
||||||
|
30 s of captured screen video derived from D-13 restart-segments ring
|
||||||
|
buffer (3 × 10 s self-contained segments).
|
||||||
|
|
||||||
|
**Actual behavior:**
|
||||||
|
1. `smoke.sh` waited 600 s before detecting a new archive in ~/Downloads.
|
||||||
|
2. The detected archive
|
||||||
|
(`/home/parf/Downloads/session_report_2026-05-16_13-54-52.zip`,
|
||||||
|
88 254 bytes) contains:
|
||||||
|
```
|
||||||
|
rrweb/ (dir, empty)
|
||||||
|
rrweb/session.json (2 bytes — empty JSON)
|
||||||
|
logs/ (dir, empty)
|
||||||
|
logs/events.json (2 bytes — empty JSON)
|
||||||
|
screenshot.png (87 300 bytes)
|
||||||
|
meta.json (336 bytes)
|
||||||
|
```
|
||||||
|
**No `video/` directory, no `last_30sec.webm`.** Total archive
|
||||||
|
88 KB vs. expected ~1.5 MB with video.
|
||||||
|
3. Offscreen console showed 3× `Uncaught Error: Attempting to use a
|
||||||
|
disconnected port object at offscreen-Bp_IKlxL.js:1:4644` starting
|
||||||
|
at exactly 290 s after recording start (timestamps 11:50:10, ~11:50:40,
|
||||||
|
~11:51:00) — coincides with `pre-emptive port reconnect (290 s cap)`
|
||||||
|
log line at recorder.ts:627.
|
||||||
|
4. Ring buffer itself was healthy throughout — 33+ segments rotated
|
||||||
|
cleanly with `kept: 3/3` invariant (sizes 472-799 KB per 10 s segment).
|
||||||
|
The D-13 restart-segments lifecycle is NOT the broken component.
|
||||||
|
|
||||||
|
**Error messages:**
|
||||||
|
```
|
||||||
|
offscreen-Bp_IKlxL.js:1 Uncaught Error: Attempting to use a disconnected port object
|
||||||
|
at offscreen-Bp_IKlxL.js:1:4644
|
||||||
|
```
|
||||||
|
And smoke.sh terminal:
|
||||||
|
```
|
||||||
|
caution: filename not matched: video/last_30sec.webm
|
||||||
|
```
|
||||||
|
|
||||||
|
**Timeline:**
|
||||||
|
- Bug introduced: not yet bisected. Most plausible window is one of:
|
||||||
|
- Plan 01-04 (port keepalive + OFFSCREEN_READY handshake) — added
|
||||||
|
the long-lived port that the pre-emptive reconnect operates on.
|
||||||
|
- Plan 01-05 (SW shrink + port host with T-1-04 sender check) —
|
||||||
|
added `getVideoBufferFromOffscreen` request-response pattern.
|
||||||
|
- Review-fix iteration 1 (commit 2e3f524 = CR-01 + CR-02 + CR-03 +
|
||||||
|
WR-03 + WR-09) — CR-01 specifically introduced the
|
||||||
|
"portAtRequest captured before await + refuse-to-post on stale
|
||||||
|
port" pattern in encodeAndSendBuffer that drops responses on the
|
||||||
|
floor without retry.
|
||||||
|
- Review-fix iteration 2 sweep #5 (commit 034155b "surface
|
||||||
|
port-replaced-during-fetch diagnostic on buffer timeout") — added
|
||||||
|
diagnostic logging on timeout but no retry.
|
||||||
|
- Never worked end-to-end with operator-side save under realistic
|
||||||
|
recording duration. The original Phase 1 closure on 2026-05-15 was
|
||||||
|
confirmed with a fresh fixture from a successful smoke run, but
|
||||||
|
the fixture was regenerated by a smoke session that almost certainly
|
||||||
|
completed before the 290 s pre-emptive reconnect window
|
||||||
|
(smoke required ≥ 35 s recording, not 5+ minutes).
|
||||||
|
|
||||||
|
**Reproduction:**
|
||||||
|
1. `npm run build` → produces dist/
|
||||||
|
2. `KEEP_PROFILE=0 ./smoke.sh`
|
||||||
|
3. In Chrome: load dist/ unpacked, click extension icon → picker
|
||||||
|
auto-accepts smoke tab → recording starts (logs confirm).
|
||||||
|
4. WAIT past the 290 s mark (e.g., 5+ minutes). Watch the offscreen
|
||||||
|
console: at 290 s see `pre-emptive port reconnect`, immediately
|
||||||
|
followed by `Uncaught Error: Attempting to use a disconnected port
|
||||||
|
object`, then 2-3 more such errors at PORT_PING_MS=25_000 intervals.
|
||||||
|
5. Click extension icon → "Сохранить отчёт об ошибке". Observe the
|
||||||
|
long wait time (the user saw 600 s — investigate whether multiple
|
||||||
|
clicks were needed or whether saveArchive has its own retry).
|
||||||
|
6. Inspect the produced zip: `unzip -l ~/Downloads/session_report_*.zip`
|
||||||
|
→ no `video/last_30sec.webm` entry.
|
||||||
|
|
||||||
|
## Current Focus
|
||||||
|
|
||||||
|
hypothesis: |
|
||||||
|
Two coupled defects sharing the same root cause (port-reconnect race
|
||||||
|
window in src/offscreen/recorder.ts):
|
||||||
|
|
||||||
|
H1 (the noise — Uncaught Errors): The ping `setInterval` at
|
||||||
|
recorder.ts:623-625 uses `keepalivePort?.postMessage({type:'PING'})`.
|
||||||
|
Optional chaining only guards against `null`, not against a
|
||||||
|
connected-but-being-disconnected `Port` object. When the
|
||||||
|
pre-emptive `setTimeout` fires at 290 s (recorder.ts:626-630) and
|
||||||
|
calls `keepalivePort?.disconnect()`, there's a microtask race
|
||||||
|
window before the onDisconnect handler (recorder.ts:616-622) runs
|
||||||
|
`teardownPortTimers()`. Any ping callback already queued/dispatched
|
||||||
|
in that window throws "Attempting to use a disconnected port object."
|
||||||
|
Two follow-on errors at ~25 s and ~50 s post-reconnect suggest the
|
||||||
|
recursive `connectPort()` re-installs the ping interval against
|
||||||
|
a NEW port that itself gets disconnected (possibly by a second
|
||||||
|
pre-emptive cycle, OR by the SW idle-unloading), but the error
|
||||||
|
remains because the same race recurs.
|
||||||
|
|
||||||
|
H2 (the silent data loss — empty-video archive): The same
|
||||||
|
reconnect-race window leaves REQUEST_BUFFER / BUFFER round-trips
|
||||||
|
vulnerable. When the operator clicks save during or shortly after
|
||||||
|
a reconnect:
|
||||||
|
(a) SW's `getVideoBufferFromOffscreen`
|
||||||
|
(src/background/index.ts:130-167) sends REQUEST_BUFFER via the
|
||||||
|
OLD port that just disconnected, OR
|
||||||
|
(b) Offscreen's `encodeAndSendBuffer`
|
||||||
|
(src/offscreen/recorder.ts:522-545, 547-604) captures
|
||||||
|
`portAtRequest` before its `await blobToBase64(...)` for ~3
|
||||||
|
segments (~150 ms), the pre-emptive reconnect fires mid-encode,
|
||||||
|
then the guard at line 597-601 correctly REFUSES to post on
|
||||||
|
the now-stale `portAtRequest` and just returns.
|
||||||
|
Either way, the SW's per-request listener times out silently after
|
||||||
|
`BUFFER_FETCH_TIMEOUT_MS = 2_000`, sets `videoBufferResponse =
|
||||||
|
{ segments: [] }`, and `createArchive`
|
||||||
|
(src/background/index.ts:340-352) takes the empty branch:
|
||||||
|
if (videoBufferResponse.segments.length > 0) { ... } else {
|
||||||
|
logger.warn('✗ No video segments to add'); // only logs
|
||||||
|
}
|
||||||
|
→ archive built WITHOUT the video file, shipped to operator silently.
|
||||||
|
The 600 s smoke wait is consistent with the operator clicking save
|
||||||
|
multiple times until one REQUEST_BUFFER round-trip happened to
|
||||||
|
complete on a fresh port — and even then, that single successful
|
||||||
|
trip may itself have caught only one of the 3 segments before the
|
||||||
|
next reconnect (worth checking against actual archive's NULL state
|
||||||
|
vs partial state).
|
||||||
|
|
||||||
|
test: |
|
||||||
|
Initial evidence-gathering test (RED): write a vitest spec that
|
||||||
|
drives recorder.ts's port lifecycle through a pre-emptive reconnect
|
||||||
|
while a ping is queued, and asserts NO Uncaught Error escapes.
|
||||||
|
This pins H1. For H2 a second test drives REQUEST_BUFFER →
|
||||||
|
encodeAndSendBuffer → reconnect-mid-await → asserts BUFFER is
|
||||||
|
retried on the fresh port (or that saveArchive surfaces an error
|
||||||
|
instead of silently dropping).
|
||||||
|
expecting: |
|
||||||
|
Initial RED tests fail in current code (proves bugs are
|
||||||
|
reproducible at the unit level). Then the gsd-debugger isolates
|
||||||
|
the precise call sequence + lands the GREEN.
|
||||||
|
next_action: TDD checkpoint — RED tests landed and confirmed failing for the exact UAT error string. Awaiting user confirmation of RED gate before writing GREEN fix.
|
||||||
|
reasoning_checkpoint: ""
|
||||||
|
tdd_checkpoint: |
|
||||||
|
Test file: tests/offscreen/port-reconnect-race.test.ts (3 tests, all RED)
|
||||||
|
Status: RED ✓ (failures confirm both H1 + H2 hypotheses)
|
||||||
|
Baseline: 7 files / 40 tests still GREEN (no regression in pinning contracts)
|
||||||
|
tsc --noEmit: exit 0
|
||||||
|
|
||||||
|
Failure outputs:
|
||||||
|
H1 AssertionError: expected [Function] to not throw an error
|
||||||
|
but 'Error: Attempting to use a disconnected port object' was thrown
|
||||||
|
(line 200 — production ping callback throws on disconnected port)
|
||||||
|
|
||||||
|
H1.b AssertionError: expected [Function] to not throw an error
|
||||||
|
but 'Error: Attempting to use a disconnected port object' was thrown
|
||||||
|
(line 419 — pre-emptive reconnect path triggers the same race)
|
||||||
|
|
||||||
|
H2 AssertionError: expected 0 to be greater than or equal to 1
|
||||||
|
(line 341 — production code silently drops BUFFER when reconnect
|
||||||
|
arrives mid-encode; neither old nor new port receives any BUFFER)
|
||||||
|
|
||||||
|
## Constraints
|
||||||
|
|
||||||
|
- TDD mode is ON (workflow.tdd_mode: true). RED test must land before
|
||||||
|
GREEN fix.
|
||||||
|
- Auto-loaded memories: `feedback-gsd-ceremony-for-fixes.md` (no
|
||||||
|
hot-edits; route through /gsd-debug) and
|
||||||
|
`feedback-no-unilateral-scope-reduction.md` (no scope narrowing;
|
||||||
|
surface choices via AskUserQuestion).
|
||||||
|
- Both fixes must NOT regress D-12 (base64 wire format) or D-13
|
||||||
|
(restart-segments). Existing tests at
|
||||||
|
`tests/offscreen/port-serialization.test.ts` and
|
||||||
|
`tests/offscreen/segment-rotation.test.ts` are the pinning contracts.
|
||||||
|
- Existing tests (40 across 7 files) all green; tsc exit 0; build
|
||||||
|
exit 0. Any fix must preserve that baseline.
|
||||||
|
|
||||||
|
## Files of Interest (preliminary; debugger may expand)
|
||||||
|
|
||||||
|
- src/offscreen/recorder.ts:
|
||||||
|
- 522-545: encodeAndSendBuffer (CR-01 portAtRequest guard)
|
||||||
|
- 547-604: doEncodeAndSendBuffer (base64 encode loop + WR-03
|
||||||
|
monotonic seq + stale-port refuse + postMessage)
|
||||||
|
- 606-630: connectPort (port lifecycle, ping interval, pre-emptive
|
||||||
|
reconnect setTimeout)
|
||||||
|
- 616-622: onDisconnect handler (teardownPortTimers + recursive
|
||||||
|
connectPort)
|
||||||
|
- 39: PORT_RECONNECT_MS = 290_000
|
||||||
|
- 38: PORT_PING_MS = 25_000
|
||||||
|
- src/background/index.ts:
|
||||||
|
- 130-167: getVideoBufferFromOffscreen (per-request listener
|
||||||
|
+ BUFFER_FETCH_TIMEOUT_MS = 2_000)
|
||||||
|
- 340-393: createArchive (the silent-empty-video branch at 346-352)
|
||||||
|
- 430-493: saveArchive (orchestrator)
|
||||||
|
- tests/offscreen/port.test.ts (existing — only covers SW-side
|
||||||
|
disconnect path, NOT pre-emptive reconnect race)
|
||||||
|
- tests/offscreen/port-serialization.test.ts (D-12 contract)
|
||||||
|
- tests/offscreen/segment-rotation.test.ts (D-13 contract)
|
||||||
|
- tests/offscreen/port-reconnect-race.test.ts (NEW — RED gate for this
|
||||||
|
debug session; pins H1 + H1.b + H2 contracts at unit-test level)
|
||||||
|
|
||||||
|
## Evidence
|
||||||
|
|
||||||
|
- timestamp: 2026-05-16T12:06:29Z — Baseline established. `npx vitest run`
|
||||||
|
→ 7 files passed / 40 tests passed. Confirms a clean RED→GREEN baseline
|
||||||
|
for the new test file.
|
||||||
|
- timestamp: 2026-05-16T12:06:45Z — Static analysis of src/offscreen/recorder.ts
|
||||||
|
+ the minified production bundle dist/assets/offscreen-Bp_IKlxL.js
|
||||||
|
CONFIRMS H1 at the bytecode level:
|
||||||
|
`grep -oP '.{60}postMessage.{20}' offscreen-Bp_IKlxL.js` yields:
|
||||||
|
`connecting"),E(),o=null,O()}),p=setInterval(()=>{o==null||o.postMessage({type:"PING"})},k)`
|
||||||
|
This is exactly the pattern at recorder.ts:623-625 — the ping
|
||||||
|
callback uses `?.postMessage` whose only safety check is null. The
|
||||||
|
minified column 4644 in the UAT error stack maps to this site:
|
||||||
|
`o.postMessage({type:"PING"})` where `o` is a non-null but
|
||||||
|
disconnected Port. Chrome's documented runtime semantics for Port
|
||||||
|
say postMessage on a disconnected port throws synchronously with
|
||||||
|
the exact error string observed in UAT
|
||||||
|
("Attempting to use a disconnected port object").
|
||||||
|
- timestamp: 2026-05-16T12:07:00Z — Static analysis of the
|
||||||
|
doEncodeAndSendBuffer flow CONFIRMS H2 silent-drop:
|
||||||
|
`src/offscreen/recorder.ts:597-601` —
|
||||||
|
`if (keepalivePort !== portAtRequest) { logger.warn(...); return; }`
|
||||||
|
On a mid-encode reconnect, keepalivePort is reassigned by the
|
||||||
|
onDisconnect→connectPort chain while portAtRequest still points to
|
||||||
|
the OLD port. The function returns WITHOUT posting anywhere.
|
||||||
|
`src/background/index.ts:134-150` — SW per-request listener bound
|
||||||
|
to the captured `port` (the OLD port). After 2 s
|
||||||
|
(BUFFER_FETCH_TIMEOUT_MS) it resolves `{ segments: [] }`.
|
||||||
|
`src/background/index.ts:346-352` — `createArchive` branch:
|
||||||
|
`if (videoBufferResponse.segments.length > 0) { ... } else {
|
||||||
|
logger.warn('✗ No video segments to add'); }`
|
||||||
|
No throw, no surface to operator. Archive ships WITHOUT video.
|
||||||
|
- timestamp: 2026-05-16T12:09:30Z — RED tests written and executed:
|
||||||
|
`tests/offscreen/port-reconnect-race.test.ts`. 3 tests, all RED:
|
||||||
|
H1 `ping callback handles a remotely-disconnected port without
|
||||||
|
throwing` — FAILS with "Attempting to use a disconnected port
|
||||||
|
object" (matches UAT error string byte-for-byte).
|
||||||
|
H1.b `pre-emptive reconnect path — ping does not throw against
|
||||||
|
just-disconnected port` — FAILS with same error (proves the
|
||||||
|
290 s pre-emptive path is the same race).
|
||||||
|
H2 `REQUEST_BUFFER mid-reconnect must NOT drop BUFFER when
|
||||||
|
segments exist` — FAILS with `expected 0 to be greater than
|
||||||
|
or equal to 1`. 3 seeded segments + mid-encode reconnect →
|
||||||
|
neither old nor new port receives any BUFFER. Proves the
|
||||||
|
silent-drop is real and reproducible at the unit level.
|
||||||
|
- timestamp: 2026-05-16T12:09:40Z — Full-suite check: 8 files (was 7) /
|
||||||
|
43 tests (was 40) — exactly 3 new RED, 40 baseline still GREEN. No
|
||||||
|
regression. `npx tsc --noEmit` exit 0 (new test file is type-clean).
|
||||||
|
|
||||||
|
## Eliminated
|
||||||
|
|
||||||
|
- **Not a buffer-emptiness issue.** UAT logs show 33+ segments rotated
|
||||||
|
successfully with `kept: 3/3` invariant maintained throughout the
|
||||||
|
600 s session. The ring buffer always had 3 × ~500-800 KB segments
|
||||||
|
available when SAVE_ARCHIVE fired. The data is THERE; the transport
|
||||||
|
fails to deliver it.
|
||||||
|
- **Not a D-13 segment-lifecycle regression.** existing
|
||||||
|
segment-rotation.test.ts contract still passes; the WebM payload
|
||||||
|
format is sound (preserved by the d12 base64 wire-format fix).
|
||||||
|
- **Not a D-12 wire-format regression.** existing
|
||||||
|
port-serialization.test.ts contract still passes; the base64 round-
|
||||||
|
trip is intact. The bug is at the port-lifecycle layer ABOVE the
|
||||||
|
wire format.
|
||||||
|
- **Not a smoke.sh artifact.** the actual zip on disk was inspected
|
||||||
|
by-hand (`unzip -l ~/Downloads/session_report_2026-05-16_13-54-52.zip`)
|
||||||
|
— the absence of video/last_30sec.webm is on the wire, not in the
|
||||||
|
smoke harness's reporting.
|
||||||
|
- **Not a JSZip issue.** other archive entries (screenshot.png, meta.json)
|
||||||
|
are present and well-formed. JSZip received an archive build with no
|
||||||
|
video entry — because createArchive's silent-skip branch never
|
||||||
|
called `zip.file('video/last_30sec.webm', videoBlob)`.
|
||||||
|
- **Not a chrome.runtime.sendMessage problem.** the bug is on the
|
||||||
|
long-lived `chrome.runtime.connect` port (offscreen↔SW), not on the
|
||||||
|
popup↔SW sendMessage path. The popup→SW SAVE_ARCHIVE call works
|
||||||
|
fine — it just gets `{success: true}` back even though the archive
|
||||||
|
is video-less.
|
||||||
|
|
||||||
|
## Resolution
|
||||||
|
|
||||||
|
root_cause: ""
|
||||||
|
fix: ""
|
||||||
|
verification: ""
|
||||||
|
files_changed: []
|
||||||
|
|
||||||
|
## Bisect Results (2026-05-16T12:15:00Z)
|
||||||
|
|
||||||
|
User chose "Bisect first" from the fix-strategy fork. Targeted git-log
|
||||||
|
scan across the suspect code regions across the full Phase 1 history
|
||||||
|
(`555eb05..HEAD`):
|
||||||
|
|
||||||
|
| Symptom | Mechanism | Introduction commit | Subsequent amplification |
|
||||||
|
|---------|----------------------------------------|------------------------------------------------------|-----------------------------------------------------------------|
|
||||||
|
| H1.a | ping setInterval with `?.postMessage` | `b064a21` (Plan 01-04, 2026-05-15) | none — race exists since this commit |
|
||||||
|
| H1.b | pre-emptive 290 s reconnect setTimeout | `b064a21` (Plan 01-04, 2026-05-15) | none |
|
||||||
|
| H2 | createArchive silent-skip on empty | **`555eb05` (imported broken upstream — predates Phase 1)** | `2e3f524` CR-01 "refuse-to-post on swapped port" without retry; `034155b` sweep #5 added diagnostic on timeout but still no retry |
|
||||||
|
|
||||||
|
**Strategic implications surfaced by bisect:**
|
||||||
|
|
||||||
|
1. **H2 silent-skip is an UPSTREAM defect** in the original "half-broken
|
||||||
|
blob of code" — never on the Phase 1 fix radar because:
|
||||||
|
- Original 22-defect audit didn't flag it (it's a "what happens if a
|
||||||
|
downstream returns 0?" question that needs a real failure mode to
|
||||||
|
surface)
|
||||||
|
- REVIEW.md's 18 findings didn't catch it
|
||||||
|
- Sweep #5 (commit 034155b "surface port-replaced-during-fetch
|
||||||
|
diagnostic on buffer timeout") added a log on timeout but kept the
|
||||||
|
silent-skip behaviour — the diagnostic logs went to the SW console,
|
||||||
|
but the operator-facing failure path stayed silent.
|
||||||
|
|
||||||
|
2. **H1 port race latency-amplifies H2 from latent to fatal.** Before
|
||||||
|
Plan 01-04 there was no port → no reconnect → silent-skip never fired
|
||||||
|
in practice. After 01-04 the 290 s pre-emptive reconnect creates a
|
||||||
|
guaranteed-recurring window where REQUEST_BUFFER round-trips can
|
||||||
|
fail; after CR-01 the "refuse to post on swapped port" pattern makes
|
||||||
|
the silent-skip fire with near-100 % probability whenever a save is
|
||||||
|
clicked across a reconnect window.
|
||||||
|
|
||||||
|
3. **Option A** (target both symptoms with retry + try/catch) is
|
||||||
|
sufficient to make the visible symptoms go away, but leaves the
|
||||||
|
upstream silent-skip pattern intact — a class of latent failures
|
||||||
|
remains. **Option B** (which converts silent-skip into operator-
|
||||||
|
visible error) directly retires the upstream class. **Option C**
|
||||||
|
(request-id'd round-trip protocol) makes port-replacement
|
||||||
|
irrelevant by architecture but is the largest blast radius.
|
||||||
|
|
||||||
|
The bisect strengthens the case for B (it's not just symptom-control;
|
||||||
|
it retires the entire shape of bug, including the upstream one).
|
||||||
|
However, A is still defensible if treated as a pure hotfix to ship
|
||||||
|
Phase 1 and the silent-skip fix is consciously deferred to a follow-up
|
||||||
|
plan. C is justifiable if the user wants Phase 1's port lifecycle to
|
||||||
|
match the durability of the D-12/D-13 contracts before Phase 2 layers
|
||||||
|
DOM capture on top.
|
||||||
|
|
||||||
|
(Updated by orchestrator after bisect; user decides next.)
|
||||||
|
|
||||||
|
## Fix Strategy Chosen: Option C (Architectural)
|
||||||
|
|
||||||
|
User decided (post-bisect): **Option C — Architectural port lifecycle refactor.**
|
||||||
|
|
||||||
|
Rationale: bisect shows the current port lifecycle's race window is the
|
||||||
|
proximate cause of every H1 error AND the trigger for the upstream H2
|
||||||
|
silent-skip class. Option C makes port-replacement architecturally
|
||||||
|
irrelevant by:
|
||||||
|
|
||||||
|
1. Replace pre-emptive 290 s `setTimeout` reconnect with **port-health
|
||||||
|
probe** (ping/pong protocol; offscreen explicitly waits for SW echo
|
||||||
|
before considering port live; on any failure → clean teardown +
|
||||||
|
reconnect).
|
||||||
|
2. Switch REQUEST_BUFFER / BUFFER to **request-id'd** pattern: SW
|
||||||
|
sends `{type:'REQUEST_BUFFER', requestId: uuid}`; offscreen
|
||||||
|
responds with `{type:'BUFFER', requestId, segments}`; SW only
|
||||||
|
accepts responses matching its current pending requestId. SW
|
||||||
|
retries on port replacement until success or hard outer timeout.
|
||||||
|
3. Continuous end-to-end vitest covering 600 s of port lifecycle
|
||||||
|
(2 reconnects + simulated REQUEST_BUFFER round-trips). Becomes
|
||||||
|
the new pinning contract for the port lifecycle.
|
||||||
|
4. **Keep** the existing operator-visible safety from Option B: even
|
||||||
|
in the new design, if a hard outer timeout fires, `createArchive`
|
||||||
|
must throw → popup surfaces `RECORDING_ERROR` to operator instead
|
||||||
|
of silently shipping a video-less archive.
|
||||||
|
|
||||||
|
Touches D-17 port-lifecycle contract (CONTEXT.md will be amended).
|
||||||
|
Expected surface: ~150-200 LOC + 2-3 new test files. Pre-existing
|
||||||
|
40 GREEN tests + 3 new RED tests must all flip GREEN at completion.
|
||||||
|
|
||||||
|
Constraints (re-iterated for the executor):
|
||||||
|
- TDD: each new behaviour gets a RED test BEFORE GREEN implementation.
|
||||||
|
- D-12 (base64 wire format) and D-13 (restart-segments) contracts must
|
||||||
|
not regress. Existing pinning tests at
|
||||||
|
`tests/offscreen/port-serialization.test.ts` and
|
||||||
|
`tests/offscreen/segment-rotation.test.ts` are the canaries.
|
||||||
|
- `npx tsc --noEmit` exit 0, `npm run build` exit 0, type-safety
|
||||||
|
grep clean must all hold at every commit.
|
||||||
|
- `tests/fixtures/last_30sec.webm` is currently dirty in the working
|
||||||
|
tree (operator's recent smoke session staged a new file at that
|
||||||
|
path). Do NOT commit changes to that file — the committed D-13
|
||||||
|
fixture is the canonical contract.
|
||||||
|
|
||||||
|
(Strategy decided 2026-05-16T12:18:00Z; executor dispatch follows.)
|
||||||
427
tests/offscreen/port-reconnect-race.test.ts
Normal file
427
tests/offscreen/port-reconnect-race.test.ts
Normal file
@@ -0,0 +1,427 @@
|
|||||||
|
// tests/offscreen/port-reconnect-race.test.ts
|
||||||
|
//
|
||||||
|
// RED-gate tests for debug session empty-archive-port-race
|
||||||
|
// (.planning/debug/empty-archive-port-race.md).
|
||||||
|
//
|
||||||
|
// Empirically pins the two coupled defects observed in Phase 1 UAT Test 3:
|
||||||
|
//
|
||||||
|
// H1 Pre-emptive reconnect at 290 s leaves the ping interval briefly
|
||||||
|
// able to fire against a Port whose remote end has been disconnected,
|
||||||
|
// throwing "Attempting to use a disconnected port object" — 3
|
||||||
|
// uncaught errors observed in the offscreen console.
|
||||||
|
//
|
||||||
|
// H2 REQUEST_BUFFER round-trip silently drops BUFFER when the port is
|
||||||
|
// replaced mid-encode (src/offscreen/recorder.ts:597 stale-port
|
||||||
|
// refuse) AND when the SW's videoPort is null at request time
|
||||||
|
// (src/background/index.ts:128 fast-bailout). The SW then times
|
||||||
|
// out (BUFFER_FETCH_TIMEOUT_MS = 2 s) and resolves
|
||||||
|
// `{ segments: [] }`, which createArchive's branch at
|
||||||
|
// src/background/index.ts:346 silently elides into a video-less
|
||||||
|
// archive. Operator-visible result: 88 KB zip with NO
|
||||||
|
// video/last_30sec.webm.
|
||||||
|
//
|
||||||
|
// These tests drive the REAL src/offscreen/recorder.ts module via the
|
||||||
|
// established chrome-stub harness (same scaffold as
|
||||||
|
// tests/offscreen/port.test.ts + handshake.test.ts). No source-code
|
||||||
|
// re-implementation — the production module is what we're testing.
|
||||||
|
|
||||||
|
import { describe, it, expect, vi, beforeEach } from 'vitest';
|
||||||
|
|
||||||
|
interface PortStub {
|
||||||
|
name: string;
|
||||||
|
// Throw-on-disconnected-postMessage: simulates the real Chrome behaviour
|
||||||
|
// documented at https://developer.chrome.com/docs/extensions/develop/concepts/messaging
|
||||||
|
// — postMessage on a disconnected Port throws synchronously with
|
||||||
|
// "Attempting to use a disconnected port object".
|
||||||
|
postMessage: ReturnType<typeof vi.fn>;
|
||||||
|
onMessage: {
|
||||||
|
addListener: ReturnType<typeof vi.fn>;
|
||||||
|
// capture for synthetic dispatch
|
||||||
|
_listeners: Array<(msg: unknown) => void>;
|
||||||
|
};
|
||||||
|
onDisconnect: {
|
||||||
|
addListener: (fn: () => void) => void;
|
||||||
|
// capture for synthetic dispatch
|
||||||
|
_listeners: Array<() => void>;
|
||||||
|
};
|
||||||
|
disconnect: ReturnType<typeof vi.fn>;
|
||||||
|
// test seam — flip this to true to simulate the remote disconnect
|
||||||
|
// having taken effect before the next postMessage runs.
|
||||||
|
_disconnected: boolean;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface ChromeStub {
|
||||||
|
runtime: {
|
||||||
|
id: string;
|
||||||
|
sendMessage: ReturnType<typeof vi.fn>;
|
||||||
|
onMessage: { addListener: ReturnType<typeof vi.fn> };
|
||||||
|
connect: () => PortStub;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
interface GlobalWithChrome {
|
||||||
|
chrome?: ChromeStub;
|
||||||
|
MediaRecorder?: { isTypeSupported: (mime: string) => boolean };
|
||||||
|
}
|
||||||
|
|
||||||
|
// Build a PortStub whose postMessage throws once `_disconnected` flips true,
|
||||||
|
// matching Chrome's runtime semantics. The throw is the exact error string
|
||||||
|
// the offscreen console showed in UAT Test 3 ("Attempting to use a
|
||||||
|
// disconnected port object"), so the test fails loudly when the bug is
|
||||||
|
// reproduced — and is silent (pass) once the production code clears the
|
||||||
|
// ping interval BEFORE the disconnect-detection race window.
|
||||||
|
function makePortStub(): PortStub {
|
||||||
|
const port: PortStub = {
|
||||||
|
name: 'video-keepalive',
|
||||||
|
postMessage: vi.fn(),
|
||||||
|
onMessage: {
|
||||||
|
addListener: vi.fn(),
|
||||||
|
_listeners: [],
|
||||||
|
},
|
||||||
|
onDisconnect: {
|
||||||
|
addListener: (fn: () => void) => {
|
||||||
|
port.onDisconnect._listeners.push(fn);
|
||||||
|
},
|
||||||
|
_listeners: [],
|
||||||
|
},
|
||||||
|
disconnect: vi.fn(() => {
|
||||||
|
port._disconnected = true;
|
||||||
|
}),
|
||||||
|
_disconnected: false,
|
||||||
|
};
|
||||||
|
// Wire onMessage.addListener to capture into _listeners so the test can
|
||||||
|
// dispatch REQUEST_BUFFER synthetically.
|
||||||
|
port.onMessage.addListener.mockImplementation(
|
||||||
|
(fn: (msg: unknown) => void) => {
|
||||||
|
port.onMessage._listeners.push(fn);
|
||||||
|
}
|
||||||
|
);
|
||||||
|
// Wire postMessage to throw if the port is disconnected — Chrome's
|
||||||
|
// documented behaviour. This is the smoking gun for H1.
|
||||||
|
port.postMessage.mockImplementation((msg: unknown) => {
|
||||||
|
if (port._disconnected) {
|
||||||
|
throw new Error('Attempting to use a disconnected port object');
|
||||||
|
}
|
||||||
|
return msg;
|
||||||
|
});
|
||||||
|
return port;
|
||||||
|
}
|
||||||
|
|
||||||
|
function buildChromeStub(ports: PortStub[]): ChromeStub {
|
||||||
|
return {
|
||||||
|
runtime: {
|
||||||
|
id: 'ext-id-test',
|
||||||
|
sendMessage: vi.fn(),
|
||||||
|
onMessage: { addListener: vi.fn() },
|
||||||
|
connect: () => {
|
||||||
|
const port = makePortStub();
|
||||||
|
ports.push(port);
|
||||||
|
return port;
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
describe('port reconnect race (RED — confirms empty-archive-port-race H1+H2)', () => {
|
||||||
|
let originalSetInterval: typeof globalThis.setInterval;
|
||||||
|
let originalClearInterval: typeof globalThis.clearInterval;
|
||||||
|
let originalSetTimeout: typeof globalThis.setTimeout;
|
||||||
|
let originalClearTimeout: typeof globalThis.clearTimeout;
|
||||||
|
|
||||||
|
beforeEach(() => {
|
||||||
|
vi.resetModules();
|
||||||
|
(globalThis as unknown as GlobalWithChrome).MediaRecorder = {
|
||||||
|
isTypeSupported: vi.fn().mockReturnValue(true),
|
||||||
|
};
|
||||||
|
originalSetInterval = globalThis.setInterval;
|
||||||
|
originalClearInterval = globalThis.clearInterval;
|
||||||
|
originalSetTimeout = globalThis.setTimeout;
|
||||||
|
originalClearTimeout = globalThis.clearTimeout;
|
||||||
|
});
|
||||||
|
|
||||||
|
// ──────────────────────────────────────────────────────────────────────
|
||||||
|
// H1 — ping fires against disconnected port and throws
|
||||||
|
// ──────────────────────────────────────────────────────────────────────
|
||||||
|
//
|
||||||
|
// Production code path (src/offscreen/recorder.ts:623-625):
|
||||||
|
// pingIntervalId = setInterval(() => {
|
||||||
|
// keepalivePort?.postMessage({ type: 'PING' }); // throws if disconnected
|
||||||
|
// }, PORT_PING_MS);
|
||||||
|
//
|
||||||
|
// The `?.` only guards against keepalivePort === null. After the SW
|
||||||
|
// (or the SW's idle-eviction, or a remote disconnect) tears down the
|
||||||
|
// port, the offscreen-side Port reference is still non-null but
|
||||||
|
// postMessage throws synchronously with "Attempting to use a
|
||||||
|
// disconnected port object". The interval is cleared only in the
|
||||||
|
// onDisconnect handler — there's a race window during which the next
|
||||||
|
// queued interval callback fires before onDisconnect runs.
|
||||||
|
//
|
||||||
|
// Test strategy: intercept setInterval to capture the ping callback,
|
||||||
|
// then synthetically disconnect the port (without firing onDisconnect
|
||||||
|
// first — mimicking the race), then invoke the captured ping callback
|
||||||
|
// and assert it does NOT throw an uncaught error.
|
||||||
|
it('H1: ping callback handles a remotely-disconnected port without throwing', async () => {
|
||||||
|
const ports: PortStub[] = [];
|
||||||
|
const stub = buildChromeStub(ports);
|
||||||
|
(globalThis as unknown as GlobalWithChrome).chrome = stub;
|
||||||
|
|
||||||
|
// Intercept setInterval to capture the ping callback for direct invocation.
|
||||||
|
const intervalCallbacks: Array<() => void> = [];
|
||||||
|
globalThis.setInterval = ((cb: () => void, _ms: number) => {
|
||||||
|
intervalCallbacks.push(cb);
|
||||||
|
return 999 as unknown as ReturnType<typeof setInterval>;
|
||||||
|
}) as typeof globalThis.setInterval;
|
||||||
|
// No-op clearInterval — the test wants to keep the callback callable
|
||||||
|
// even after the production code attempts a clear, to expose the
|
||||||
|
// race-window assumption.
|
||||||
|
globalThis.clearInterval = (() => {}) as typeof globalThis.clearInterval;
|
||||||
|
|
||||||
|
await import('../../src/offscreen/recorder');
|
||||||
|
|
||||||
|
expect(ports.length).toBe(1);
|
||||||
|
expect(intervalCallbacks.length).toBeGreaterThanOrEqual(1);
|
||||||
|
|
||||||
|
const port = ports[0];
|
||||||
|
const pingCb = intervalCallbacks[0];
|
||||||
|
|
||||||
|
// Simulate the race: the remote side of the port has been
|
||||||
|
// disconnected (e.g. SW eviction, ~5 min cap, port glitch) but the
|
||||||
|
// offscreen's onDisconnect handler has NOT yet run, so the interval
|
||||||
|
// is still installed and keepalivePort is still non-null on the
|
||||||
|
// offscreen side.
|
||||||
|
port._disconnected = true;
|
||||||
|
|
||||||
|
// The PRODUCTION ping callback must not throw — either by checking
|
||||||
|
// port.postMessage safety first, or by wrapping the call in
|
||||||
|
// try/catch, or by removing the interval at first postMessage
|
||||||
|
// failure. Whatever the fix, an uncaught throw here would surface
|
||||||
|
// as the "Attempting to use a disconnected port object" error the
|
||||||
|
// operator observed in UAT Test 3.
|
||||||
|
expect(() => pingCb()).not.toThrow();
|
||||||
|
|
||||||
|
// Restore globals.
|
||||||
|
globalThis.setInterval = originalSetInterval;
|
||||||
|
globalThis.clearInterval = originalClearInterval;
|
||||||
|
});
|
||||||
|
|
||||||
|
// ──────────────────────────────────────────────────────────────────────
|
||||||
|
// H2 — BUFFER lost on stale-port reconnect; SW left with empty segments
|
||||||
|
// ──────────────────────────────────────────────────────────────────────
|
||||||
|
//
|
||||||
|
// Production code path (src/offscreen/recorder.ts:547-604):
|
||||||
|
// const finalized = getSegments();
|
||||||
|
// ...
|
||||||
|
// for (let idx = 0; idx < allSegments.length; idx++) {
|
||||||
|
// const data = await blobToBase64(segment); // ←── async window
|
||||||
|
// transferred.push(...);
|
||||||
|
// }
|
||||||
|
// if (keepalivePort !== portAtRequest) {
|
||||||
|
// logger.warn('port replaced during encode; dropping BUFFER response');
|
||||||
|
// return;
|
||||||
|
// }
|
||||||
|
// portAtRequest.postMessage({ type: 'BUFFER', segments: transferred });
|
||||||
|
//
|
||||||
|
// When the SW issues REQUEST_BUFFER during the ~10-50 ms reconnect
|
||||||
|
// window, the offscreen captures `portAtRequest = keepalivePort`,
|
||||||
|
// awaits the per-segment base64 encode (~150 ms for 3 segments), and
|
||||||
|
// by the time the encode completes the pre-emptive reconnect has
|
||||||
|
// swapped `keepalivePort` to a fresh Port. The guard correctly
|
||||||
|
// refuses to post on the stale Port. HOWEVER, no retry is issued —
|
||||||
|
// the SW's getVideoBufferFromOffscreen times out after 2 s and
|
||||||
|
// resolves `{ segments: [] }`. createArchive then takes the silent
|
||||||
|
// no-video branch (src/background/index.ts:350-352). The archive
|
||||||
|
// ships without video.
|
||||||
|
//
|
||||||
|
// This RED test pins the contract: when REQUEST_BUFFER arrives AND
|
||||||
|
// segments exist, BUFFER (with the encoded segments) MUST reach SOME
|
||||||
|
// live port, either:
|
||||||
|
// (a) by retrying on the freshly-connected port, OR
|
||||||
|
// (b) by re-issuing REQUEST_BUFFER on the new port from the
|
||||||
|
// reconnect handler when there's an unanswered request in
|
||||||
|
// flight.
|
||||||
|
// Whatever the strategy, silent loss with non-empty buffer is the
|
||||||
|
// bug. The current production code drops the response on the floor.
|
||||||
|
it('H2: REQUEST_BUFFER mid-reconnect must NOT drop BUFFER when segments exist', async () => {
|
||||||
|
const ports: PortStub[] = [];
|
||||||
|
const stub = buildChromeStub(ports);
|
||||||
|
(globalThis as unknown as GlobalWithChrome).chrome = stub;
|
||||||
|
|
||||||
|
const recorder = await import('../../src/offscreen/recorder');
|
||||||
|
|
||||||
|
// 1) Seed the ring buffer with 3 real Blob segments via the test
|
||||||
|
// seam (recorder.ts:85 — pushSegmentForTest). Each segment is a
|
||||||
|
// plausible WebM head (EBML magic + filler bytes). The 3 ×
|
||||||
|
// payload ensures the for-loop body actually awaits — opening
|
||||||
|
// the real reconnect race window.
|
||||||
|
const ebmlMagic = new Uint8Array([0x1a, 0x45, 0xdf, 0xa3]);
|
||||||
|
recorder.resetBuffer();
|
||||||
|
for (let i = 0; i < 3; i++) {
|
||||||
|
const payload = new Uint8Array(1024);
|
||||||
|
payload.set(ebmlMagic, 0);
|
||||||
|
payload.fill(i + 1, ebmlMagic.length);
|
||||||
|
recorder.pushSegmentForTest(
|
||||||
|
new Blob([payload], { type: 'video/webm' })
|
||||||
|
);
|
||||||
|
}
|
||||||
|
expect(recorder.getSegments().length).toBe(3);
|
||||||
|
|
||||||
|
expect(ports.length).toBe(1);
|
||||||
|
const oldPort = ports[0];
|
||||||
|
expect(oldPort.onMessage._listeners.length).toBeGreaterThanOrEqual(1);
|
||||||
|
|
||||||
|
// 2) SW issues REQUEST_BUFFER — synthetically dispatch to the
|
||||||
|
// captured onPortMessage listener. The production code kicks off
|
||||||
|
// encodeAndSendBuffer which awaits 3 blobToBase64 calls.
|
||||||
|
oldPort.onMessage._listeners[0]({ type: 'REQUEST_BUFFER' });
|
||||||
|
|
||||||
|
// 3) Mid-encode, the pre-emptive reconnect fires. Simulate by
|
||||||
|
// firing onDisconnect on the old port — production code's
|
||||||
|
// handler runs `keepalivePort = null` then `connectPort()`
|
||||||
|
// which spawns a new PortStub. The keepalivePort reference now
|
||||||
|
// points to the new port; the in-flight portAtRequest in the
|
||||||
|
// encode loop still points to oldPort, so the stale-port guard
|
||||||
|
// at line 597 will trigger.
|
||||||
|
oldPort.onDisconnect._listeners.forEach((fn) => fn());
|
||||||
|
|
||||||
|
// After reconnect a NEW port exists.
|
||||||
|
expect(ports.length).toBeGreaterThanOrEqual(2);
|
||||||
|
const newPort = ports[ports.length - 1];
|
||||||
|
|
||||||
|
// 4) Drain enough microtasks for the encode loop to complete and
|
||||||
|
// the post-await guard to evaluate. blobToBase64 awaits
|
||||||
|
// `blob.arrayBuffer()` (1 microtask) ×3 + the guard check.
|
||||||
|
// Generous: drain 32 microtasks then a macrotask. This is
|
||||||
|
// deterministic — vitest's Node env exposes setImmediate
|
||||||
|
// semantics via setTimeout(_, 0).
|
||||||
|
for (let i = 0; i < 32; i++) {
|
||||||
|
await Promise.resolve();
|
||||||
|
}
|
||||||
|
await new Promise<void>((resolve) => setTimeout(resolve, 0));
|
||||||
|
for (let i = 0; i < 32; i++) {
|
||||||
|
await Promise.resolve();
|
||||||
|
}
|
||||||
|
|
||||||
|
// CONTRACT — pick exactly one:
|
||||||
|
//
|
||||||
|
// (a) BUFFER posted on the NEW port (segments encoded, retry on
|
||||||
|
// fresh port works), OR
|
||||||
|
// (b) BUFFER posted on the OLD port (encode completed and the
|
||||||
|
// post squeezed in before the swap took effect — acceptable
|
||||||
|
// because the SW's per-request listener is still bound to it
|
||||||
|
// and the SW gets the data).
|
||||||
|
//
|
||||||
|
// The current production code posts NEITHER when the reconnect
|
||||||
|
// arrives between the await and the post (the line 597 guard
|
||||||
|
// returns silently). This assertion FAILS today, proving the H2
|
||||||
|
// bug. The fix must surface SOME signal that the SW can act on —
|
||||||
|
// silent loss when the buffer is non-empty is the bug.
|
||||||
|
|
||||||
|
const oldPortBuffers = oldPort.postMessage.mock.calls.filter(
|
||||||
|
(call: unknown[]) =>
|
||||||
|
typeof call[0] === 'object' &&
|
||||||
|
call[0] !== null &&
|
||||||
|
(call[0] as { type?: unknown }).type === 'BUFFER'
|
||||||
|
);
|
||||||
|
const newPortBuffers = newPort.postMessage.mock.calls.filter(
|
||||||
|
(call: unknown[]) =>
|
||||||
|
typeof call[0] === 'object' &&
|
||||||
|
call[0] !== null &&
|
||||||
|
(call[0] as { type?: unknown }).type === 'BUFFER'
|
||||||
|
);
|
||||||
|
|
||||||
|
const totalBuffersPosted = oldPortBuffers.length + newPortBuffers.length;
|
||||||
|
// Stronger assertion: when BUFFER reaches a port, it must carry
|
||||||
|
// the 3 segments we seeded. A 0-segment BUFFER is the silent-loss
|
||||||
|
// shape and should not satisfy the contract.
|
||||||
|
const allBufferCalls = [...oldPortBuffers, ...newPortBuffers];
|
||||||
|
const segmentBearingBuffers = allBufferCalls.filter((call) => {
|
||||||
|
const msg = call[0] as { segments?: unknown };
|
||||||
|
return Array.isArray(msg.segments) && msg.segments.length > 0;
|
||||||
|
});
|
||||||
|
expect(totalBuffersPosted).toBeGreaterThanOrEqual(1);
|
||||||
|
expect(segmentBearingBuffers.length).toBeGreaterThanOrEqual(1);
|
||||||
|
|
||||||
|
// Cleanup test seam state so we don't bleed segments across tests.
|
||||||
|
recorder.resetBuffer();
|
||||||
|
});
|
||||||
|
|
||||||
|
// ──────────────────────────────────────────────────────────────────────
|
||||||
|
// H1.b — pre-emptive reconnect path specifically. Walk the same
|
||||||
|
// sequence the operator hit at 290 s: the production code calls
|
||||||
|
// `keepalivePort?.disconnect()` and the test then invokes the ping
|
||||||
|
// callback that was installed BEFORE the reconnect. This proves the
|
||||||
|
// race window in the production timer-clear ordering.
|
||||||
|
// ──────────────────────────────────────────────────────────────────────
|
||||||
|
it('H1.b: pre-emptive reconnect path — ping does not throw against just-disconnected port', async () => {
|
||||||
|
const ports: PortStub[] = [];
|
||||||
|
const stub = buildChromeStub(ports);
|
||||||
|
(globalThis as unknown as GlobalWithChrome).chrome = stub;
|
||||||
|
|
||||||
|
// Capture both interval callbacks AND timeout callbacks for direct
|
||||||
|
// invocation. We need to fire the pre-emptive setTimeout
|
||||||
|
// synthetically (it would otherwise wait 290 s of wall-clock time).
|
||||||
|
const intervalCallbacks: Array<() => void> = [];
|
||||||
|
const timeoutCallbacks: Array<() => void> = [];
|
||||||
|
|
||||||
|
globalThis.setInterval = ((cb: () => void, _ms: number) => {
|
||||||
|
intervalCallbacks.push(cb);
|
||||||
|
return 100 as unknown as ReturnType<typeof setInterval>;
|
||||||
|
}) as typeof globalThis.setInterval;
|
||||||
|
globalThis.setTimeout = ((cb: () => void, _ms: number) => {
|
||||||
|
timeoutCallbacks.push(cb);
|
||||||
|
return 200 as unknown as ReturnType<typeof setTimeout>;
|
||||||
|
}) as typeof globalThis.setTimeout;
|
||||||
|
// No-op the clear* so the captured callbacks remain callable —
|
||||||
|
// surfaces the race-window assumption in the production code.
|
||||||
|
globalThis.clearInterval = (() => {}) as typeof globalThis.clearInterval;
|
||||||
|
globalThis.clearTimeout = (() => {}) as typeof globalThis.clearTimeout;
|
||||||
|
|
||||||
|
await import('../../src/offscreen/recorder');
|
||||||
|
|
||||||
|
expect(ports.length).toBe(1);
|
||||||
|
const port = ports[0];
|
||||||
|
|
||||||
|
// intervalCallbacks[0] is the ping callback (first setInterval call).
|
||||||
|
// timeoutCallbacks contains at least the pre-emptive reconnect timer
|
||||||
|
// (recorder.ts:626). Other setTimeouts may exist (e.g. queueMicrotask
|
||||||
|
// shims, scheduleRotation — but the latter is gated on mediaStream
|
||||||
|
// which is null in this test).
|
||||||
|
expect(intervalCallbacks.length).toBeGreaterThanOrEqual(1);
|
||||||
|
expect(timeoutCallbacks.length).toBeGreaterThanOrEqual(1);
|
||||||
|
|
||||||
|
const pingCb = intervalCallbacks[0];
|
||||||
|
// The pre-emptive reconnect timeout is the only setTimeout the
|
||||||
|
// bootstrap installs (rotation is gated on mediaStream). Pick the
|
||||||
|
// first to keep the test resilient.
|
||||||
|
const preemptiveReconnect = timeoutCallbacks[0];
|
||||||
|
|
||||||
|
// 1) Fire pre-emptive reconnect — the production code calls
|
||||||
|
// keepalivePort?.disconnect(). Our stub flips _disconnected=true
|
||||||
|
// synchronously (matching Chrome's local disconnect semantics).
|
||||||
|
preemptiveReconnect();
|
||||||
|
expect(port._disconnected).toBe(true);
|
||||||
|
expect(port.disconnect).toHaveBeenCalled();
|
||||||
|
|
||||||
|
// 2) Now the race: an already-queued ping callback (e.g. scheduled
|
||||||
|
// before the reconnect) fires against the just-disconnected port.
|
||||||
|
// Production code: keepalivePort?.postMessage({type:'PING'}) —
|
||||||
|
// keepalivePort is still the old non-null reference at this point
|
||||||
|
// (the onDisconnect handler may not have run yet in the same
|
||||||
|
// macrotask), so postMessage runs against the disconnected port
|
||||||
|
// and our stub throws — exactly matching the UAT error message.
|
||||||
|
//
|
||||||
|
// The fix must make this safe. Acceptable strategies:
|
||||||
|
// - Clear the interval as the FIRST step of the pre-emptive
|
||||||
|
// reconnect, before .disconnect().
|
||||||
|
// - Wrap postMessage in try/catch inside the ping callback.
|
||||||
|
// - Switch to a per-port closure (capturedPort) so a stale ping
|
||||||
|
// can detect it's pinging a swapped port.
|
||||||
|
expect(() => pingCb()).not.toThrow();
|
||||||
|
|
||||||
|
// Restore globals.
|
||||||
|
globalThis.setInterval = originalSetInterval;
|
||||||
|
globalThis.clearInterval = originalClearInterval;
|
||||||
|
globalThis.setTimeout = originalSetTimeout;
|
||||||
|
globalThis.clearTimeout = originalClearTimeout;
|
||||||
|
});
|
||||||
|
});
|
||||||
Reference in New Issue
Block a user