docs(option-c): archive empty-archive-port-race + amend CONTEXT.md D-17 port lifecycle
Two doc updates closing the debug session per the resolved pattern this
phase has established (cf. resolved/d12-blob-port-transfer-fails.md and
resolved/webm-playback-freeze.md):
1. **Move debug session to resolved/** with the Resolution section
filled in (root_cause, fix, verification, files_changed). Status
flipped tdd_red_confirmed -> resolved. Original investigation
notes + bisect results + Option C strategy spec all preserved
in-place — the file is the full provenance trail.
2. **Amend 01-CONTEXT.md D-17** with the new port lifecycle commitments.
Append-only (D-17 itself untouched) per the doc cascade rule
established earlier this phase ("amendments append, do not replace,
to preserve SPEC provenance"). The amendment narrates:
- What was Claude's-discretion at Phase 1 plan time has been
specified by Option C.
- The 290 s pre-emptive setTimeout reconnect (Pitfall 4) is RETIRED.
- The architectural commitments added: PING/PONG health probe,
request-id'd REQUEST_BUFFER/BUFFER, SW retry on port replacement,
outer 10 s hard-timeout, operator-visible EmptyVideoBufferError
surface.
- The 4 pinning contracts added (port-health-probe,
request-id-protocol, port-lifecycle-continuous, plus the
refactored port-reconnect-race).
Suite remains 11 files / 53 tests, all GREEN. Quality gates intact.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -299,5 +299,74 @@ phase, so downstream phases see a consistent baseline:
|
||||
|
||||
---
|
||||
|
||||
## Amendment (Phase 01-stabilize-video-pipeline, 2026-05-16) — D-17 port lifecycle
|
||||
|
||||
- **AMENDED-BY:** debug session `empty-archive-port-race` (Option C, resolved 2026-05-16)
|
||||
- **Replaces nothing.** D-17 above stands as the original port-keepalive
|
||||
contract. This amendment narrows the port LIFECYCLE shape that was
|
||||
Claude's-discretion at Phase 1 plan time.
|
||||
- **Background.** UAT Test 3 surfaced two coupled defects: 3× Uncaught
|
||||
"Attempting to use a disconnected port object" in the offscreen console
|
||||
starting at the 290 s pre-emptive reconnect mark, AND a silent
|
||||
empty-video archive shipping to the operator. Bisect proved the H1
|
||||
port race was introduced by Plan 01-04 (commit `b064a21`) and the H2
|
||||
silent-skip in `createArchive` was an upstream defect from commit
|
||||
`555eb05` that the port race amplified from latent to fatal. Full
|
||||
lineage and strategy rationale: `.planning/debug/resolved/empty-archive-port-race.md`.
|
||||
- **Architectural commitments retired:**
|
||||
- The legacy 290 000 ms pre-emptive `setTimeout` reconnect (Pitfall 4)
|
||||
is RETIRED. Its race window between the synchronous `.disconnect()`
|
||||
and the `onDisconnect` handler firing was the proximate cause of
|
||||
H1 — see the bisect notes.
|
||||
- **Architectural commitments added:**
|
||||
- **Port health probe (PING ↔ PONG).** The offscreen `PORT_PING_MS`
|
||||
interval doubles as a liveness probe; each PING expects a PONG echo
|
||||
from the SW. The offscreen tracks `missedPongs` and reconnects
|
||||
cleanly when the count exceeds `MAX_MISSED_PONGS = 3` (~75 s of
|
||||
unresponsive SW — well past Chrome's ~30 s idle threshold, so a
|
||||
real disconnect would already have surfaced its own
|
||||
`onDisconnect`). The SW echoes PONG on every PING via the
|
||||
onConnect-level message sink.
|
||||
- **Request-id'd REQUEST_BUFFER / BUFFER.** Every `REQUEST_BUFFER`
|
||||
carries a SW-generated `requestId` (crypto.randomUUID with
|
||||
Math.random fallback). The offscreen echoes the same `requestId`
|
||||
on the BUFFER response. The SW routes BUFFER → pending request via
|
||||
a module-level `Map<requestId, PendingBufferRequest>` — port-
|
||||
agnostic, so port replacement does not lose the response.
|
||||
- **Retry on port replacement.** Every `onConnect` (post-bootstrap)
|
||||
scans `pendingBufferRequests` and re-issues REQUEST_BUFFER on the
|
||||
fresh port with the SAME requestId. The offscreen posts BUFFER on
|
||||
the CURRENT `keepalivePort`, the sink matches by id, and the
|
||||
request resolves. This retires the H2 silent-drop class
|
||||
architecturally.
|
||||
- **Outer hard-timeout bumped 2 s → 10 s.** The legacy per-port
|
||||
`BUFFER_FETCH_TIMEOUT_MS = 2000` was too tight to cover a retry
|
||||
across a reconnect. The new outer budget covers EVERY retry across
|
||||
port replacements; the inner per-port round-trip is still
|
||||
~100-200 ms.
|
||||
- **Operator-visible failure surface.** `createArchive` throws
|
||||
`EmptyVideoBufferError` when the video buffer is empty. `saveArchive`
|
||||
catches and emits `{type:'RECORDING_ERROR', error:'empty-video-buffer'}`
|
||||
via `chrome.runtime.sendMessage` for the popup, AND returns
|
||||
`{success:false, error}` in the direct-response path. Replaces the
|
||||
upstream silent-skip branch in `createArchive` that shipped a
|
||||
video-less archive in silence.
|
||||
- **Pinning contracts added:**
|
||||
- `tests/offscreen/port-health-probe.test.ts` — pins the PING/PONG +
|
||||
request-id'd encode contract on the offscreen side (4 tests).
|
||||
- `tests/background/request-id-protocol.test.ts` — pins the SW-side
|
||||
request-id routing + retry + error-surface contract (5 tests).
|
||||
- `tests/background/port-lifecycle-continuous.test.ts` — continuous
|
||||
600 s end-to-end simulation: 12 ping/pong cycles + 2 forced
|
||||
reconnects + 3 SAVE_ARCHIVE round-trips, asserts no Uncaught and
|
||||
every BUFFER carries segments.
|
||||
- `tests/offscreen/port-reconnect-race.test.ts` (refactored): H1.b
|
||||
no longer pins the retired 290 s setTimeout path — it now pins
|
||||
the externally-disconnected-port → ping try/catch → reconnect path
|
||||
that the H1 fix delivers.
|
||||
|
||||
---
|
||||
|
||||
*Phase: 01-stabilize-video-pipeline*
|
||||
*Context gathered: 2026-05-15*
|
||||
*Amended: 2026-05-16 (debug session empty-archive-port-race, Option C)*
|
||||
|
||||
Reference in New Issue
Block a user