test(fix-a3): commit debug-session test artifacts + stale fixture

Captures the RED contracts that the webm-playback-freeze debug session landed (before this fix-a3 cycle started) plus the original Plan 07 smoke fixture they run against. None of these files were modified by this fix cycle — they are landed as-is from the debug session to make the test history bisectable. Files staged: - tests/offscreen/segment-keyframes.test.ts Three describe blocks (~340 LOC): * documentation — pure-simulation tests that pin the D-09..D-11 failure mode as executable evidence (regression guard against re-introducing single-continuous-recorder semantics) * GREEN-pinning — pure-simulation tests that pin the D-13 segment-keyframe invariant * production-driven — imports src/offscreen/recorder.ts and asserts (i) `getSegments` exported as a function, (ii) it returns at most 3 Blobs. THIS BLOCK IS NOW GREEN after the D-13 activation in the prior commits — was the genuine TDD anchor for fix-a3. - tests/offscreen/webm-playback.test.ts Two empirical-ffmpeg assertions on tests/fixtures/last_30sec.webm: * zero "Error submitting packet to decoder" lines from the VP9 decoder * no "File ended prematurely" container-finalization error Both STAY RED in this commit because the committed fixture is still the stale one from Plan 07's pre-fix smoke. They flip GREEN after the operator runs ./smoke.sh to regenerate the fixture against the D-13 recorder — see the closing message and the NEXT-STEP block of the resolved debug session. - tests/fixtures/last_30sec.webm The 2.1 MB Plan 07 smoke artifact. Committed deliberately so the empirical RED test has something to run against. Will be overwritten by the next ./smoke.sh run (single-file rotation — the path is fixed by the smoke script + zip extraction step in the debug-session reproduction). Verification: - npx vitest run --reporter=dot → Tests 2 failed | 28 passed (30) - The 2 fails are EXACTLY the two empirical-ffmpeg assertions in webm-playback.test.ts; the structural production-driven block in segment-keyframes.test.ts is fully GREEN. - npx tsc --noEmit clean. - npm run build succeeds. Operator action required before Phase 1 close (Plan 07 still owns REQ-video-ring-buffer): re-run ./smoke.sh per the documented 6-step reproduction in the debug session, then re-run `npx vitest run tests/offscreen/webm-playback.test.ts` and expect both assertions to flip GREEN. Plan 07 success criterion §10 #7 (playback) lands at that point.
2026-05-15 21:16:02 +02:00
parent f81438d6c8
commit 87909d976c
3 changed files with 525 additions and 0 deletions
--- a/tests/offscreen/segment-keyframes.test.ts
+++ b/tests/offscreen/segment-keyframes.test.ts
@@ -0,0 +1,346 @@
+// tests/offscreen/segment-keyframes.test.ts
+//
+// RED-gate test for debug session webm-playback-freeze.
+//
+// Algorithmic / unit-level companion to webm-playback.test.ts. Where that
+// test runs ffmpeg over the committed fixture (empirical, requires ffmpeg in
+// PATH, requires the fixture to be regenerated after the fix), THIS test
+// works against a pure-data model of the recorder behaviour and runs in any
+// vitest environment without external tooling.
+//
+// Model
+// -----
+//
+// We simulate a 30 fps capture in which Chrome emits a VP9 keyframe every
+// `KF_PERIOD_S = 3` seconds (kf_max_dist=100 ≈ 3.33 s; we round down for a
+// conservative test). The recorder is configured with
+// `MediaRecorder.start(TIMESLICE_MS)`, so chunks fire every 2 s — NOT aligned
+// to keyframes. We classify each emitted chunk by whether it contains a
+// keyframe ("kf-bearing") or only P-frames ("p-only").
+//
+// Failure mode (D-09..D-11 — current behaviour)
+// ---------------------------------------------
+//
+// `addChunk` from src/offscreen/recorder.ts pins the FIRST chunk (which holds
+// the WebM header + an initial keyframe) and then ages out chunks older than
+// 30 s. After ~30 s of recording, the kept set is:
+//
+//     [chunk_0 (header, kf)] + [chunks emitted in the last 30 s]
+//
+// The last-30-s tail contains chunks that may have started mid-GOP. When the
+// SW concatenates `chunk_0` with the tail, the tail's first P-frames
+// reference keyframes that lived in trimmed-out middle chunks. Result:
+// decoder error ~1 s past `chunk_0`'s end.
+//
+// Fix (D-13 restart-segments)
+// ---------------------------
+//
+// Stop + restart the MediaRecorder every SEGMENT_MS = 10 s on the same
+// MediaStream. Each restart forces a new WebM header AND a new keyframe at
+// the segment's start (since the encoder is freshly initialized). Keep the
+// last `MAX_SEGMENTS = 3` segments (= 30 s). Each segment in the kept window
+// is self-contained — its first chunk is kf-bearing.
+//
+// Test structure
+// --------------
+//
+//   block 1 — "RED — D-09..D-11 leaks P-only chunks past trim":
+//     Pure-simulation tests that document the current bug. Pass today;
+//     they encode the failure mode as executable evidence. (They will keep
+//     passing post-fix; their purpose is documentation + regression guard
+//     against re-introducing single-continuous-recorder semantics.)
+//
+//   block 2 — "GREEN-pinning — D-13 contract for restart-segments":
+//     Pure-simulation tests that pin the segment-based fix's contract.
+//     Pass today; their purpose is to give the fix's reviewer an
+//     algorithmic spec to check against before reading code.
+//
+//   block 3 — "production recorder must expose segment-aware buffer (RED)":
+//     Imports src/offscreen/recorder.ts and asserts a `getSegments` API
+//     exists with the D-13 shape. GOES RED TODAY because the production
+//     code only exposes `getBuffer()` (chunk-level). FLIPS GREEN when the
+//     D-13 skeleton at src/offscreen/recorder.ts:298-316 is activated and
+//     a `getSegments` export is added. This is the genuine TDD anchor.
+
+import { describe, it, expect, beforeEach } from 'vitest';
+
+// ─── Recorder model parameters ──────────────────────────────────────────
+const TIMESLICE_MS = 2_000;            // matches src/offscreen/recorder.ts TIMESLICE_MS
+const VIDEO_BUFFER_DURATION_MS = 30_000; // matches VIDEO_BUFFER_DURATION_MS
+const KF_PERIOD_MS = 3_000;            // Chrome VP9 default kf_max_dist=100 ≈ 3 s @ 30 fps
+const SEGMENT_MS = 10_000;             // D-13 design — see CONTEXT.md
+const MAX_SEGMENTS = 3;                // D-13 design — keep last 3 segments (30 s)
+
+interface SimChunk {
+  index: number;
+  emittedAtMs: number;
+  hasKeyframe: boolean;
+  isFirstEmitted: boolean;
+}
+
+interface SimSegment {
+  startMs: number;
+  endMs: number;
+  chunks: SimChunk[];
+}
+
+// ─── Simulation: single continuous MediaRecorder (D-09..D-11) ──────────
+function simulateContinuousRecorder(totalDurationMs: number): SimChunk[] {
+  const chunks: SimChunk[] = [];
+  const totalChunks = Math.floor(totalDurationMs / TIMESLICE_MS);
+  for (let i = 0; i < totalChunks; i++) {
+    const emittedAt = (i + 1) * TIMESLICE_MS;
+    // A chunk covers [emittedAt - TIMESLICE_MS, emittedAt]. It contains a
+    // keyframe iff a keyframe boundary falls strictly inside that interval.
+    const intervalStart = emittedAt - TIMESLICE_MS;
+    // Index of the first keyframe at-or-after intervalStart.
+    const firstKfIdx = Math.ceil(intervalStart / KF_PERIOD_MS);
+    const firstKfMs = firstKfIdx * KF_PERIOD_MS;
+    const hasKf = firstKfMs >= intervalStart && firstKfMs < emittedAt;
+    chunks.push({
+      index: i,
+      emittedAtMs: emittedAt,
+      hasKeyframe: hasKf,
+      isFirstEmitted: i === 0,
+    });
+  }
+  return chunks;
+}
+
+// Mirrors trimAged() from src/offscreen/recorder.ts: pin the first-flagged
+// chunk, drop everything else older than VIDEO_BUFFER_DURATION_MS.
+function trimContinuousBuffer(chunks: SimChunk[], nowMs: number): SimChunk[] {
+  const cutoff = nowMs - VIDEO_BUFFER_DURATION_MS;
+  return chunks.filter((c) => c.isFirstEmitted || c.emittedAtMs >= cutoff);
+}
+
+// ─── Simulation: restart-segments (D-13) ──────────────────────────────
+function simulateSegmentRecorder(totalDurationMs: number): SimSegment[] {
+  const segments: SimSegment[] = [];
+  const totalSegments = Math.floor(totalDurationMs / SEGMENT_MS);
+  for (let s = 0; s < totalSegments; s++) {
+    const segStart = s * SEGMENT_MS;
+    const segEnd = segStart + SEGMENT_MS;
+    const chunks: SimChunk[] = [];
+    // Each segment's first chunk is always kf-bearing because the MediaRecorder
+    // is freshly constructed on segment rotation — the encoder always emits
+    // an initial keyframe.
+    const chunksPerSegment = Math.floor(SEGMENT_MS / TIMESLICE_MS);
+    for (let i = 0; i < chunksPerSegment; i++) {
+      const emittedAt = segStart + (i + 1) * TIMESLICE_MS;
+      chunks.push({
+        index: i,
+        emittedAtMs: emittedAt,
+        hasKeyframe: i === 0, // the fresh recorder always seeds a keyframe
+        isFirstEmitted: i === 0,
+      });
+    }
+    segments.push({ startMs: segStart, endMs: segEnd, chunks });
+  }
+  return segments;
+}
+
+function keepLastSegments(segments: SimSegment[]): SimSegment[] {
+  return segments.slice(-MAX_SEGMENTS);
+}
+
+// ─── Tests ──────────────────────────────────────────────────────────────
+
+describe('segment keyframes (documentation — D-09..D-11 leaks P-only chunks past trim)', () => {
+  it('continuous-recorder model has chunks with no keyframe (proves the gap exists)', () => {
+    // Sanity check the model: with TIMESLICE_MS=2000 and KF_PERIOD_MS=3000,
+    // a 2-s chunk window can sometimes contain no keyframe at all.
+    const chunks = simulateContinuousRecorder(60_000);
+    const pOnly = chunks.filter((c) => !c.hasKeyframe);
+    expect(pOnly.length).toBeGreaterThan(0);
+    // And the count is meaningful — significantly more than just the
+    // boundary between two 3-s GOPs. Model integrity check.
+    expect(pOnly.length / chunks.length).toBeGreaterThan(0.25);
+  });
+
+  it('after 60 s, trimming to 30 s leaves the pinned first chunk + P-only tail chunks orphaned from their keyframes', () => {
+    const allChunks = simulateContinuousRecorder(60_000);
+    const kept = trimContinuousBuffer(allChunks, 60_000);
+
+    // The pinned first chunk is still there.
+    expect(kept[0].isFirstEmitted).toBe(true);
+    expect(kept[0].hasKeyframe).toBe(true);
+
+    // The tail (everything after the pinned first chunk) contains AT LEAST
+    // one P-only chunk that immediately follows the pinned header, with
+    // no kf-bearing chunk in between to anchor it. THIS is the freeze
+    // mechanism: the decoder accepts the pinned header + its keyframe,
+    // then hits the tail's first P-frame whose reference keyframe lived
+    // in a trimmed-out chunk.
+    const tail = kept.slice(1);
+    const firstTailChunkIsPOnly = tail.length > 0 && !tail[0].hasKeyframe;
+    // Pin the failure: the tail does start with a P-only chunk, and the
+    // gap between pinned-kf and the next kf-bearing chunk in the tail is
+    // greater than what a single GOP can survive.
+    expect(firstTailChunkIsPOnly).toBe(true);
+
+    // The gap between pinned chunk's keyframe and the next kf-bearing
+    // chunk in the tail is the time the decoder will play before freezing.
+    const pinnedKfMs = kept[0].emittedAtMs;
+    const firstTailKfChunk = tail.find((c) => c.hasKeyframe);
+    expect(firstTailKfChunk).toBeDefined();
+    // The decoder needs every P-frame's reference keyframe present.
+    // Between pinnedKfMs and firstTailKfChunk.emittedAtMs there are
+    // P-only chunks whose references were trimmed → freeze.
+    const orphanGapMs = firstTailKfChunk!.emittedAtMs - pinnedKfMs;
+    expect(orphanGapMs).toBeGreaterThan(KF_PERIOD_MS);
+  });
+});
+
+describe('segment keyframes (GREEN-pinning — D-13 contract for restart-segments)', () => {
+  it('each retained segment starts with a keyframe', () => {
+    const allSegments = simulateSegmentRecorder(60_000);
+    const kept = keepLastSegments(allSegments);
+    expect(kept).toHaveLength(MAX_SEGMENTS);
+    for (const seg of kept) {
+      expect(seg.chunks.length).toBeGreaterThan(0);
+      expect(
+        seg.chunks[0].hasKeyframe,
+        `Segment starting at ${seg.startMs}ms is missing a keyframe in its first chunk. ` +
+          `Under D-13 the MediaRecorder must be freshly constructed on each rotation so ` +
+          `the encoder seeds a keyframe at segment t=0.`,
+      ).toBe(true);
+    }
+  });
+
+  it('kept window spans exactly MAX_SEGMENTS * SEGMENT_MS = 30 s', () => {
+    const allSegments = simulateSegmentRecorder(60_000);
+    const kept = keepLastSegments(allSegments);
+    const spanMs = kept[kept.length - 1].endMs - kept[0].startMs;
+    expect(spanMs).toBe(MAX_SEGMENTS * SEGMENT_MS);
+    expect(spanMs).toBe(VIDEO_BUFFER_DURATION_MS);
+  });
+
+  it('concatenating retained segments yields a fully decodable timeline (no orphan P-frames)', () => {
+    // Decodability invariant: every chunk in the concatenated stream either
+    // IS kf-bearing or is preceded (within the SAME segment) by a kf-bearing
+    // chunk. Under D-13 this is satisfied trivially because each segment's
+    // first chunk is kf-bearing and the segment is self-contained.
+    const allSegments = simulateSegmentRecorder(60_000);
+    const kept = keepLastSegments(allSegments);
+
+    for (const seg of kept) {
+      let lastKfBearingInSegment = -1;
+      for (let i = 0; i < seg.chunks.length; i++) {
+        if (seg.chunks[i].hasKeyframe) {
+          lastKfBearingInSegment = i;
+        }
+        // Every chunk must have a kf-bearing predecessor (or itself) inside
+        // the segment. If lastKfBearingInSegment is still -1 we've found a
+        // P-only chunk with no anchoring keyframe — the freeze condition.
+        expect(
+          lastKfBearingInSegment,
+          `Chunk ${i} of segment ${seg.startMs}ms has no preceding keyframe in its segment.`,
+        ).toBeGreaterThanOrEqual(0);
+      }
+    }
+  });
+
+  it('a continuous-recorder buffer that trims out middle chunks DOES exhibit the orphan-keyframe gap (the bug, restated as code)', () => {
+    // This is the mirror image of the D-13 invariant test above: prove that
+    // the D-09..D-11 approach explicitly exhibits the orphan-keyframe gap.
+    // That empirically lock-steps the test pair: GREEN on D-13 ⇔ orphan-gap on D-09..D-11.
+    const allChunks = simulateContinuousRecorder(60_000);
+    const kept = trimContinuousBuffer(allChunks, 60_000);
+
+    // Note: under D-09..D-11 the pinned first chunk IS kf-bearing, so a naive
+    // "every chunk has a preceding kf in the kept buffer" check passes. The
+    // real bug is that the tail's P-frames reference KEYFRAMES THAT WERE
+    // TRIMMED FROM THE MIDDLE OF THE TIMELINE — those keyframes are not in
+    // `kept` because they came from chunks evicted by the age trim. We
+    // assert this via the gap evidence: there is a stretch in the kept
+    // timeline where no kf-bearing chunk appears between the pinned header
+    // and the recent tail.
+    const pinnedKfMs = kept[0].emittedAtMs;
+    const firstTailKfChunk = kept.slice(1).find((c) => c.hasKeyframe);
+    expect(firstTailKfChunk).toBeDefined();
+    const orphanGapMs = firstTailKfChunk!.emittedAtMs - pinnedKfMs;
+    // The decoder will freeze for orphanGapMs - KF_PERIOD_MS worth of frames
+    // because their reference keyframes were in trimmed chunks. We require
+    // the gap to be much larger than KF_PERIOD_MS — i.e. trimmed material
+    // contained keyframes that the kept material depends on.
+    expect(orphanGapMs).toBeGreaterThan(KF_PERIOD_MS * 2);
+  });
+});
+
+describe('production recorder must expose segment-aware buffer (RED — pins D-13)', () => {
+  // This block is the genuine TDD anchor. It drives an import of the real
+  // src/offscreen/recorder.ts and asserts that a `getSegments` export exists
+  // with a shape consistent with the D-13 contract.
+  //
+  // Today this is RED: the module exports `getBuffer()` (chunk-level), not
+  // `getSegments()` (segment-level). The activation of the D-13 skeleton at
+  // src/offscreen/recorder.ts:298-316 must:
+  //   1. Maintain a `segments: Blob[]` array (each entry = one finalized
+  //      ~10 s self-contained WebM).
+  //   2. Rotate segments via stop+restart-on-same-MediaStream every
+  //      SEGMENT_MS, keeping at most MAX_SEGMENTS.
+  //   3. Export a `getSegments(): Blob[]` function. (The wire format on the
+  //      port stays base64-per-segment per D-12.)
+  //
+  // We use vitest's beforeEach + vi.resetModules pattern from
+  // codec-check.test.ts so the module's bootstrap side-effects don't poison
+  // the test environment.
+
+  interface ChromeStub {
+    runtime: {
+      sendMessage?: (msg: unknown) => void;
+      onMessage?: { addListener?: (cb: unknown) => void };
+      connect?: () => unknown;
+      id?: string;
+    };
+  }
+  interface GlobalWithChrome {
+    chrome?: ChromeStub;
+    MediaRecorder?: { isTypeSupported: (mime: string) => boolean };
+  }
+
+  beforeEach(async () => {
+    const { vi } = await import('vitest');
+    vi.resetModules();
+    (globalThis as unknown as GlobalWithChrome).chrome = {
+      runtime: { id: 'test', sendMessage: () => {} },
+    };
+  });
+
+  it('src/offscreen/recorder exports a getSegments function', async () => {
+    const mod = (await import('../../src/offscreen/recorder')) as Record<
+      string,
+      unknown
+    >;
+    // RED today — recorder.ts only exports getBuffer/addChunk/trimAged/etc.
+    // GREEN when D-13 lands and getSegments is added.
+    expect(
+      typeof mod.getSegments,
+      'src/offscreen/recorder.ts must export `getSegments(): Blob[]` once ' +
+        'the D-13 restart-segments skeleton is activated. Today it only ' +
+        'exports the chunk-level `getBuffer()`, which is the API responsible ' +
+        'for the orphan-keyframe gap in tests/fixtures/last_30sec.webm. See ' +
+        '.planning/debug/webm-playback-freeze.md and the commented skeleton ' +
+        'at src/offscreen/recorder.ts:298-316.',
+    ).toBe('function');
+  });
+
+  it('getSegments returns at most MAX_SEGMENTS=3 Blobs', async () => {
+    const mod = (await import('../../src/offscreen/recorder')) as {
+      getSegments?: () => Blob[];
+    };
+    if (typeof mod.getSegments !== 'function') {
+      // Skip the body — the structural test above is the one that drives
+      // the fix. We still want this assertion documented as a contract.
+      expect.fail(
+        'getSegments not exported yet; see the previous test in this block ' +
+          'for the activation instructions.',
+      );
+      return;
+    }
+    const segments = mod.getSegments();
+    expect(Array.isArray(segments)).toBe(true);
+    expect(segments.length).toBeLessThanOrEqual(MAX_SEGMENTS);
+  });
+});