Files
Mark 519a0d8a99 docs(01): revise plan 07 wave + ffprobe verify guard per checker iteration 1
Two changes:
1. wave: 3 → 6 (cascade: max(wave(05)=4, wave(06)=5)+1 = 6).
2. Task 1 <automated> verify now prefixes the ffprobe invocation with
   test -f tests/fixtures/last_30sec.webm && which ffprobe so the gate
   fails fast with a clear signal if the human checkpoint never produced
   the fixture (instead of ffprobe blowing up with a cryptic file-not-found).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 16:50:32 +02:00

330 lines
18 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
phase: 01-stabilize-video-pipeline
plan: 07
type: execute
wave: 6
depends_on: ["05", "06"]
files_modified:
- tests/fixtures/last_30sec.webm
autonomous: false
requirements:
- REQ-video-ring-buffer
requirements_addressed:
- REQ-video-ring-buffer
must_haves:
truths:
- "`dist/` loads unpacked into Chrome ≥ 116 with no errors at chrome://extensions"
- "On extension load, the offscreen prompts the operator with Chrome's native screen-share picker"
- "After selecting a screen/window, Chrome's permanent 'Sharing your screen' indicator is shown (D-04 accepted residual risk)"
- "After ≥ 30 s of recording, clicking SAVE_ARCHIVE produces a `session_report_*.zip` in the user's Downloads folder containing `video/last_30sec.webm`"
- "Extracting that webm and running `ffprobe -v error -f matroska -i last_30sec.webm` exits 0 with EMPTY stderr (D-12 acceptance gate)"
- "The captured webm is preserved at `tests/fixtures/last_30sec.webm` as a known-good sample for regression"
- "If the ffprobe gate FAILS, the developer is presented with the D-13 fallback skeleton (already pre-staged in src/offscreen/recorder.ts) and re-plans Phase 1 closure"
artifacts:
- path: "tests/fixtures/last_30sec.webm"
provides: "Captured + verified ffprobe-clean WebM sample"
contains: ""
key_links:
- from: "dist/manifest.json"
to: "dist/src/offscreen/index.html (or dist/offscreen/index.html — whichever Plan 06 produced)"
via: "chrome.offscreen.createDocument at runtime"
pattern: "DISPLAY_MEDIA"
- from: "ffprobe -v error"
to: "last_30sec.webm playability"
via: "exit code 0 + empty stderr"
pattern: ""
---
<objective>
Manual smoke test against a real Chrome instance + ffprobe verification of
the assembled `last_30sec.webm`. This is the D-12 acceptance gate that
ROADMAP.md Phase 1 Success Criterion #3 names verbatim ("concatenating
header + buffered chunks yields a byte sequence a browser would play").
Purpose: Code-level verification (vitest, tsc) covers correctness of pure
logic and types. Playability of an actual WebM file requires running the
extension against a real browser, recording, and exporting. Phase 1 is not
"done" until ffprobe accepts the output — that is the criterion the
plan-checker is bound to enforce.
This plan is **`autonomous: false`** — it has a `checkpoint:human-verify`
because driving a Chrome instance unpacked-load + screen-picker click +
manual save click is not a CLI operation. ffprobe IS a CLI tool; the
ffprobe step is automated AFTER the human steps complete.
The plan also handles the BRANCH where ffprobe fails — in that case, the
developer SHALL switch to the D-13 fallback (already pre-staged as a
commented skeleton in `src/offscreen/recorder.ts`) and the orchestrator
re-plans Phase 1 closure as a follow-up.
Output: A committed `tests/fixtures/last_30sec.webm` known-good sample (if
ffprobe passes) OR a clear escalation to D-13 (if ffprobe fails).
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/phases/01-stabilize-video-pipeline/01-CONTEXT.md
@.planning/phases/01-stabilize-video-pipeline/01-RESEARCH.md
@.planning/phases/01-stabilize-video-pipeline/01-VALIDATION.md
@.planning/phases/01-stabilize-video-pipeline/01-PATTERNS.md
@dist/manifest.json
@src/offscreen/recorder.ts
</context>
<interfaces>
This plan runs against a built `dist/` from Plan 06. No new code interfaces
are introduced. The plan exercises the contracts already implemented:
- SW `ensureOffscreen``chrome.offscreen.createDocument({reasons: [DISPLAY_MEDIA]})`
- Offscreen on START_RECORDING: `getDisplayMedia` → MediaRecorder.start(2000)
- Offscreen ondataavailable: addChunk(blob, Date.now())
- Offscreen track.onended: clear buffer + RECORDING_ERROR to SW
- SW on SAVE_ARCHIVE: getVideoBufferFromOffscreen() via port → REQUEST_BUFFER → BUFFER → mergeVideoChunks → JSZip → chrome.downloads.download
Manual interactions:
1. `chrome://extensions` → "Load unpacked" → select `dist/`
2. (Whatever currently triggers `startVideoCapture` — the existing flow in `src/popup/index.ts` opens the popup, which probably calls `REQUEST_PERMISSIONS`. The exact UX wire is Phase 3 territory, so Plan 07 accepts that the trigger may be clunky — the goal is to get a recording running and exporting.)
3. Click through Chrome's native screen picker
4. Wait ≥ 35 seconds (so the buffer has rotated at least once and the trim path executed)
5. Click whatever in the popup triggers SAVE_ARCHIVE (likely "Сохранить отчёт об ошибке")
6. Verify a `session_report_*.zip` lands in Downloads
7. Extract and run ffprobe
</interfaces>
<threat_model>
## Trust Boundaries
| Boundary | Description |
|----------|-------------|
| developer's local Chrome ↔ unpacked extension | The developer is running the extension on their own machine; loading unpacked is the recommended Phase 1 deployment per audit P2 #22 (no CI). No new trust boundary introduced by this plan. |
## STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|-----------|----------|-----------|-------------|-----------------|
| T-1-NEW-07-01 | Information Disclosure — committed test fixture | `tests/fixtures/last_30sec.webm` | mitigate | The developer chooses what to share with Chrome's picker when running the smoke test. Recommended share: a fresh test browser window with no sensitive content (e.g., the README rendered in Markdown preview). The committed fixture MUST NOT contain identifying information. Reviewer guidance documented in SUMMARY. |
</threat_model>
<tasks>
<task type="checkpoint:human-verify" gate="blocking">
<name>Task 1: HUMAN — load dist/ unpacked, capture a recording, export, save the archive</name>
<files>tests/fixtures/last_30sec.webm</files>
<read_first>
- dist/manifest.json (verify the build is loadable)
- src/popup/index.html and src/popup/index.ts (understand the existing UX trigger — Phase 3 reworks this; for now, click whatever button exists)
</read_first>
<action>
This is a `checkpoint:human-verify` task. The "action" is the manual sequence below — Claude prepares the environment (pre-flight checks), the human drives Chrome, then Claude resumes for the ffprobe gate. Full details in `<how-to-verify>` and `<what-built>` below.
Step summary:
1. Run pre-flight automation (manifest check + ffprobe presence check).
2. Pause and hand off to the human: load dist/ unpacked, click through the popup, accept the screen-share picker, wait ≥ 35 s, click save.
3. Resume automation: extract the saved archive, run `ffprobe -v error -f matroska -i last_30sec.webm`, capture the exit code + stderr.
4. If ffprobe exit 0 with no stderr lines: commit the captured WebM as `tests/fixtures/last_30sec.webm` (regression fixture for future phases).
5. If ffprobe fails: escalate to the orchestrator with the D-13 fallback signal.
</action>
<what-built>
The full Phase-1 Mokosh build:
- `dist/manifest.json` declares `desktopCapture` (not `tabCapture`)
- The offscreen document calls `getDisplayMedia` once per session
- Captured stream is recorded by a single `MediaRecorder` at vp9 / 400 kbps / 2 s timeslice
- Buffer holds first chunk (WebM header) + every chunk newer than now-30s
- SW receives the buffer over the `'video-keepalive'` port on SAVE_ARCHIVE
- Archive is delivered via `chrome.downloads`
</what-built>
<verify>
<automated>test -f tests/fixtures/last_30sec.webm && which ffprobe && ffprobe -v error -f matroska -i tests/fixtures/last_30sec.webm; test $? -eq 0</automated>
</verify>
<how-to-verify>
**Pre-flight (automated, do these in a bash shell before opening Chrome):**
```bash
# Confirm the build is fresh
ls -la dist/manifest.json dist/assets/*.js
node -e "const m=require('./dist/manifest.json'); if(!m.permissions.includes('desktopCapture')) process.exit(1); if(m.permissions.includes('tabCapture')) process.exit(2); console.log('manifest OK')"
# Confirm ffprobe is installed
which ffprobe || { echo "ffprobe missing — install with: sudo apt install ffmpeg"; exit 1; }
ffprobe -version | head -1
```
**Manual smoke (in Chrome):**
1. Open `chrome://extensions` in a Chromium-based browser (Chrome / Chromium ≥ 116).
2. Enable "Developer mode" (top right).
3. Click "Load unpacked".
4. Select the project's `dist/` directory.
5. Confirm the extension card shows the name "AI Call Recorder" (from `manifest.json`) with NO errors or warnings (red text in the card == FAIL).
6. Open the extension's popup (click the toolbar icon).
7. Click whatever button triggers permission/recording. The popup is from before this phase — Phase 3 reworks it; for Phase 1 the button may say "Сохранить отчёт" or "Запросить разрешения".
8. Chrome SHOULD show its native screen-share picker. **If it doesn't**: open chrome://extensions, click "service worker" on the extension card, inspect the SW console for the `[SW:Main]` log line that should say "Sending START_RECORDING to offscreen..." — that tells us where the flow stalled.
9. Pick "Entire screen" or a specific window (your choice — recommend an innocuous one like a code editor showing this PLAN.md).
10. Confirm Chrome's permanent "Sharing your screen" indicator appears at the top of the screen. (This is the accepted D-04 trade-off.)
11. **Wait at least 35 seconds.** Move the mouse around or scroll a page so the captured stream has actual visual change (helps vp9 produce useful frames).
12. Open the popup again and click the SAVE_ARCHIVE button (likely "Сохранить отчёт об ошибке").
13. Within ~5 seconds, a `session_report_YYYY-MM-DD_HH-MM-SS.zip` SHOULD appear in your Downloads folder.
**Verification of the exported archive:**
```bash
# Find the latest archive
LATEST=$(ls -t ~/Downloads/session_report_*.zip 2>/dev/null | head -1)
echo "Latest archive: $LATEST"
[ -z "$LATEST" ] && { echo "FAIL — no session_report archive found in Downloads"; exit 1; }
# Extract the video
unzip -p "$LATEST" video/last_30sec.webm > /tmp/last_30sec.webm
ls -la /tmp/last_30sec.webm
# Run the D-12 acceptance gate
ffprobe -v error -f matroska -i /tmp/last_30sec.webm
GATE=$?
echo "ffprobe exit: $GATE"
ffprobe -v error -show_format -show_streams /tmp/last_30sec.webm 2>&1 | head -30
```
**The acceptance criterion is `GATE == 0` AND zero lines of stderr from the `ffprobe -v error` invocation above.** If both hold:
- Copy the verified webm into the project as a regression fixture: `cp /tmp/last_30sec.webm tests/fixtures/last_30sec.webm`
- Report success.
**If `GATE != 0`** (the ffprobe gate fails):
- Capture the stderr output and the structural dump (`ffprobe -v error -show_packets -i /tmp/last_30sec.webm 2>&1 | head -50`).
- This is the D-13 escalation: the simple continuous-recorder + age-trim approach didn't survive vp9's keyframe cadence (RESEARCH.md Pitfall 1).
- DO NOT delete the existing recorder code. The D-13 fallback skeleton is already pre-staged in `src/offscreen/recorder.ts` as a commented block. Surface the failure to the orchestrator with a precise summary so a new plan (08?) can be drafted to activate the fallback.
</how-to-verify>
<resume-signal>
Report ONE of:
- **`approved` + paste the ffprobe exit code (must be 0) + paste any single stderr line if present + confirm `tests/fixtures/last_30sec.webm` is committed.**
- **`ffprobe-failed` + paste the stderr + paste the first 20 lines of `ffprobe -v error -show_packets`. Orchestrator escalates to D-13 fallback re-plan.**
- **`load-failed` + paste the chrome://extensions error text or the SW console error. Orchestrator escalates to a previous-plan revision (likely Plan 05 or 06).**
- **`picker-rejected` + paste the offscreen console line containing `RECORDING_ERROR`. Surface to user as a one-off; retry by reloading the extension card and clicking the popup again.**
Phase 1 is **DONE** when this checkpoint returns `approved`.
</resume-signal>
<acceptance_criteria>
- The build at dist/ loads unpacked with NO red error text on the extension card.
- getDisplayMedia picker shows on a popup-triggered start path.
- Chrome's "Sharing your screen" indicator appears.
- After ≥ 35 s of recording, SAVE_ARCHIVE produces a session_report_*.zip in Downloads within 5 seconds.
- ffprobe -v error -f matroska -i last_30sec.webm exits 0 with NO stderr lines.
- tests/fixtures/last_30sec.webm is committed as the regression-fixture.
</acceptance_criteria>
<done>Phase 1 acceptance gate met. WebM ring buffer ships playable output.</done>
</task>
<task type="auto">
<name>Task 2: AUTOMATED — commit the fixture and update STATE.md</name>
<files>tests/fixtures/last_30sec.webm, .planning/STATE.md</files>
<read_first>
- tests/fixtures/last_30sec.webm (verify it exists from Task 1)
- .planning/STATE.md (current state — phase 1 progress)
</read_first>
<action>
This task ONLY runs if Task 1 returned `approved`. If Task 1 returned `ffprobe-failed`, `load-failed`, or `picker-rejected`, SKIP this task — the orchestrator handles the escalation.
Two edits:
(1) Ensure `tests/fixtures/last_30sec.webm` is committed to git. Check its size:
```bash
ls -lh tests/fixtures/last_30sec.webm
# Expected: between ~300 KB and ~2 MB for a 30-second vp9 capture at 400 kbps
```
If the file is suspiciously small (< 100 KB) or large (> 5 MB), STOP and audit — the file may be empty (only the header chunk) or include way more than 30 s of content. The expected bitrate is ~400 kbps × 30 s = ~1.5 MB.
Also verify it's tracked by git:
```bash
git status tests/fixtures/last_30sec.webm
```
(2) Update `.planning/STATE.md` to mark Phase 1 as complete. Use the Edit tool to update the relevant fields:
(2a) `stopped_at:` — replace `"Phase 1 context gathered"` with `"Phase 1 closure: ffprobe acceptance gate passed; tests/fixtures/last_30sec.webm committed"`.
(2b) `last_activity:` — replace the existing line with `2026-05-15 — Phase 1 closure: D-12 ffprobe gate green; ready for Phase 2`.
(2c) `## Current Position` block:
- `Phase: 1 of 5 (Stabilize video pipeline)``Phase: 2 of 5 (next: Stabilize DOM + event capture privacy)`
- `Plan: 0 of TBD in current phase``Plan: 7 of 7 complete (Phase 1 closed)`
- `Status: Ready to plan``Status: Phase 1 complete; ready to plan Phase 2`
(2d) Update the `## Performance Metrics` Phase 1 row (currently `| 1. Stabilize video pipeline | 0 | — | — |`) to reflect 7 plans completed.
(2e) Append a `## Phase 1 Closure Notes` block at the bottom of STATE.md with:
- ffprobe exit code (must be 0)
- Size of the committed fixture
- "Phase 1 outcome: SPEC §10 acceptance criteria #2, #3, #7 are functionally green pending Phase 4 end-to-end smoke verification."
- Note: "Phase 2 owns the DOM/event-capture privacy slice; Phase 3 owns the popup state machine + base64-URL replacement; Phase 4 runs the full SPEC §10 smoke pass."
Do NOT touch the `## Deferred Items` or `## Accumulated Context > ## Decisions` sections — those are owned by the orchestrator across phases.
After both edits, run:
```bash
git status
```
The output should show `tests/fixtures/last_30sec.webm` (new) and `.planning/STATE.md` (modified).
</action>
<verify>
<automated>[ -f tests/fixtures/last_30sec.webm ] && grep -q "Phase 1 closure" .planning/STATE.md</automated>
</verify>
<acceptance_criteria>
- `tests/fixtures/last_30sec.webm` exists, size between 100 KB and 5 MB
- `grep -c "Phase 1 closure" .planning/STATE.md` returns at least 1
- `grep -c "ffprobe acceptance gate passed" .planning/STATE.md` returns at least 1
- `git status .planning/STATE.md tests/fixtures/last_30sec.webm` shows both files staged or modified
</acceptance_criteria>
<done>Phase 1 closure recorded in STATE.md; regression fixture committed; the orchestrator can transition to Phase 2 cleanly.</done>
</task>
</tasks>
<verification>
After Task 1 returns `approved` AND Task 2 lands:
1. `git log --oneline -5` shows the Phase 1 commits (Plan 01 through Plan 07).
2. `tests/fixtures/last_30sec.webm` is in the repo, size sensible (~1-2 MB).
3. `ffprobe -v error -f matroska -i tests/fixtures/last_30sec.webm; echo $?` prints `0` with no stderr.
4. `.planning/STATE.md` reflects Phase 1 closure.
5. `npx vitest run` exits 0 (8 tests passing).
6. `npx tsc --noEmit` exits 0.
7. `npm run build` exits 0.
8. `grep -RnE "tabCapture|chrome\.alarms|VideoRecorderDB|copy-offscreen|openIndexedDB" src/ vite.config.ts manifest.json | grep -v '^#'` returns nothing.
If ANY of (1)-(8) fail, Phase 1 is NOT closed; orchestrator escalates.
Commit cadence: ONE commit at the end of Task 2 (`feat(01-07): close phase 1 — ffprobe gate green, fixture committed`). The fixture and STATE.md are committed together.
If Task 1 escalates (D-13 fallback or earlier-plan rework needed), this verification block is SKIPPED — the orchestrator carries the escalation.
</verification>
<success_criteria>
- Manual smoke + ffprobe gate green
- `tests/fixtures/last_30sec.webm` committed as regression-fixture
- `.planning/STATE.md` records Phase 1 closure
- All grep gates from Plans 01..06 still pass
- Build clean; vitest green; tsc clean
- Phase 2 can begin (the next milestone)
</success_criteria>
<output>
After completion, create `.planning/phases/01-stabilize-video-pipeline/01-07-SUMMARY.md` with:
- The exact `chrome://extensions` card state observed (errors? warnings?)
- The exact stderr captured from `ffprobe -v error -f matroska -i /tmp/last_30sec.webm` (should be empty if green)
- The size of `tests/fixtures/last_30sec.webm`
- The ffprobe stream/format dump (so future regressions have a reference)
- Whether the D-13 fallback was activated (and if so, the exact escalation path)
- ONE commit SHA (the Task-2 commit)
- A "What Phase 2 needs to know" section:
- The offscreen now owns capture; Phase 2 (DOM + event-capture privacy) plugs into the existing content-script architecture and does NOT touch the offscreen
- The port `'video-keepalive'` keeps the SW alive; Phase 2's content-script work should NOT add competing keepalives
- The SW's `onMessage` listener now validates `sender.id === chrome.runtime.id`; Phase 2's content-script messages already pass this check (content scripts have the same extension ID), but if Phase 2 adds a new sender path (e.g., from an injected page-world script), Phase 2 must respect the guard
</output>