Files
Mark 9dcfcf0793 fix(02): revise plans per checker (B1 + 4 flags) — add tabs permission for D-P2-02
- BLOCKER B1: add `tabs` to manifest.json permissions (DEC-011 Amendment 1
  cites Phase 2 D-P2-02 meta.urls feature as justification). Honors
  D-P2-02 "all tabs visible" wording verbatim. Updates manifest-i18n test
  expected permission list lockstep.
- F1: add A28 harness assertion for REQ-archive-layout strict zip-layout
  verification (5 entries, no extras).
- F2: createArchive empty-tracker fallback removed; logs warn + sets
  urls:[] instead of fake [extension-origin URL]. 02-01 RED test pins
  empty-tracker → urls:[].
- F3: 02-02 Task 3 prose deliberation struck; typed `blob-url-mint-failed`
  throw is the resolved-only contract.
- F4: 02-02 Task 3 verify block adds full-suite `npm test` after focused
  test runs.
- A27 strict-mode (Plan 02-04): REQUIRES both URLs in meta.urls; FAILS
  on length < 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 14:25:20 +02:00

178 lines
15 KiB
Markdown

# Phase 2: Stabilize export pipeline - Context
**Gathered:** 2026-05-20
**Status:** Ready for planning
<domain>
## Phase Boundary
The export pipeline — saveArchive coordination of video buffer fetch, screenshot capture, content-script rrweb messaging, JSZip assembly, and Blob URL download. Phase 2 closes the audit's last functional export defect (P0-6: base64→Blob URL migration), corrects the meta.json captured-URL bug (P1 #10), and locks the meta.json schema with strict validation + harness <5s latency assertion.
Most of REQ-popup-ui + REQ-archive-layout + REQ-screenshot-on-export was substantively shipped via Plans 01-08 (webm-remux + JSZip), 01-09 (popup state machine + SAVE-only UI), and 01-12 (manifest i18n). Phase 2 scope is the residual gaps, NOT a from-scratch implementation.
**2026-05-20 re-phasing context:** Original ROADMAP Phase 3 (Stabilize export pipeline) renumbered to Phase 2 after the original Phase 2 (DOM/event-capture privacy) was removed entirely per the "log is internal" charter shift. REQ-rrweb-dom-buffer + REQ-user-event-log verification absorbed by new Phase 3 (smoke). REQ-password-confidentiality moved to Out of Scope (v1).
</domain>
<decisions>
## Implementation Decisions
### Blob URL migration architecture (P0-6 fix)
- **D-P2-01: Offscreen-minted Blob URL pipeline.** Replace the current base64 data: URL download path (src/background/index.ts:709-710) with: SW packages zip via JSZip → SW sends Blob to offscreen via port → offscreen calls `URL.createObjectURL(blob)` → offscreen sends URL back to SW → SW calls `chrome.downloads.download({ url, filename })` → SW `onChanged` listener triggers `URL.revokeObjectURL` after download completes. Matches the original audit P0-6 recommendation. Properly unblocks any archive size (real bug-report archives are ~5-10 MB; current base64 path hits Chrome's ~2 MB data-URL cap).
- **Rationale:** user direction "up to you. If you think we need to migrate — good let's do it." Combined with analysis: real archives EXCEED base64 limit; without migration the canonical use case breaks; offscreen has `URL.createObjectURL` (SW does not — DEC-006 binding).
- **Port-transfer plumbing:** existing offscreen↔SW port from Plan 01-04/07 (D-12 binary wire format at `src/shared/binary.ts` blobToBase64/base64ToBlob); needs new message types for Blob transfer (reverse direction from existing SW-bound video segments). Likely uses base64 wire-encoding for the Blob (same pattern as video segments) OR direct Blob via port (since both contexts are in the same extension origin).
### meta.json captured-URL fix (P1 #10)
- **D-P2-02: meta.json schema migrates from singular `url: string` to plural `urls: string[]`.** Captures all tabs visible during the 30s recording window. Captures the operator's full multi-tab context, not just the active-at-save tab.
- **Schema-breaking change** — CON-meta-json-schema currently mandates 7 fields including `url: string`. Phase 2 MUST amend REQUIREMENTS.md REQ-meta-json-schema to reflect the new shape (8 fields, `urls: string[]` replacing `url: string`).
- **Tab-tracking infrastructure:** SW needs to maintain a Set of tab URLs seen during the 30s window. Simplest: chrome.tabs.onUpdated + chrome.tabs.onActivated listeners maintain `tabUrlsSeen: Set<string>`; pruned alongside the video segment ring buffer. Alternative: query chrome.windows.getAll + iterate tabs at SAVE time (simpler but doesn't capture history during the window).
- **Rationale:** user direction "All tabs' URLs as an array (meta.json.urls)" — highest informational fidelity for multi-tab bug-reproduction workflows.
- **Privacy note:** since "log is internal" per re-phasing, tab URLs may include sensitive operator state — acceptable per current charter.
### Phase 2 scope tightness
- **D-P2-03: Full Phase 2 scope.** Three workstreams:
1. Blob URL migration (D-P2-01)
2. meta.json `urls` array migration + REQUIREMENTS.md schema amendment (D-P2-02)
3. meta.json strict schema validation (new test file: ISO-8601 timestamp regex, urls array non-empty, version semver, totalEvents non-negative integer, 8 fields exact)
4. UAT harness <5s latency assertion (A24+ extension, consistent with Phase 1's Approach-B harness pattern)
- **Rationale:** user direction "Full scope: bug fixes + schema + harness latency assertion". Most comprehensive Phase 2 closure.
- **Estimated plans:** ~3-4 plans. Wave structure likely: Wave 0 RED tests → Wave 1 meta.urls + tab-tracking infra → Wave 2 Blob URL pipeline (offscreen + SW + port plumbing) → Wave 3 schema validation + harness extension → Wave 4 operator empirical checkpoint.
### Claude's Discretion
- Wave structure + plan granularity (planner figures out)
- Tab-tracking infrastructure exact implementation (researcher may scope; SW state vs onActivated-based)
- Blob transfer wire format choice (base64 vs direct Blob port message) — both have precedent in Plan 01-07 D-12; planner picks
- Specific harness assertion shapes for A24+ (planner figures out)
- Whether to spawn researcher for Blob URL implementation patterns OR proceed directly to planner (likely direct: implementation is well-understood)
- Whether to bundle the offscreen Blob URL change with the SW download change in a single commit OR atomic per-step
- README.md updates if the archive size limit lifts (informational only)
</decisions>
<canonical_refs>
## Canonical References
**Downstream agents MUST read these before planning or implementing.**
### Roadmap + Requirements
- `.planning/ROADMAP.md` §"Phase 2: Stabilize export pipeline" — phase boundary, depends_on Phase 1, scope note from 2026-05-20 re-phasing
- `.planning/REQUIREMENTS.md` — REQ-popup-ui, REQ-screenshot-on-export, REQ-archive-layout, REQ-meta-json-schema (needs amendment for `urls`), REQ-archive-export-latency, REQ-manifest-permissions (Complete)
- `.planning/PROJECT.md` — DEC-005 (JSZip), DEC-006 (chrome.downloads), DEC-008 (screenshot via captureVisibleTab)
### Audit + Original Defects
- `/home/parf/.claude/plans/dear-claude-there-is-snazzy-fox.md` (the original manifest.zip audit) — P0-6 (base64 download), P1 #10 (meta.json url + version), P1 #11 (fetch Request handling — moved to Phase 4 per re-phasing)
### Source code surfaces
- `src/background/index.ts` saveArchive (line 736), createArchive (line 608), downloadArchive (line 695), captureScreenshot (line 568), meta.json construction (line 674-682)
- `src/shared/binary.ts` — base64ToBlob / blobToBase64; existing wire-format for offscreen→SW; Phase 2 may reuse for SW→offscreen reverse direction OR introduce a new message type
- `src/offscreen/recorder.ts` — current offscreen context; Phase 2 adds Blob URL minting handler
- `src/shared/types.ts` — SessionMetadata interface; Phase 2 amends `url: string``urls: string[]`
### Plan precedents
- `.planning/phases/01-stabilize-video-pipeline/01-08-SUMMARY.md` — webm-remux + JSZip integration
- `.planning/phases/01-stabilize-video-pipeline/01-09-SUMMARY.md` — SAVE-only popup + SAVE_ARCHIVE message channel; Amendment 3 always-on charter
- `.planning/phases/01-stabilize-video-pipeline/01-07-SUMMARY.md` — D-12 base64 wire-format precedent for port-transfer plumbing
- `.planning/phases/01-stabilize-video-pipeline/01-13-SUMMARY.md` — Approach B UAT harness pattern for A24+ extension
### Architectural constraints (from Phase 1 SUMMARYs)
- Never `await import(...)` in `src/background/index.ts` (Plan 01-11 SUMMARY)
- Test-mode symbols stay in dist-test/ only via `__MOKOSH_UAT__` define-token (Plan 01-11 SUMMARY)
- Tier-1 FORBIDDEN_HOOK_STRINGS gate (currently 12 entries; Phase 2 may add A24+ entries — must update lockstep)
- Pre-checkpoint bundle gates per saved memory `feedback-pre-checkpoint-bundle-gates.md` before any operator checkpoint
</canonical_refs>
<code_context>
## Existing Code Insights
### Reusable Assets
- `chrome.tabs.query` + `chrome.tabs.captureVisibleTab`: already wired in `captureScreenshot` (line 568). Tab-tracking for D-P2-02 can reuse the active-tab query pattern.
- `chrome.tabs.onUpdated` + `chrome.tabs.onActivated` listeners: not currently used in SW; D-P2-02 introduces them for tab-URL set maintenance.
- `chrome.runtime.connect` long-lived port (D-17 keepalive): existing offscreen↔SW connection; can carry new SW→offscreen Blob transfer message.
- `src/shared/binary.ts` `blobToBase64` / `base64ToBlob`: reusable wire-format for SW→offscreen Blob transfer (same encoding as Plan 01-07 D-12).
- `chrome.downloads.onChanged` listener (NOT yet wired): Phase 2 adds for `URL.revokeObjectURL` lifecycle.
- JSZip in `createArchive` (line 617): Blob output via `zip.generateAsync({ type: 'blob' })` already produces the right type.
### Established Patterns
- D-06 always-on charter: SAVE creates a new zip; recorder stays in REC. Phase 2's Blob URL pipeline MUST preserve this (no recorder side-effects in downloadArchive).
- Atomic commits per task per plan: Phase 1 precedent; Phase 2 follows.
- UAT harness Approach B (Plan 01-13): page-side assertions + driver wrappers + harness.test.ts orchestrator. Phase 2's A24+ extends this pattern.
- Pre-checkpoint bundle gates per `feedback-pre-checkpoint-bundle-gates.md`: SW CSP grep + Node-globals grep + DOM-globals grep + Tier-1 SW-bundle-import gate + manifest validation before operator checkpoints.
### Integration Points
- saveArchive → createArchive → downloadArchive chain (line 736 → 608 → 695): Blob URL migration changes downloadArchive but createArchive's interface (returns Blob) is preserved.
- Offscreen recorder: existing message handler at `src/offscreen/recorder.ts` listens for `START_RECORDING`, `STOP_RECORDING`; Phase 2 adds a new handler for `CREATE_DOWNLOAD_URL` (or similar) that mints + returns Blob URL.
- meta.json construction in createArchive (line 674-682): D-P2-02 changes `url: ...` to `urls: ...`; SessionMetadata type updates lockstep.
- `chrome.runtime.connect` port: Plan 01-04/07 binary wire-format precedent; Phase 2's Blob → URL transfer rides this OR a new port (planner picks).
</code_context>
<specifics>
## Specific Ideas
- **Real-archive-size assumption:** operators in production may capture sessions yielding 5-10 MB archives (~1.5-5 MB VP9 video + screenshot + rrweb + events + meta). Base64 data: URL has ~2 MB cap; this is the GATING constraint for Blob URL migration. Phase 2's success criterion includes empirically saving a real archive >2 MB without failure.
- **Tab-URL set semantics:** D-P2-02's `urls: string[]` should be DEDUPLICATED + ORDERED (first-seen first). Empty array is acceptable for sessions where no tab events were observed (purely whole-desktop recording without browser-tab interaction). The operator's primary tab at SAVE time should always be in the array if it has a valid URL.
- **URL filter:** chrome-extension://, chrome://, devtools://, about:, file:// URLs may or may not be included. Default: exclude chrome:// and about: (low diagnostic value); INCLUDE chrome-extension:// (so the welcome tab or popup URLs show up if the operator was there). Planner-determined.
</specifics>
<deferred>
## Deferred Ideas
- **REQ-password-confidentiality** — Out of Scope (v1) per 2026-05-20 re-phasing ("log is internal"). Phase 4 (optional) or v2 candidate.
- **rrweb 2.0.0-alpha.4 → stable v2 upgrade research** — Phase 3 (smoke) absorbs this when DOM verification is planned; researcher spawn deferred until then.
- **Audit P1 #11 fetch Request handling** — Phase 4 hardening (operator-event log polish; not export-pipeline scope).
- **Audit P1 #14 navigation URL fix** — Phase 4 hardening.
- **Audit P1 #15 rrweb timestamp semantics** — Phase 4 hardening.
- **Dark-surface logo contrast** — Phase 4 hardening (Plan 01-10 operator observation 2026-05-20).
- **ROADMAP backfill** for Plans 01-08..01-13 entries (Plan 01-13 plan-checker flag #4) — Phase 4 or housekeeping commit.
- **getDisplayMedia cursor visibility** (`video: { cursor: 'always' }`) — Phase 4 hardening (Plan 01-07 operator observation 2026-05-15).
- **setimmediate polyfill `new Function`** in SW chunk via `vite-plugin-node-polyfills` — Phase 4 hardening (verified pre-existing across Phase 1).
### Reviewed Todos (not folded)
None — discussion stayed within phase scope.
</deferred>
## Revision Log
### 2026-05-20 — DEC-011 Amendment 1 + plan-checker iteration 1
- **DEC-011 Amendment 1 (Phase 2 scope addition):** `tabs` permission ADDED to manifest.json
per user direction during plan-checker iteration 1. Justification: Phase 2's D-P2-02
meta.urls feature REQUIRES tab URL visibility beyond active-tab semantics. Audit T-1-02
("declaring unused permissions expands attack surface") is acknowledged but overridden for
this Phase 2 feature; the meta.urls feature is now genuinely USED, so the permission is not
unused. tests/i18n/manifest-i18n.test.ts pins the new 8-entry permission set as a
regression guard. PROJECT.md DEC-011 row rewritten with Amendment 1 prose.
- **Plan 02-04 A27 strict-mode:** harness MUST observe BOTH tab URLs in meta.urls after a
multi-tab session. meta.urls.length >= 2 REQUIRED; test FAILS on length < 2. No
extension-origin sentinels permitted. Empty-tracker case (no browser interaction during
recording) still produces urls:[] per F2 resolution — but A27 explicitly EXERCISES the
multi-tab path so its meta.urls is NEVER empty.
- **createArchive empty-tracker fallback removed (F2):** tracker.getTabUrlsSeen() returning
empty array is meaningful (whole-desktop-no-tab session) and meta.urls: [] is the canonical
representation. tests/background/meta-json-urls-schema.test.ts adds Test 5 pinning this
contract; tests/build/strict-meta-json-validation.test.ts Test 3 relaxed to PERMIT empty
urls[] (still validates URL format on non-empty arrays). createArchive calls a new
snapshotOpenTabs() helper (chrome.tabs.query({}) defensive enumeration via DEC-011
Amendment 1) BEFORE reading getTabUrlsSeen() so any tab the operator opened but never
activated is still captured. Empty array IS the result when no tabs are open at SAVE time.
- **Plan 02-04 A28 added (F1):** REQ-archive-layout strict zip-layout pin. Harness driver
enumerates zip entries and asserts EXACTLY 5 paths (`video/last_30sec.webm`,
`rrweb/session.json`, `logs/events.json`, `screenshot.png`, `meta.json`). Cross-references
REQ-archive-layout + REQ-popup-ui + REQ-screenshot-on-export. UAT target: 28→29 GREEN.
- **Plan 02-02 Task 3 F3 resolved:** OR-deliberation in createArchive prose struck. The
resolved-only contract is: throw typed `blob-url-mint-failed` error on empty/timeout
response from offscreen; NO data: URL fallback for any archive size.
- **Plan 02-02 Task 3 F4 resolved:** verify block extended with full-suite `npm test` after
the focused-test runs so unrelated regressions surface during execution.
---
*Phase: 2-stabilize-export-pipeline*
*Context gathered: 2026-05-20*
*Revised: 2026-05-20 (plan-checker iteration 1)*