Milestone v1 (v2.0.0): Mokosh — Session Capture #1

Merged
strategy155 merged 297 commits from gsd/phase-04-harden-clean-up-optional into main 2026-05-31 15:34:17 +00:00
2 changed files with 144 additions and 2 deletions
Showing only changes of commit cc042a5583 - Show all commits

View File

@@ -3,8 +3,8 @@ gsd_state_version: 1.0
milestone: v2.0.0
milestone_name: milestone
status: executing
stopped_at: Phase 1 FULLY CLOSED 2026-05-20. Roadmap RE-PHASED 2026-05-20 per charter shift — original Phase 2 (DOM/event-capture privacy) REMOVED; DOM/event-log verification absorbed by new Phase 3 (SPEC §10 smoke); REQ-password-confidentiality moved to Out of Scope (v1) per "we don't care about privacy hardening. At least here." Total phases now 4. Phase 2 (export pipeline) discussion next.
last_updated: "2026-05-20T14:30:00.000Z"
stopped_at: Phase 2 (Stabilize export pipeline) discuss-phase complete 2026-05-20. 02-CONTEXT.md captures 3 locked decisions — D-P2-01 offscreen-minted Blob URL pipeline (closes P0-6); D-P2-02 meta.json schema migrates url→urls[]; D-P2-03 full scope ~3-4 plans. Next: /gsd-plan-phase 2.
last_updated: "2026-05-20T15:00:00.000Z"
last_activity: 2026-05-20
progress:
total_phases: 4

View File

@@ -0,0 +1,142 @@
# Phase 2: Stabilize export pipeline - Context
**Gathered:** 2026-05-20
**Status:** Ready for planning
<domain>
## Phase Boundary
The export pipeline — saveArchive coordination of video buffer fetch, screenshot capture, content-script rrweb messaging, JSZip assembly, and Blob URL download. Phase 2 closes the audit's last functional export defect (P0-6: base64→Blob URL migration), corrects the meta.json captured-URL bug (P1 #10), and locks the meta.json schema with strict validation + harness <5s latency assertion.
Most of REQ-popup-ui + REQ-archive-layout + REQ-screenshot-on-export was substantively shipped via Plans 01-08 (webm-remux + JSZip), 01-09 (popup state machine + SAVE-only UI), and 01-12 (manifest i18n). Phase 2 scope is the residual gaps, NOT a from-scratch implementation.
**2026-05-20 re-phasing context:** Original ROADMAP Phase 3 (Stabilize export pipeline) renumbered to Phase 2 after the original Phase 2 (DOM/event-capture privacy) was removed entirely per the "log is internal" charter shift. REQ-rrweb-dom-buffer + REQ-user-event-log verification absorbed by new Phase 3 (smoke). REQ-password-confidentiality moved to Out of Scope (v1).
</domain>
<decisions>
## Implementation Decisions
### Blob URL migration architecture (P0-6 fix)
- **D-P2-01: Offscreen-minted Blob URL pipeline.** Replace the current base64 data: URL download path (src/background/index.ts:709-710) with: SW packages zip via JSZip → SW sends Blob to offscreen via port → offscreen calls `URL.createObjectURL(blob)` → offscreen sends URL back to SW → SW calls `chrome.downloads.download({ url, filename })` → SW `onChanged` listener triggers `URL.revokeObjectURL` after download completes. Matches the original audit P0-6 recommendation. Properly unblocks any archive size (real bug-report archives are ~5-10 MB; current base64 path hits Chrome's ~2 MB data-URL cap).
- **Rationale:** user direction "up to you. If you think we need to migrate — good let's do it." Combined with analysis: real archives EXCEED base64 limit; without migration the canonical use case breaks; offscreen has `URL.createObjectURL` (SW does not — DEC-006 binding).
- **Port-transfer plumbing:** existing offscreen↔SW port from Plan 01-04/07 (D-12 binary wire format at `src/shared/binary.ts` blobToBase64/base64ToBlob); needs new message types for Blob transfer (reverse direction from existing SW-bound video segments). Likely uses base64 wire-encoding for the Blob (same pattern as video segments) OR direct Blob via port (since both contexts are in the same extension origin).
### meta.json captured-URL fix (P1 #10)
- **D-P2-02: meta.json schema migrates from singular `url: string` to plural `urls: string[]`.** Captures all tabs visible during the 30s recording window. Captures the operator's full multi-tab context, not just the active-at-save tab.
- **Schema-breaking change** — CON-meta-json-schema currently mandates 7 fields including `url: string`. Phase 2 MUST amend REQUIREMENTS.md REQ-meta-json-schema to reflect the new shape (8 fields, `urls: string[]` replacing `url: string`).
- **Tab-tracking infrastructure:** SW needs to maintain a Set of tab URLs seen during the 30s window. Simplest: chrome.tabs.onUpdated + chrome.tabs.onActivated listeners maintain `tabUrlsSeen: Set<string>`; pruned alongside the video segment ring buffer. Alternative: query chrome.windows.getAll + iterate tabs at SAVE time (simpler but doesn't capture history during the window).
- **Rationale:** user direction "All tabs' URLs as an array (meta.json.urls)" — highest informational fidelity for multi-tab bug-reproduction workflows.
- **Privacy note:** since "log is internal" per re-phasing, tab URLs may include sensitive operator state — acceptable per current charter.
### Phase 2 scope tightness
- **D-P2-03: Full Phase 2 scope.** Three workstreams:
1. Blob URL migration (D-P2-01)
2. meta.json `urls` array migration + REQUIREMENTS.md schema amendment (D-P2-02)
3. meta.json strict schema validation (new test file: ISO-8601 timestamp regex, urls array non-empty, version semver, totalEvents non-negative integer, 8 fields exact)
4. UAT harness <5s latency assertion (A24+ extension, consistent with Phase 1's Approach-B harness pattern)
- **Rationale:** user direction "Full scope: bug fixes + schema + harness latency assertion". Most comprehensive Phase 2 closure.
- **Estimated plans:** ~3-4 plans. Wave structure likely: Wave 0 RED tests → Wave 1 meta.urls + tab-tracking infra → Wave 2 Blob URL pipeline (offscreen + SW + port plumbing) → Wave 3 schema validation + harness extension → Wave 4 operator empirical checkpoint.
### Claude's Discretion
- Wave structure + plan granularity (planner figures out)
- Tab-tracking infrastructure exact implementation (researcher may scope; SW state vs onActivated-based)
- Blob transfer wire format choice (base64 vs direct Blob port message) — both have precedent in Plan 01-07 D-12; planner picks
- Specific harness assertion shapes for A24+ (planner figures out)
- Whether to spawn researcher for Blob URL implementation patterns OR proceed directly to planner (likely direct: implementation is well-understood)
- Whether to bundle the offscreen Blob URL change with the SW download change in a single commit OR atomic per-step
- README.md updates if the archive size limit lifts (informational only)
</decisions>
<canonical_refs>
## Canonical References
**Downstream agents MUST read these before planning or implementing.**
### Roadmap + Requirements
- `.planning/ROADMAP.md` §"Phase 2: Stabilize export pipeline" — phase boundary, depends_on Phase 1, scope note from 2026-05-20 re-phasing
- `.planning/REQUIREMENTS.md` — REQ-popup-ui, REQ-screenshot-on-export, REQ-archive-layout, REQ-meta-json-schema (needs amendment for `urls`), REQ-archive-export-latency, REQ-manifest-permissions (Complete)
- `.planning/PROJECT.md` — DEC-005 (JSZip), DEC-006 (chrome.downloads), DEC-008 (screenshot via captureVisibleTab)
### Audit + Original Defects
- `/home/parf/.claude/plans/dear-claude-there-is-snazzy-fox.md` (the original manifest.zip audit) — P0-6 (base64 download), P1 #10 (meta.json url + version), P1 #11 (fetch Request handling — moved to Phase 4 per re-phasing)
### Source code surfaces
- `src/background/index.ts` saveArchive (line 736), createArchive (line 608), downloadArchive (line 695), captureScreenshot (line 568), meta.json construction (line 674-682)
- `src/shared/binary.ts` — base64ToBlob / blobToBase64; existing wire-format for offscreen→SW; Phase 2 may reuse for SW→offscreen reverse direction OR introduce a new message type
- `src/offscreen/recorder.ts` — current offscreen context; Phase 2 adds Blob URL minting handler
- `src/shared/types.ts` — SessionMetadata interface; Phase 2 amends `url: string``urls: string[]`
### Plan precedents
- `.planning/phases/01-stabilize-video-pipeline/01-08-SUMMARY.md` — webm-remux + JSZip integration
- `.planning/phases/01-stabilize-video-pipeline/01-09-SUMMARY.md` — SAVE-only popup + SAVE_ARCHIVE message channel; Amendment 3 always-on charter
- `.planning/phases/01-stabilize-video-pipeline/01-07-SUMMARY.md` — D-12 base64 wire-format precedent for port-transfer plumbing
- `.planning/phases/01-stabilize-video-pipeline/01-13-SUMMARY.md` — Approach B UAT harness pattern for A24+ extension
### Architectural constraints (from Phase 1 SUMMARYs)
- Never `await import(...)` in `src/background/index.ts` (Plan 01-11 SUMMARY)
- Test-mode symbols stay in dist-test/ only via `__MOKOSH_UAT__` define-token (Plan 01-11 SUMMARY)
- Tier-1 FORBIDDEN_HOOK_STRINGS gate (currently 12 entries; Phase 2 may add A24+ entries — must update lockstep)
- Pre-checkpoint bundle gates per saved memory `feedback-pre-checkpoint-bundle-gates.md` before any operator checkpoint
</canonical_refs>
<code_context>
## Existing Code Insights
### Reusable Assets
- `chrome.tabs.query` + `chrome.tabs.captureVisibleTab`: already wired in `captureScreenshot` (line 568). Tab-tracking for D-P2-02 can reuse the active-tab query pattern.
- `chrome.tabs.onUpdated` + `chrome.tabs.onActivated` listeners: not currently used in SW; D-P2-02 introduces them for tab-URL set maintenance.
- `chrome.runtime.connect` long-lived port (D-17 keepalive): existing offscreen↔SW connection; can carry new SW→offscreen Blob transfer message.
- `src/shared/binary.ts` `blobToBase64` / `base64ToBlob`: reusable wire-format for SW→offscreen Blob transfer (same encoding as Plan 01-07 D-12).
- `chrome.downloads.onChanged` listener (NOT yet wired): Phase 2 adds for `URL.revokeObjectURL` lifecycle.
- JSZip in `createArchive` (line 617): Blob output via `zip.generateAsync({ type: 'blob' })` already produces the right type.
### Established Patterns
- D-06 always-on charter: SAVE creates a new zip; recorder stays in REC. Phase 2's Blob URL pipeline MUST preserve this (no recorder side-effects in downloadArchive).
- Atomic commits per task per plan: Phase 1 precedent; Phase 2 follows.
- UAT harness Approach B (Plan 01-13): page-side assertions + driver wrappers + harness.test.ts orchestrator. Phase 2's A24+ extends this pattern.
- Pre-checkpoint bundle gates per `feedback-pre-checkpoint-bundle-gates.md`: SW CSP grep + Node-globals grep + DOM-globals grep + Tier-1 SW-bundle-import gate + manifest validation before operator checkpoints.
### Integration Points
- saveArchive → createArchive → downloadArchive chain (line 736 → 608 → 695): Blob URL migration changes downloadArchive but createArchive's interface (returns Blob) is preserved.
- Offscreen recorder: existing message handler at `src/offscreen/recorder.ts` listens for `START_RECORDING`, `STOP_RECORDING`; Phase 2 adds a new handler for `CREATE_DOWNLOAD_URL` (or similar) that mints + returns Blob URL.
- meta.json construction in createArchive (line 674-682): D-P2-02 changes `url: ...` to `urls: ...`; SessionMetadata type updates lockstep.
- `chrome.runtime.connect` port: Plan 01-04/07 binary wire-format precedent; Phase 2's Blob → URL transfer rides this OR a new port (planner picks).
</code_context>
<specifics>
## Specific Ideas
- **Real-archive-size assumption:** operators in production may capture sessions yielding 5-10 MB archives (~1.5-5 MB VP9 video + screenshot + rrweb + events + meta). Base64 data: URL has ~2 MB cap; this is the GATING constraint for Blob URL migration. Phase 2's success criterion includes empirically saving a real archive >2 MB without failure.
- **Tab-URL set semantics:** D-P2-02's `urls: string[]` should be DEDUPLICATED + ORDERED (first-seen first). Empty array is acceptable for sessions where no tab events were observed (purely whole-desktop recording without browser-tab interaction). The operator's primary tab at SAVE time should always be in the array if it has a valid URL.
- **URL filter:** chrome-extension://, chrome://, devtools://, about:, file:// URLs may or may not be included. Default: exclude chrome:// and about: (low diagnostic value); INCLUDE chrome-extension:// (so the welcome tab or popup URLs show up if the operator was there). Planner-determined.
</specifics>
<deferred>
## Deferred Ideas
- **REQ-password-confidentiality** — Out of Scope (v1) per 2026-05-20 re-phasing ("log is internal"). Phase 4 (optional) or v2 candidate.
- **rrweb 2.0.0-alpha.4 → stable v2 upgrade research** — Phase 3 (smoke) absorbs this when DOM verification is planned; researcher spawn deferred until then.
- **Audit P1 #11 fetch Request handling** — Phase 4 hardening (operator-event log polish; not export-pipeline scope).
- **Audit P1 #14 navigation URL fix** — Phase 4 hardening.
- **Audit P1 #15 rrweb timestamp semantics** — Phase 4 hardening.
- **Dark-surface logo contrast** — Phase 4 hardening (Plan 01-10 operator observation 2026-05-20).
- **ROADMAP backfill** for Plans 01-08..01-13 entries (Plan 01-13 plan-checker flag #4) — Phase 4 or housekeeping commit.
- **getDisplayMedia cursor visibility** (`video: { cursor: 'always' }`) — Phase 4 hardening (Plan 01-07 operator observation 2026-05-15).
- **setimmediate polyfill `new Function`** in SW chunk via `vite-plugin-node-polyfills` — Phase 4 hardening (verified pre-existing across Phase 1).
### Reviewed Todos (not folded)
None — discussion stayed within phase scope.
</deferred>
---
*Phase: 2-stabilize-export-pipeline*
*Context gathered: 2026-05-20*