Milestone v1 (v2.0.0): Mokosh — Session Capture #1

Merged
strategy155 merged 297 commits from gsd/phase-04-harden-clean-up-optional into main 2026-05-31 15:34:17 +00:00
2 changed files with 299 additions and 0 deletions
Showing only changes of commit dba51ea233 - Show all commits

View File

@@ -0,0 +1,191 @@
# Phase 3: SPEC §10 smoke verification + DOM/event-log verification - Context
**Gathered:** 2026-05-20
**Status:** Ready for planning
<domain>
## Phase Boundary
End-to-end verification phase: all 9 SPEC §10 acceptance criteria must pass against an unpacked load of the build into a real Chrome instance. ABSORBS DOM (§10 #4 rrweb) + event-log (§10 #5) verification originally scoped for the removed-2026-05-20 original Phase 2.
Phase 3 is verification-heavy with a thin slice of UAT harness extension (Approach B pattern from Plan 01-13 / 02-04). No new production-code feature work; the rrweb wiring + event-log wiring already shipped in Phase 1 (`src/content/index.ts`). Phase 3 confirms what's shipped + writes the §10 sweep VERIFICATION.
**In scope:**
- Verify rrweb 2.0.0-alpha.4 records DOM events without errors on typical pages (form + table + modal) per §10 #4
- Verify event log captures click + input (non-password) + navigation (popstate/hashchange) + js_error + network_error per §10 #5 + CON-event-log-schema
- Verify SPEC §10 #8 *partially* via the existing `target.type === 'password'` filter at src/content/index.ts:82 (no new masking; charter-aligned)
- Verify SPEC §10 #9 RAM ≤ 50 MB via operator/alpha-tester observation (best-effort; no programmatic measurement)
- Full §10 sweep VERIFICATION.md aggregating Phase 1 + 2 + 3 evidence
- UAT harness extension following Plan 01-13 / 02-04 Approach B (A29+ assertions)
**Out of scope:**
- rrweb v2 stable upgrade — DEFER TO PHASE 4 (alpha-pin stable across 9 plans + 29/29 UAT GREEN; ROADMAP "Deferred to Phase 4 if Phase 3 plans are tight" honored)
- Full rrweb maskInputFn + data-sensitive guards — REQ-password-confidentiality is Out of Scope v1 per charter shift "we don't care about privacy hardening" 2026-05-20
- Programmatic RAM measurement (puppeteer.Page.metrics, chrome.devtools Memory API) — DEFER TO PHASE 4 if best-effort operator observation surfaces concern
- Any new production-surface feature work — Phase 3 is verification, not implementation
</domain>
<decisions>
## Implementation Decisions
### Phase 3 plan structure (5 plans, tightly scoped)
- **D-P3-01:** Phase 3 = exactly 5 atomic plans, each scoped per the per-area decisions below:
- **03-01** rrweb DOM verification harness extension (§10 #4)
- **03-02** event-log verification harness extension (§10 #5)
- **03-03** §10 #8 password-filter verification (verify existing minimum; PARTIAL mark in VERIFICATION)
- **03-04** §10 #9 RAM ceiling best-effort (operator instructions in VERIFICATION + optional harness scaffolding via puppeteer.Page.metrics if practical without research budget)
- **03-05** full §10 sweep VERIFICATION.md aggregating §10 #1-#9 evidence across Phase 1 + 2 + 3
- **Rationale:** ROADMAP explicitly notes "Deferred to Phase 4 if Phase 3 plans are tight"; user explicitly chose 5-plan structural granularity over 3-plan thin OR 5-plan-with-rrweb-v2-upgrade-thick. Tight per-plan scoping per Q2/Q3/Q4 individual answers.
### SPEC §10 #8 password handling (charter tension surface)
- **D-P3-02:** §10 #8 = verify existing minimum + PARTIAL mark in VERIFICATION.md
- `src/content/index.ts:82` already has `if (target.type === 'password') return;` skip in setupInputLogging — Plan 03-03 ships a harness assertion that VERIFIES this filter fires (synthetic password input on a probe page + grep events.json for absence of the entered value).
- VERIFICATION.md marks SPEC §10 #8 as PARTIAL with explicit charter citation: "Full masking deferred per 'we don't care about privacy hardening. At least here.' 2026-05-20 charter + REQ-password-confidentiality moved to Out of Scope v1."
- **NOT in scope:** rrweb v2 `maskInputFn`, `data-sensitive` HTML attribute guards, full §10 #8 closure — all deferred to Phase 4 if charter reverses.
- **Rationale:** Honor charter literally while preserving the existing defense-in-depth. The filter is already there; verifying it doesn't expand scope and produces a real partial-mitigation record.
### rrweb v2 upgrade scope (defer)
- **D-P3-03:** rrweb 2.0.0-alpha.4 stays pinned through Phase 3. Upgrade research + implementation DEFER TO PHASE 4 hardening.
- **Rationale:** rrweb alpha-pin has been stable across 5 Phase 1 plans + 4 Phase 2 plans + 29/29 UAT GREEN end-to-end. No known production bugs in current pin. Semver-major upgrade to stable v2 is breaking surface (`maskInputFn` API change is one known migration); risk-vs-reward favors deferring to Phase 4 when more harness time + research budget is available.
- Phase 4 candidate task: gsd-phase-researcher spawn to verify alpha-pin safety + check stable v2 ship status; if stable, schedule upgrade plan.
### RAM ceiling §10 #9 verification approach (best-effort)
- **D-P3-04:** §10 #9 = best-effort + operator instructions in VERIFICATION.md
- VERIFICATION.md marks §10 #9 as N/A-in-harness with explicit operator instruction text: "Load extension; idle 5 min; open chrome://memory-internals (or chrome://extensions → service-worker memory display); verify extension background RAM < 50 MB."
- Alpha distribution build covers real-world observation across multiple operator profiles.
- **Optional scaffolding** (Plan 03-04 may include if practical without research budget): puppeteer.Page.metrics() returning JSHeapUsedSize for SW + offscreen context. If straightforward, add as A29-style assertion; if research-heavy, skip and defer to Phase 4.
- **Rationale:** Honors saved memory feedback-trust-harness-over-manual-uat.md (manual surface limited to genuine non-automatable case). chrome.devtools Memory API is full-fat measurement but is research-heavy + out of charter for verification-phase scope.
### Claude's Discretion
- **Harness assertion numbering:** A29+ for new Phase 3 assertions (continues A24..A28 sequence from Phase 2 Plan 02-04). Exact A-numbers per plan are planner-side.
- **Probe page composition** (Plan 03-01): synthetic HTML inline in extension-page-harness.ts (form + table + modal) vs real-world page navigation — planner's call based on existing harness page patterns.
- **Event-log trigger strategy** (Plan 03-02): synthetic browser events injected via Puppeteer page.click / page.type vs natural extension lifecycle observation — planner's call; both are valid Approach B applications.
- **Sequencing** (Plan 03-05 vs 03-01..04): Plan 03-05 is the synthesis plan; runs after 03-01..04. Whether 03-01..04 can parallelize within wave 2 depends on `files_modified` overlap audit at plan time (per Phase 2 Wave 2 lesson — plan-checker should catch overlaps).
</decisions>
<canonical_refs>
## Canonical References
**Downstream agents MUST read these before planning or implementing.**
### Phase scope + acceptance source
- `.planning/ROADMAP.md` §"Phase 3" — phase goal + dependencies + absorbed Phase-2 scope + P0-7 framing
- `.planning/REQUIREMENTS.md` §"DOM Capture" (REQ-rrweb-dom-buffer) + §"Event Logging" (REQ-user-event-log) + §"Phase 1 Acceptance Criteria (SPEC §10 verbatim)" §10 #1-#9 + §"Traceability" table
- `.planning/PROJECT.md` §"Validated" (Phase 1 + 2 closure record) + §"Key Decisions" DEC-001 to DEC-012 + Amendment 1 (tabs permission for Phase 2)
- `Тз расширение фаза1.md` (if present at repo root or `.planning/intel/`) — original Russian SPEC §10 source
### Prior-phase verification + harness pattern
- `.planning/phases/01-stabilize-video-pipeline/01-VERIFICATION.md` — Phase 1 verifier audit GREEN 17/17; prior must-haves coverage; D-13 restart-segments architecture preserved
- `.planning/phases/02-stabilize-export-pipeline/02-VERIFICATION.md` — Phase 2 verifier audit PASSED 5/5; T5 override pattern (user delegation + harness coverage); use as VERIFICATION.md structural template for Plan 03-05
- `.planning/phases/01-stabilize-video-pipeline/01-13-SUMMARY.md` — Approach B harness ESTABLISHED (assertA* + driveA* + orchestrator pattern); Tier-1 FORBIDDEN_HOOK_STRINGS inventory baseline
- `.planning/phases/01-stabilize-video-pipeline/01-12-SUMMARY.md` — UAT harness 16→21 GREEN extension precedent (A18-A22)
- `.planning/phases/01-stabilize-video-pipeline/01-14-SUMMARY.md` — A23 single-assertion plan precedent
- `.planning/phases/02-stabilize-export-pipeline/02-04-SUMMARY.md` — A24-A28 harness extension + bundle gates precedent (most recent Approach B plan; closest analog for Phase 3 plans)
### Production code surfaces being verified (read-only for Phase 3)
- `src/content/index.ts` — rrweb wiring (line 1 import + line 20-41 buffer cleanup + line 284-311 init + line 82 password filter); event log via setupInputLogging + click + navigation + fetch/XHR/error capture
- `src/background/index.ts` §"GET_RRWEB_EVENTS" handler — SW pulls events from content script at SAVE time
- `src/shared/types.ts` — EventWithTime, UserEvent (from rrweb + project types)
- `package.json` line 11 — `"rrweb": "^2.0.0-alpha.4"` (pin source of D-P3-03 defer rationale)
### UAT harness extension surfaces (Phase 3 modifies these)
- `tests/uat/extension-page-harness.ts` — page-side `assertA*` host
- `tests/uat/lib/harness-page-driver.ts` — host-side `driveA*` host
- `tests/uat/harness.test.ts` — orchestrator wiring
- `tests/uat/lib/assertions.ts` — shared harness assertion helpers
- `tests/uat/lib/launch.ts` — Puppeteer Chrome launch + extension load
- `tests/uat/lib/zip.ts` — jszip-based archive parsing helpers
- `tests/uat/lib/test-hook-contract.d.ts` — Tier-1 hook surface contract
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` — FORBIDDEN_HOOK_STRINGS inventory (currently 12 + Plan 02-04 additions)
### Decision/operating-principle references
- `/home/parf/.claude/projects/-home-parf-projects-work-repremium/memory/feedback-trust-harness-over-manual-uat.md` — when harness comprehensively covers checklist surfaces, run it instead of asking operator (T5-style override pattern; use for §10 #4/#5/#7 closure; §10 #9 RAM is the genuine exception)
- `/home/parf/.claude/projects/-home-parf-projects-work-repremium/memory/feedback-pre-checkpoint-bundle-gates.md` — pre-checkpoint bundle gates (6/6 standard inventory: Tier-1 grep + SW CSP + Node-globals + DOM-globals + manifest + en↔ru parity) — apply before any operator step
- `/home/parf/.claude/projects/-home-parf-projects-work-repremium/memory/feedback-gsd-ceremony-for-fixes.md` — debug discoveries route through /gsd-debug, never hot-edits
- `/home/parf/.claude/projects/-home-parf-projects-work-repremium/memory/feedback-no-unilateral-scope-reduction.md` — user validates every scope decision; surface broadest scope; don't pre-filter
### Charter-shift provenance
- 2026-05-20 charter shift "we don't care about privacy hardening" → REQ-password-confidentiality moved Out of Scope v1; original-Phase-2 (DOM/privacy) removed; renumbering documented in `.planning/STATE.md` Session Continuity + `.planning/ROADMAP.md` re-phasing commit `6dbed91`
</canonical_refs>
<code_context>
## Existing Code Insights
### Reusable Assets
- **UAT harness Approach B**: page-side `assertA*` (named `assertA29`, `assertA30`, ... per Phase 3 plan numbering) + host-side `driveA*` + `harness.test.ts` orchestrator. Plan 02-04 extended this 24→29 with 5 new assertions; Plan 03-01..03-04 extend 29→3X further. Pattern is mature; planner can replicate verbatim.
- **JSZip archive parsing** (`tests/uat/lib/zip.ts`): jszip-based reading; Plan 02-04 used for A28 set-equality. Plan 03-01 / 03-02 may read events.json + session.json from inside the assembled archive for verification.
- **Puppeteer page.evaluate + page.type + page.click** (per Plan 02-04 `driveA*` precedent): standard surface for synthetic event injection (Plan 03-02 trigger source).
- **chrome.i18n.getMessage fallback pattern** (Plan 01-12 + 01-10): `chrome.i18n.getMessage('<key>') || '<en-const>'` — Plan 03-* may need operator-facing copy for VERIFICATION.md instructions; keep i18n-safe.
### Established Patterns
- **Tier-1 FORBIDDEN_HOOK_STRINGS lockstep**: new harness probes that ride existing production surfaces (data-mks-*) are NOT hooks. New test-only surfaces gated by `__MOKOSH_UAT__` Vite-define-token go into FORBIDDEN_HOOK_STRINGS and must be hook-free in production bundle. Plan 02-04 stayed at 12 entries because A24-A28 used production surfaces only; Phase 3 should follow same pattern.
- **Pre-checkpoint bundle gates** per saved memory: SW CSP-safety + Node-globals + DOM-globals + Tier-1 grep + manifest validation + en↔ru parity. Phase 3 doesn't introduce new bundle surfaces (verification-only), but the gates still apply before Plan 03-05 closes the phase.
- **VERIFICATION.md override pattern** (per Plan 02 T5 override): when verifier returns `human_needed` but harness already covers the surface, document override in `overrides_applied` + `override_notes` fields with explicit user-delegation citation + saved-memory reference. Use this pattern for §10 #9 RAM (genuinely manual) explicitly, NOT for §10 #4/#5/#8 (harness coverage exists).
- **rrweb cleanup mechanics** (src/content/index.ts:20-41): rolling 10-min window, 5000-event cap, oldest-dropped-on-overflow. Phase 3 verifies these contracts via Plan 03-01 harness assertions.
- **CON-event-log-schema** (REQUIREMENTS.md §"Event Logging" + intel): 5 event types (click, input, navigation, js_error, network_error). Plan 03-02 verifies all 5 are captured during a synthetic session.
### Integration Points
- **harness.test.ts orchestrator**: Plan 03-* adds new harness orchestration paths; existing A0-A28 must remain GREEN (no regression).
- **`__MOKOSH_UAT__` Vite-define-token**: shared between Plan 01-13 + 02-04; Phase 3 follows same convention if adding test-only surfaces.
- **`dist/` build output**: Phase 3 harness runs against `dist/` after `npm run build`; SKIP_PROD_REBUILD=1 + HEADLESS=1 env flags per Plan 02-04 precedent.
- **`chrome://extensions/` service-worker memory display**: Plan 03-04 operator instructions cite this; familiar surface from Plan 01-10 operator UAT cycles.
</code_context>
<specifics>
## Specific Ideas
- **Probe page candidates for §10 #4** (planner discretion): form (input variety) + table (tbody/tr/td rendering) + modal (z-index + focus trap). Real-world example: https://example.com is too minimal; https://www.iana.org has a form. Synthetic HTML in extension-page-harness.ts is cleanest (no external network dependency in CI).
- **Synthetic password input for §10 #8**: standard `<input type="password">` element. Plan 03-03 types a non-trivial string ("secret-do-not-log-123") + saves archive + greps events.json for absence. Negative-assertion test pattern.
- **Synthetic event triggers for §10 #5**: page.click on a button (click event) + page.type on an input (input event) + page.goto/page.reload (navigation) + window.dispatchEvent(new ErrorEvent('error', ...)) (js_error) + fetch to a 404 endpoint (network_error). All 5 event types in one drive.
- **VERIFICATION.md §10 aggregator format** (Plan 03-05): replicate Phase 1 + Phase 2 VERIFICATION.md structure (frontmatter status + score + must_haves table); add §10 #1-#9 row-by-row evidence column with Phase-citation + plan-citation + commit-citation.
- **Alpha re-distribution after Phase 3**: build/zip pattern from Plan 01-10 closure (mokosh-build-2026-05-20-6dbed91.zip + INSTALL.md). User may want refreshed zip for testers after Phase 3 closes (separate workstream; not Phase 3 plan).
</specifics>
<deferred>
## Deferred Ideas
### To Phase 4 (Hardening + clean up — optional)
- **rrweb v2 stable upgrade** — research + implementation (D-P3-03 defer rationale); gsd-phase-researcher spawn to verify alpha-pin safety + stable v2 ship status; if stable, schedule upgrade plan
- **Programmatic RAM measurement** — puppeteer.Page.metrics() OR chrome.devtools Memory API (D-P3-04 defer rationale); upgrade from best-effort if alpha-tester observation surfaces concern
- **REQ-password-confidentiality v2 candidate** — full rrweb v2 maskInputFn + data-sensitive HTML attribute guards (D-P3-02 defer rationale); only if charter reverses ("we don't care about privacy hardening" reversal)
- **Audit P1 #11/#14/#15 polish** — fetch Request→[object Request], navigation URL tracking, rrweb timestamp semantics; pre-existing Phase 4 backlog
- **2 pre-existing ffprobe/ffmpeg vitest flakes** — Phase 4 hardening list
- **getDisplayMedia cursor visibility refinement** (Plan 01-07 operator observation 2026-05-15) — Phase 4 hardening
- **Dark-surface logo contrast** (Plan 01-10 operator observation 2026-05-20) — Phase 4 hardening
- **setimmediate polyfill `new Function` in SW chunk** via vite-plugin-node-polyfills (Plan 01-12 disclosure) — Phase 4 hardening; logged at .planning/phases/01-stabilize-video-pipeline/deferred-items.md
- **ROADMAP backfill for Plans 01-08..01-13 entries** (Plan 01-13 plan-checker flag #4) — Phase 4 docs polish
### To v2 / SRV milestone (per SPEC §9 Out of Scope)
- Server upload of captured archives (SRV-01)
- AI-driven diagnostics (SRV-02)
- Automatic ticket creation (SRV-03)
- Analytics dashboard (SRV-04)
- Audio recording (CAP-01)
</deferred>
---
*Phase: 03-spec-10-smoke-verification-dom-event-log-verification*
*Context gathered: 2026-05-20*

View File

@@ -0,0 +1,108 @@
# Phase 3: SPEC §10 smoke verification + DOM/event-log verification - Discussion Log
> **Audit trail only.** Do not use as input to planning, research, or execution agents.
> Decisions are captured in CONTEXT.md — this log preserves the alternatives considered.
**Date:** 2026-05-20
**Phase:** 03-spec-10-smoke-verification-dom-event-log-verification
**Areas discussed:** Phase 3 plan structure & breadth, SPEC §10 #8 password handling, rrweb v2 upgrade scope, RAM ceiling §10 #9 verification approach
---
## Gray area selection (multiSelect)
User picked all 4 gray areas presented — consistent with feedback-no-unilateral-scope-reduction memory pattern.
---
## Phase 3 plan structure & breadth
| Option | Description | Selected |
|--------|-------------|----------|
| Thin — 3 plans | rrweb DOM harness + event-log harness + §10 sweep VERIFICATION. rrweb v2 + RAM slip to Phase 4. | |
| Standard — 4 plans | Thin + 1 of (rrweb v2 upgrade OR RAM verification). | |
| Thick — 5 plans | Standard + both deferred items (rrweb v2 upgrade + RAM programmatic measurement). | ✓ |
**User's choice:** Thick — 5 plans (with clarification — see follow-up below)
**Follow-up clarification (composition mismatch with Q3/Q4):**
Q1 "Thick" literal reading would include rrweb v2 upgrade + RAM programmatic measurement as separate plans. But Q3 chose "Defer rrweb v2 to Phase 4" and Q4 chose "RAM = best-effort + operator note". Spirit interpretation: 5 atomic plans for structural granularity, with each plan tightly scoped per Q2/Q3/Q4.
| Sub-option | Description | Selected |
|--------|-------------|----------|
| Spirit interpretation | 5 atomic plans tightly scoped per Q2/Q3/Q4 (rrweb DOM harness + event-log harness + §10 #8 verify + §10 #9 best-effort + §10 sweep VERIFICATION). | ✓ |
| Literal Q1 reading | 5 plans incl. rrweb v2 upgrade + RAM programmatic — Q3/Q4 misclicks. | |
| Compromise mix | 5 plans with rrweb v2 RESEARCH ONLY (no upgrade) + RAM best-effort folded into §10 sweep. | |
**Notes:** Spirit interpretation locked. Final 5-plan composition documented in CONTEXT.md D-P3-01.
---
## SPEC §10 #8 password handling (charter tension surface)
| Option | Description | Selected |
|--------|-------------|----------|
| Verify existing minimum + mark partial | Harness assertion verifies existing target.type==='password' filter at src/content/index.ts:82; mark PARTIAL in VERIFICATION.md with charter citation. | ✓ |
| Explicit OUT OF SCOPE | No harness assertion; existing filter is incidental defense-in-depth. | |
| Full mitigation (rrweb maskInputFn) | Wire rrweb v2 maskInputFn + extend filter to data-sensitive guards; contradicts charter. | |
**User's choice:** Verify existing minimum + mark partial (Recommended)
**Notes:** Honor charter literally ("we don't care about privacy hardening" 2026-05-20) while preserving existing defense-in-depth. Plan 03-03 verifies the existing filter without expanding scope. VERIFICATION.md marks #8 as PARTIAL with charter citation.
---
## rrweb v2 upgrade scope
| Option | Description | Selected |
|--------|-------------|----------|
| Defer to Phase 4 | rrweb 2.0.0-alpha.4 stable across 9 plans; defer research + upgrade to Phase 4 hardening. | ✓ |
| Research only in Phase 3 (light) | gsd-phase-researcher spawn to verify alpha-pin safety + stable v2 ship status; no code change. | |
| Research + upgrade in Phase 3 (full) | Research + upgrade now; 1 extra plan; risks regression. | |
**User's choice:** Defer to Phase 4 (Recommended)
**Notes:** alpha-pin proven stable through Phase 1 + Phase 2 + 29/29 UAT GREEN. ROADMAP explicitly notes "Deferred to Phase 4 if Phase 3 plans are tight" — honored.
---
## RAM ceiling §10 #9 verification approach
| Option | Description | Selected |
|--------|-------------|----------|
| Best-effort + operator/alpha note | Mark §10 #9 N/A-in-harness in VERIFICATION.md with operator instructions for chrome://memory-internals or chrome://extensions service-worker memory display; alpha-tester observation. | ✓ |
| Programmatic via puppeteer.Page.metrics | Add A29 harness assertion: puppeteer.Page.metrics() returns JSHeapUsedSize/JSHeapTotalSize; assert < 50 MB. | |
| chrome.devtools Memory API — full path | Most accurate; longest dev path; research-heavy. | |
**User's choice:** Best-effort + operator/alpha note (Recommended)
**Notes:** Honors feedback-trust-harness-over-manual-uat.md (manual surface limited to genuine non-automatable case). chrome.devtools Memory API is research-heavy + out of charter for verification-phase scope. Optional puppeteer.Page.metrics scaffolding may be added in Plan 03-04 if practical without research budget.
---
## Claude's Discretion
Captured in CONTEXT.md D-P3-01 "Claude's Discretion" subsection:
- Harness assertion numbering (A29+ for new Phase 3 assertions)
- Probe page composition for §10 #4 (synthetic vs real-world)
- Event-log trigger strategy for §10 #5 (synthetic vs natural)
- Sequencing within Wave 2 (parallelization audit per plan-checker)
---
## Deferred Ideas
All deferred items captured in CONTEXT.md `<deferred>` section:
- Phase 4: rrweb v2 stable upgrade, programmatic RAM measurement, REQ-password-confidentiality v2 candidate, audit P1 #11/#14/#15 polish, 2 ffprobe/ffmpeg vitest flakes, cursor visibility refinement, dark-surface logo contrast, setimmediate polyfill hardening, ROADMAP backfill
- v2 / SRV milestone: server upload (SRV-01), AI diagnostics (SRV-02), automatic ticketing (SRV-03), analytics dashboard (SRV-04), audio recording (CAP-01)
---
## Session efficiency notes
- User batched all 4 gray areas in 1 AskUserQuestion (multiSelect)
- All 4 area-specific questions presented in 1 AskUserQuestion call (4 questions batch)
- 1 follow-up clarification for Q1↔Q3/Q4 composition tension
- Total: 3 user-interaction turns (gray-area selection + question batch + clarification)
- Aligned with user's "minimum friction everywhere" pattern + saved memory feedback-no-unilateral-scope-reduction