Plan 01-14 ships W3C Screen Capture monitorTypeSurfaces: 'include' (Chrome
119+) on the offscreen getDisplayMedia call, plus an A23 harness regression
assertion that verifies the constraint reaches the call site via the
existing offscreen-hooks bridge.
Scope: 1 source line + A23 wiring + Tier-1 grep gate inventory update
(lockstep extension of unit-gate + UAT A0 FORBIDDEN_HOOK_STRINGS).
Autonomous, single executor; no operator empirical checkpoint (UAT 16/16
harness coverage suffices per feedback-pre-checkpoint-bundle-gates.md).
Canonical sources:
- Plan 01-10 RESEARCH section 5 ('monitorTypeSurfaces: include' recommendation)
- Plan 01-10 RESEARCH section Pitfall-5 ('Misinterpreting displaySurface
as a hard constraint' — monitorTypeSurfaces is the picker-UI complement
to D-15's post-grant validation)
- W3C Screen Capture spec section 6.1 DisplayMediaStreamOptions
- developer.chrome.com/docs/web-platform/screen-sharing-controls
Decisions honored:
- D-01 (whole-desktop only via getDisplayMedia; reject window/tab) — the
new constraint is the picker-UI realization of D-01's intent.
- D-15 (post-grant displaySurface validation) — UNCHANGED; remains the
enforcement (this plan is belt-and-suspenders at the picker UI level).
Ceremony note: this plan replaces the prior AMENDMENT-A.md improvisation
path retired per 01-11-SUMMARY Architectural Notes. Canonical GSD ceremony
(plan -> checker -> executor -> SUMMARY).
Validations:
- gsd-sdk frontmatter.validate -> valid: true (8/8 required fields).
- gsd-sdk verify.plan-structure -> valid: true (1 task; hasFiles/hasAction
/hasVerify/hasDone all true).
- ROADMAP.md Phase 1 plans list extended with 01-14 entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
250 lines
15 KiB
Markdown
250 lines
15 KiB
Markdown
# Roadmap: Mokosh
|
||
|
||
## Overview
|
||
|
||
The Mokosh codebase is a partially-broken first attempt at SPEC `Тз расширение
|
||
фаза1.md`. An external audit identified 7 P0 defects that prevent SPEC §10
|
||
acceptance. This roadmap therefore frames Phase 1 as **stabilization to spec**,
|
||
not greenfield: phases 1–3 each remediate a tightly-grouped subset of the P0
|
||
defects along sensible commit boundaries; phase 4 runs the SPEC §10 smoke pass
|
||
end-to-end. An optional phase 5 absorbs the P1/P2 follow-ups (SW state
|
||
persistence, `fetch` interception fix, `meta.json` field hardening,
|
||
`generate-icons.js` ESM/CJS, dead-code cleanup).
|
||
|
||
The journey: **broken-but-installable → playable video → masked DOM + log →
|
||
working export → green §10 smoke → harden + clean up**.
|
||
|
||
## Phases
|
||
|
||
**Phase Numbering:**
|
||
- Integer phases (1, 2, 3): Planned milestone work.
|
||
- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED).
|
||
|
||
Decimal phases appear between their surrounding integers in numeric order.
|
||
|
||
- [ ] **Phase 1: Stabilize video pipeline** — Collapse offscreen duality, fix MediaRecorder shadow, fix WebM ring buffer playability, replace `chrome.tabCapture` with offscreen `getDisplayMedia` (AMENDED from original DEC-003). **Closed 2026-05-15 then REOPENED 2026-05-16**: the 2026-05-15 closure was based on insufficient operator playback verification; D-13's concat-of-self-contained-segments architecture produces a multi-EBML WebM that plays only ~9 s instead of ~30 s in standards-compliant parsers (mpv, ffmpeg, Chrome HTMLMediaElement). UAT Test 3 retest on 2026-05-16 confirmed via byte-level EBML probe. SPEC §10 #7 not actually satisfied. Plan 01-08 (WebM remux via ts-ebml + webm-muxer) replaces `mergeVideoSegments`'s file-concat with a real single-EBML remux. See `.planning/debug/d13-multi-ebml-concat-unplayable.md`. Option C port-lifecycle refactor (debug session `empty-archive-port-race`) DID land cleanly and is retained. Phase 1 will additionally absorb whole-desktop + auto-start UX work (Plans 01-09/01-10) per the 2026-05-16 amended charter.
|
||
- [ ] **Phase 2: Stabilize DOM + event capture privacy** — Migrate rrweb to v2 `maskInputFn`, plug `content/index.ts setupInputLogging` password leak
|
||
- [ ] **Phase 3: Stabilize export pipeline** — Restore user-activation gesture in popup, delete dead `permissions.request`, replace base64 `data:` URL with Blob URL minted in offscreen
|
||
- [ ] **Phase 4: SPEC §10 smoke verification** — End-to-end install-and-record-and-export pass against all 9 acceptance criteria
|
||
- [ ] **Phase 5: Harden + clean up** _(optional)_ — P1/P2 follow-ups: SW state persistence, fetch interception, `meta.json` fields, `generate-icons.js` ESM/CJS, dead-code
|
||
|
||
## Phase Details
|
||
|
||
### Phase 1: Stabilize video pipeline
|
||
**Goal**: The video ring buffer captures the most recent 30 s of the active
|
||
tab's video continuously across tab switches, with a playable WebM header
|
||
retained — so that on export the assembled `last_30sec.webm` will play.
|
||
|
||
**Depends on**: Nothing (first phase). Operates on the existing `offscreen/`
|
||
directory + `vite.config.ts` inline string + `src/background/`.
|
||
|
||
**Requirements**: REQ-video-ring-buffer
|
||
|
||
**P0 defects addressed**:
|
||
- P0-1: Collapse the offscreen duality (`offscreen/index.ts` + inline string in
|
||
`vite.config.ts`) into a single source of truth; fix the `mediaRecorder`
|
||
shadow that breaks `stopRecording`.
|
||
- P0-2: Fix WebM ring buffer playability — single continuous `MediaRecorder`,
|
||
2000 ms timeslice per spec (CON-video-codec), cluster-timestamp-based rolling
|
||
window honouring the WebM header retention rule (DEC-009).
|
||
- P0-3: Make capture always-on with `chrome.tabs.onActivated` /
|
||
`chrome.tabs.onUpdated` re-attachment; start on `onInstalled` / `onStartup`,
|
||
not on popup open (CON-tab-capture-binding, REQ-video-ring-buffer).
|
||
|
||
**Success Criteria** (what must be TRUE):
|
||
1. [x] There is exactly one source of truth for the offscreen document; rebuilding
|
||
`vite.config.ts` does not regenerate a divergent inline duplicate, and
|
||
`stopRecording` runs without `mediaRecorder is undefined` shadow errors.
|
||
2. [x] With the extension loaded and an operator session active, a `MediaRecorder`
|
||
is running on the operator-selected screen/window source. AMENDED 2026-05-15
|
||
(D-13 fix-a3 activation): the recorder cycles in 10 s self-contained segments
|
||
(stop+restart on the same `MediaStream`) instead of a single continuous
|
||
recorder — replaces D-09..D-11 to fix VP9 keyframe orphan-P-frame freezes.
|
||
Recording continues unchanged across tab switches (no tab re-attach logic;
|
||
AMENDED from the original wording).
|
||
3. [x] The in-memory video ring buffer at any instant contains at most 3 of the
|
||
most recent 10 s WebM segments (3 × 10 s = 30 s window, no more, no less);
|
||
concatenating segments sequentially yields a multi-EBML-header WebM that
|
||
Chrome plays end-to-end (SPEC §10 #7 — operator confirmed clean playback
|
||
2026-05-15; ffmpeg `-v warning -i fixture -f null -` exit 0 with zero
|
||
decoder errors, only expected muxer DTS-monotonicity warnings at segment
|
||
join boundaries).
|
||
|
||
**Plans**: 13 plans (01-01 through 01-13)
|
||
- [x] 01-01-PLAN.md — Doc cascade: amend DEC-003 / DEC-010 / RETIRE constraints / swap manifest permissions (D-A1..D-A6)
|
||
- [x] 01-02-PLAN.md — Wave-0 test infrastructure: Vitest install + 4 RED test files + fixtures placeholder
|
||
- [x] 01-03-PLAN.md — Offscreen recorder TDD: ring buffer + codec strict-mode + getDisplayMedia + track-ended cleanup; D-13 fallback skeleton pre-staged
|
||
- [x] 01-04-PLAN.md — Port keepalive + OFFSCREEN_READY handshake (TDD): replaces alarms keepalive on offscreen side
|
||
- [x] 01-05-PLAN.md — SW shrink: delete legacy buffer + alarms + IndexedDB + tabCapture paths; wire SW-side onConnect host
|
||
- [x] 01-06-PLAN.md — Build pipeline collapse: delete vite.config.ts inline plugin + top-level offscreen/ dir; declare rollupOptions.input
|
||
- [x] 01-07-PLAN.md — Manual smoke + ffprobe D-12 acceptance gate + A3 empirical-playback gate; D-12 + A3 debug sessions resolved mid-execution via pre-staged base64 wire format + D-13 restart-segments; regression fixture committed; SPEC §10 #2/#3/#7 functionally green (Closed 2026-05-15)
|
||
- [x] 01-08-PLAN.md — WebM remux via ts-ebml + webm-muxer (replaces D-13 file-concat; closes SPEC §10 #7 playability per debug d13-multi-ebml-concat-unplayable.md)
|
||
- [x] 01-09-PLAN.md — Toolbar onClicked direct flow + monitor-only picker + onStartup notification + 3-state badge state machine; closure-by-harness Amendment 2 (Plan 01-13 PASS substitutes for operator UAT)
|
||
- [ ] 01-10-PLAN.md — Welcome tab (Hero + Loom dial per D-02; first-install onboarding; harness A15-A17)
|
||
- [x] 01-11-PLAN.md — UAT harness Approach-A spike (PIVOTED to 01-13; carries forward Wave 0 infrastructure + Tier-1 grep gate; falsified hypotheses recorded)
|
||
- [ ] 01-12-PLAN.md — Design integration (R2 Lora self-host, src/shared/tokens.css canonical, 8 i18n strings + 4 supporting keys, branded Loom icons, manifest i18n; harness A18-A22)
|
||
- [x] 01-13-PLAN.md — UAT harness via Approach B (extension-internal-page driver + offscreen synthetic stream; 15/15 GREEN; Plan 01-09 functional closure)
|
||
- [ ] 01-14-PLAN.md — Picker narrowing via monitorTypeSurfaces:'include' (Chrome 119+ picker enhancement; A23 harness regression)
|
||
|
||
### Phase 2: Stabilize DOM + event capture privacy
|
||
**Goal**: rrweb captures DOM events on typical pages and the user-event log
|
||
captures clicks/navigation/network errors — and in neither stream do password
|
||
values appear.
|
||
|
||
**Depends on**: Phase 1 (no functional dependency, but Phase 1 establishes the
|
||
"capture is always-on" baseline that this phase plugs into).
|
||
|
||
**Requirements**: REQ-rrweb-dom-buffer, REQ-user-event-log,
|
||
REQ-password-confidentiality
|
||
|
||
**P0 defects addressed**:
|
||
- P0-5: rrweb data-sensitive leak — migrate to rrweb v2 `maskInputFn` (the
|
||
legacy `maskInputSelector` is gone in v2.0.0-alpha.4 per `package.json`); fix
|
||
the parallel leak in `src/content/index.ts setupInputLogging` so password
|
||
field values are dropped at logger entry, not just at rrweb level.
|
||
|
||
**Success Criteria** (what must be TRUE):
|
||
1. On a page containing `<input type="password">` and elements with
|
||
`data-sensitive="true"`, rrweb snapshots for that page mask the value of
|
||
both kinds of fields (verified by inspecting exported `rrweb/session.json`).
|
||
2. On the same page, typing into a password field produces no `input` event
|
||
entry containing the typed value in the user-event log
|
||
(`logs/events.json`).
|
||
3. On a typical page with forms, tables, and a modal, rrweb records DOM
|
||
events without throwing in the Content Script console; the event log
|
||
captures clicks, navigations (`popstate`/`hashchange`), and network errors
|
||
(`fetch` / `XHR` >= 400).
|
||
|
||
**Plans**: TBD
|
||
**UI hint**: yes
|
||
|
||
### Phase 3: Stabilize export pipeline
|
||
**Goal**: A click on "Сохранить отчёт об ошибке" produces a SPEC-conformant ZIP
|
||
archive on disk in under 5 s, containing a screenshot taken at click time,
|
||
laid out per CON-archive-layout, with `meta.json` per CON-meta-json-schema, and
|
||
declared by a manifest carrying exactly the permission set in DEC-011.
|
||
|
||
**Depends on**: Phase 1, Phase 2 (export consumes the video + rrweb + event-log
|
||
buffers established by phases 1 and 2).
|
||
|
||
**Requirements**: REQ-popup-ui, REQ-screenshot-on-export, REQ-archive-layout,
|
||
REQ-meta-json-schema, REQ-archive-export-latency, REQ-manifest-permissions
|
||
|
||
**P0 defects addressed**:
|
||
- P0-4: Restore the user-activation gesture for `getMediaStreamId` by moving
|
||
the call to the popup-click handler; delete the dead `permissions.request`
|
||
dance that was masking the missing gesture (REQ-popup-ui,
|
||
CON-tab-capture-binding).
|
||
- P0-6: Replace the base64 `data:` URL download with a Blob URL minted in the
|
||
offscreen document — the Service Worker lacks `URL.createObjectURL` (DEC-006,
|
||
REQ-archive-export-latency).
|
||
|
||
**Success Criteria** (what must be TRUE):
|
||
1. Opening the popup shows a button reading "Сохранить отчёт об ошибке" with
|
||
sub-label "Последние 30 сек видео + 10 мин лога"; clicking it transitions
|
||
idle → "Сохраняю..." → "Готово! ✓" → idle (with 3 s revert) and triggers
|
||
a `chrome.downloads` download.
|
||
2. The downloaded file lands in the user's Downloads folder, named
|
||
`session_report_YYYY-MM-DD_HH-MM-SS.zip`, in under 5 seconds from click;
|
||
opening it reveals exactly the layout in REQ-archive-layout
|
||
(`video/last_30sec.webm`, `rrweb/session.json`, `logs/events.json`,
|
||
`screenshot.png`, `meta.json` at the root) with no extra entries.
|
||
3. `meta.json` validates against the verbatim CON-meta-json-schema (all 7
|
||
fields present, types correct, `timestamp` is ISO-8601 with `Z`).
|
||
4. `manifest.json` in `dist/` after `npm run build` declares exactly the
|
||
permission set in DEC-011 with no additional or missing entries; loading
|
||
unpacked into Chrome produces no permission-related warnings or errors in
|
||
`chrome://extensions/`.
|
||
|
||
**Plans**: TBD
|
||
**UI hint**: yes
|
||
|
||
### Phase 4: SPEC §10 smoke verification
|
||
**Goal**: All 9 SPEC §10 acceptance criteria pass against an unpacked load of
|
||
the build into a real Chrome instance.
|
||
|
||
**Depends on**: Phase 1, Phase 2, Phase 3.
|
||
|
||
**Requirements**: REQ-install-clean (and end-to-end verification of all
|
||
preceding REQs)
|
||
|
||
**P0 defects addressed**:
|
||
- P0-7: End-to-end smoke verification against §10. This is a verification
|
||
phase, not a new implementation — it confirms the cumulative output of
|
||
phases 1–3 actually satisfies the SPEC.
|
||
|
||
**Success Criteria** (what must be TRUE):
|
||
1. The extension installs into Chrome via "Load unpacked" against `dist/`
|
||
with no errors or warnings in `chrome://extensions/`.
|
||
2. With the extension loaded and a normal browsing session under way, the
|
||
video buffer runs continuously across tab switches and never holds more
|
||
than 30 s of footage (confirmed by inspecting the SW console / a debug
|
||
export).
|
||
3. On a typical page (form + table + modal) rrweb records without throwing,
|
||
the event log captures clicks/navigation/network errors, and passwords
|
||
are absent from both streams.
|
||
4. A click on the popup button produces a ZIP in Downloads in under 5 s; the
|
||
ZIP opens; `video/last_30sec.webm` plays in a browser.
|
||
5. Background RAM consumption (measured via Chrome Task Manager) does not
|
||
exceed 50 MB during a sustained recording session (CON-ram-ceiling).
|
||
|
||
**Plans**: TBD
|
||
|
||
### Phase 5: Harden + clean up _(optional)_
|
||
**Goal**: Eliminate the P1/P2 follow-ups identified in the audit so that the
|
||
codebase is not just spec-conformant but maintainable. This phase has no new
|
||
v1 requirements — it improves robustness and removes technical debt around
|
||
already-shipped behaviour.
|
||
|
||
**Depends on**: Phase 4 (do not harden until §10 is green).
|
||
|
||
**Requirements**: none (no new v1 REQs; all v1 REQs are covered by phases 1–4)
|
||
|
||
**P1/P2 items addressed** (informative list from the audit, exact scope
|
||
finalized at plan time):
|
||
- SW state persistence around the 30 s idle unload edge cases.
|
||
- `fetch` interception fix in the network-error path of REQ-user-event-log.
|
||
- `meta.json` field hardening (timestamp source, version source, totalEvents
|
||
derivation).
|
||
- `generate-icons.js` ESM/CJS compatibility with the rest of the toolchain.
|
||
- Dead-code cleanup (the `permissions.request` dance removed in Phase 3 may
|
||
have stranded helpers; the offscreen duality removed in Phase 1 may have
|
||
stranded shims).
|
||
- `getDisplayMedia` cursor visibility constraint (`video: { cursor: 'always' }`)
|
||
— refines capture quality for diagnostic UX; surfaced during Phase 1 smoke
|
||
(2026-05-15) as a user observation. Operator's screen cursor was absent
|
||
from captured frames despite being the highest-signal cue when reproducing
|
||
pointer-driven bugs. Constraint is opt-in per the `getDisplayMedia` spec
|
||
and Chrome implements it via the `CursorCaptureConstraint` enum (`always`
|
||
/ `motion` / `never`).
|
||
|
||
**Success Criteria** (what must be TRUE):
|
||
1. After running the extension idle for >5 minutes, then exporting, the
|
||
archive still contains a non-empty video buffer (proves SW state
|
||
persistence works across one or more SW unload/reload cycles).
|
||
2. A page that issues a failing `fetch` (response code >= 400) produces a
|
||
`network_error` entry in `events.json`; a failing `XMLHttpRequest` does
|
||
too.
|
||
3. `npm run build` and `node generate-icons.js` both succeed under the
|
||
project's module setting (`"type": "module"` in `package.json`) with no
|
||
`require is not defined` or `Cannot use import statement outside a
|
||
module` errors.
|
||
4. A repo grep for the symbols deleted in phases 1 and 3
|
||
(`permissions.request`, the duplicate offscreen inline string) returns no
|
||
live references.
|
||
|
||
**Plans**: TBD
|
||
|
||
## Progress
|
||
|
||
**Execution Order:**
|
||
Phases execute in numeric order: 1 → 2 → 3 → 4 → 5.
|
||
|
||
| Phase | Plans Complete | Status | Completed |
|
||
|-------|----------------|--------|-----------|
|
||
| 1. Stabilize video pipeline | 7/7 | Complete | 2026-05-15 |
|
||
| 2. Stabilize DOM + event capture privacy | 0/TBD | Not started | - |
|
||
| 3. Stabilize export pipeline | 0/TBD | Not started | - |
|
||
| 4. SPEC §10 smoke verification | 0/TBD | Not started | - |
|
||
| 5. Harden + clean up (optional) | 0/TBD | Not started | - |
|