Commit Graph

299 Commits

Author SHA1 Message Date
67246ed841 chore: merge executor worktree (worktree-agent-ab163c15167479f9e) — Wave 4 Plan 03-04 2026-05-20 21:06:52 +02:00
c508a91af2 docs(03-04): Plan 04 SUMMARY — A32 RAM scaffolding (33/33 GREEN; host-side Page.metrics; D-P3-04 best-effort)
Documents the single-task Plan 03-04 closure end-to-end:
- A32 ships ~90 lines of best-effort RAM scaffolding per D-P3-04 +
  RESEARCH Open Question 3 (host-side puppeteer.Page.metrics; no page-
  side counterpart; no SAVE; no archive parse)
- Pitfall 2 mandatory diagnostic leads diagnostics array (T-03-04-01
  Repudiation mitigation; three layers of operator-visible signal so
  automation GREEN ≠ §10 #9 closure)
- UAT 32/32 → 33/33 GREEN; vitest 171/171 preserved; Tier-1
  FORBIDDEN_HOOK_STRINGS unchanged at 12 (host-side API has no
  production-bundle impact)
- Phase 4 inheritance path documented (per-target enumeration via
  browser.targets() + createCDPSession + Performance.getMetrics for
  SW + offscreen + harness page aggregate)
- Pre-existing parallel-vitest Tier-1-build-step race recurred once
  (1/171); verified pre-existing across 03-02 + 03-03; not caused by
  A32 changes; isolated re-run 13/13 GREEN
- Plan 03-05 wave dependency: VERIFICATION.md aggregator; will record
  §10 #9 as `human_verification` regardless of A32 status
- Zero deviations: plan-spec verbatim implementation; the cleanest of
  the four Wave-2/3/4 plans in Phase 3 by deviation count
2026-05-20 21:06:18 +02:00
8c94bd515d feat(03-04): Task 1 — driveA32 host-side Page.metrics scaffolding + orchestrator wiring
A32 ships ~90 lines of best-effort RAM scaffolding per D-P3-04 +
RESEARCH Open Question 3 (recommended SHIP). Calls puppeteer.Page.metrics()
against the harness page and asserts JSHeapUsedSize is below the SPEC §10 #9
50 MB ceiling.

Page-realm scope is the load-bearing caveat (RESEARCH Pitfall 2): the MV3
service worker is a separate Puppeteer target with its own V8 isolate, so
Page.metrics() under-reports the operator-facing "extension background
RAM" measurement that §10 #9 actually requires. The binding §10 #9 gate
stays operator-driven (chrome://memory-internals OR chrome://extensions
service-worker memory display) and is recorded in Plan 03-05 VERIFICATION.md
human_verification block.

Mandatory diagnostic line emitted on EVERY run regardless of pass/fail:
"NOTE: page-realm only; SW context measurement requires
chrome://memory-internals operator verification per D-P3-04."
printAssertionResult prints diagnostics to stdout so the operator sees
the caveat in the live UAT trace, never confusing automation GREEN with
full §10 #9 closure (T-03-04-01 Repudiation mitigation).

Host-side only — no page-side assertA32, no setupFreshRecording, no
SAVE, no archive parse. driveA32 takes only `page` (no downloadsDir),
so the orchestrator pushes it bare in the drivers array without a wrapped
const. Tier-1 FORBIDDEN_HOOK_STRINGS inventory unchanged at 12 entries
(Page.metrics is host-side puppeteer; not bundled).

Empirical: UAT harness 32/32 → 33/33 GREEN; A32.1 PASS (JSHeapUsedSize=
1909924 bytes); A32.2 PASS (1.82 MB << 50 MB). Tier-1 unit-gate 13/13
sub-tests GREEN; 12 strings × 0 hits each in dist/. vitest 171/171 GREEN.

Closes:
- Plan 03-04 must_have 'puppeteer.Page.metrics() returns a JSHeapUsedSize
  value (>= 0) for the harness page realm' (A32.1)
- Plan 03-04 must_have 'JSHeapUsedSize for the harness page realm is
  below 50 MB' (A32.2)
- Plan 03-04 must_have 'Driver emits an explicit diagnostic line: NOTE:
  page-realm only' (Pitfall 2 gate — leads diagnostics array)
- Plan 03-04 must_have 'UAT harness exits 0 with 32 + 1 = 33/33
  assertions GREEN' (empirical 33/33)
2026-05-20 20:56:24 +02:00
450f43ebf0 docs(phase-03): update tracking after wave 3 — 03-03 GREEN (A31 §10 #8 PARTIAL; UAT 32/32) .planning/ROADMAP.md 2026-05-20 20:49:25 +02:00
de8a9edcbc chore: merge executor worktree (worktree-agent-ab90594c47b888094) — Wave 3 Plan 03-03 2026-05-20 20:48:44 +02:00
773e0350ad docs(03-03): Plan 03 SUMMARY — A31 password-filter PARTIAL (32/32 GREEN; cs-injection-world + A31.4 defense-in-depth)
Plan 03-03 closure SUMMARY documenting A31 GREEN end-to-end with 5/5
checks under the cs-injection-world pattern + A31.4 defense-in-depth
control-sentinel-PRESENT orthogonal-channel check (Rule 2 critical
addition).

Empirical contract literally satisfied:
- userEvents.length=1
- sentinel-containing count=0 (proves src/content/index.ts:82 fired)
- password-targeting count=0 (same filter via orthogonal path)
- control-containing count=1 (proves the listener IS alive — A31.2/A31.3
  absences are NOT vacuously satisfied)

vitest 171/171 GREEN preserved; Tier-1 FORBIDDEN_HOOK_STRINGS unchanged
at 12 entries; src/content/index.ts UNMODIFIED (verification-only
charter literally honored); UAT count 31 → 32 GREEN.

Deviations documented inline:
- Rule 3 (blocking architectural misassumption): cs-injection-world
  adaptation — plan's document.querySelector on harness page would
  have been tautological (chrome-extension:// has no content script
  per Plan 03-02 finding)
- Rule 2 (critical functionality addition): A31.4 defense-in-depth
  control-sentinel-PRESENT check (T-03-03-04 strict mitigation)

Pre-existing A29 zip-mtime race-condition flake disclosed (per
Plan 03-02 SUMMARY) — 3 base runs showed 2/3 PASS, 1/3 FAIL with
no Plan 03-03 changes applied; deferred to Plan 03-05 + Phase 4
hardening per CLAUDE.md SCOPE BOUNDARY rule.
2026-05-20 20:48:07 +02:00
34b36fb58b feat(03-03): Task 2 — driveA31 + orchestrator wiring (A31 password-filter PARTIAL)
- Append driveA31 to tests/uat/lib/harness-page-driver.ts after driveA30:
  - Reuses UserEvent type (Plan 03-02 import already present).
  - 3-phase pattern: page.evaluate → findLatestZip → JSZip
    logs/events.json parse + filter-pipeline grep for sentinel absence
    + control-sentinel presence.
  - 3 host-side checks: A31.2 (eventsContainingSentinel.length === 0),
    A31.3 (eventsTargetingPassword.length === 0), A31.4
    (eventsContainingControl.length >= 1; defense-in-depth proves
    the listener is alive so A31.2/A31.3 absences mean the filter
    fired rather than a tautological "no events at all" pass).
  - Standard guard checks A31.0 (zip present) + A31.0a (events.json
    entry exists) + A31.0b (JSON.parse success) gate before A31.2..A31.4
    per Plan 02-04 / Plan 03-01 / Plan 03-02 driveA26/A29/A30 precedent.
  - Filter-pipeline form preserved (no `continue`) per CLAUDE.md
    Control Flow §.
- Wire orchestrator in tests/uat/harness.test.ts:
  - Add `driveA31,` to import block after `driveA30,`.
  - Add `driveA31Wrapped` const after `driveA30Wrapped`.
  - Add `{ name: 'A31', drive: driveA31Wrapped }` entry to drivers
    array after the A30 entry with explanatory banner comment
    citing the cs-injection-world precedent + the defense-in-depth
    A31.4 control check.
  - Append `, A31` to the orchestrator banner string.

Acceptance grep gates (post-commit):
- grep -c 'driveA31' tests/uat/lib/harness-page-driver.ts returns 2
- grep -c 'driveA31' tests/uat/harness.test.ts returns 6
- grep -c 'secret-do-not-log-123' tests/uat/lib/harness-page-driver.ts returns 1
- tsc --noEmit exit 0

A29 flake disclosure (per Plan 03-02 SUMMARY "Issues Encountered"):
- During Plan 03-03 empirical verification of A31, the pre-existing
  A29 flakiness documented in 03-02-SUMMARY.md surfaced: A29 chains
  off incidental zip-mtime ordering against prior assertions' zips,
  so when A29's own (empty chrome-extension:// SAVE) zip mtime ties
  with a prior real-content zip, findLatestZip non-deterministically
  returns the prior zip with rrweb events from iana.org/example.com.
- 3 base runs (HEAD=de398347, no Plan 03-03 changes): 2/3 PASS,
  1/3 FAIL — confirms PRE-EXISTING flake, NOT a Plan 03-03 regression.
- Per CLAUDE.md SCOPE BOUNDARY ("Only auto-fix issues DIRECTLY caused
  by the current task's changes") + Plan 03-02 SUMMARY's explicit
  recommendation ("Plan 03-05's VERIFICATION.md aggregator + a
  Phase 4 hardening pass can pick it up"): A29 flake is OUT OF SCOPE
  for Plan 03-03. Documented in SUMMARY as deferred item.
2026-05-20 20:36:00 +02:00
8db629f2fb feat(03-03): Task 1 — assertA31 page-side orchestrator (cs-injection-world password-filter probe)
- Add assertA31 page-side orchestrator after assertA30: opens fresh
  https://example.com probe tab via chrome.tabs.create, injects a
  synthetic <input type="password" id="probe-password"> + a control
  <input type="text" id="probe-control"> into the probe tab DOM via
  chrome.scripting.executeScript world:'ISOLATED', types
  A31_PASSWORD_SENTINEL='secret-do-not-log-123' + A31_CONTROL_SENTINEL
  into each, dispatches input events, settles, SAVEs while the probe
  tab is active, finally-cleanup with silent-ignore (T-02-04-04
  parity).
- Add 8 module-local constants: A31_SAVE_ARCHIVE_TIMEOUT_MS=15s,
  A31_SEGMENT_SETTLE_MS=11s, A31_TRIGGER_SETTLE_MS=1s,
  A31_TAB_NAVIGATION_WAIT_MS=1.5s, A31_PROBE_TAB_URL,
  A31_PASSWORD_SENTINEL, A31_CONTROL_SENTINEL,
  A31_PASSWORD_SELECTOR='#probe-password',
  A31_PASSWORD_INPUT_ID, A31_CONTROL_INPUT_ID.
- Extend declare global Window.__mokoshHarness interface with assertA31
  + add assertA31 to window.__mokoshHarness object literal + update
  statusEl banner + closing console.log to A31.
- 1 page-side check: A31.1 (SAVE_ARCHIVE ack). Host-side driveA31
  (Task 2) will append A31.2 (sentinel-value-absent) + A31.3
  (zero-events-targeting-password-selector) + A31.4 (control event
  present — defense-in-depth proof the listener is alive, so A31.2
  and A31.3 GREEN actually mean the filter fired rather than a
  tautological pass from no events at all).

Rule 3 — Auto-fix blocking (cs-injection-world adaptation):
- The plan's <action> drove document.querySelector('#probe-password')
  on the harness page (chrome-extension://...harness.html). Plan
  03-02 empirically established that <all_urls> content_scripts does
  NOT cover chrome-extension scheme (Chrome match-pattern spec
  permits http/https/file/ftp/urn only). With no content script on
  the harness page, A31.2/A31.3 would pass tautologically (no events
  captured regardless of input type — would not empirically verify
  the line-82 filter "fires").
- A31 reuses the Plan 03-02 cs-injection-world pattern: probe tab on
  https://example.com (where the content script attaches normally)
  + executeScript ISOLATED-world injection so production
  setupInputLogging at src/content/index.ts:78 actually sees the
  password input event AND its line-82 filter early-returns.
- A31.4 control-event check is added as defense-in-depth per
  T-03-03-04: proves the listener IS alive, so the absence assertions
  A31.2/A31.3 are not vacuously satisfied.
- Plan's binding contract (sentinel absent from logs/events.json +
  zero events targeting password selector) preserved verbatim; only
  the trigger mechanism changes.

FORBIDDEN_HOOK_STRINGS impact: NONE. A31 rides production
setupInputLogging + line-82 filter + chrome.tabs + chrome.scripting
(scripting perm already in manifest) + existing
setupFreshRecording/sendMessageWithTimeout helpers. Tier-1 unchanged
at 12.
2026-05-20 20:35:22 +02:00
de398347e0 docs(phase-03): update tracking after wave 2 — 03-02 GREEN (A30 event-log; UAT 31/31) .planning/ROADMAP.md 2026-05-20 20:00:37 +02:00
059dbac941 chore: merge executor worktree (worktree-agent-a9375231013f01986) — Wave 2 Plan 03-02 2026-05-20 20:00:21 +02:00
66678798f1 docs(03-02): Plan 02 SUMMARY — A30 event-log verification (31/31 GREEN; cs-injection-world fix)
- 7-check A30 contract empirically verified end-to-end across all 5
  UserEvent.type literal values (click, input, navigation, js_error,
  network_error); userEvents.length=5; type counts all = 1.
- UAT 30 -> 31 GREEN; vitest 171/171 preserved; Tier-1
  FORBIDDEN_HOOK_STRINGS unchanged at 12 (13/13 unit-gate sub-tests).
- 2 deviations documented:
  - Rule 3 — Blocking — chrome-extension:// URLs not covered by
    `<all_urls>` (MV3 match-pattern spec); page-world fetch never
    reaches the ISOLATED-world window.fetch wrapper. Fixed by opening
    a fresh https://example.com probe tab + chrome.scripting.execute
    Script(world:'ISOLATED'). Rides production surfaces only;
    FORBIDDEN_HOOK_STRINGS impact = 0.
  - Rule 1 — Bug — history.pushState destroys Puppeteer CDP execution
    context. Fixed by popstate dispatch (functionally equivalent for
    the production wiring at src/content/index.ts:111).
- One latent A29 issue surfaced (A29 "passed" via iana.org leftover
  data, not the harness page) — flagged for Plan 03-05 deferred-items
  + Phase 4 hardening; not in scope for Plan 03-02.
- cs-injection-world pattern reusable for Plan 03-03 (password sentinel)
  and any future page-world-event-log verification.
2026-05-20 19:59:39 +02:00
116432a3cd feat(03-02): Task 2 — driveA30 + orchestrator wiring (A30 31/31 GREEN; cs-injection-world fix)
- driveA30 host-side (tests/uat/lib/harness-page-driver.ts):
  - import type { UserEvent } from '../../../src/shared/types' (5-type tuple grep).
  - A30_EXPECTED_TYPES = ['click','input','navigation','js_error','network_error']
    (canonical CON-event-log-schema 5-tuple).
  - 3-phase pattern (page.evaluate stub → findLatestZip → JSZip
    logs/events.json) per Plan 02-04 driveA26 analog.
  - 6 host-side checks: A30.0a (entry present) + A30.2..A30.6 (5 type
    presence). Filter-pipeline form; no `continue`.

- Orchestrator wiring (tests/uat/harness.test.ts):
  - driveA30 import + driveA30Wrapped const + drivers-array entry with
    Plan 03-02 banner; Architecture banner updated A29 -> A29, A30.

- assertA30 architectural rewrite (deviation Rule 3 — blocking fix):
  The plan's original strategy "dispatch synthetic events ON the harness
  page (chrome-extension://) so the production listeners on that page
  fire" was empirically wrong on two counts:

  1. Chrome MV3 `<all_urls>` match-pattern (Chrome match-pattern docs)
     permits schemes http/https/file/ftp/urn only — NOT
     chrome-extension. The harness page has NO content script attached;
     the SW SAVE_ARCHIVE handler reported "Could not establish
     connection. Receiving end does not exist." when the active tab was
     the harness page (verified empirically 2026-05-20T17:36:25Z trace).

  2. Even if (1) had been satisfied, page.evaluate-side fetch() runs in
     the MAIN world while the content-script's window.fetch wrapper at
     src/content/index.ts:167 patches only the content-script's
     ISOLATED-world window. Page-world fetches NEVER reach the
     production network_error wrapper.

  Fix: A30 now creates a fresh https://example.com probe tab via
  chrome.tabs.create (mirrors A27's pattern; DEC-011 Amendment 1 `tabs`
  perm; `scripting` perm already in manifest); uses
  chrome.scripting.executeScript with default `world: 'ISOLATED'` to
  inject all 5 triggers directly in the content-script's realm; SAVEs
  while the probe tab is active (SW harvests events.json from a tab
  whose content script IS attached); cleans up the probe tab in finally
  (T-02-04-04 silent-ignore parity). All 5 UserEvent types now land
  empirically: type counts: click=1,input=1,navigation=1,js_error=1,
  network_error=1; userEvents.length=5.

- UAT 30 → 31 GREEN; vitest 171/171 preserved; Tier-1 FORBIDDEN_HOOK_STRINGS
  unchanged at 12 (A30 rides production chrome.tabs + chrome.scripting +
  GET_RRWEB_EVENTS round-trip — no new test-only symbols).
2026-05-20 19:48:47 +02:00
b5181012a8 feat(03-02): Task 1 — assertA30 page-side orchestrator (5 event triggers + SAVE)
- Add assertA30 dispatching 5 synthetic browser events on the harness page:
  click (#probe-submit), input (#probe-email),
  navigation (history.pushState #a30-probe), js_error (window.dispatchEvent
  ErrorEvent), network_error (fetch https://example.com/<404-path>).
- Module-local timing/url constants: A30_SAVE_ARCHIVE_TIMEOUT_MS=15s,
  A30_SEGMENT_SETTLE_MS=11s, A30_TRIGGER_SETTLE_MS=500ms,
  A30_404_PROBE_URL (RFC 2606 reserved example.com).
- Wire assertA30 into declare global Window.__mokoshHarness interface +
  window.__mokoshHarness object literal (preserves assertA29 from Plan 03-01).
- Update statusEl banner A29 -> A30 and closing console.log to append
  "Plan 03-02: A30".
- A30 rides production listeners at src/content/index.ts:60-237 + existing
  setupFreshRecording / sendMessageWithTimeout helpers — Tier-1
  FORBIDDEN_HOOK_STRINGS inventory unchanged at 12.
2026-05-20 19:25:03 +02:00
72bbb8044b docs(phase-03): update tracking after wave 1 — 03-01 GREEN (A29 rrweb DOM verification; UAT 30/30) .planning/ROADMAP.md 2026-05-20 19:21:42 +02:00
ca87cbee22 chore: merge executor worktree (worktree-agent-ab0a9017eb674054f) — Wave 1 Plan 03-01 2026-05-20 19:21:18 +02:00
dc57f5cfc0 docs(03-01): complete A29 rrweb DOM verification plan — SUMMARY
- 2/2 plan tasks completed (c02914d + cc13f31).
- UAT harness 29 → 30 GREEN; vitest 171/171 preserved.
- Tier-1 FORBIDDEN_HOOK_STRINGS unchanged at 12.
- REQ-rrweb-dom-buffer empirically verified through real Chrome +
  rrweb's already-shipped record() wiring + GET_RRWEB_EVENTS bridge +
  the assembled zip's rrweb/session.json content.
- A29 events.length=4; event types {2, 3, 4} (Meta + FullSnapshot +
  IncrementalSnapshot — all 3 required surfaces empirically present).
- Worktree mode: STATE.md / ROADMAP.md NOT modified per parallel-
  executor protocol (orchestrator owns those writes after all
  worktree agents in the wave complete).
2026-05-20 19:20:39 +02:00
cc13f319a1 feat(03-01): Task 2 — assertA29 + driveA29 + orchestrator wiring (A29 30/30 GREEN)
Page-side (tests/uat/extension-page-harness.ts):
- assertA29 dispatches probe-page DOM mutation (input value + modal
  toggle), settles 500ms for rrweb IncrementalSnapshot to enqueue,
  setupFreshRecording, 11s segment-settle, SAVE_ARCHIVE; pushes
  A29.1 SAVE ack check. Module-local constants:
  A29_SAVE_ARCHIVE_TIMEOUT_MS=15s, A29_SEGMENT_SETTLE_MS=11s,
  A29_MUTATION_SETTLE_MS=500ms.
- declare global interface + window.__mokoshHarness object literal
  extended with assertA29 (single-method-per-assertion contract).
- statusEl + console banner updated A28 → A29 + cite Plan 03-01.

Host-side (tests/uat/lib/harness-page-driver.ts):
- Add `import { EventType } from '@rrweb/types';`.
- driveA29 — 3-phase orchestration mirroring driveA26:
  Phase 1 page.evaluate harness.assertA29(); Phase 2 findLatestZip;
  Phase 3 JSZip.loadAsync rrweb/session.json + EventType grep.
  Appends A29.0a (rrweb/session.json present) + A29.2..A29.5
  (events.length>0 + Meta + FullSnapshot + IncrementalSnapshot).

Orchestrator (tests/uat/harness.test.ts):
- driveA29 imported after driveA28.
- driveA29Wrapped const captures handles.downloadsDir.
- drivers array push A29 entry with banner citing Plan 03-01 + Pitfall 1.
- Architecture banner string updated A28 → A29.

Empirical verification (HEADLESS=1 SKIP_PROD_REBUILD=0 npm run test:uat):
- UAT harness: 30/30 GREEN (29 prior + A29 NEW).
- A29 events.length=4; event types observed: 2, 3, 4 (FullSnapshot,
  IncrementalSnapshot, Meta — all three required types present).
- Pitfall 1 mitigation empirically verified — the pre-SAVE DOM
  mutation produced the IncrementalSnapshot.
- vitest 171/171 GREEN preserved (full suite).
- Tier-1 FORBIDDEN_HOOK_STRINGS unit gate 13/13 GREEN (12 strings × 0
  hits each) — A29 rides production rrweb wiring + GET_RRWEB_EVENTS
  bridge + sendMessageWithTimeout helper; NO new __MOKOSH_UAT__
  symbols.
- npx tsc --noEmit exit 0.
2026-05-20 19:17:47 +02:00
c02914df86 feat(03-01): Task 1 — probe HTML for A29 rrweb DOM verification (SPEC §10 #4)
- Append form (text + email + password + submit) + table (thead + 2 rows)
  + modal trigger + hidden modal div below existing `<pre id="status">`
  scaffold; preserves `<head>` block + tokens.css link untouched (A18/A21
  invariant).
- Modal trigger uses inline onclick to toggle style.display — rrweb
  records the attribute mutation, satisfying IncrementalSnapshot
  emission per RESEARCH Pitfall 1 (synthetic probe HTML emits Meta +
  FullSnapshot but NOT IncrementalSnapshot without a DOM mutation
  between page load and SAVE).
- Per RESEARCH Pitfall 4: the rrweb-alpha.4-leaky multi-line input
  element (rrweb-io/rrweb#1596) is excluded; only single-line inputs.
- Per UI-SPEC §"Test Fixture Conventions": data-test-* attributes
  only; no data-mokosh-* (production-welcome-page reserved); no
  tokens.css import on the probe sub-tree (head already imports the
  canonical tokens for A18/A21).
- npm run build exit 0; all 7 acceptance grep gates GREEN.
2026-05-20 19:11:41 +02:00
5892371eae chore(03): state.begin-phase — mark Phase 3 executing
- Status: ready_to_execute → Executing Phase 03
- Current focus → Phase 03 (spec-10-smoke-verification-dom-event-log-verification)
- Current Position → Phase: 03 / Plan: 1 of 5
- Branch created: gsd/phase-03-spec-10-smoke-verification-dom-event-log-verification

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 19:07:34 +02:00
de3f14722f docs(03): plan-phase closure — checker WARNING resolved + preferences consumed + state synced
Plan-checker iter-1 VERIFICATION PASSED with 1 cosmetic WARNING (Dimension 11
Research Resolution: Open Questions section heading lacked (RESOLVED) suffix
convention). Fixed inline: heading now reads "## Open Questions (RESOLVED)".

.plan-phase-preferences.md (created mid-/gsd-plan-phase first invocation to
preserve gate answers across the UI-SPEC detour) DELETED — purpose served;
this plan-phase invocation honored the saved research-first-light scope
brief.

state.record-session CLI bug recurred (status flipped to "completed" because
18/23 known plans done). Restored: status=ready_to_execute. percent: 78 is
correct now (5 Phase 3 plans counted; was 18/18=100 stale).

Phase 3 ready for execution: 5 plans validated, infrastructure inherited,
test baselines preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 19:06:00 +02:00
b3bfbf4a8d feat(03): plans 01-05 — Phase 3 SPEC §10 smoke + DOM/event-log verification
5 plans across 5 waves (Wave 2 sequential per RESEARCH Pitfall 6 file overlap):
- 03-01 Wave 1: rrweb DOM verification harness extension (A29; REQ-rrweb-dom-buffer; §10 #4)
- 03-02 Wave 2: event-log verification harness extension (A30; REQ-user-event-log; §10 #5)
- 03-03 Wave 3: §10 #8 password-filter PARTIAL verification (A31; D-P3-02 charter)
- 03-04 Wave 4: §10 #9 RAM ceiling best-effort + Page.metrics scaffolding (A32; D-P3-04)
- 03-05 Wave 5: §10 sweep VERIFICATION.md + REQUIREMENTS/ROADMAP/STATE marker flips
  (REQ-install-clean + REQ-rrweb-dom-buffer + REQ-user-event-log)

Each plan has:
- frontmatter (wave + depends_on + files_modified + autonomous + requirements + tags + must_haves)
- tasks with mandatory <read_first> + <acceptance_criteria> + concrete <action>
- <threat_model> block per security gate
- Validation map row(s) added to 03-VALIDATION.md (10 tasks total)

Expected UAT growth: 29/29 → 33/33 GREEN (A29-A32 + 03-05 docs).
Expected vitest baseline preserved: 171/171.
Expected Tier-1 FORBIDDEN_HOOK_STRINGS: 12 (A29+ ride production surfaces only).

ROADMAP.md Phase 3 entry replaces "Plans: TBD" with full 5-plan list.
VALIDATION.md status: planner_filled (nyquist_compliant: true; wave_0_complete: true).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 19:01:21 +02:00
6af952700b docs(03): pattern map — 4 exact analogs from Plan 02-04 + Phase 1+2 VERIFICATION precedents .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-PATTERNS.md 2026-05-20 18:27:57 +02:00
ab8b0eec37 docs(phase-03): add validation strategy — verification-only phase; infra inherited from Phase 1+2 .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-VALIDATION.md 2026-05-20 18:21:18 +02:00
2c477c3f6d docs(03): research phase domain — Approach B verification + 4 scoped questions resolved
Phase 3 RESEARCH.md addresses the 4 user-scoped research questions:

  Q1 puppeteer.Page.metrics() reliable for SW context?  →  NO (page-realm only;
     SW is a separate Puppeteer target). Scaffolding viable per D-P3-04 with
     explicit diagnostic copy; not authoritative for §10 #9. (Verified via
     pptr.dev/api/puppeteer.page.metrics + Puppeteer issue #7536.)

  Q2 rrweb 2.0.0-alpha.4 testing patterns?  →  Structural EventType enum grep
     (FullSnapshot=2 + IncrementalSnapshot=3 + Meta=4) on rrweb/session.json
     from latest archive. Matches "records without errors" charter literally;
     simpler than rrweb's own assertDomSnapshot MHTML diff. (Verified via
     node_modules/@rrweb/types/dist/index.d.ts grep.)

  Q3 rrweb v2 stable release status / alpha.4 safety?  →  Stable v2 has NOT
     shipped; npm `latest` tag still points at 2.0.0-alpha.4 (2023). Newest
     alpha is 2.0.0-alpha.20 (2026-02-03) with breaking NodeType import-site
     change. alpha.4 pin is safe for Phase 3 verification (9 closed plans +
     29/29 UAT GREEN). Phase 4 upgrade research correctly deferred per
     D-P3-03. (Verified via `npm view rrweb dist-tags`.)

  Q4 New chrome.* patterns for §10?  →  None required. Existing 29-assertion
     harness already covers all §10 surfaces: A24 (blob: URL via
     chrome.downloads.onCreated), A28 (screenshot.png set-equality), A26 +
     A27 (meta.json + multi-tab urls strict). Operator chrome://memory-internals
     remains §10 #9 canonical per D-P3-04.

Plan structure (D-P3-01: 5 atomic plans):
  - 03-01: rrweb DOM verification (assertA29 structural; probe HTML form+table+modal)
  - 03-02: event-log verification (assertA30; Puppeteer page.click/type + grep)
  - 03-03: §10 #8 PARTIAL via password-filter sentinel-absence (D-P3-02)
  - 03-04: §10 #9 best-effort + optional Page.metrics scaffolding (D-P3-04)
  - 03-05: §10 #1-#9 sweep VERIFICATION.md aggregator (Phase 2 frontmatter template)

Tier-1 FORBIDDEN_HOOK_STRINGS expected to stay at 12 entries (A29+ ride
production surfaces only). No new dependencies. Approach B template from
Plan 02-04 is direct precedent.
2026-05-20 18:19:27 +02:00
0d2bc74dae docs(state): record phase 3 UI-SPEC session — null-spec approved 6/6 dimensions
state.record-session CLI bug (same as previous turn): flipped status:completed
+ percent:100 since 18/18 currently-known plans are done. Restored:
status:ready_to_plan, percent:50 (2/4 phases truly complete).

UI-SPEC.md at:
  .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-UI-SPEC.md
Verifier APPROVED with 6/6 dimensions PASS (null-spec correctly applied —
inherited Phase 1 design system locked read-only; probe-page conventions
scoped to internal Puppeteer fixtures).

Next: /gsd-plan-phase 3 (preferences preserved at .plan-phase-preferences.md
auto-deletes when consumed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 18:06:09 +02:00
03d4b3343c docs(03): UI design contract — null-spec for verification-only phase
Phase 3 is verification-only; /gsd-ui-phase 3 trigger on "page" keyword
is a false positive. UI-SPEC.md confirms no new user-facing UI surface
in scope; locks the Phase 1 design system (Lora + IBM Plex Sans + Loom
palette + Mokosh mark + tokens.css + 17 i18n keys) as read-only
inherited context; declares minimal probe-page conventions for
internal Puppeteer test fixtures (Plans 03-01..03-05 per D-P3-01).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 18:04:00 +02:00
6b52d111f8 docs(03): preserve plan-phase preferences captured pre-UI-SPEC exit
User invoked /gsd-plan-phase 3 and answered both gate questions before the
workflow correctly exited at the UI Design Contract gate (per workflow rule
that manual invocations cannot nested-Skill-spawn /gsd-ui-phase due to
AskUserQuestion-in-subcontext issue #1009).

Preferences saved at .plan-phase-preferences.md for the next plan-phase
invocation (after /gsd-ui-phase 3 produces UI-SPEC.md):
- UI gate: generate UI-SPEC.md first (chosen — most canonical; verification
  caveat noted for /gsd-ui-phase to consider)
- Research gate: research first (light) — scope-limited to puppeteer.Page.metrics
  + rrweb alpha-pin status (NOT rrweb v2 upgrade implementation, NOT masking)

File auto-deletes when /gsd-plan-phase 3 honors these preferences.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 18:00:08 +02:00
2b4f0689fb docs(state): record phase 3 context session — Phase 3 discuss-phase complete
state.record-session CLI incorrectly flipped status to "completed" + percent to
100% (since 18/18 currently-known plans are done — but that's a CLI inference
bug; Phase 3 + Phase 4 are still pending so milestone is NOT complete).

Restored: status=ready_to_plan, percent=50% (2/4 phases truly complete).

Phase 3 CONTEXT.md at:
  .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-CONTEXT.md
DISCUSSION-LOG.md sibling captures the alternatives considered.

5 plans + 4 D-P3-* locked decisions ready for /gsd-plan-phase 3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 17:57:07 +02:00
dba51ea233 docs(03): capture phase context — discuss-phase complete (5 plans + 4 D-P3-* locked decisions) .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-CONTEXT.md .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-DISCUSSION-LOG.md 2026-05-20 17:55:48 +02:00
113f52d33e docs(phase-02): evolve PROJECT.md after Phase 2 closure
Per workflow update_project_md step (prevents planning document drift #956):
- Validated section split by phase:
  * Phase 1 validated (Verifier audit GREEN 17/17): video-ring-buffer +
    manifest-permissions + install-clean
  * Phase 2 validated (Verifier audit PASSED 5/5; T5 override per saved
    memory): screenshot-on-export + popup-ui + archive-layout +
    meta-json-schema + archive-export-latency
- Active section restructured to reflect remaining phases:
  * Phase 3: SPEC §10 smoke + DOM/event-log verification (absorbs
    rrweb-dom-buffer + user-event-log from removed original Phase 2)
  * Phase 4: Harden + clean up (optional)
- Last updated footer: 2026-05-20 Phase 2 closure note

Audit closures referenced: P0-6 (Blob URL pipeline) + P1 #10 (meta.urls
schema migration).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 17:45:45 +02:00
a8b4fe567b docs(phase-02): complete phase execution — markers flipped to closed
Phase 2 closure tracking:
- STATE.md: status: ready_to_plan (Phase 3 prep awaits); Current Position
  flipped Phase 2 → COMPLETE; progress 14/18 → 18/18; percent 50% reflects
  2/4 phases complete
- ROADMAP.md: Phase 2 plan-count + status updated by gsd-sdk phase.complete
- REQUIREMENTS.md: 5 Phase 2 REQs flipped to Complete with Phase 2 closure
  notes:
    * REQ-screenshot-on-export — A28 archive layout verification
    * REQ-popup-ui — SAVE-only state machine verified by A24 + A25
    * REQ-archive-layout — A28 set-equality on jszip-parsed archive
    * REQ-meta-json-schema — D-P2-02 + D-P2-03 8-field shape verified by
      A26 + A27 + tests/build/strict-meta-json-validation.test.ts (8 tests)
      + tests/background/meta-json-urls-schema.test.ts (5 tests)
    * REQ-archive-export-latency — D-P2-01 Blob URL pipeline closes audit
      P0-6; A25 empirical <5s verification
- REQ-manifest-permissions: amended to reflect DEC-011 Amendment 1 (added
  `tabs` permission for Phase 2 D-P2-02 meta.urls feature) + corrected
  `tabCapture` → `desktopCapture` per D-01 historical evolution

Phase 2 outcome: 4/4 plans landed; UAT harness 24→29 GREEN; vitest 153→171
GREEN; bundle gates 6/6 PASS; verifier verdict PASSED (5/5; T5 override
per user delegation + saved memory feedback-trust-harness-over-manual-uat.md).

Audit closures: P0-6 (base64 data-URL cap → Blob URL pipeline) + P1 #10
(meta.url:string → urls:string[] schema).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 17:43:51 +02:00
a499db3ff2 docs(02): VERIFICATION — Phase 2 PASSED (T5 override per user delegation + harness coverage)
Verifier returned human_needed with 4/5 truths VERIFIED (T1-T4) + T5 UNCERTAIN
because Plan 02-04 Task 4 contract literally typed checkpoint:human-verify gate=blocking
and the operator empirical "approved" ack wasn't on record.

T5 (operator clicks SAVE → ZIP produced in <5s with correct layout + Blob URL)
is OVERRIDDEN to VERIFIED based on:

1. User explicit delegation 2026-05-20: "why do i need to do all of this? It's on
   you to test..." — established that automation covers what automation can cover.

2. New saved memory feedback-trust-harness-over-manual-uat.md (same session):
   reserve operator empirical UAT for surfaces automation genuinely cannot verify
   (brand judgment, ergonomics). For deep-pipeline Phase 2 work, every operator-
   checklist surface IS harness-covered.

3. Harness assertion coverage of every step:
   - (a) <5s latency → A25 empirical via Puppeteer
   - (b) 5-entry archive layout → A28 set-equality
   - (c) 8-field meta.json schema → A26 + tests/build/strict-meta-json-validation.test.ts
   - (d) video playback → Phase 1 VERIFICATION.md empirical (D-13 unchanged)
   - (e) blob: URL pattern → A24 empirical

4. Alpha distribution build covers real-world OS-archive-manager layer outside
   in-session verification scope.

Plan 02-04 Task 4 was authored before the saved-memory principle was established;
the checkpoint contract reflects an older operating mode.

Status: passed (with 1 override applied; override_notes captured in frontmatter)
Score: 5/5

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 17:41:04 +02:00
df692b2d70 docs(phase-02): update tracking after wave 3 — 02-04 GREEN (UAT 29/29 + bundle gates PASS; checkpoint closed via harness coverage) .planning/ROADMAP.md 2026-05-20 17:33:03 +02:00
cbd6849cad chore: merge executor worktree (worktree-agent-ae01a6e0a930f4599) — Wave 3 Plan 02-04 A26+A27+A28 + bundle gates 2026-05-20 17:26:27 +02:00
c9d1a8e65a docs(02-04): SUMMARY — Phase 2 closure UAT harness A24+A25+A26+A27(strict)+A28 (29/29 UAT GREEN; 171/171 vitest GREEN; bundle gates PASS)
5 new harness assertions empirically verifying D-P2-01 (Blob URL pipeline)
+ D-P2-02 (meta.urls) + D-P2-03 (8-field schema) + REQ-archive-export-latency
(5s) + REQ-archive-layout (5 entries) + DEC-011 Amendment 1 (tabs permission).

Test baselines:
- vitest 171/171 GREEN (full suite preserved)
- UAT harness 24/24 → 29/29 GREEN (HEADLESS=1 npm run test:uat empirically verified)
- Tier-1 FORBIDDEN_HOOK_STRINGS gate 13/13 GREEN (12 strings × 0 hits; unchanged from baseline)
- SW-bundle-import gate 2/2 GREEN
- i18n + build gates 57/57 GREEN

Pre-checkpoint bundle gates per saved memory feedback-pre-checkpoint-bundle-gates.md:
- Build clean (npm run build exit 0)
- SW CSP-safety: 1 documented exception (setimmediate polyfill; pre-existing)
- SW Node-globals: 0 Buffer.* / require( hits
- DOM-globals: typeof-guarded bundled-lib idioms only
- Manifest validation: tabs + downloads permissions intact in dist/manifest.json

Plan task accomplishments:
- Task 1 A24 (Blob URL empirical): 4ae7325 (prior executor)
- Task 2 A25 (5s latency): 47e9818 (prior executor)
- Task 3 A26+A27+A28 wiring: 20e06a6 (this run)
- Task 3b A27.7 F2 contract refinement (Rule 1 fix): d0ebc80 (this run)

Operator empirical UAT cycle 1 (Task 4 Step 2; checkpoint:human-verify
gate=blocking) remains the binding closure gate for Phase 2. Checklist
surfaced in SUMMARY § "Operator Empirical UAT Cycle 1 — AWAITED".
2026-05-20 17:25:13 +02:00
d0ebc807a2 fix(02-04): harness A27.7 — F2 contract refined (legitimate chrome-extension:// URLs permitted; only empty-tracker fallback forbidden)
Rule 1 deviation surfaced during the first UAT harness end-to-end run:
A27.7 originally forbade ALL chrome-extension:// URLs in meta.urls. Empirical
reality: the harness environment legitimately captures chrome-extension://
URLs (the welcome.html page opens automatically on first install per Plan
01-10; the harness page itself at chrome-extension://<id>/tests/uat/
extension-page-harness.html is a real active tab). The production tracker
(src/background/tab-url-tracker.ts:79 URL_SCHEME_ALLOW) EXPLICITLY permits
the chrome-extension:// scheme.

F2's actual contract was: empty tracker → urls: [] (NOT a single fake
chrome-extension:// sentinel). With real URLs present, the F2 fallback path
is definitionally not triggered. The refined A27.7 expresses F2's actual
semantics: "empty-tracker fallback NOT triggered" — verified by
`realHttpUrls.length >= 2` (proof the tracker was populated by real
onActivated events, NOT by the F2 empty-state fallback).

This is a strict semantic improvement: the original A27.7 would have hidden
a real production regression (if the tracker started excluding chrome-extension
URLs, A27 would have continued to PASS misleadingly). The refined contract
catches the intended F2 regression (empty-tracker fallback → fake sentinel)
without false-positiving on legitimate chrome-extension active tabs.

Empirical UAT verification: 29/29 GREEN with the fix in place.
- A27.4 ✓ meta.urls contains https://example.com/
- A27.5 ✓ meta.urls contains https://www.iana.org/
- A27.7 ✓ F2 contract: real http(s) URLs present (length=2)
- A28.* ✓ 5-entry zip-layout strict
2026-05-20 17:24:10 +02:00
20e06a6a58 feat(02-04): harness A26+A27(strict)+A28 — meta.json 8-field + multi-tab urls[] STRICT + REQ-archive-layout (D-P2-02/03 + DEC-011 Amendment 1)
Wave 3 closure task 3 — extends the UAT harness with 3 new assertions
(A26 + A27 + A28) for empirical verification of the D-P2-02/D-P2-03
contracts + REQ-archive-layout end-to-end through a real Chrome instance.

Page side (tests/uat/extension-page-harness.ts):
  - assertA26() — stub returning the assertion name; host-side does all
    inspection (JSZip is host-only via tests/uat/lib/zip.ts).
  - assertA27() — STRICT mode (post DEC-011 Amendment 1): owns its
    setupFreshRecording + opens 2 tabs (example.com + iana.org) +
    activates each (chrome.tabs.update active:true) + 11s settle + SAVE
    + tab cleanup in finally with try/catch (T-02-04-04 mitigation).
    Returns A27.1 (SAVE ack) + tabAUrl + tabBUrl for the host driver.
  - assertA28() — stub returning the assertion name; host-side enumerates
    zip entries.
  - __mokoshHarness surface extended from 25 → 28 methods.

Host side (tests/uat/lib/harness-page-driver.ts):
  - driveA26 — chains off A25's zip via findLatestZip helper; loads via
    JSZip, parses meta.json, asserts 6 checks: entry present, exactly 8
    fields, schemaVersion='2', urls is non-empty Array, legacy url field
    undefined, every URL matches /^(https?|chrome-extension):\\/\\//.
  - driveA27 — snapshot pre-existing zips; runs page-side; polls 8s for
    new-or-updated zip with stable-size protocol; loads + parses
    meta.json; asserts 8 STRICT checks per DEC-011 Amendment 1: SAVE ack,
    meta.urls is Array, length>=2, contains tabAUrl, contains tabBUrl,
    every entry non-empty string, no extension-origin sentinels (F2),
    no chrome-internal URLs.
  - driveA28 — chains off A27's zip; enumerates non-directory entries
    via filter pipeline (per CLAUDE.md no-continue style); asserts 3
    checks: exactly 5 entries, set-equal to the canonical 5 paths, no
    extras.
  - findLatestZip helper added for A26/A28 chaining (mtime-sort wins).
  - JSZip imported at top (mirrors tests/uat/lib/zip.ts pattern).

Orchestrator (tests/uat/harness.test.ts):
  - Imports driveA26/A27/A28 + wraps each with handles.downloadsDir.
  - Drivers array extends from 25 → 28 (running total 29/29 with A0).
  - Architecture banner updated to mention A26+A27+A28.

FORBIDDEN_HOOK_STRINGS impact: NONE. A26/A28 are host-side JSZip ops;
A27 uses chrome.tabs.create + chrome.tabs.update + chrome.tabs.remove
(production APIs; `tabs` permission granted via DEC-011 Amendment 1
landed in Plan 02-03). Tier-1 inventory stays at 12.

Verification (pre-commit):
  - npx tsc --noEmit: clean.
  - npm run build: exit 0; dist/ populated.
  - 4 new manifest gates (Tier-1 + SW-bundle-import) verified in followup.

Closes Plan 02-04 Task 3 (Wave 3 functional contract). Pre-checkpoint
bundle gates + operator empirical UAT cycle follow in Task 4.
2026-05-20 17:16:35 +02:00
b6b3f377b8 chore: merge partial executor worktree (worktree-agent-aac9035b8c3b890ac) — Wave 3 Plan 02-04 A24+A25 (529 mid-plan) 2026-05-20 17:06:09 +02:00
47e9818cb1 feat(02-04): harness A25 — empirical <5s SAVE→zip latency (REQ-archive-export-latency, SPEC §10 #6)
Wire A25 into the UAT harness as the binding empirical gate for
REQ-archive-export-latency / SPEC §10 #6 (5000ms hard ceiling end-to-end
from SAVE_ARCHIVE dispatch to zip-on-disk).

Architecture:
- Page-side assertA25 records t0 (performance.now) + t0Wall (Date.now)
  + tAck bookends around the chrome.runtime.sendMessage(SAVE_ARCHIVE)
  call. Returns A25Result extending AssertionRecord with the 3 timing
  fields + ackSuccess flag.
- Host-side driveA25(page, downloadsDir) snapshots zip dir BEFORE
  page.evaluate dispatch, polls for new-or-overwritten .zip via mtime
  delta (mirrors A12/A13 overwrite-aware pattern), uses page-supplied
  t0Wall as the host anchor for the dispatch→file-on-disk latency
  check (NOT a host-side Date.now captured before page.evaluate, which
  would include setupFreshRecording + 11s segment-settle wall time and
  always fail the 5s budget).

[Rule 1 - Bug] Initial implementation used host-side Date.now() captured
before page.evaluate as the latency anchor — this incorrectly included
the 11s segment-settle window in the budget. First run observed
A25.3=11188ms (FAIL). Fix: page-side captures Date.now() at the
SAVE_ARCHIVE dispatch instant (AFTER setupFreshRecording + segment-settle
complete) and returns it as t0Wall in A25Result; the driver uses this
as the canonical host anchor. Result on re-run: A25.3=61ms (GREEN, well
under 5s SLO). Documented per T-02-04-02 disposition (bracket only the
SAVE dispatch, not the broader test orchestration).

Files modified:
- tests/uat/extension-page-harness.ts (+~115 lines): assertA25 +
  A25_* constants + A25Result interface
- tests/uat/lib/harness-page-driver.ts (+~95 lines): driveA25 +
  A25_HOST_POLL_TIMEOUT_MS const + A25_LATENCY_CEILING_MS const
- tests/uat/harness.test.ts (+~15 lines): import driveA25, wrap with
  downloadsDir, append to drivers list

Verification:
- HEADLESS=1 npm run test:uat → 26/26 GREEN
- elapsedAck=60ms, host-side delta=61ms (both well under 5000ms SLO)
- npx vitest run tests/background/no-test-hooks-in-prod-bundle.test.ts
  → 13/13 GREEN (Tier-1 FORBIDDEN_HOOK_STRINGS unchanged at 12)
- npx tsc --noEmit → clean

Plan 02-04 scope: 2/3 tasks landed (A24 + A25); Task 3 adds
A26 (meta.json 8-field) + A27 (multi-tab strict) + A28 (archive-layout strict).
2026-05-20 16:49:56 +02:00
4ae73250fa feat(02-04): harness A24 — empirical Blob URL download verification (D-P2-01 closes P0-6)
Wire A24 into the Plan 01-13 Approach B UAT harness as the binding empirical
gate for D-P2-01. A24 verifies end-to-end that SAVE_ARCHIVE → chrome.downloads.
download receives a `blob:` URL prefix (NOT `data:application/zip;base64,`),
closing audit P0-6 functionally. The Plan 02-02 unit tests pin the wire-format
at the SW↔offscreen boundary; A24 pins it at the chrome.downloads platform
boundary through a real Chrome instance.

Strategy: chrome.downloads.onCreated listener captures the URL cross-realm.
The plan's <action> block proposed a chrome.downloads.download monkey-patch
installed in the harness page realm — but that intercepts only same-realm
calls, missing the SW's call. The canonical cross-realm capture pattern is
chrome.downloads.onCreated (fires for any download initiated by any extension
realm, with the full DownloadItem including .url). Documented as a deviation
from the plan's pseudo-code in SUMMARY.md (Rule 1 — bug fix vs the pseudo-code
strategy; same A24 contract verified, correct mechanism).

Files modified:
- tests/uat/extension-page-harness.ts (+~150 lines): assertA24 + A24_* constants
- tests/uat/lib/harness-page-driver.ts (+~30 lines): driveA24 page.evaluate wrapper
- tests/uat/harness.test.ts (+~10 lines): import driveA24, append to drivers list

Verification:
- HEADLESS=1 npm run test:uat → 25/25 GREEN (24 baseline + A24)
- capturedUrl observed: blob:chrome-extension://lpgnfoop.../...
- npx vitest run → 171/171 GREEN (no regression)
- Tier-1 FORBIDDEN_HOOK_STRINGS gate → 13/13 GREEN (12 strings preserved)
- npx tsc --noEmit → clean

Plan 02-04 scope: 1/3 tasks landed (A24); Tasks 2-3 add A25+A26+A27+A28
(latency, meta.json shape, multi-tab strict, REQ-archive-layout strict).
2026-05-20 16:41:36 +02:00
3821e5c402 docs(phase-02): update tracking after wave 2 part 2 — 02-03 GREEN (D-P2-02 + D-P2-03 close P1 #10) .planning/ROADMAP.md 2026-05-20 16:14:06 +02:00
38f3aa8d7f chore: merge executor worktree (worktree-agent-ac398144f27b986ca) — Wave 2 Plan 02-03 2026-05-20 16:13:38 +02:00
935ba1d489 docs(02-03): complete D-P2-02 meta.urls + D-P2-03 8-field schema plan
SUMMARY for Plan 02-03 documenting:
- New module src/background/tab-url-tracker.ts (4 exports incl. snapshotOpenTabs per DEC-011 Amendment 1 capability).
- SessionMetadata field-count delta (7 → 8: removed url; added urls + schemaVersion).
- 8th field `schemaVersion` decision: ratified per Plan 02-01 Task 3 planner pick; value '2' marks the D-P2-02 url→urls cutover.
- Filter rules verbatim from CONTEXT.md `<specifics>`: include https + http + chrome-extension://; exclude chrome:// + about: + devtools:// + file:// + blob: + data:.
- DEC-011 Amendment 1 verified in place (already landed via plan-checker iteration-1 revision pass commits 9dcfcf0 + df8c086).
- F2 resolution: empty-tracker case emits `urls: []` with diagnostic logger.warn; no sentinel-URL fallback.
- Rule 3 deviation: tests/background/meta-json-urls-schema.test.ts Tests 3+4+5 rewired to drive chrome.tabs.onUpdated callbacks directly via stub _callbacks array. Preserves Tier-1 FORBIDDEN_HOOK_STRINGS gate at 13 entries (production bundle stays test-hook-clean).
- Forward link: Plan 02-04 A27 multi-tab strict-mode unblocked by Amendment 1 + this plan's meta.urls implementation.

Test count delta: 163/171 GREEN → 171/171 GREEN (+8 net; all 8 Plan-02-01-flagged RED tests flipped). Tier-1 gate: 13/13 GREEN unchanged.

[parallel-executor] No modifications to STATE.md or ROADMAP.md (orchestrator owns those writes after all worktree agents in the wave complete).
2026-05-20 16:12:58 +02:00
af035564d3 docs(02-03): REQUIREMENTS — REQ-meta-json-schema amended for 8-field shape with urls[] + schemaVersion
- Rewrite REQ-meta-json-schema block (lines ~106-119) to reflect the
  Plan 02-03 D-P2-02 + D-P2-03 cutover:
  * 8 fields exact (was 7); `url: string` REMOVED; `urls: string[]`
    + `schemaVersion: '2'` ADDED.
  * Acceptance criteria: schemaVersion === '2'; ISO-8601 timestamp;
    urls entries match URL_SCHEME_ALLOW regex (https + http +
    chrome-extension://); urls deduplicated + first-seen-ordered; semver
    extensionVersion; non-negative integer totalEvents; exactly 8 keys.
  * F2 explicitly carried in the urls acceptance bullet: empty array IS
    permitted (whole-desktop-no-tab session is a meaningful operator
    state); non-empty arrays validate each entry against the filter regex.
  * Binding note preserves the original CON-meta-json-schema 7-field
    shape as SPEC provenance while documenting that this REQ supersedes
    it for the Phase 2 cutover.

- Traceability table entry updated:
  Phase 3 (originally) → **Phase 2** → Phase 2 (implementation landed
  via Plan 02-03; harness validation deferred to Plan 02-04).

- Footer dated 2026-05-20 with the REQ-meta-json-schema amendment
  citation; prior Plan 01-10 closure entry demoted to "Earlier update".

Verification gates per plan:
- grep -c "schemaVersion" .planning/REQUIREMENTS.md → 3 (≥2 required ✓)
- grep -c "urls.*string\[\]" .planning/REQUIREMENTS.md → 2 (≥1 required ✓)
2026-05-20 16:09:07 +02:00
78031e7782 feat(02-03): meta.json — urls[] + schemaVersion (D-P2-02 + D-P2-03; replaces url:string)
- src/shared/types.ts SessionMetadata: REPLACE `url: string` with
  `urls: string[]`; ADD `schemaVersion: string` as the first field.
  Total 8 fields. Field-emission order follows source-declaration order
  (TypeScript object-literal insertion order; JSON.stringify emits in
  insertion order per ECMA-262). Docstring cites D-P2-02 + D-P2-03 +
  Plan 02-01 Task 3 planner-resolved 8th field decision + F2 empty-array
  permission.

- src/background/index.ts:
  * Import { initTabUrlTracker, snapshotOpenTabs, getTabUrlsSeen } from
    './tab-url-tracker'.
  * Register initTabUrlTracker() at module top-level alongside
    chrome.downloads.onChanged (Plan 02-02 precedent for D-P2-* feature
    registration). Defensive try/catch matches the surrounding chrome.*
    listener pattern; tracker module has its own initialized flag for
    idempotency.
  * createArchive: snapshotOpenTabs() before reading getTabUrlsSeen()
    (DEC-011 Amendment 1 capability — captures tabs opened but never
    activated). Empty urls[] emitted faithfully per F2 (no fake
    extension-origin sentinel; logger.warn for diagnostic visibility on
    whole-desktop-no-tab sessions).
  * metadata literal: schemaVersion: '2' first, urls (not url), 8 fields
    total. ECMA-262 insertion-order guarantee + JSON.stringify deliver
    the canonical wire shape.

- Always-on charter preserved: createArchive does NOT call
  clearTabUrlsSeen() — tracker continues accumulating across saves
  (Plan 01-09 Amendment 3 invariant).

Verification:
- npx tsc --noEmit → clean.
- npm run build → clean (dist/assets/index.ts-8LkXuqac.js 378.82 kB,
  ~+2 kB vs pre-Task-2 baseline for the new tab-url-tracker module).
- npx vitest run → 171/171 GREEN (was 163 GREEN / 8 RED; +8 GREEN net).
- Tier-1 grep gate: 13/13 GREEN unchanged.

Closes 8 RED tests:
- tests/background/meta-json-urls-schema.test.ts Tests 1+2 (Tests 3+4+5
  flipped in Task 1).
- tests/build/strict-meta-json-validation.test.ts Tests 1+3+8 (Tests 2,
  4, 5, 6, 7 remain GREEN regression guards).
2026-05-20 16:08:08 +02:00
7beb69059e feat(02-03): tab-url-tracker — chrome.tabs.onActivated + onUpdated → urls[] with dedup + filter (D-P2-02)
- Add src/background/tab-url-tracker.ts: initTabUrlTracker, getTabUrlsSeen,
  snapshotOpenTabs, clearTabUrlsSeen.
- Filter: positive-allow regex ^(https?|chrome-extension):// — INCLUDE
  https + http + chrome-extension://; default-deny chrome://, about:,
  devtools://, file://, blob:, data: (per CONTEXT.md `<specifics>` URL
  filter clause).
- Dedup: Set membership gate + first-seen-ordered array; getTabUrlsSeen
  returns a slice so callers cannot mutate internal state.
- snapshotOpenTabs: defensive chrome.tabs.query({}) enumeration for SAVE-
  time augmentation (DEC-011 Amendment 1 capability). Captures tabs the
  operator opened but never activated.
- Module guards: initialized flag prevents double-listener registration;
  all chrome.tabs.* listener calls wrapped in defensive try/catch matching
  the src/background/index.ts:bootstrap pattern.
- Tier-1 grep-gate preserved (13 entries): NO `_resetForTesting` /
  `_observeForTesting` ergonomic test hooks exported (would have leaked
  into production bundles per tests/background/no-test-hooks-in-prod-
  bundle.test.ts). Tests drive chrome.tabs.onUpdated callbacks directly
  via the chrome stub — Plan 02-01 SUMMARY anticipated this option.

[Rule 3 - Blocking] tests/background/meta-json-urls-schema.test.ts Tests 3+4
extended to wire chrome.tabs.onUpdated callbacks directly (replaces the
optional `_resetForTesting` / `_observeForTesting` skeletons). Test 5
simplified (empty-tracker assertion needs no observation seeding on a
freshly-reset module graph). Test 5 F2 contract preserved verbatim.

Verification:
- npx tsc --noEmit → clean
- npx vitest run tests/background/meta-json-urls-schema.test.ts → 3/5 GREEN
  (Tests 3+4+5 the tracker-contract trio flipped; Tests 1+2 still RED as
  they pin the SessionMetadata + createArchive amendment — Task 2 territory)
2026-05-20 16:06:06 +02:00
d3aa567a54 docs(phase-02): update tracking after wave 2 part 1 — 02-02 GREEN (D-P2-01 closes P0-6) .planning/ROADMAP.md 2026-05-20 15:58:54 +02:00
3f251c5666 chore: merge executor worktree (worktree-agent-a7b893984f8b14c8f) — Wave 2 Plan 02-02 2026-05-20 15:58:23 +02:00
95b5bd252c docs(02-02): complete Blob URL download pipeline plan (D-P2-01 closes P0-6)
SUMMARY.md documents:
- 3 RED tests in tests/background/blob-url-download.test.ts flipped GREEN
  (wire-format polarity guard, 6 MB latency + wire-format, revoke lifecycle).
- 6 files modified (3 prod source + 3 test files; +518 / -35 lines).
- Wire-format extension: 3 new PortMessageType variants on keepalivePort.
- Operator-facing improvement: archives >2 MB now download successfully
  (was: silent failure with data:URL Network error).
- Rule 3 deviation: extended Plan 02-01 test helpers with the offscreen-side
  CREATE_DOWNLOAD_URL → DOWNLOAD_URL → REVOKE_DOWNLOAD_URL round-trip
  simulation pattern + capturedArchiveBytes bytes capture. This pattern
  is reusable by Plan 02-03 and was anticipated in Plan 02-01 SUMMARY.
- Forward link: Plan 02-03 (meta.urls + tab-url-tracker) is unblocked;
  Plan 02-04 (UAT harness A24+) is unblocked.

Verification:
- npx tsc --noEmit: clean
- npm run build: clean
- npm run build:test: clean
- tests/background/blob-url-download.test.ts: 3/3 GREEN
- Tier-1 FORBIDDEN_HOOK_STRINGS: 13/13 GREEN (unchanged)
- Full vitest: 163 passed / 8 failed (was 159 passed / 12 failed); +4 GREEN
  net delta. 8 remaining RED are exactly Plan 02-03 territory.
2026-05-20 15:57:35 +02:00
79964e62d2 feat(02-02): SW — downloadArchive via offscreen-minted Blob URL + revoke lifecycle (D-P2-01 closes P0-6)
Production changes (src/background/index.ts):
- pendingDownloadUrlResolvers Map<requestId, resolver> routes DOWNLOAD_URL
  responses back to the in-flight downloadArchive Promise; mirrors the
  pendingBufferRequests pattern from the BUFFER round-trip so port
  replacement mid-mint does not lose the response.
- pendingRevokes Map<downloadId, url> tracks (downloadId → minted blob:URL)
  for the chrome.downloads.onChanged revoke dispatch.
- onConnect port message sink extended with DOWNLOAD_URL routing branch
  (alongside existing PING/BUFFER routing).
- downloadArchive rewritten: encode archive via blobToBase64 → post
  CREATE_DOWNLOAD_URL on videoPort → await DOWNLOAD_URL response (race
  against 5s BLOB_URL_MINT_TIMEOUT_MS) → reject empty / non-blob: URLs
  (T-02-02-03 mitigation) → call chrome.downloads.download → register
  (downloadId, url) in pendingRevokes. NO data:URL fallback — typed
  errors route through saveArchive's catch to RECORDING_ERROR.
- chrome.downloads.onChanged listener registered at module init:
  on terminal state ('complete' / 'interrupted'), posts REVOKE_DOWNLOAD_URL
  to videoPort and clears the pendingRevokes entry.

Deviation (Rule 3 — auto-fix blocking issue):
- Plan 02-01's test helpers in blob-url-download.test.ts +
  meta-json-urls-schema.test.ts + strict-meta-json-validation.test.ts
  modeled only the REQUEST_BUFFER → BUFFER round-trip, not the new
  CREATE_DOWNLOAD_URL → DOWNLOAD_URL round-trip Plan 02-02 introduces.
  Without the test-side mint simulation, the SW's downloadArchive
  times out at the offscreen mint step → chrome.downloads.download
  never called → ALL existing meta.json tests timeout.
- Each helper extended with a tryFireDownloadUrl block that decodes
  the CREATE_DOWNLOAD_URL.dataBase64, mints a Node-native blob:URL via
  URL.createObjectURL, captures the archive bytes for downstream
  JSZip extraction (capturedArchiveBytes), and replies DOWNLOAD_URL.
  Test 3 (revoke lifecycle) additionally shims port.postMessage to
  call URL.revokeObjectURL on receipt of REVOKE_DOWNLOAD_URL — the
  test-side equivalent of src/offscreen/recorder.ts handleCreateDownloadUrl.
- Pre-existing Plan-02-02-era TODO comments in both test files
  explicitly anticipated this extension ("Plan 02-03 implementer will
  likely need a different helper, e.g. spy on URL.createObjectURL").

Verification (full §verification block from plan):
- npx tsc --noEmit: clean
- npm run build: clean
- npx vitest run tests/background/blob-url-download.test.ts: 3/3 GREEN (was 3 RED)
- npx vitest run tests/background/no-test-hooks-in-prod-bundle.test.ts: 13/13 GREEN
- npm test full suite: 163 passed / 8 failed (was 159 passed / 12 failed);
  net delta +4 GREEN = 3 RED→GREEN flips + 1 ffprobe-flaky pass. 8 remaining
  RED are exactly the Plan 02-03 territory (5 meta-json-urls-schema + 3
  strict-meta-json-validation RED tests).
- grep -c "data:application/zip;base64," src/background/index.ts: 0 (gone)
- grep -c "blob:" src/background/index.ts: 8 (new pipeline)
- grep -c "chrome.downloads.onChanged" src/background/index.ts: 5 (listener wired)
- dist/ post-build: 0 "data:application/zip;base64," matches; 1 file with
  "chrome.downloads.onChanged" (the SW chunk).
2026-05-20 15:54:28 +02:00