Phase 4 carries one genuine designer-side decision: dark-surface logo contrast
strategy. Recommends Option A — `currentColor` SVG + CSS color driven via the
existing `.dark, [data-theme="dark"]` block in tokens.css (lines 234-251). Post-
research amendment: welcome.ts must swap `?url` (data URL → <img>) for `?raw`
(inline <svg> via DOMParser) because <img>-rendered SVGs do not inherit parent
CSS color — `currentColor` only resolves on inline DOM SVG.
Cursor visibility constraint (Plan 01-07 obs 2026-05-15) is listed as
behavioral-only inheritance, not a design surface — 1-line change in
src/offscreen/recorder.ts per Chrome CursorCaptureConstraint enum.
Inherits Phase 1 design system as read-only (Lora display + IBM Plex Sans UI
+ Loom palette + Mokosh mark + canonical tokens.css + 17-key i18n matrix).
Zero new tokens, zero new copy, zero new colors. PNG icons unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User invoked /gsd-plan-phase 4 and answered both gate questions before the
workflow correctly exited at the UI Design Contract gate (per workflow rule
that manual invocations cannot nested-Skill-spawn /gsd-ui-phase due to
AskUserQuestion-in-subcontext issue #1009).
Preferences saved at .plan-phase-preferences.md for the next plan-phase
invocation (after /gsd-ui-phase 4 produces UI-SPEC.md):
- UI gate: generate UI-SPEC.md first — unlike Phase 3 (false positive),
Phase 4 has genuine dark-logo work; UI-SPEC should be thin-but-real
(dark-logo design only; cursor visibility listed as inherited behavioral
change, not a design surface)
- Research gate: research first (light, ~10-20 min) — scope-limited to:
setimmediate polyfill replacement strategy + SW state persistence 5min
idle test patterns + chrome.scripting.executeScript world:'ISOLATED'
best practices for A29 cs-injection-world fix. Researcher NOT to
investigate already-deferred items (rrweb v2, SW-RAM, masking).
File auto-deletes when /gsd-plan-phase 4 honors these preferences.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the single-task Plan 03-04 closure end-to-end:
- A32 ships ~90 lines of best-effort RAM scaffolding per D-P3-04 +
RESEARCH Open Question 3 (host-side puppeteer.Page.metrics; no page-
side counterpart; no SAVE; no archive parse)
- Pitfall 2 mandatory diagnostic leads diagnostics array (T-03-04-01
Repudiation mitigation; three layers of operator-visible signal so
automation GREEN ≠ §10 #9 closure)
- UAT 32/32 → 33/33 GREEN; vitest 171/171 preserved; Tier-1
FORBIDDEN_HOOK_STRINGS unchanged at 12 (host-side API has no
production-bundle impact)
- Phase 4 inheritance path documented (per-target enumeration via
browser.targets() + createCDPSession + Performance.getMetrics for
SW + offscreen + harness page aggregate)
- Pre-existing parallel-vitest Tier-1-build-step race recurred once
(1/171); verified pre-existing across 03-02 + 03-03; not caused by
A32 changes; isolated re-run 13/13 GREEN
- Plan 03-05 wave dependency: VERIFICATION.md aggregator; will record
§10 #9 as `human_verification` regardless of A32 status
- Zero deviations: plan-spec verbatim implementation; the cleanest of
the four Wave-2/3/4 plans in Phase 3 by deviation count
A32 ships ~90 lines of best-effort RAM scaffolding per D-P3-04 +
RESEARCH Open Question 3 (recommended SHIP). Calls puppeteer.Page.metrics()
against the harness page and asserts JSHeapUsedSize is below the SPEC §10 #9
50 MB ceiling.
Page-realm scope is the load-bearing caveat (RESEARCH Pitfall 2): the MV3
service worker is a separate Puppeteer target with its own V8 isolate, so
Page.metrics() under-reports the operator-facing "extension background
RAM" measurement that §10 #9 actually requires. The binding §10 #9 gate
stays operator-driven (chrome://memory-internals OR chrome://extensions
service-worker memory display) and is recorded in Plan 03-05 VERIFICATION.md
human_verification block.
Mandatory diagnostic line emitted on EVERY run regardless of pass/fail:
"NOTE: page-realm only; SW context measurement requires
chrome://memory-internals operator verification per D-P3-04."
printAssertionResult prints diagnostics to stdout so the operator sees
the caveat in the live UAT trace, never confusing automation GREEN with
full §10 #9 closure (T-03-04-01 Repudiation mitigation).
Host-side only — no page-side assertA32, no setupFreshRecording, no
SAVE, no archive parse. driveA32 takes only `page` (no downloadsDir),
so the orchestrator pushes it bare in the drivers array without a wrapped
const. Tier-1 FORBIDDEN_HOOK_STRINGS inventory unchanged at 12 entries
(Page.metrics is host-side puppeteer; not bundled).
Empirical: UAT harness 32/32 → 33/33 GREEN; A32.1 PASS (JSHeapUsedSize=
1909924 bytes); A32.2 PASS (1.82 MB << 50 MB). Tier-1 unit-gate 13/13
sub-tests GREEN; 12 strings × 0 hits each in dist/. vitest 171/171 GREEN.
Closes:
- Plan 03-04 must_have 'puppeteer.Page.metrics() returns a JSHeapUsedSize
value (>= 0) for the harness page realm' (A32.1)
- Plan 03-04 must_have 'JSHeapUsedSize for the harness page realm is
below 50 MB' (A32.2)
- Plan 03-04 must_have 'Driver emits an explicit diagnostic line: NOTE:
page-realm only' (Pitfall 2 gate — leads diagnostics array)
- Plan 03-04 must_have 'UAT harness exits 0 with 32 + 1 = 33/33
assertions GREEN' (empirical 33/33)
- Append driveA31 to tests/uat/lib/harness-page-driver.ts after driveA30:
- Reuses UserEvent type (Plan 03-02 import already present).
- 3-phase pattern: page.evaluate → findLatestZip → JSZip
logs/events.json parse + filter-pipeline grep for sentinel absence
+ control-sentinel presence.
- 3 host-side checks: A31.2 (eventsContainingSentinel.length === 0),
A31.3 (eventsTargetingPassword.length === 0), A31.4
(eventsContainingControl.length >= 1; defense-in-depth proves
the listener is alive so A31.2/A31.3 absences mean the filter
fired rather than a tautological "no events at all" pass).
- Standard guard checks A31.0 (zip present) + A31.0a (events.json
entry exists) + A31.0b (JSON.parse success) gate before A31.2..A31.4
per Plan 02-04 / Plan 03-01 / Plan 03-02 driveA26/A29/A30 precedent.
- Filter-pipeline form preserved (no `continue`) per CLAUDE.md
Control Flow §.
- Wire orchestrator in tests/uat/harness.test.ts:
- Add `driveA31,` to import block after `driveA30,`.
- Add `driveA31Wrapped` const after `driveA30Wrapped`.
- Add `{ name: 'A31', drive: driveA31Wrapped }` entry to drivers
array after the A30 entry with explanatory banner comment
citing the cs-injection-world precedent + the defense-in-depth
A31.4 control check.
- Append `, A31` to the orchestrator banner string.
Acceptance grep gates (post-commit):
- grep -c 'driveA31' tests/uat/lib/harness-page-driver.ts returns 2
- grep -c 'driveA31' tests/uat/harness.test.ts returns 6
- grep -c 'secret-do-not-log-123' tests/uat/lib/harness-page-driver.ts returns 1
- tsc --noEmit exit 0
A29 flake disclosure (per Plan 03-02 SUMMARY "Issues Encountered"):
- During Plan 03-03 empirical verification of A31, the pre-existing
A29 flakiness documented in 03-02-SUMMARY.md surfaced: A29 chains
off incidental zip-mtime ordering against prior assertions' zips,
so when A29's own (empty chrome-extension:// SAVE) zip mtime ties
with a prior real-content zip, findLatestZip non-deterministically
returns the prior zip with rrweb events from iana.org/example.com.
- 3 base runs (HEAD=de398347, no Plan 03-03 changes): 2/3 PASS,
1/3 FAIL — confirms PRE-EXISTING flake, NOT a Plan 03-03 regression.
- Per CLAUDE.md SCOPE BOUNDARY ("Only auto-fix issues DIRECTLY caused
by the current task's changes") + Plan 03-02 SUMMARY's explicit
recommendation ("Plan 03-05's VERIFICATION.md aggregator + a
Phase 4 hardening pass can pick it up"): A29 flake is OUT OF SCOPE
for Plan 03-03. Documented in SUMMARY as deferred item.
- Add assertA31 page-side orchestrator after assertA30: opens fresh
https://example.com probe tab via chrome.tabs.create, injects a
synthetic <input type="password" id="probe-password"> + a control
<input type="text" id="probe-control"> into the probe tab DOM via
chrome.scripting.executeScript world:'ISOLATED', types
A31_PASSWORD_SENTINEL='secret-do-not-log-123' + A31_CONTROL_SENTINEL
into each, dispatches input events, settles, SAVEs while the probe
tab is active, finally-cleanup with silent-ignore (T-02-04-04
parity).
- Add 8 module-local constants: A31_SAVE_ARCHIVE_TIMEOUT_MS=15s,
A31_SEGMENT_SETTLE_MS=11s, A31_TRIGGER_SETTLE_MS=1s,
A31_TAB_NAVIGATION_WAIT_MS=1.5s, A31_PROBE_TAB_URL,
A31_PASSWORD_SENTINEL, A31_CONTROL_SENTINEL,
A31_PASSWORD_SELECTOR='#probe-password',
A31_PASSWORD_INPUT_ID, A31_CONTROL_INPUT_ID.
- Extend declare global Window.__mokoshHarness interface with assertA31
+ add assertA31 to window.__mokoshHarness object literal + update
statusEl banner + closing console.log to A31.
- 1 page-side check: A31.1 (SAVE_ARCHIVE ack). Host-side driveA31
(Task 2) will append A31.2 (sentinel-value-absent) + A31.3
(zero-events-targeting-password-selector) + A31.4 (control event
present — defense-in-depth proof the listener is alive, so A31.2
and A31.3 GREEN actually mean the filter fired rather than a
tautological pass from no events at all).
Rule 3 — Auto-fix blocking (cs-injection-world adaptation):
- The plan's <action> drove document.querySelector('#probe-password')
on the harness page (chrome-extension://...harness.html). Plan
03-02 empirically established that <all_urls> content_scripts does
NOT cover chrome-extension scheme (Chrome match-pattern spec
permits http/https/file/ftp/urn only). With no content script on
the harness page, A31.2/A31.3 would pass tautologically (no events
captured regardless of input type — would not empirically verify
the line-82 filter "fires").
- A31 reuses the Plan 03-02 cs-injection-world pattern: probe tab on
https://example.com (where the content script attaches normally)
+ executeScript ISOLATED-world injection so production
setupInputLogging at src/content/index.ts:78 actually sees the
password input event AND its line-82 filter early-returns.
- A31.4 control-event check is added as defense-in-depth per
T-03-03-04: proves the listener IS alive, so the absence assertions
A31.2/A31.3 are not vacuously satisfied.
- Plan's binding contract (sentinel absent from logs/events.json +
zero events targeting password selector) preserved verbatim; only
the trigger mechanism changes.
FORBIDDEN_HOOK_STRINGS impact: NONE. A31 rides production
setupInputLogging + line-82 filter + chrome.tabs + chrome.scripting
(scripting perm already in manifest) + existing
setupFreshRecording/sendMessageWithTimeout helpers. Tier-1 unchanged
at 12.
- driveA30 host-side (tests/uat/lib/harness-page-driver.ts):
- import type { UserEvent } from '../../../src/shared/types' (5-type tuple grep).
- A30_EXPECTED_TYPES = ['click','input','navigation','js_error','network_error']
(canonical CON-event-log-schema 5-tuple).
- 3-phase pattern (page.evaluate stub → findLatestZip → JSZip
logs/events.json) per Plan 02-04 driveA26 analog.
- 6 host-side checks: A30.0a (entry present) + A30.2..A30.6 (5 type
presence). Filter-pipeline form; no `continue`.
- Orchestrator wiring (tests/uat/harness.test.ts):
- driveA30 import + driveA30Wrapped const + drivers-array entry with
Plan 03-02 banner; Architecture banner updated A29 -> A29, A30.
- assertA30 architectural rewrite (deviation Rule 3 — blocking fix):
The plan's original strategy "dispatch synthetic events ON the harness
page (chrome-extension://) so the production listeners on that page
fire" was empirically wrong on two counts:
1. Chrome MV3 `<all_urls>` match-pattern (Chrome match-pattern docs)
permits schemes http/https/file/ftp/urn only — NOT
chrome-extension. The harness page has NO content script attached;
the SW SAVE_ARCHIVE handler reported "Could not establish
connection. Receiving end does not exist." when the active tab was
the harness page (verified empirically 2026-05-20T17:36:25Z trace).
2. Even if (1) had been satisfied, page.evaluate-side fetch() runs in
the MAIN world while the content-script's window.fetch wrapper at
src/content/index.ts:167 patches only the content-script's
ISOLATED-world window. Page-world fetches NEVER reach the
production network_error wrapper.
Fix: A30 now creates a fresh https://example.com probe tab via
chrome.tabs.create (mirrors A27's pattern; DEC-011 Amendment 1 `tabs`
perm; `scripting` perm already in manifest); uses
chrome.scripting.executeScript with default `world: 'ISOLATED'` to
inject all 5 triggers directly in the content-script's realm; SAVEs
while the probe tab is active (SW harvests events.json from a tab
whose content script IS attached); cleans up the probe tab in finally
(T-02-04-04 silent-ignore parity). All 5 UserEvent types now land
empirically: type counts: click=1,input=1,navigation=1,js_error=1,
network_error=1; userEvents.length=5.
- UAT 30 → 31 GREEN; vitest 171/171 preserved; Tier-1 FORBIDDEN_HOOK_STRINGS
unchanged at 12 (A30 rides production chrome.tabs + chrome.scripting +
GET_RRWEB_EVENTS round-trip — no new test-only symbols).
- Append form (text + email + password + submit) + table (thead + 2 rows)
+ modal trigger + hidden modal div below existing `<pre id="status">`
scaffold; preserves `<head>` block + tokens.css link untouched (A18/A21
invariant).
- Modal trigger uses inline onclick to toggle style.display — rrweb
records the attribute mutation, satisfying IncrementalSnapshot
emission per RESEARCH Pitfall 1 (synthetic probe HTML emits Meta +
FullSnapshot but NOT IncrementalSnapshot without a DOM mutation
between page load and SAVE).
- Per RESEARCH Pitfall 4: the rrweb-alpha.4-leaky multi-line input
element (rrweb-io/rrweb#1596) is excluded; only single-line inputs.
- Per UI-SPEC §"Test Fixture Conventions": data-test-* attributes
only; no data-mokosh-* (production-welcome-page reserved); no
tokens.css import on the probe sub-tree (head already imports the
canonical tokens for A18/A21).
- npm run build exit 0; all 7 acceptance grep gates GREEN.
Plan-checker iter-1 VERIFICATION PASSED with 1 cosmetic WARNING (Dimension 11
Research Resolution: Open Questions section heading lacked (RESOLVED) suffix
convention). Fixed inline: heading now reads "## Open Questions (RESOLVED)".
.plan-phase-preferences.md (created mid-/gsd-plan-phase first invocation to
preserve gate answers across the UI-SPEC detour) DELETED — purpose served;
this plan-phase invocation honored the saved research-first-light scope
brief.
state.record-session CLI bug recurred (status flipped to "completed" because
18/23 known plans done). Restored: status=ready_to_execute. percent: 78 is
correct now (5 Phase 3 plans counted; was 18/18=100 stale).
Phase 3 ready for execution: 5 plans validated, infrastructure inherited,
test baselines preserved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3 is verification-only; /gsd-ui-phase 3 trigger on "page" keyword
is a false positive. UI-SPEC.md confirms no new user-facing UI surface
in scope; locks the Phase 1 design system (Lora + IBM Plex Sans + Loom
palette + Mokosh mark + tokens.css + 17 i18n keys) as read-only
inherited context; declares minimal probe-page conventions for
internal Puppeteer test fixtures (Plans 03-01..03-05 per D-P3-01).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User invoked /gsd-plan-phase 3 and answered both gate questions before the
workflow correctly exited at the UI Design Contract gate (per workflow rule
that manual invocations cannot nested-Skill-spawn /gsd-ui-phase due to
AskUserQuestion-in-subcontext issue #1009).
Preferences saved at .plan-phase-preferences.md for the next plan-phase
invocation (after /gsd-ui-phase 3 produces UI-SPEC.md):
- UI gate: generate UI-SPEC.md first (chosen — most canonical; verification
caveat noted for /gsd-ui-phase to consider)
- Research gate: research first (light) — scope-limited to puppeteer.Page.metrics
+ rrweb alpha-pin status (NOT rrweb v2 upgrade implementation, NOT masking)
File auto-deletes when /gsd-plan-phase 3 honors these preferences.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
state.record-session CLI incorrectly flipped status to "completed" + percent to
100% (since 18/18 currently-known plans are done — but that's a CLI inference
bug; Phase 3 + Phase 4 are still pending so milestone is NOT complete).
Restored: status=ready_to_plan, percent=50% (2/4 phases truly complete).
Phase 3 CONTEXT.md at:
.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-CONTEXT.md
DISCUSSION-LOG.md sibling captures the alternatives considered.
5 plans + 4 D-P3-* locked decisions ready for /gsd-plan-phase 3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verifier returned human_needed with 4/5 truths VERIFIED (T1-T4) + T5 UNCERTAIN
because Plan 02-04 Task 4 contract literally typed checkpoint:human-verify gate=blocking
and the operator empirical "approved" ack wasn't on record.
T5 (operator clicks SAVE → ZIP produced in <5s with correct layout + Blob URL)
is OVERRIDDEN to VERIFIED based on:
1. User explicit delegation 2026-05-20: "why do i need to do all of this? It's on
you to test..." — established that automation covers what automation can cover.
2. New saved memory feedback-trust-harness-over-manual-uat.md (same session):
reserve operator empirical UAT for surfaces automation genuinely cannot verify
(brand judgment, ergonomics). For deep-pipeline Phase 2 work, every operator-
checklist surface IS harness-covered.
3. Harness assertion coverage of every step:
- (a) <5s latency → A25 empirical via Puppeteer
- (b) 5-entry archive layout → A28 set-equality
- (c) 8-field meta.json schema → A26 + tests/build/strict-meta-json-validation.test.ts
- (d) video playback → Phase 1 VERIFICATION.md empirical (D-13 unchanged)
- (e) blob: URL pattern → A24 empirical
4. Alpha distribution build covers real-world OS-archive-manager layer outside
in-session verification scope.
Plan 02-04 Task 4 was authored before the saved-memory principle was established;
the checkpoint contract reflects an older operating mode.
Status: passed (with 1 override applied; override_notes captured in frontmatter)
Score: 5/5
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>