Files
Mark b3bfbf4a8d feat(03): plans 01-05 — Phase 3 SPEC §10 smoke + DOM/event-log verification
5 plans across 5 waves (Wave 2 sequential per RESEARCH Pitfall 6 file overlap):
- 03-01 Wave 1: rrweb DOM verification harness extension (A29; REQ-rrweb-dom-buffer; §10 #4)
- 03-02 Wave 2: event-log verification harness extension (A30; REQ-user-event-log; §10 #5)
- 03-03 Wave 3: §10 #8 password-filter PARTIAL verification (A31; D-P3-02 charter)
- 03-04 Wave 4: §10 #9 RAM ceiling best-effort + Page.metrics scaffolding (A32; D-P3-04)
- 03-05 Wave 5: §10 sweep VERIFICATION.md + REQUIREMENTS/ROADMAP/STATE marker flips
  (REQ-install-clean + REQ-rrweb-dom-buffer + REQ-user-event-log)

Each plan has:
- frontmatter (wave + depends_on + files_modified + autonomous + requirements + tags + must_haves)
- tasks with mandatory <read_first> + <acceptance_criteria> + concrete <action>
- <threat_model> block per security gate
- Validation map row(s) added to 03-VALIDATION.md (10 tasks total)

Expected UAT growth: 29/29 → 33/33 GREEN (A29-A32 + 03-05 docs).
Expected vitest baseline preserved: 171/171.
Expected Tier-1 FORBIDDEN_HOOK_STRINGS: 12 (A29+ ride production surfaces only).

ROADMAP.md Phase 3 entry replaces "Plans: TBD" with full 5-plan list.
VALIDATION.md status: planner_filled (nyquist_compliant: true; wave_0_complete: true).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 19:01:21 +02:00

19 KiB
Raw Permalink Blame History

phase, slug, plan, type, wave, depends_on, files_modified, autonomous, requirements, tags, user_setup, must_haves
phase slug plan type wave depends_on files_modified autonomous requirements tags user_setup must_haves
03 spec-10-smoke-verification-dom-event-log-verification 04 execute 4
01
02
03
tests/uat/lib/harness-page-driver.ts
tests/uat/harness.test.ts
true
uat-harness
a32
ram-ceiling
spec-10-9-best-effort
approach-b
page-metrics
charter-d-p3-04
truths artifacts key_links
puppeteer.Page.metrics() returns a JSHeapUsedSize value (>= 0) for the harness page realm
JSHeapUsedSize for the harness page realm is below 50 MB (page-realm only; SW context excluded per RESEARCH Pitfall 2)
Driver emits an explicit diagnostic line: 'NOTE: page-realm only; SW context excluded' (prevents operator misinterpretation)
UAT harness exits 0 with 32 + 1 = 33/33 assertions GREEN (A31 baseline preserved + new A32)
path provides contains
tests/uat/lib/harness-page-driver.ts driveA32 host-side Page.metrics scaffolding (best-effort; explicit page-realm-only diagnostic) driveA32
path provides contains
tests/uat/harness.test.ts driveA32 import + drivers-array push entry (no wrapped driver — Page.metrics needs only page, not downloadsDir) driveA32
from to via pattern
tests/uat/harness.test.ts tests/uat/lib/harness-page-driver.ts driveA32 import + drivers-array push driveA32
from to via pattern
tests/uat/lib/harness-page-driver.ts driveA32 puppeteer.Page.metrics() CDP Performance.getMetrics await page.metrics() page.metrics()
Extend the UAT harness with A32 — best-effort scaffolding for SPEC §10 #9 (extension background RAM ≤ 50 MB). Per D-P3-04 locked decision: this is best-effort + operator-driven. The harness DOES NOT measure the MV3 service worker heap (RESEARCH Pitfall 2: Page.metrics is page-realm only). The genuine binding §10 #9 gate is the operator's `chrome://memory-internals` observation, recorded in Plan 03-05 VERIFICATION.md `human_verification` block.

A32 SHIPS the optional Page.metrics scaffolding per RESEARCH Open Question 3 recommendation (~30 lines; cost-cheap; informational value). Diagnostic output explicitly states the page-realm scope so the operator never confuses an automation GREEN with full §10 #9 closure.

Purpose: Provides a low-cost informational floor for page-realm heap usage and exercises the puppeteer.Page.metrics API end-to-end so Phase 4 (programmatic RAM measurement upgrade) inherits a working scaffold.

Output: A32 assertion with 2 host-side checks (Page.metrics returned JSHeapUsedSize >= 0 + JSHeapUsedSize < 50 MB) + an explicit diagnostic line about page-realm scope; UAT count 32 → 33 GREEN.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/REQUIREMENTS.md @.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-CONTEXT.md @.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-RESEARCH.md @.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-PATTERNS.md @.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-01-PLAN.md @.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-02-PLAN.md @.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-03-PLAN.md

From puppeteer ^25.0.2 (Page.metrics):

interface Metrics { Timestamp?: number; Documents?: number; Frames?: number; JSEventListeners?: number; Nodes?: number; LayoutCount?: number; RecalcStyleCount?: number; LayoutDuration?: number; RecalcStyleDuration?: number; ScriptDuration?: number; TaskDuration?: number; JSHeapUsedSize?: number; // <- bytes; the field A32 reads JSHeapTotalSize?: number; } page.metrics(): Promise;

From RESEARCH.md §"Code Example A3X":

  • Page.metrics is page-realm only — JSHeapUsedSize covers V8 isolate of THIS Page, NOT the MV3 service worker (separate target).
  • 50 MB threshold per SPEC §10 #9; treat as best-effort floor for the page realm alone.
  • Diagnostic copy gate: emit 'NOTE: page-realm only; SW context measurement requires chrome://memory-internals operator verification per D-P3-04.'

From src/shared/types.ts: no UserEvent / type changes for A32.

Plan Anchors

  • Sequential wave assignment (per RESEARCH Pitfall 6 + file-overlap rule): Plan 03-04 lives in wave 4 modifies tests/uat/lib/harness-page-driver.ts + tests/uat/harness.test.ts (SAME files as Plans 03-01..03; depends_on enforces sequential).
  • NO page-side assertion needed. Page.metrics is a host-side puppeteer API. Unlike A24..A31, A32 does NOT call assertA32 inside page.evaluate — there's no need for a window.__mokoshHarness method. This is consistent with how the host-side latency portion of A25 is computed; A32 is similar but skips the page-side entirely.
  • No setupFreshRecording, no SAVE, no zip read. A32 measures the current heap state of the harness page; no archive is produced.
  • RESEARCH Pitfall 2 mitigation (HARD): the diagnostic line about page-realm scope MUST be emitted regardless of pass/fail. This prevents an operator from glancing at "A32 GREEN" and concluding §10 #9 is closed.
  • 50 MB threshold: SPEC §10 #9 + CON-ram-ceiling. Page-realm typical values: a few MB (Plan 02-04 harness measurements show ~2-8 MB page-realm heap during recording). Far below the 50 MB ceiling on any reasonable run.
  • FORBIDDEN_HOOK_STRINGS lockstep: A32 is host-side only; Page.metrics is not bundled to the page. Tier-1 inventory stays at 12 entries.
  • A6 in RESEARCH Assumptions Log MEDIUM-risk noted: "if Plan 03-04 scaffolding requires a new bridge op (e.g., get-page-metrics from offscreen → harness), that would add 1-2 entries." This plan AVOIDS that: Page.metrics is read from the host puppeteer object directly; no new bridge ops added; no new MOKOSH_UAT symbols.
Task 1: Add driveA32 host-side (puppeteer.Page.metrics scaffolding) + orchestrator wiring tests/uat/lib/harness-page-driver.ts, tests/uat/harness.test.ts - tests/uat/lib/harness-page-driver.ts (full sense of the file; in particular how driveA1 is a 1-line page.evaluate wrapper, contrasting with A32 which is pure host-side) - tests/uat/harness.test.ts where Plan 03-03 added driveA31 + driveA31Wrapped + drivers-array entry (study shape) - .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-RESEARCH.md §"Code Example A3X" (canonical scaffolding shape; verbatim copy) - .planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-RESEARCH.md §"Pitfall 2" (diagnostic-copy gate) Host-side (`tests/uat/lib/harness-page-driver.ts`): - Adds `export async function driveA32(page: Page): Promise`: - Calls `const metrics = await page.metrics();` - Computes `const jsHeapBytes = metrics.JSHeapUsedSize ?? -1;` - Computes `const jsHeapMB = jsHeapBytes >= 0 ? jsHeapBytes / (1024 * 1024) : -1;` - Pushes A32.1 (Page.metrics returned JSHeapUsedSize): expected '>= 0', actual `jsHeapBytes`, passed `jsHeapBytes >= 0` - Pushes A32.2 (page-realm JS heap < 50 MB): expected '< 50 MB', actual `${jsHeapMB.toFixed(2)} MB`, passed `jsHeapMB >= 0 && jsHeapMB < 50` - Pushes the mandatory diagnostic: `'NOTE: page-realm only; SW context measurement requires chrome://memory-internals operator verification per D-P3-04.'` - Also pushes informational diagnostics: `JSHeapUsedSize=${jsHeapBytes} bytes` and `JSHeapTotalSize=${metrics.JSHeapTotalSize ?? -1} bytes` - Returns AssertionRecord computed `passed = checks.every(c => c.passed)` - The new constant `A32_RAM_CEILING_BYTES = 50 * 1024 * 1024` makes the threshold readable.
Orchestrator (`tests/uat/harness.test.ts`):
- Adds `driveA32,` to import block (after `driveA31,`).
- NO `driveA32Wrapped` const needed (driveA32 takes only `page`).
- Adds `{ name: 'A32', drive: driveA32 },` to drivers array AFTER the A31 entry, with banner comment citing D-P3-04 + Pitfall 2.
- Updates orchestrator banner line to append `, A32`.
1. Open `tests/uat/lib/harness-page-driver.ts`. At the end of the file (AFTER driveA31 added by Plan 03-03), append:

/* ─── Plan 03-04 — driveA32 (RAM scaffolding best-effort) ──────────── */

/** RAM ceiling per SPEC §10 #9 + CON-ram-ceiling. */
const A32_RAM_CEILING_BYTES = 50 * 1024 * 1024;
/** Bytes-per-MB factor for diagnostic copy. */
const A32_BYTES_PER_MB = 1024 * 1024;

/**
 * Drive A32 (Plan 03-04 — SPEC §10 #9 RAM best-effort per D-P3-04).
 *
 * Reads puppeteer.Page.metrics() against the harness page and asserts
 * JSHeapUsedSize is below the 50 MB ceiling. This is informational
 * scaffolding ONLY:
 *
 *   - RESEARCH Pitfall 2: Page.metrics is page-realm only. The MV3
 *     service worker is a separate Puppeteer target with its own V8
 *     isolate; page.metrics() does not aggregate across workers/iframes.
 *   - The page-realm value reported here is NOT the operator-facing
 *     "extension background RAM" measurement that SPEC §10 #9 requires.
 *   - The binding §10 #9 gate lives in Plan 03-05 VERIFICATION.md
 *     `human_verification` block (operator runs chrome://memory-internals
 *     OR chrome://extensions service-worker memory display).
 *
 * Why ship this anyway (per RESEARCH Open Question 3):
 *   - Low cost (~30 lines; single API call; no new bundle surface).
 *   - Exercises the Page.metrics API end-to-end so Phase 4 (programmatic
 *     RAM measurement upgrade) inherits a working scaffold.
 *   - Provides a sanity floor — if the harness page-realm heap ever
 *     blows past 50 MB, something has gone catastrophically wrong in
 *     the test infrastructure itself (not necessarily a §10 #9 regression
 *     in production).
 *
 * The diagnostic line about page-realm scope MUST be emitted regardless
 * of pass/fail per Pitfall 2.
 *
 * @param page - The harness page from `launchHarnessBrowser`.
 * @returns AssertionRecord with 2 checks (heap returned + heap < 50 MB)
 *          + explicit page-realm-only diagnostic.
 */
export async function driveA32(page: Page): Promise<AssertionRecord> {
  const checks: CheckRecord[] = [];
  const diagnostics: string[] = [];

  // Pitfall 2 gate: emit the page-realm caveat BEFORE any other diagnostic
  // so it leads in the structured output (the operator sees it first).
  diagnostics.push(
    'NOTE: page-realm only; SW context measurement requires chrome://memory-internals operator verification per D-P3-04.',
  );

  let metricsErr: string | null = null;
  let jsHeapBytes = -1;
  let jsHeapTotal = -1;
  try {
    const metrics = await page.metrics();
    jsHeapBytes = metrics.JSHeapUsedSize ?? -1;
    jsHeapTotal = metrics.JSHeapTotalSize ?? -1;
  } catch (err) {
    metricsErr = err instanceof Error ? err.message : String(err);
  }

  const jsHeapMB = jsHeapBytes >= 0 ? jsHeapBytes / A32_BYTES_PER_MB : -1;
  diagnostics.push(`A32 JSHeapUsedSize=${jsHeapBytes} bytes (${jsHeapMB.toFixed(2)} MB)`);
  diagnostics.push(`A32 JSHeapTotalSize=${jsHeapTotal} bytes`);
  if (metricsErr !== null) {
    diagnostics.push(`A32 Page.metrics threw: ${metricsErr}`);
  }

  checks.push({
    name: 'A32.1: Page.metrics returned a JSHeapUsedSize value >= 0',
    expected: '>= 0',
    actual: jsHeapBytes,
    passed: jsHeapBytes >= 0,
  });
  checks.push({
    name: `A32.2: Page-realm JS heap < ${A32_RAM_CEILING_BYTES / A32_BYTES_PER_MB} MB (NOTE: scaffolding only; SW context excluded per D-P3-04)`,
    expected: `< ${A32_RAM_CEILING_BYTES / A32_BYTES_PER_MB} MB`,
    actual: jsHeapMB >= 0 ? `${jsHeapMB.toFixed(2)} MB` : 'unavailable',
    passed: jsHeapBytes >= 0 && jsHeapBytes < A32_RAM_CEILING_BYTES,
  });

  const passed = checks.every((c) => c.passed);
  return {
    passed,
    name: 'A32 — RAM scaffolding (best-effort; page-realm only per D-P3-04 / SPEC §10 #9)',
    checks,
    diagnostics,
    error: metricsErr ?? undefined,
  };
}
  1. Open tests/uat/harness.test.ts. In the import block from ./lib/harness-page-driver, AFTER driveA31, and BEFORE getManifestVersion, add:
  // Plan 03-04 — RAM scaffolding best-effort (SPEC §10 #9 per D-P3-04)
  driveA32,
  1. In the drivers array, AFTER the { name: 'A31', ... } entry from Plan 03-03, add:
    // Plan 03-04 A32: RAM scaffolding (SPEC §10 #9 best-effort per D-P3-04).
    // NOTE — Page.metrics is page-realm only; SW context is a separate
    // Puppeteer target (RESEARCH Pitfall 2). A32 is informational
    // scaffolding; the binding §10 #9 gate lives in Plan 03-05
    // VERIFICATION.md `human_verification` block. No wrapped const
    // needed — driveA32 takes only `page`.
    { name: 'A32', drive: driveA32 },
  1. Update the orchestrator banner line (line 268) to append , A32:
  process.stdout.write('Architecture: A0 pre-flight + extension-internal page driver (A1..A14, A15..A17, A18..A22, A23, A24, A25, A26, A27, A28, A29, A30, A31, A32)\n');
  1. Run npx tsc --noEmit. Expected: clean.
  2. Run HEADLESS=1 SKIP_PROD_REBUILD=0 npm run test:uat. Expected: 33/33 GREEN. npx tsc --noEmit; D=$(grep -c "driveA32" tests/uat/lib/harness-page-driver.ts); test "$D" -ge 2 && H=$(grep -c "driveA32" tests/uat/harness.test.ts); test "$H" -ge 2 && grep -q "NOTE: page-realm only" tests/uat/lib/harness-page-driver.ts && HEADLESS=1 SKIP_PROD_REBUILD=0 npm run test:uat <acceptance_criteria>
    • npx tsc --noEmit exits 0.
    • grep -c 'driveA32' tests/uat/lib/harness-page-driver.ts returns >=2.
    • grep -c 'driveA32' tests/uat/harness.test.ts returns >=2 (import line + drivers-array push; no wrapped const).
    • grep -c 'NOTE: page-realm only' tests/uat/lib/harness-page-driver.ts returns exactly 1.
    • grep -c 'page.metrics()' tests/uat/lib/harness-page-driver.ts returns exactly 1.
    • grep -c 'A32_RAM_CEILING_BYTES' tests/uat/lib/harness-page-driver.ts returns >=2 (declaration + usage).
    • HEADLESS=1 SKIP_PROD_REBUILD=0 npm run test:uat exits 0 with stdout containing UAT harness: 33/33 assertions passed AND the diagnostic line NOTE: page-realm only; SW context measurement requires chrome://memory-internals operator verification per D-P3-04. (printed by printAssertionResult on A32).
    • npm test -- --run tests/background/no-test-hooks-in-prod-bundle.test.ts exits 0 (Tier-1 inventory stays at 12). </acceptance_criteria> UAT harness runs 33/33 GREEN. A32 emits the page-realm-only diagnostic on EVERY run (pass or fail). FORBIDDEN_HOOK_STRINGS unchanged at 12. Page.metrics scaffolding lives in the harness for Phase 4 to upgrade. The binding §10 #9 gate remains operator-driven and is recorded as human_verification in Plan 03-05.

<threat_model>

Trust Boundaries

Boundary Description
Puppeteer host ↔ CDP Page.metrics is a thin wrapper over CDP Performance.getMetrics; runs in the puppeteer host process, no extension code path
Page realm ↔ host realm A32 does NOT use page.evaluate; no new contract between page and host
dist-test/ ↔ dist/ Two-bundle separation: Plan 03-04 adds NO test-only symbols; production bundle invariant unchanged

STRIDE Threat Register

Threat ID Category Component Disposition Mitigation Plan
T-03-04-01 Repudiation Operator interprets A32 GREEN as full §10 #9 closure, skips chrome://memory-internals check mitigate Mandatory diagnostic line 'NOTE: page-realm only; SW context measurement requires chrome://memory-internals operator verification per D-P3-04.' emitted on EVERY run; check name itself includes the caveat; Plan 03-05 VERIFICATION.md explicitly lists §10 #9 in human_verification block. Three layers of operator-visible signal.
T-03-04-02 Information Disclosure Test-only hook surface leaking to production bundle mitigate A32 is host-side only; Page.metrics is not bundled to the page realm. FORBIDDEN_HOOK_STRINGS unchanged at 12 entries.
T-03-04-03 Denial of Service Page.metrics returns 0 or throws on first call after browser launch mitigate A32 wraps the call in try/catch + falls through gracefully (jsHeapBytes stays -1; A32.1 RED with clear diagnostic). Per A3 in RESEARCH Assumptions Log, Page.metrics has been stable since Puppeteer 1.x; failure is extremely unlikely on 25.0.2.
T-03-04-04 Elevation of Privilege New chrome.* permission grant for measurement accept A32 uses zero chrome.* APIs. Page.metrics is a CDP call, not an extension API. No manifest delta.

No new production surface; threat surface unchanged from Plan 03-03. UAT harness extension is test-only and adds no bundle surface (Page.metrics is host-side only). </threat_model>

- `npx tsc --noEmit` exits 0. - `HEADLESS=1 SKIP_PROD_REBUILD=0 npm run test:uat` exits 0 with 33/33 GREEN. - The diagnostic line `NOTE: page-realm only; SW context measurement requires chrome://memory-internals operator verification per D-P3-04.` appears in stdout from A32. - `npm test -- --run tests/background/no-test-hooks-in-prod-bundle.test.ts` exits 0 (12 FORBIDDEN_HOOK_STRINGS × 0 hits each).

<success_criteria>

  • A32 GREEN with 2 checks (heap returned + heap < 50 MB).
  • Pitfall 2 diagnostic emitted on every run.
  • Page.metrics scaffolding in place for Phase 4 to upgrade.
  • FORBIDDEN_HOOK_STRINGS unchanged at 12 entries.
  • vitest baseline preserved (171/171 GREEN).
  • Plan 03-05 will record §10 #9 as human_verification regardless of A32 status — A32 is informational scaffolding, NOT the binding gate. </success_criteria>
After completion, create `.planning/phases/03-spec-10-smoke-verification-dom-event-log-verification/03-04-SUMMARY.md` documenting: - A32 host-side-only scaffolding rationale (no page-side; Page.metrics is host) - D-P3-04 + Pitfall 2 compliance (mandatory page-realm-only diagnostic) - Phase 4 inheritance: programmatic RAM measurement upgrade path - UAT 32 → 33 GREEN; Tier-1 inventory unchanged at 12 - Plan 03-05 wave dependency: VERIFICATION.md aggregator; depends on Plans 03-01..04 GREEN