--- phase: 04 slug: harden-clean-up-optional status: draft nyquist_compliant: false wave_0_complete: false created: 2026-05-21 --- # Phase 04 — Validation Strategy > Per-phase validation contract for feedback sampling during execution. **Phase 4 character:** Final v1 hardening phase. Mix of bug fixes + flake stabilization + audit P1 polish + visual polish + build hygiene + closure aggregator. Wave 0 RED test scaffolds for new audit-P1 fixes; existing harness extended for new A33+ assertions; spike-first for SW state persistence (Plan 04-03 per RESEARCH finding 2). --- ## Test Infrastructure | Property | Value | |----------|-------| | **Framework** | vitest 4.x (unit) + custom Puppeteer harness (UAT — `npm run test:uat`) | | **Config file** | `vitest.config.ts` + `tests/uat/harness.test.ts` (orchestrator) | | **Quick run command** | `npm test -- --run tests/.test.ts` | | **Full suite command** | `npm test -- --run` (vitest) + `HEADLESS=1 SKIP_PROD_REBUILD=1 npm run test:uat` (UAT harness) | | **Estimated runtime** | ~50s (vitest 171→~190 tests post Wave 0) + ~95-300s (UAT harness 33→~37 assertions; 5-min idle test adds ~300s to single run if not gated) ≈ 2.5-6 min full sweep | --- ## Sampling Rate - **After every task commit:** Run focused test command (vitest single-file OR `npm run test:uat -- --grep A` for harness) - **After every plan wave:** Run full vitest + full UAT harness — both MUST be GREEN - **Before `/gsd-verify-work 4`:** Full suite GREEN + pre-checkpoint bundle gates 6/6 PASS (per saved memory `feedback-pre-checkpoint-bundle-gates.md`) - **5-min idle test (Plan 04-03):** dedicated `npm run test:uat:long` lane with 6-min timeout; NOT in default sample - **Max feedback latency:** ~2.5 min (default full sweep); ~10s (focused vitest); ~25s (focused UAT assertion) --- ## Per-Task Verification Map | Task ID | Plan | Wave | Requirement | Threat Ref | Secure Behavior | Test Type | Automated Command | File Exists | Status | |---------|------|------|-------------|------------|-----------------|-----------|-------------------|-------------|--------| | 04-01 T1 RED 3 tests | 04-01 | 1 | Audit P1 #11/#14/#15 | T-04-01-01..03 | URL extraction + previousUrl + epoch normalization | unit (vitest jsdom) | `npm test -- tests/content/ --run` | ❌ NEW (Wave 0) | ⬜ pending | | 04-01 T2 GREEN edits | 04-01 | 1 | Audit P1 #11/#14/#15 | T-04-01-01..03 | Same; src/content/index.ts edits | unit (vitest) | `npm test -- tests/content/ --run` (+8 GREEN) + `npx tsc --noEmit` | ✗ EXISTS (modify) | ⬜ pending | | 04-02 T1 RED build gates | 04-02 | 1 | SC #4 dead-code + setimmediate hygiene | T-04-02-01/03 | grep gate | unit (vitest + execFile build) | `npm test -- tests/build/no-new-function-in-sw-chunk.test.ts tests/build/dead-code-grep.test.ts --run` | ❌ NEW (Wave 0) | ⬜ pending | | 04-02 T2 GREEN polyfill + rename + flip | 04-02 | 1 | SC #3 generate-icons + setimmediate Q1 | T-04-02-01/02/04 | queueMicrotask polyfill; .cjs rename | build-gate + unit | `npm run build && grep -c 'new Function' dist/assets/index.ts-*.js` -> 0 + `node generate-icons.cjs` exit 0 | ✗ EXISTS (modify + rename) | ⬜ pending | | 04-03 T1 assertA29 rewrite | 04-03 | 2 | A29 flake stabilization | T-04-03-01/02 | cs-injection-world ISOLATED + sentinel | UAT (page-side) | `npx tsc --noEmit && npm run build:test` | ✗ EXISTS (modify) | ⬜ pending | | 04-03 T2 driveA29 strict-sentinel | 04-03 | 2 | A29 sentinel filter | T-04-03-01 | rrweb IncrementalSource.Mutation filter | UAT (host-side) | `HEADLESS=1 SKIP_PROD_REBUILD=1 npm run test:uat` 33/33 GREEN; 5/5 stress | ✗ EXISTS (modify) | ⬜ pending | | 04-04 T1 SPIKE | 04-04 | 3 | SC #1 SW state persistence empirical | T-04-04-01 | offscreen survives SW idle | spike script | `HEADLESS=1 tsx tests/uat/spike-a33-sw-persistence.ts` -> videoSize > 100_000 | ❌ NEW (Wave 0 spike) | ⬜ pending | | 04-04 T2 A33 + stopServiceWorker + orchestrator | 04-04 | 3 | SC #1 5-min idle harness | T-04-04-02/03/04 | CDP worker.close() + 5-min wait + SAVE | UAT | `HEADLESS=1 SKIP_LONG_UAT=1 npm run test:uat` 34/34 GREEN (skip-mode); full-mode 34/34 ~6.5 min | ✗ EXISTS (modify) | ⬜ pending | | 04-05 T1 assertA34 fetch+XHR | 04-05 | 4 | SC #2 fetch+XHR network_error | T-04-05-01 | cs-injection-world dual-trigger | UAT (page-side) | `npx tsc --noEmit && npm run build:test` | ✗ EXISTS (modify) | ⬜ pending | | 04-05 T2 driveA34 + orchestrator | 04-05 | 4 | SC #2 + P1 #11 end-to-end empirical | T-04-05-01 | 2 network_error entries with status===404 | UAT | `HEADLESS=1 SKIP_LONG_UAT=1 npm run test:uat` 35/35 GREEN; full-mode ~7 min | ✗ EXISTS (modify) | ⬜ pending | | 04-06 T1 RED inline-SVG + cursor-pin | 04-06 | 5 | UI-SPEC dark-logo + RESEARCH Finding 4 | T-04-06-01 | DOMParser inline injection (no innerHTML); cursor: 'always' literal | unit (vitest jsdom + build-grep) | `npm test -- tests/welcome/ tests/build/cursor-visibility.test.ts --run` | ❌ NEW (Wave 0) | ⬜ pending | | 04-06 T2 GREEN SVG + welcome.ts + globals | 04-06 | 5 | UI-SPEC stroke recolor + ?raw import | T-04-06-01 | currentColor + DOMParser inline | unit | `npm test -- tests/welcome/inline-svg.test.ts --run` 3/3 GREEN | ✗ EXISTS (modify) | ⬜ pending | | 04-06 T3 A17.8 + 01-07 back-patch | 04-06 | 5 | UI-SPEC harness invariant + docs hygiene | T-04-06-01 | A17.8 raw-source grep | UAT + docs | `HEADLESS=1 SKIP_LONG_UAT=1 npm run test:uat` 35/35 + grep verify | ✗ EXISTS (modify) | ⬜ pending | | 04-06 T4 Operator empirical | 04-06 | 5 | UI-SPEC AC #6 aesthetic judgment | T-04-06-01 | dark-mode visual contrast | manual | operator returns "approved" or describes issue | n/a | ⬜ pending | | 04-07 T1 04-VERIFICATION.md | 04-07 | 6 | Phase 4 closure aggregator | T-04-07-01 | scorecard + override notes + deferred items | docs aggregator | `test -f .planning/phases/04-harden-clean-up-optional/04-VERIFICATION.md && grep -cE '^## '` >= 5 | ❌ NEW | ⬜ pending | | 04-07 T2 Marker flips | 04-07 | 6 | D-P4-05 + ROADMAP/STATE flips | T-04-07-02/03 | Phase 4 [x] + completed_phases: 4 | docs | `grep -c '\[x\] \*\*Phase 4' .planning/ROADMAP.md` >= 1 | ✗ EXISTS (modify) | ⬜ pending | *Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky* **Planner instructions:** Populate one row per task. Per RESEARCH finding 5 (6 new Wave-0 test files anticipated), expect ~6 unit-test rows + ~4 harness-A33+ rows + ~4 bundle-gate rows + ~3 docs rows. Format per Phase 3 03-VALIDATION precedent. --- ## Wave 0 Requirements Per RESEARCH finding 5, 6 new test files anticipated as Wave 0 RED scaffolds for audit-P1 / ROADMAP-SC items: - ⬜ `tests/build/no-new-function-in-sw-chunk.test.ts` — grep `dist/assets/index.ts-*.js` for `new Function(` count = 0 after setimmediate polyfill replacement - ⬜ `tests/build/dead-code-grep.test.ts` — ROADMAP SC #4: rg `permissions\.request` + duplicate offscreen inline string in src/ = 0 hits - ⬜ `tests/content/fetch-interception.test.ts` — audit P1 #11: Request-arg vs string-arg case for `args[0]` - ⬜ `tests/content/navigation-tracking.test.ts` — audit P1 #14: module-level previousUrl tracking - ⬜ `tests/content/rrweb-timestamps.test.ts` — audit P1 #15: epoch vs page-load-relative - ⬜ `tests/welcome/inline-svg.test.ts` — UI-SPEC: `?raw` import + DOMParser inline-SVG + currentColor resolves via CSS Existing infrastructure already in place (inherited from Phases 1-3): - ✅ `tests/uat/extension-page-harness.ts` — page-side assertA* host (extend with assertA33+) - ✅ `tests/uat/lib/harness-page-driver.ts` — host-side driveA* host (extend with driveA33+ + improved driveA29) - ✅ `tests/uat/harness.test.ts` — orchestrator (extend) - ✅ `tests/uat/lib/assertions.ts` — shared helpers - ✅ `tests/uat/lib/zip.ts` — jszip-based archive parsing - ✅ `tests/uat/lib/launch.ts` — Puppeteer Chrome launch + extension load - ✅ `tests/background/no-test-hooks-in-prod-bundle.test.ts` — FORBIDDEN_HOOK_STRINGS lockstep `wave_0_complete: false` until 6 new test files committed per planner Wave 0. --- ## Manual-Only Verifications | Behavior | Requirement | Why Manual | Test Instructions | |----------|-------------|------------|-------------------| | Dark-surface logo visual contrast judgment (WCAG ratio not specified in UI-SPEC) | UI-SPEC acceptance criterion #2 + operator empirical | Aesthetic judgment of deep-indigo stroke on madder-orange wrapper in dark-OS context — UI-SPEC defers to operator empirical checkpoint | Load extension in OS dark mode; open welcome page; verify mark wrapper still madder, stroke is deep-indigo, contrast is acceptable. Plan 04-06 operator empirical UAT cycle covers this. | | Cursor visibility verification | ROADMAP cursor visibility item per D-P4-03 | Already shipped at src/offscreen/recorder.ts:285 per Plan 01-09 (RESEARCH finding 4) — Plan 04-06 downgrades to verification + stale-note correction; minor manual check of one captured frame to confirm cursor present | Load extension, start recording, click anywhere, SAVE archive; open `video/last_30sec.webm` in Chrome; verify cursor visible in playback. | *All other Phase 4 behaviors have automated verification via vitest (Wave 0 unit tests) + UAT harness (A33+) + bundle gates.* --- ## Validation Sign-Off - [ ] All tasks have `` verify or Wave 0 dependencies — pending planner fill-in - [ ] Sampling continuity: no 3 consecutive tasks without automated verify — verify after planner fills table - [ ] Wave 0 covers all MISSING references — 6 new test files anticipated; planner confirms count - [ ] No watch-mode flags — verify in planner output (focused commands use `--run`) - [ ] Feedback latency < ~2.5 min default (5-min idle test on dedicated lane) — confirmed by RESEARCH - [ ] `nyquist_compliant: true` set in frontmatter — pending sign-off after planner completes **Approval:** pending (planner fills per-task map; checker validates)