Files
mokosh/.planning/phases/02-stabilize-export-pipeline/02-04-PLAN.md
Mark 0608b22427 feat(02): plans 01-04 — Phase 2 export pipeline closure (Blob URL + meta.urls + schema + harness)
Wave structure (4 plans, 3 waves):
- 02-01 (Wave 1 RED): 15 RED tests pinning D-P2-01 (blob: URL contract), D-P2-02
  (meta.urls schema + dedup + filter), D-P2-03 (strict 8-field validation +
  schemaVersion '2' cutover marker).
- 02-02 (Wave 2): Offscreen-minted Blob URL pipeline — extends PortMessageType
  with CREATE/REVOKE messages; SW downloadArchive rewrite (data: → blob: via
  base64-on-wire to offscreen + URL.createObjectURL + chrome.downloads.onChanged
  revoke lifecycle). Closes audit P0-6; unblocks >2 MB archives.
- 02-03 (Wave 2): meta.urls schema migration + tab-url-tracker module
  (chrome.tabs.onActivated + onUpdated → deduplicated, filtered, first-seen-
  ordered string[]); SessionMetadata 7→8 fields with schemaVersion + urls;
  REQUIREMENTS.md REQ-meta-json-schema amendment. Closes P1 #10.
- 02-04 (Wave 3): UAT harness A24+A25+A26+A27 — blob: URL prefix, <5s SAVE→zip
  latency, meta.json 8-field shape, multi-tab dedup; pre-checkpoint bundle gates
  per saved memory + operator empirical UAT cycle 1. Tier-1 FORBIDDEN_HOOK_STRINGS
  inventory stays at 12 (no new hook symbols — chrome.* monkey-patches + JSZip
  + production APIs only).

Locked decisions honored (per 02-CONTEXT.md):
- D-P2-01: offscreen-minted Blob URL via existing keepalivePort + base64 wire
  format (reuses D-12 precedent at src/shared/binary.ts).
- D-P2-02: meta.json url:string → urls:string[]; URL filter per CONTEXT.md
  <specifics> (include https://, chrome-extension://; exclude chrome://, about:,
  devtools://, file://); dedup + first-seen ordering.
- D-P2-03: full scope; 8-field strict schema validation with schemaVersion='2'
  as the 8th field (planner-resolved tentative pick; revisable by plan-checker).

Architectural constraints preserved:
- Always-on charter (Plan 01-09 Amendment 3): no finally-block in saveArchive;
  no clearTabUrlsSeen on SAVE.
- Tier-1 FORBIDDEN_HOOK_STRINGS = 12 (no new test-hook symbols).
- Never await import(...) in src/background/index.ts (Plan 01-11 SUMMARY).
- Pre-checkpoint bundle gates per feedback-pre-checkpoint-bundle-gates.md (run
  in 02-04 Task 4 before operator surface).

Plan validation: gsd-sdk frontmatter.validate + verify.plan-structure GREEN
for all 4 plans.

ROADMAP updated: Phase 2 Plans list + Goal/Success Criteria block annotated
with D-P2-02/D-P2-03 amendments + 5th success criterion (Blob URL + revoke
lifecycle for >2 MB archives); Progress table 0/TBD → 0/4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 14:03:14 +02:00

722 lines
37 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
phase: 02-stabilize-export-pipeline
plan: 04
type: auto
wave: 3
depends_on: [02, 03]
files_modified:
- tests/uat/extension-page-harness.ts
- tests/uat/lib/harness-page-driver.ts
- tests/uat/harness.test.ts
- tests/background/no-test-hooks-in-prod-bundle.test.ts
autonomous: false
requirements:
- REQ-archive-export-latency
- REQ-meta-json-schema
- REQ-popup-ui
- REQ-screenshot-on-export
- REQ-archive-layout
tags:
- uat-harness
- a24
- a25
- a26
- a27
- blob-url-empirical
- latency-5s
- meta-urls-shape
- operator-checkpoint
- phase-2-closure
- approach-b
must_haves:
truths:
- "UAT harness extends with A24+ assertions covering the D-P2-01 + D-P2-02 + D-P2-03 contracts empirically (page-side, no SW-side hooks per Approach B)."
- "A24 verifies the SAVE_ARCHIVE → chrome.downloads.create call site receives a `blob:` URL prefix (NOT `data:application/zip;base64,`)."
- "A25 verifies the SAVE_ARCHIVE → zip-on-disk latency (<5000ms from chrome.runtime.sendMessage({type:'SAVE_ARCHIVE'}) dispatch to file appearing in downloadsDir) per REQ-archive-export-latency."
- "A26 verifies the produced meta.json has the 8-field D-P2-02/D-P2-03 shape: schemaVersion='2', urls is non-empty string[], no `url` key present."
- "A27 verifies tab-URL tracking by activating 2 distinct tabs during the recording window, then SAVE, then inspecting meta.urls — expects both URLs to be present (deduplicated, first-seen-ordered)."
- "Tier-1 FORBIDDEN_HOOK_STRINGS inventory extended IF new hook surfaces required for A24+; lockstep across tests/background/no-test-hooks-in-prod-bundle.test.ts AND tests/uat/harness.test.ts A0 mirror."
- "Final operator empirical checkpoint validates: (a) saving a real ~6MB archive completes successfully (was: data:URL Network error pre-Plan-02-02); (b) the produced meta.json has the new 8-field shape (operator opens the zip with archive-manager tool); (c) the saved zip opens cleanly in the OS file manager."
- "Pre-checkpoint bundle gates per saved memory `feedback-pre-checkpoint-bundle-gates.md` are run BEFORE operator checkpoint: SW CSP grep (no `new Function`/`eval`), SW Node-globals grep (no `Buffer.from`), DOM-globals grep, SW-bundle-import gate, manifest validation."
- "Phase 2 closes with 4/4 plans landed; 24/24 UAT baseline preserved or extended to 28/28 (A24+A25+A26+A27 inclusive); vitest baseline preserved or extended; operator empirical ack documented."
artifacts:
- path: "tests/uat/extension-page-harness.ts"
provides: "assertA24 (blob: URL prefix) + assertA25 (5s latency) + assertA26 (meta.urls shape) + assertA27 (multi-tab URL dedup)"
contains: "assertA24|assertA25|assertA26|assertA27"
- path: "tests/uat/lib/harness-page-driver.ts"
provides: "driveA24, driveA25, driveA26, driveA27 page.evaluate wrappers"
contains: "driveA24|driveA25|driveA26|driveA27"
- path: "tests/uat/harness.test.ts"
provides: "Orchestrator runs A24+A25+A26+A27 after A23; FORBIDDEN_HOOK_STRINGS lockstep if new hook symbols"
contains: "driveA24|driveA25|driveA26|driveA27"
- path: "tests/background/no-test-hooks-in-prod-bundle.test.ts"
provides: "Lockstep FORBIDDEN_HOOK_STRINGS update if new hook symbols introduced"
contains: "FORBIDDEN_HOOK_STRINGS"
key_links:
- from: "tests/uat/harness.test.ts"
to: "tests/uat/lib/harness-page-driver.ts:driveA24..A27"
via: "import + sequential orchestrator dispatch after driveA23"
pattern: "driveA24|driveA25|driveA26|driveA27"
- from: "tests/uat/extension-page-harness.ts:assertA25"
to: "performance.now() bookends around SAVE_ARCHIVE dispatch + host-side downloadsDir poll"
via: "page-side measures dispatch→ack; host-side measures dispatch→file-on-disk; assert total < 5000ms"
pattern: "assertA25"
- from: "tests/uat/extension-page-harness.ts:assertA24"
to: "chrome.downloads.onChanged listener OR chrome.downloads spy proxy"
via: "harness page spies on chrome.downloads.download via Proxy/replace OR reads delta.url from onChanged event; verifies prefix is 'blob:'"
pattern: "chrome\\.downloads"
---
<objective>
Wave 3 of Phase 2: extend the UAT harness with A24+A25+A26+A27 assertions covering the D-P2-01 +
D-P2-02 + D-P2-03 contracts end-to-end through a real Chrome instance. Run pre-checkpoint bundle
gates per saved memory before surfacing the final operator empirical checkpoint. Close Phase 2.
Purpose: empirical verification that Plans 02-02 (Blob URL pipeline) and 02-03 (meta.urls schema)
work in a real Chrome instance, not just in unit-test isolation. The harness extension is the
closure gate for Phase 2 — analogous to Plan 01-13's 15/15 harness PASS that closed Phase 1
functional contract.
Output:
- 4 new harness assertions (A24+A25+A26+A27) wired through page-harness + driver + orchestrator.
- Tier-1 FORBIDDEN_HOOK_STRINGS lockstep IF new hook symbols required.
- Pre-checkpoint bundle gate validation completed.
- Operator empirical checkpoint documenting Plan 02-02 + 02-03 + the 5s-latency contract on a
real 6MB archive.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/REQUIREMENTS.md
@.planning/STATE.md
@.planning/phases/02-stabilize-export-pipeline/02-CONTEXT.md
@.planning/phases/02-stabilize-export-pipeline/02-01-PLAN.md
@.planning/phases/02-stabilize-export-pipeline/02-02-PLAN.md
@.planning/phases/02-stabilize-export-pipeline/02-03-PLAN.md
# Precedent for harness extension pattern (Approach B)
@.planning/phases/01-stabilize-video-pipeline/01-13-SUMMARY.md
@.planning/phases/01-stabilize-video-pipeline/01-14-SUMMARY.md
# Files under modification
@tests/uat/harness.test.ts
@tests/uat/lib/harness-page-driver.ts
@tests/background/no-test-hooks-in-prod-bundle.test.ts
<interfaces>
<!-- Existing harness orchestrator pattern (read these before extending). -->
From tests/uat/harness.test.ts FORBIDDEN_HOOK_STRINGS (line 107-122):
```typescript
const FORBIDDEN_HOOK_STRINGS: ReadonlyArray<string> = [
'__mokoshTest',
'setCurrentStream',
'setSegmentCountGetter',
'installFakeDisplayMedia',
'uninstallFakeDisplayMedia',
'dispatchEndedOnTrack',
'getSegmentCount',
'__mokoshOffscreenQuery',
'get-display-surface',
'get-segment-count',
'lastGetDisplayMediaConstraints',
'get-last-getDisplayMedia-constraints',
];
```
Existing driver pattern (tests/uat/lib/harness-page-driver.ts:driveA23):
```typescript
export async function driveA23(page: Page): Promise<AssertionRecord> {
return await page.evaluate(async () => {
const harness = (window as any).__mokoshHarness;
const r: AssertionRecord = await harness.assertA23();
return r;
}) as AssertionRecord;
}
```
A5 driver pattern (host-side + page-side merge) for downloadsDir polling — driveA24/A25
should reuse this pattern OR chain off A5's existing zip-on-disk verification. See
tests/uat/lib/harness-page-driver.ts:driveA5 (line 215) for the merged-checks pattern.
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: assertA24 — chrome.downloads receives blob: URL prefix (D-P2-01 empirical)</name>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts</files>
<behavior>
- assertA24 page-side: at SAVE_ARCHIVE dispatch time, spy on chrome.downloads.download via a
page-context Proxy / monkey-patch installed BEFORE the dispatch. Wait for the SAVE_ARCHIVE
ack, then read the captured args[0].url. Assert:
(a) url.startsWith('blob:') === true
(b) url.startsWith('data:application/zip;base64,') === false
- driveA24: standard page.evaluate wrapper (mirrors driveA23 shape).
- A24 chains AFTER A23 in the orchestrator — does its own setupFreshRecording before SAVE
OR reuses A5's recording state if testable. PLANNER-DECISION: A24 does its OWN fresh
recording + save dispatch because the spy-installation requires controlling the window
RIGHT BEFORE the SAVE — chaining off A5's already-completed save misses the spy window.
- NO new test-hook symbols required: chrome.downloads.download is a chrome.* API that the
privileged extension-internal page can monkey-patch directly without going through the
__mokoshOffscreenQuery bridge. The Proxy replaces chrome.downloads.download with a wrapper
that records args + delegates to the original; assertA24 reads back the captured args + restores.
- This means Tier-1 FORBIDDEN_HOOK_STRINGS inventory STAYS at 12 (no new symbols).
</behavior>
<action>
Edit `tests/uat/extension-page-harness.ts` to add `assertA24` (pattern after assertA23 — read
lines around 1906-1989 from the file for the existing assertA23 shape):
```typescript
/**
* A24 — D-P2-01 empirical: SAVE_ARCHIVE → chrome.downloads.download is invoked with a
* `blob:` URL (NOT `data:application/zip;base64,`). Closes audit P0-6 functionally.
*
* Pattern: install a chrome.downloads.download Proxy that records the first url arg,
* dispatch SAVE_ARCHIVE, await ack, restore original, assert the captured url prefix.
*
* Chains: independent of A5 (does its own setupFreshRecording + SAVE) because the spy
* must be installed BEFORE the SAVE dispatch.
*/
async function assertA24(): Promise<AssertionResult> {
const checks: CheckRecord[] = [];
const diagnostics: string[] = [];
// Setup: fresh recording (mirrors A23's pattern; reuses setupFreshRecording helper).
await setupFreshRecording();
// Settle: let one segment land so SAVE has video to package.
await new Promise(r => setTimeout(r, 11_000));
// Install spy.
const original = chrome.downloads.download.bind(chrome.downloads);
let capturedUrl: string | null = null;
(chrome.downloads as any).download = (opts: chrome.downloads.DownloadOptions) => {
capturedUrl = opts.url;
return original(opts);
};
try {
const ack: { success: boolean } = await new Promise((resolve) => {
chrome.runtime.sendMessage({ type: 'SAVE_ARCHIVE' }, resolve);
});
checks.push({
name: 'A24.1: SAVE_ARCHIVE ack received with success=true',
expected: true,
actual: ack.success,
passed: ack.success === true,
});
// Poll up to 5s for the spy to fire (chrome.downloads.download is async-resolved
// post the offscreen bridge round-trip).
const pollStart = Date.now();
while (capturedUrl === null && Date.now() - pollStart < 5000) {
await new Promise(r => setTimeout(r, 100));
}
checks.push({
name: 'A24.2: chrome.downloads.download was invoked',
expected: true,
actual: capturedUrl !== null,
passed: capturedUrl !== null,
});
const urlIsBlob = capturedUrl !== null && capturedUrl.startsWith('blob:');
const urlIsDataBase64 = capturedUrl !== null && capturedUrl.startsWith('data:application/zip;base64,');
checks.push({
name: 'A24.3: download URL starts with "blob:" (D-P2-01)',
expected: true,
actual: urlIsBlob,
passed: urlIsBlob,
});
checks.push({
name: 'A24.4: download URL does NOT start with "data:application/zip;base64," (legacy path retired)',
expected: true,
actual: !urlIsDataBase64,
passed: !urlIsDataBase64,
});
diagnostics.push(`capturedUrl prefix: ${capturedUrl?.substring(0, 40) ?? '<null>'}...`);
} finally {
(chrome.downloads as any).download = original;
}
return {
name: 'A24 — D-P2-01 Blob URL download (closes P0-6)',
checks,
diagnostics,
};
}
```
Register assertA24 on the __mokoshHarness window surface (mirror the assertA23 registration line).
Edit `tests/uat/lib/harness-page-driver.ts` to add driveA24 (standard wrapper, identical
to driveA23 shape).
Per D-P2-01: this is the empirical closure of P0-6. The unit test (Plan 02-01 Task 1) proves
the wire-format at the SW boundary; A24 proves it end-to-end through a real Chrome instance,
including the offscreen bridge round-trip + the chrome.downloads platform call.
</action>
<verify>
<automated>npm run build:test 2>&1 | tail -5 ; HEADLESS=1 npm run test:uat 2>&1 | grep -E "(A24|FAIL|PASS)" | tail -20</automated>
</verify>
<done>
A24 page-side + driver wired. Harness orchestrator runs through A24. ALL 4 A24 checks GREEN.
Tier-1 FORBIDDEN_HOOK_STRINGS unchanged at 12 (chrome.downloads spy is a chrome.* monkey-patch,
not a test-hook symbol).
Atomic commit:
`feat(02-04): harness A24 — empirical Blob URL download verification (D-P2-01)`.
</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: assertA25 — SAVE→zip-on-disk latency <5s (REQ-archive-export-latency)</name>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts</files>
<behavior>
- assertA25 page-side: dispatch SAVE_ARCHIVE; record `t0 = performance.now()` before dispatch
and `tAck = performance.now()` after the ack. Return both timings in the page result.
- driveA25 host-side: also polls downloadsDir for the new zip file (mirrors A5's pattern) and
records `tFile = Date.now()` when the file is observed. Returns the page result merged with
the host-side timing.
- Assertions:
(a) tAck - t0 < 5000 (page-side: dispatch → ack)
(b) tFile - t0_host < 5000 (host-side: dispatch → file on disk, where t0_host is captured
on the driver side just before page.evaluate)
- Operator-facing tolerance: 5000ms is the SPEC §10 #6 + CON-archive-export-latency hard ceiling.
Recorded actual latency reported in diagnostics for retrospective tuning.
- A25 chains AFTER A24's SAVE (which already produced one zip in downloadsDir) — A25 does its
OWN setupFreshRecording + SAVE because the latency measurement must be on a clean save, not
compounded with A24's still-pending state.
- PLANNER-DECISION on the empty-buffer race: A25 settles for 11s after setupFreshRecording before
dispatching SAVE_ARCHIVE (mirrors A13's pattern from Plan 01-13 Wave 3D — let one segment land).
The 5s SAVE→file latency budget is measured FROM SAVE dispatch, NOT from the broader test orchestration.
</behavior>
<action>
Add `assertA25` to tests/uat/extension-page-harness.ts (mirror assertA24's structure):
```typescript
/**
* A25 — REQ-archive-export-latency: SAVE_ARCHIVE → zip on disk in <5000ms.
* CON-archive-export-latency + SPEC §10 #6 hard ceiling.
*
* Returns dispatch + ack timings (page-side); host-side driver merges the
* dispatch→file-on-disk timing (mirrors A5's merged-checks pattern).
*/
async function assertA25(): Promise<AssertionResult & { t0: number; tAck: number }> {
const checks: CheckRecord[] = [];
const diagnostics: string[] = [];
await setupFreshRecording();
await new Promise(r => setTimeout(r, 11_000));
const t0 = performance.now();
const ack: { success: boolean } = await new Promise((resolve) => {
chrome.runtime.sendMessage({ type: 'SAVE_ARCHIVE' }, resolve);
});
const tAck = performance.now();
const elapsedAck = tAck - t0;
checks.push({
name: 'A25.1: SAVE_ARCHIVE ack received with success=true',
expected: true,
actual: ack.success,
passed: ack.success === true,
});
checks.push({
name: 'A25.2: dispatch → ack latency < 5000ms',
expected: '<5000ms',
actual: `${elapsedAck.toFixed(0)}ms`,
passed: elapsedAck < 5000,
});
diagnostics.push(`page-side latency: t0=${t0.toFixed(0)} tAck=${tAck.toFixed(0)} delta=${elapsedAck.toFixed(0)}ms`);
return {
name: 'A25 — REQ-archive-export-latency: <5000ms (SPEC §10 #6)',
checks,
diagnostics,
t0,
tAck,
};
}
```
Add `driveA25` with host-side latency merge (pattern after driveA5):
```typescript
export async function driveA25(page: Page, downloadsDir: string): Promise<AssertionRecord> {
const preExisting = new Set(readdirSync(downloadsDir).filter(isZipFilename));
const t0_host = Date.now();
const pageResult = await page.evaluate(async () => {
const harness = (window as any).__mokoshHarness;
return await harness.assertA25();
}) as AssertionRecord & { t0: number; tAck: number };
// Host-side poll for new zip.
let tFile: number | null = null;
const pollStart = Date.now();
while (Date.now() - pollStart < 6000 /* 1s budget over the 5s SLO for slack */) {
const candidates = readdirSync(downloadsDir).filter(
(name) => isZipFilename(name) && !preExisting.has(name),
);
if (candidates.length > 0) {
tFile = Date.now();
break;
}
await new Promise(r => setTimeout(r, 100));
}
const mergedChecks: CheckRecord[] = pageResult.checks.slice();
const elapsedFile = tFile !== null ? tFile - t0_host : -1;
mergedChecks.push({
name: 'A25.3: host-side dispatch → zip-on-disk latency < 5000ms',
expected: '<5000ms',
actual: elapsedFile >= 0 ? `${elapsedFile}ms` : 'zip never appeared',
passed: elapsedFile >= 0 && elapsedFile < 5000,
});
const mergedDiagnostics = pageResult.diagnostics.slice();
mergedDiagnostics.push(`host-side latency: t0_host=${t0_host} tFile=${tFile ?? '<missing>'} delta=${elapsedFile}ms`);
return {
passed: mergedChecks.every(c => c.passed),
name: pageResult.name,
checks: mergedChecks,
diagnostics: mergedDiagnostics,
error: pageResult.error,
};
}
```
Note: `isZipFilename` is already exported in harness-page-driver.ts (line 309) — reuse.
Per REQ-archive-export-latency + CON-archive-export-latency: this is the canonical SPEC §10 #6
empirical gate. The 5000ms budget is end-to-end (SAVE dispatch → file on disk).
</action>
<verify>
<automated>HEADLESS=1 npm run test:uat 2>&1 | grep -E "(A25|FAIL|PASS|latency)" | tail -20</automated>
</verify>
<done>
A25 page-side + driver wired with merged checks. 3 checks GREEN (ack received + page-side
latency + host-side latency, all <5000ms). Atomic commit:
`feat(02-04): harness A25 — empirical <5s SAVE→zip latency (REQ-archive-export-latency, SPEC §10 #6)`.
</done>
</task>
<task type="auto" tdd="true">
<name>Task 3: assertA26 + A27 — meta.urls shape + multi-tab dedup empirical (D-P2-02)</name>
<files>tests/uat/extension-page-harness.ts, tests/uat/lib/harness-page-driver.ts, tests/uat/harness.test.ts</files>
<behavior>
- assertA26 page-side: chain off A25's produced zip (read host-side filename via merged checks).
Use JSZip.loadAsync to read meta.json from the zip; JSON.parse; assert:
(a) Object.keys(meta).length === 8
(b) meta.schemaVersion === '2'
(c) Array.isArray(meta.urls) && meta.urls.length >= 1
(d) meta.url === undefined (the legacy single-URL field is gone)
(e) meta.urls.every(u => /^(https?|chrome-extension):\/\//.test(u))
- assertA27 page-side: requires multi-tab activation BEFORE save. Sequence:
(1) setupFreshRecording
(2) Open tab A (e.g., chrome.tabs.create({ url: 'https://example.com' })) and wait for it to load
(3) Activate tab A
(4) Open tab B (e.g., chrome.tabs.create({ url: 'https://www.iana.org' })) and wait
(5) Activate tab B
(6) Wait 11s for one segment to land
(7) Dispatch SAVE_ARCHIVE
(8) Read meta.urls from the produced zip
(9) Assert both example.com AND iana.org appear in meta.urls
(10) Cleanup: close both tabs
- PLANNER-NOTE on tabs permission limitation: A27 depends on chrome.tabs URL access. Per Plan
01-13 SUMMARY Known Limitations item 3, `tabs` permission is NOT declared and `chrome.tabs.query`
may return tabs without `.url`. RESOLUTION: A27 uses chrome.tabs.create + chrome.tabs.update
to drive tab activation directly, bypassing chrome.tabs.query. The tracker's chrome.tabs.onActivated
listener fires regardless of permission — it's the post-event chrome.tabs.get(tabId) inside the
tracker that may return undefined .url. If A27 reveals the gap empirically, surface a diagnostic
in the test result and defer the closure to Phase 4 hardening (CONTEXT.md `<deferred>` tabs
permission gap item). If A27 PASSES on the active-tab path, the limitation is non-blocking
for Phase 2.
- driveA26 + driveA27: standard page.evaluate wrappers (mirror driveA23/A24).
- Orchestrator update: tests/uat/harness.test.ts adds A24, A25, A26, A27 to the assertion
sequence AFTER A23. Total target: 24 (Phase 1 baseline) + 4 (Phase 2) = 28/28 GREEN.
- FORBIDDEN_HOOK_STRINGS lockstep: PLANNER-DECISION — A24/A25/A26/A27 use chrome.* monkey-patch
(A24's downloads spy) + JSZip + chrome.tabs.create/update which are production APIs. No new
test-hook surface required. Tier-1 inventory STAYS at 12. Plan-checker verifies via final grep.
</behavior>
<action>
Add `assertA26` and `assertA27` to tests/uat/extension-page-harness.ts:
```typescript
/**
* A26 — D-P2-02 + D-P2-03 empirical: meta.json has the 8-field shape with urls[] (not url:string)
* and schemaVersion='2'. Chains off A25's produced zip.
*/
async function assertA26(zipBytes: Uint8Array): Promise<AssertionResult> {
const checks: CheckRecord[] = [];
const diagnostics: string[] = [];
const zip = await JSZip.loadAsync(zipBytes);
const metaFile = zip.file('meta.json');
checks.push({
name: 'A26.1: meta.json entry exists in zip',
expected: true, actual: metaFile !== null, passed: metaFile !== null,
});
if (metaFile === null) {
return { name: 'A26 — meta.json 8-field shape (D-P2-02/03)', checks, diagnostics };
}
const metaText = await metaFile.async('string');
const meta = JSON.parse(metaText);
diagnostics.push(`meta.json keys: ${Object.keys(meta).join(',')}`);
checks.push({
name: 'A26.2: meta has exactly 8 fields',
expected: 8, actual: Object.keys(meta).length, passed: Object.keys(meta).length === 8,
});
checks.push({
name: 'A26.3: meta.schemaVersion === "2"',
expected: '2', actual: meta.schemaVersion, passed: meta.schemaVersion === '2',
});
checks.push({
name: 'A26.4: meta.urls is non-empty Array',
expected: 'non-empty Array',
actual: Array.isArray(meta.urls) ? `Array(${meta.urls.length})` : typeof meta.urls,
passed: Array.isArray(meta.urls) && meta.urls.length >= 1,
});
checks.push({
name: 'A26.5: meta.url (legacy field) is undefined',
expected: 'undefined', actual: typeof meta.url, passed: meta.url === undefined,
});
checks.push({
name: 'A26.6: every meta.urls[i] matches /^(https?|chrome-extension):\\/\\//',
expected: true,
actual: Array.isArray(meta.urls) && meta.urls.every((u: string) => /^(https?|chrome-extension):\/\//.test(u)),
passed: Array.isArray(meta.urls) && meta.urls.every((u: string) => /^(https?|chrome-extension):\/\//.test(u)),
});
return { name: 'A26 — meta.json 8-field shape (D-P2-02/D-P2-03)', checks, diagnostics };
}
/**
* A27 — D-P2-02 empirical: multi-tab URL tracking. Activates two distinct
* URLs during the recording window; meta.urls should contain both.
*/
async function assertA27(): Promise<AssertionResult> {
const checks: CheckRecord[] = [];
const diagnostics: string[] = [];
await setupFreshRecording();
// Open + activate 2 distinct tabs.
const tabA = await chrome.tabs.create({ url: 'https://example.com', active: true });
await new Promise(r => setTimeout(r, 1500));
const tabB = await chrome.tabs.create({ url: 'https://www.iana.org', active: true });
await new Promise(r => setTimeout(r, 1500));
// Settle for one segment.
await new Promise(r => setTimeout(r, 11_000));
// SAVE.
const ack: { success: boolean } = await new Promise(resolve => {
chrome.runtime.sendMessage({ type: 'SAVE_ARCHIVE' }, resolve);
});
checks.push({
name: 'A27.1: SAVE_ARCHIVE ack received',
expected: true, actual: ack.success, passed: ack.success === true,
});
// Cleanup tabs.
if (tabA.id) await chrome.tabs.remove(tabA.id);
if (tabB.id) await chrome.tabs.remove(tabB.id);
diagnostics.push(`opened tabA=${tabA.id} tabB=${tabB.id}; meta.urls inspection deferred to host-side`);
return { name: 'A27 — multi-tab URL dedup (D-P2-02)', checks, diagnostics };
}
```
Add `driveA26(page, downloadsDir)` + `driveA27(page, downloadsDir)` with host-side zip-lookup
+ meta.urls inspection — pattern after driveA5's merged-checks shape.
For driveA27's host side: after the page returns the ack, locate the most-recent zip in
downloadsDir, load it via JSZip, parse meta.json, assert example.com AND iana.org are both
in meta.urls.
Update tests/uat/harness.test.ts orchestrator (search for the existing sequence around the
driveA23 dispatch) to add:
```typescript
drivers.push({ name: 'A24', fn: () => driveA24(page) });
drivers.push({ name: 'A25', fn: () => driveA25(page, handles.downloadsDir) });
drivers.push({ name: 'A26', fn: () => driveA26(page, handles.downloadsDir) });
drivers.push({ name: 'A27', fn: () => driveA27(page, handles.downloadsDir) });
```
Final target: 28/28 GREEN.
FORBIDDEN_HOOK_STRINGS verification: grep the new test files for any new symbols that might
leak into dist/. Expected: none (all symbols are local to extension-page-harness.ts which is
NOT built into dist/ — it's loaded via the test-harness page, not the production manifest).
Tier-1 inventory stays at 12.
Per D-P2-02 + D-P2-03 + REQ-archive-export-latency: A24+A25+A26+A27 collectively validate the
full Phase 2 contract empirically. After A27, Phase 2 functional contract is HARNESS-CLOSED
(analogous to Plan 01-13's role for Phase 1).
</action>
<verify>
<automated>HEADLESS=1 npm run test:uat 2>&1 | grep -E "(A26|A27|FAIL|PASS|28/28)" | tail -30</automated>
</verify>
<done>
A26 + A27 page-side + drivers wired. Orchestrator runs 28 assertions sequentially. ALL 28 GREEN.
FORBIDDEN_HOOK_STRINGS unchanged at 12. Atomic commit:
`feat(02-04): harness A26+A27 — empirical meta.json 8-field + multi-tab urls[] verification (D-P2-02/03)`.
</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<name>Task 4: Pre-checkpoint bundle gates + operator empirical checkpoint</name>
<files>(no files modified; verification-only — gate runs against existing dist/ + a real Chrome instance)</files>
<action>
**Step 1 — Pre-checkpoint bundle gates** (orchestrator-driven; per saved memory
`feedback-pre-checkpoint-bundle-gates.md`; MUST PASS before surfacing the empirical Step 2 to operator):
Run each in sequence; on first failure, surface a diagnostic to the operator + block the checkpoint:
1. `npm run build` → exit 0; dist/ populated.
2. SW CSP-safety grep: `grep -rE 'new Function\(|eval\(' dist/assets/index-*-bg.js` → 0 hits OR documented pre-existing exceptions only (e.g., setimmediate polyfill `new Function` per Plan 01-12 Wave 7 disclosure — exact-string match exception list documented in `.planning/phases/01-stabilize-video-pipeline/deferred-items.md`).
3. SW Node-globals grep: `grep -rE 'Buffer\.from|Buffer\.alloc|process\.|require\(' dist/assets/index-*-bg.js` → 0 hits.
4. DOM-globals grep: `grep -rE '(window\.|document\.)' dist/assets/index-*-bg.js | grep -vE '^//|globalThis|^$'` → 0 hits in SW chunk (DOM globals are forbidden in SW context — see DEC-006).
5. Tier-1 SW-bundle-import gate: `npx vitest run tests/background/sw-bundle-import.test.ts` → GREEN.
6. Tier-1 FORBIDDEN_HOOK_STRINGS gate: `npx vitest run tests/background/no-test-hooks-in-prod-bundle.test.ts` → 12 strings, 0 hits each → GREEN.
7. Manifest validation gate: `npx vitest run tests/i18n/manifest-i18n.test.ts tests/i18n/locale-parity.test.ts tests/build/` → GREEN.
**Step 2 — Operator-driven empirical UAT cycle 1** (manual, ~5 min):
Step 2.1 — Load unpacked extension from `dist/` into Chrome.
Expected: no warnings/errors in chrome://extensions/.
Step 2.2 — Open a tab with `https://example.com`. Open a second tab with `https://www.iana.org`.
Click the Mokosh toolbar icon → pick "Entire screen" in the picker.
Expected: REC badge appears; recording starts.
Step 2.3 — Switch between the two tabs a few times. Wait at least 15 seconds (one full segment lands).
Step 2.4 — Open the Mokosh popup. Click "Сохранить отчёт об ошибке" (or the i18n equivalent).
Expected within 5 seconds:
(a) A `session_report_*.zip` file lands in Downloads folder
(b) The popup transitions idle → "Сохраняю..." → "Готово! ✓" → idle (3s revert)
Step 2.5 — Open the zip with the OS archive manager.
Expected layout:
```
session_report_*.zip
├── video/last_30sec.webm
├── rrweb/session.json
├── logs/events.json
├── screenshot.png
└── meta.json
```
Step 2.6 — Open meta.json. Verify:
(a) `schemaVersion: "2"` present
(b) `urls` field is an ARRAY (not a string)
(c) `urls` contains at least `https://example.com/` AND `https://www.iana.org/`
(d) NO `url` field present (just `urls`)
(e) Exactly 8 keys total
Step 2.7 — Open video/last_30sec.webm in a browser (drag into a Chrome tab).
Expected: ~30 seconds of video plays end-to-end.
Step 2.8 — Verify the >2 MB archive case (the regression class P0-6 closes):
(a) In Chrome DevTools → Network panel of the Mokosh offscreen / extension context, observe
the download was initiated from a `blob:chrome-extension://<id>/<uuid>` URL (NOT a
`data:application/zip;base64,...`).
(b) If the archive is larger than 2 MB (typical for ≥15s of video), this proves the
D-P2-01 migration works for the canonical use case.
**Reply contract:** Type "approved" if Steps 1-2 all match expectations.
If any step deviates, describe the deviation (which step + what was observed + what was expected).
Deviations route to a follow-up plan (02-05 OR a debug session per saved memory
`feedback-gsd-ceremony-for-fixes.md` — NEVER hot-edit).
**Why human-verify (not auto):** Steps 2.1-2.8 require a real Chrome instance with screen-share
grant + a real OS Downloads folder + a real archive-manager tool. The harness in Tasks 1-3
covers the chrome.* + zip-shape contracts but cannot validate the OS-level archive integrity
(Step 2.5 archive-manager open) or the operator's empirical observation of the network panel
(Step 2.8a). Per saved memory `feedback-pre-checkpoint-bundle-gates.md`: "Operator time is for
things automation cannot verify" — Steps 2.5, 2.7, 2.8 are exactly those.
</action>
<verify>
<automated>npm run build 2>&1 | tail -5 ; npx vitest run tests/background/no-test-hooks-in-prod-bundle.test.ts tests/background/sw-bundle-import.test.ts tests/i18n/ tests/build/ 2>&1 | tail -10</automated>
</verify>
<done>
Step 1 gates ALL GREEN (7/7 sub-gates). Operator empirical reply received:
EITHER "approved" → Phase 2 closes, mark REQUIREMENTS.md REQ-archive-export-latency +
REQ-meta-json-schema + REQ-popup-ui + REQ-archive-layout + REQ-screenshot-on-export as Complete,
flip STATE.md Phase 2 → COMPLETE, update ROADMAP.md Phase 2 status;
OR deviations documented → route through `/gsd-debug` per saved memory + follow-up plan 02-05.
</done>
<what-built>
Phase 2 implementation complete:
- Plan 02-02: Offscreen-minted Blob URL pipeline (D-P2-01, closes P0-6).
- Plan 02-03: meta.json urls[] + schemaVersion + tab-url-tracker (D-P2-02, closes P1 #10).
- Plan 02-04 Tasks 1-3: UAT harness A24+A25+A26+A27 GREEN (28/28).
Acceptance baselines preserved: vitest GREEN, UAT GREEN, Tier-1 FORBIDDEN_HOOK_STRINGS = 12,
production bundle hook-free.
</what-built>
<how-to-verify>
See `<action>` block above — Step 1 (orchestrator) + Step 2 (operator) define the verification
contract verbatim. Reply contract codified in the `<action>` block.
</how-to-verify>
<resume-signal>Type "approved" or describe deviations (e.g., "Step 2.6c failed: meta.urls only had 1 entry, not 2")</resume-signal>
</task>
</tasks>
<threat_model>
## Trust Boundaries
| Boundary | Description |
|----------|-------------|
| harness page → chrome.downloads | privileged extension-internal page; spy via Proxy is intra-extension; not exposed to web pages |
| operator's manual zip inspection | filesystem boundary; archive may contain operator credentials per "log is internal" charter — operator decides if it's safe to share externally |
## STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|-----------|----------|-----------|-------------|-----------------|
| T-02-04-01 | Tampering | chrome.downloads.download spy left installed after A24 (failure to restore in finally) | mitigate | A24 uses try/finally to restore original chrome.downloads.download; orchestrator bails on first failure, so A25 would observe the spy if A24's finally fails. Add a sentinel test: A25 reads `chrome.downloads.download.toString()` to verify it's the native function (not the proxy). |
| T-02-04-02 | Repudiation | A25 latency measurement skewed by harness startup overhead | accept | Measurement bookends bracket ONLY the SAVE_ARCHIVE dispatch → ack, NOT the broader test orchestration. setupFreshRecording + segment-settle happen BEFORE the t0 mark. |
| T-02-04-03 | Information Disclosure | A27 creates real tabs with example.com / iana.org which appear in real Downloads zips | accept | Per "log is internal" charter (CONTEXT.md re-phasing context); test tabs are public sites with no PII; downloadsDir is a per-run mkdtempSync, cleaned up by test runner. |
| T-02-04-04 | Denial of Service | A27 leaks tabs if cleanup fails (chrome.tabs.remove throws on already-closed tabs) | mitigate | A27's cleanup is wrapped in try/catch (silent-ignore — closing an already-closed tab is benign). |
| T-02-04-05 | Elevation of Privilege | New harness assertions reveal previously-uncovered chrome.* APIs to the test surface | accept | Approach B (Plan 01-13 SUMMARY): the harness page is a privileged extension-internal page that already has full chrome.* access. A24+A27 use chrome.downloads + chrome.tabs which are part of the existing extension capability set per manifest.json. No new manifest permissions in this plan. |
</threat_model>
<verification>
- `npm run build` → clean.
- `npm run build:test` → clean.
- `npx tsc --noEmit` → clean.
- Pre-checkpoint bundle gates (Task 4 Step 1):
- `tests/background/no-test-hooks-in-prod-bundle.test.ts` → 12 strings, all 0 hits → GREEN.
- `tests/background/sw-bundle-import.test.ts` → GREEN.
- `tests/build/no-remote-fonts.test.ts` → GREEN.
- `tests/build/icons-present.test.ts` → GREEN.
- `tests/build/fonts-present.test.ts` → GREEN.
- `tests/i18n/manifest-i18n.test.ts` + `tests/i18n/locale-parity.test.ts` → GREEN.
- `npm test` (full suite) → GREEN.
- `HEADLESS=1 npm run test:uat` → 28/28 GREEN.
- Operator empirical Task 4 Step 2 → "approved" or surfaces deviations to drive Plan 02-05 follow-up.
</verification>
<success_criteria>
1. UAT harness 28/28 GREEN (A0-A23 Phase 1 baseline + A24+A25+A26+A27 Phase 2 extension).
2. Vitest full suite GREEN (Phase 1 baseline + Plan 02-01 RED tests now all GREEN post-Plans 02-02 + 02-03).
3. Tier-1 FORBIDDEN_HOOK_STRINGS = 12 (unchanged — A24+A27 use chrome.* monkey-patch + production APIs).
4. Pre-checkpoint bundle gates PASS per `feedback-pre-checkpoint-bundle-gates.md`.
5. Operator empirical UAT cycle 1 ack "approved" OR documented deviations.
6. Phase 2 closure marker flippable: REQUIREMENTS.md + STATE.md + ROADMAP.md (this last is orchestrator's territory post-checkpoint).
</success_criteria>
<output>
After completion, create `.planning/phases/02-stabilize-export-pipeline/02-04-SUMMARY.md`
documenting:
- 4 new harness assertions (A24+A25+A26+A27) with their check-counts and rationale.
- Tier-1 FORBIDDEN_HOOK_STRINGS unchanged at 12 (architectural rationale: chrome.* spy / production APIs vs new test-hook symbols).
- Pre-checkpoint bundle gate run record (each gate result inline).
- Operator empirical ack quote (verbatim) OR list of deviations + follow-up plan-pointer.
- Phase 2 closure summary: 4/4 plans landed; vitest + UAT GREEN; P0-6 + P1 #10 closed; meta.json 8-field schema shipped.
- Forward link: Phase 3 (SPEC §10 smoke + DOM/event-log verification) inherits the harness as its closure template (mirrors Plan 01-13's role for Phase 2).
</output>