wip: Phase 8 paused — code-complete, auto-prov research needed for Task 5 close

Phase 8 (Windows Platform) is CODE-COMPLETE — 16 commits across 3 plans landed atomically. AP-4 byte-frozen invariant intact (main.go = 209 lines). D-08-28 test pyramid GREEN on Linux CI. Plan 8-3 Task 5 (autonomous headless QEMU integration_test gate) is CHECKPOINT-blocked on auto-provisioning research: - D-08-28 locked 'automated test pyramid because operator lacks Windows' - but Plan 8-3 Task 5 required manual 2-4h VM provisioning (planning miss) - Operator surfaced: 'There are tools which make it automatically. ... That is the correct way.' - Research dispatch pre-flight-bailed at 65% context (would have produced under-cited research; correct call per prior overclaim-catching pattern) - Pausing for fresh-session research dispatch on Packer / dockurr/windows / Vagrant / virt-install comparison Next action: /gsd-resume-work + dispatch the auto-prov research with full context budget. Then lock D-08-29 + amend plans + run auto-prov + build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 14:40:15 +02:00
parent 510632836d
commit 59694f66f7
2 changed files with 160 additions and 0 deletions
--- a/.planning/HANDOFF.json
+++ b/.planning/HANDOFF.json
@@ -0,0 +1,37 @@
+{
+  "version": "1.0",
+  "timestamp": "2026-05-19T12:37:46Z",
+  "phase": "8",
+  "phase_name": "Windows Platform",
+  "phase_dir": ".planning/phases/08-windows-platform",
+  "plan": "08-03",
+  "task": 5,
+  "total_tasks": 5,
+  "status": "code-complete-task5-checkpoint-blocked-needs-auto-prov-research",
+  "session_summary": "Phase 8 CODE-COMPLETE — all 3 plans (08-01/08-02/08-03) executed; 16 commits landed. AP-4 invariant preserved (main.go byte-identical 209 lines). D-08-27 surgical lift integrated (connection/manager.go + xray/runtime.go + proxy/engine.go + xray/dispatcher_counter.go build-tag splits + //go:build !windows stickers on platform/android/*). D-08-28 test pyramid GREEN on Linux CI (40 Dart tests + Go tests across platform/windows + connection + xray + proxy). Plan 8-3 Task 5 (autonomous headless QEMU integration_test gate) is BLOCKED on operator VM provisioning — but this was IDENTIFIED AS A PLANNING MISS: D-08-28's lock was 'no manual steps' yet Task 5 requires 2-4h manual VM setup. Operator: 'There are tools which make it automatically. Build the zip for me now, and then let's plan automatic provisioning for tests. That is the correct way.' Research dispatch attempted (auto-prov tool comparison: dockurr/windows vs Packer vs Vagrant vs virt-install) but executor correctly pre-flighted and bailed at 65% context — would have produced under-cited research. Right call per operator's prior pattern of catching overclaims. PAUSED HERE for clean restart with fresh context budget on the auto-prov research dispatch.",
+  "completed_tasks": [
+    {"id": "08-01", "name": "D-port Bridge Foundation (Pigeon C++ + hand-rolled EventChannel Variant B + Go main_windows.go)", "status": "done", "commits": ["a4ed7dc", "59b744f", "c1ec79f", "9296034", "5f1b5ec", "70ce128"]},
+    {"id": "08-02", "name": "VPN Plumbing + D-08-27 Surgical Lift (Wintun + WFP via vendored wireguard-windows/firewall + DNS + network-change + xray log sink + connection/platform_lifecycle.go interface + 11 Go test files)", "status": "done", "commits": ["8354a5b", "2e388f3", "47246ca", "7ba1498", "f358869"]},
+    {"id": "08-03 T1-T4", "name": "Packaging + UX (UAC manifest + crash handler + UAC-restart modal + CMake install + cat .ico + build-windows.sh + READMEs + Riverpod state test + integration_test/windows_smoke_test.dart + tools/windows/run-headless-vm-test.sh)", "status": "done", "commits": ["40ace95", "418b31a", "1708993", "24f08e8", "5106328"]},
+    {"id": "08-03 T5", "name": "VM smoke gate (autonomous headless QEMU integration_test)", "status": "checkpoint-blocked", "blocker": "Operator's Linux dev box lacks QEMU Win 11 VM provisioning; Task 5 requires VM image at ~/.prowler-windows-vm/win11-test.qcow2 + SSH key + Windows-side dev tools (VS 2022 BT + MSYS2/mingw-w64 + Go + Flutter). Build-windows.sh aborts at Step 1 with 'PowerShell not on PATH' defensive check fire."}
+  ],
+  "remaining_tasks": [
+    {"id": "auto-prov-research", "name": "Research auto-provisioning Win 11 VM (Packer vs dockurr/windows vs Vagrant vs virt-install) — dispatched but pre-flight-bailed at 65% context; restart in fresh session", "status": "not_started", "scope": "Decision-supporting document at .planning/phases/08-windows-platform/08-RESEARCH-vm-autoprov.md per the dispatch prompt body documented inline below"},
+    {"id": "auto-prov-amendment", "name": "Lock D-08-29 auto-provisioning strategy in CONTEXT.md based on research verdict", "status": "not_started"},
+    {"id": "auto-prov-impl", "name": "Plan amendment OR new Plan 8-4 to implement the auto-provisioning per locked D-08-29", "status": "not_started"},
+    {"id": "auto-prov-run", "name": "Run auto-prov ~1-2hr unattended → v0.8.0 ZIP + integration_test PASS", "status": "not_started"},
+    {"id": "phase-8-close", "name": "Verifier pass + 08-LEARNINGS extraction + ROADMAP Phase 8 [x] + STATE 8/9 increment", "status": "not_started"}
+  ],
+  "blockers": [
+    {"description": "Plan 8-3 Task 5 autonomous gate requires Win 11 QEMU VM not yet provisioned", "type": "infrastructure", "workaround": "Auto-provisioning research + implementation (this pause's primary next-step)"}
+  ],
+  "human_actions_pending": [],
+  "decisions": [
+    {"decision": "Pause here; restart fresh session for auto-prov research", "rationale": "Researcher pre-flighted at 65% context and correctly bailed — would have produced under-cited research violating quality_gate. Operator's prior pattern: 'caught real overclaims in prior research rounds.' Half-cited research that gets locked is worse than no research.", "phase": "8"},
+    {"decision": "Plan 8-3 Task 5 manual-prov design is acknowledged as a planning miss", "rationale": "D-08-28 locked 'automated test pyramid because operator does not possess a Windows machine' but Task 5 still required manual 2-4h VM provisioning. Operator surfaced this: 'That is the correct way' — auto-prov is the right shape.", "phase": "8"},
+    {"decision": "Auto-prov infrastructure will be used for BOTH the v0.8.0 build AND ongoing D-08-28 testing (same code path)", "rationale": "No separate 'build vs test' infrastructure. scripts/build-windows.sh Step 0 = if-VM-missing-provision-VM; Step 11 = run-integration-test-in-VM. One mechanism, two consumers.", "phase": "8"}
+  ],
+  "uncommitted_files": [],
+  "context_notes": "Phase 8 is at a clean stopping point: all 3 plans executed atomically, 16 commits in lineage, AP-4 byte-frozen paths verified, STATE.md progress block preserved at 9/7/40/40/100 (post-helper-regression-restore-discipline maintained throughout). The path forward is well-defined: focused auto-prov research → locked D-08-29 → amendment → run → ZIP. Total estimated time-to-ship: ~3-4 hours wall-clock once research is locked (15-20 min agent for amendment + 1-2 hr unattended for the actual provision-and-build run). Phase 8 LEARNINGS already accumulating: (1) stream-idle timeout on long-running executors at ~100+ tool uses (Wave 1 first dispatch died at 105; recovered via spot-check fallback + smaller-scope re-dispatch); (2) auto-provisioning must be part of the design from the start when operator lacks platform access (not a bolt-on after manual fallback); (3) plan-revision-loop max 3 was sufficient (used 2 of 3 for connect-path blocker + D-08-28 test pyramid).",
+  "next_action": "/gsd-resume-work to load this handoff. Then dispatch the auto-prov research with the prompt documented in 08-RESEARCH-vm-autoprov-prompt-deferred.md (TO BE WRITTEN during pause OR can be re-derived from this session's context). Expected research output: locked tool (likely dockurr/windows or Packer + qemu builder) + implementation sketch. Then plan amendment + run."
+}
--- a/.planning/phases/08-windows-platform/.continue-here.md
+++ b/.planning/phases/08-windows-platform/.continue-here.md
@@ -0,0 +1,123 @@
+---
+context: phase
+phase: 08-windows-platform
+task: 5
+total_tasks: 5
+status: code-complete-task5-checkpoint-needs-auto-prov-research
+last_updated: 2026-05-19T12:37:46Z
+---
+
+# 🟡 Phase 8 CODE-COMPLETE — auto-provisioning research needed before close
+
+Phase 8 (Windows Platform) is **CODE-COMPLETE** — all 3 plans executed atomically. 16 commits landed. AP-4 byte-frozen invariant intact (main.go = 209 lines). D-08-28 test pyramid GREEN on Linux CI. v0.8.0 ZIP will build the moment `scripts/build-windows.sh` can find a Win 11 VM to run inside.
+
+**Task 5 (autonomous headless QEMU integration_test gate) is CHECKPOINT-blocked** — but it's a *planning miss*, not a code defect. D-08-28 locked "automated test pyramid because operator lacks Windows machine" yet Task 5 still requires 2-4h of manual VM clicks. Operator flagged this: *"There are tools which make it automatically. ... That is the correct way."* The fix is locked: auto-provisioning via Packer / dockurr/windows / virt-install, decided by research dive.
+
+## Critical Anti-Patterns
+
+| Pattern | Description | Severity | Prevention Mechanism |
+|---------|-------------|----------|---------------------|
+| Manual-step bolt-on after locking "automated" | D-08-28 said "automated test pyramid because operator can't manually verify" — but Plan 8-3 Task 5 ended up requiring 2-4h manual VM setup. Lock-letter and lock-spirit diverged. | **blocking** | When a lock specifies "automated" or "no manual steps," every dependent task must trace back to that constraint. Auto-provision-on-first-run is the right shape, not "operator does it once." |
+| Stream-idle timeout on long-running executors at ~100+ tool uses | Plan 8-1 first dispatch died at 105 tool uses / ~20 min wall-clock (Anthropic API SSE layer timeout, NOT a code error). Recovered via spot-check fallback + smaller-scope re-dispatch. | advisory | For plans with 4+ tasks: instruct executor to commit per-task atomically + create partial SUMMARY if approaching 80 tool uses + return for continuation. Smaller scopes reduce SSE pressure. |
+| gsd-sdk state.* helpers regress STATE.md progress | RECURRING (Phase 7 + Phase 8). state.begin-phase rewrote progress to stale `6/3/43/39/91` from correct `9/7/40/40/100`. Caught by memory `project_gsd_sdk_state_helper_regression`. | **blocking** | Use direct Edit for STATE.md progress block; if a state.* helper fires, immediately verify + Edit-restore. Memory captures this verbatim. |
+| Researcher quality-gate pre-flight bail at 65% context | New finding this session. Auto-prov research dispatch correctly pre-flighted: 9 tasks + 400-600 line target + citation gates needed ~15-25 WebFetch calls but only 35% context budget remained → researcher refused to half-cite. | **blocking** | Dispatch deep-research only with FULL context budget available (≥75%). For mid-session ad-hoc research, scope must be ≤3 tasks AND ≤150 lines. Phase 8 example: restart fresh session for the auto-prov research dive. |
+
+## Required Reading (in order)
+
+1. `.planning/HANDOFF.json` — machine-readable session state (read this first)
+2. `.planning/STATE.md` — current `9/7/40/40/100` progress block; Session Continuity section
+3. `.planning/phases/08-windows-platform/08-CONTEXT.md` — 28 locked decisions (D-08-01..28)
+4. `.planning/phases/08-windows-platform/08-03-SUMMARY.md` — Plan 8-3 close-out with Task 5 CHECKPOINT documentation
+5. `.planning/phases/08-windows-platform/08-RESEARCH.md` — prescriptive implementation guide (Code Examples + File Inventory + Validation Gates)
+6. `.planning/phases/08-windows-platform/08-PATTERNS.md` — codebase analog map with line-range citations
+7. `prowler-client/windows/BUILD.md` — current manual provisioning instructions (will be REPLACED by auto-prov post-research)
+8. `prowler-client/windows/SMOKE-TEST-CHECKLIST.md` — what the integration test asserts (informs auto-prov VM-readiness checks)
+
+## Current State
+
+**Phase 8 commit lineage (16 commits, oldest → newest):**
+
+| Wave | Commit | Description |
+|------|--------|-------------|
+| 1 | `a4ed7dc` | 4 AP-4 EXCEPTION docs (D-08-23 ×2 + D-08-18 + D-08-27) |
+| 1 | `59b744f` | Task 2 — Flutter Windows scaffold (rescued from stream-idle) |
+| 1 | `c1ec79f` | Task 2 finish — Pigeon C++ regen + platform/windows/ anchor + firewall/COPYING |
+| 1 | `9296034` | Task 3 — D-port bridge impl (Pigeon HostApi C++ + hand-rolled EventChannel Variant B + main_windows.go) |
+| 1 | `5f1b5ec` | Task 4 — Dart unit tests for D-08-28 test pyramid |
+| 1 | `70ce128` | 08-01-SUMMARY.md |
+| 2 | `8354a5b` | Task 1 — vendor wireguard-windows/firewall/ + firewall_session.go glue |
+| 2 | `2e388f3` | Task 2 — platform/windows VPN plumbing (6 Go files) |
+| 2 | `47246ca` | Task 3 — AP-4 EXCEPTION D-08-27 surgical lift (Windows-Connect-path wiring) |
+| 2 | `7ba1498` | Task 4 — D-08-28 Go unit tests (os_interfaces.go production seam + 10 test files) |
+| 2 | `f358869` | 08-02-SUMMARY.md |
+| 3 | `40ace95` | Task 1 — UAC manifest + crash handler + UAC-restart modal + CMake install + .ico |
+| 3 | `418b31a` | Task 2 — scripts/build-windows.sh + BUILD.md + READMEs + SMOKE-TEST-CHECKLIST.md |
+| 3 | `1708993` | Task 3 — Riverpod wintun_admin_required_state unit test |
+| 3 | `24f08e8` | Task 4 — integration_test/windows_smoke_test.dart + run-headless-vm-test.sh |
+| 3 | `5106328` | Task 5 — VM smoke gate CHECKPOINT + 08-03-SUMMARY.md + STATE/ROADMAP close-out |
+
+**Invariants verified at pause:**
+
+- `wc -l prowler-server/core/main.go` → **209** (AP-4 byte-identical)
+- STATE.md `progress:` block → **9/7/40/40/100** preserved
+- `flutter analyze lib/ windows/runner/ pigeons/ test/` → clean
+- `flutter test test/unit/ test/state/ test/widgets/` → all green (40 tests pass)
+- `CGO_ENABLED=1 GOOS=linux go build ./...` → clean
+- `CGO_ENABLED=0 GOOS=windows GOARCH=amd64 go vet ./...` → clean
+- `bash -n scripts/build-windows.sh` → clean syntax
+- `bash -n tools/windows/run-headless-vm-test.sh` → clean syntax
+
+## Completed Work
+
+- All 28 D-08-XX decisions implemented (D-08-01..28 in CONTEXT.md; D-08-27 lift + D-08-28 test pyramid added during plan-revision passes 1 & 2)
+- 3 plans fully executed in sequential waves
+- D-port bridge (Pigeon @HostApi C++ + hand-rolled flutter::EventChannel<EncodableValue> Variant B with mandatory std::mutex sink_ + std::shared_ptr lifecycle + PostMessageW marshal) — landed and tested
+- D-08-27 surgical AP-4 lift complete (build-tag splits across connection/manager.go + xray/runtime.go + proxy/engine.go + xray/dispatcher_counter.go + //go:build !windows stickers on 3 platform/android/*.go production files + new connection/platform_lifecycle.go interface + lifecycle_windows.go implementation registered via main_windows.go::init())
+- D-08-28 test pyramid GREEN on Linux CI (40 Dart tests + Go unit tests with os_interfaces.go production seam)
+- WFP killswitch via vendored wireguard-windows/firewall/ MIT package (commit 8354a5b)
+- Wintun ephemeral adapter lifecycle (D-08-03 + D-08-05 "Prowler Network" friendly-name)
+- DNS push/restore via netsh with state-file at %LOCALAPPDATA%\Prowler\dns-restore.json (D-08-02)
+- Crash handler via MiniDumpWriteDump + SEH + 10 MB rotation (D-08-16)
+- UAC requireAdministrator manifest + cat .ico from Phase 7 PNG + UAC-restart-as-admin modal via Riverpod Consumer + reused ConnectErrorCode.consentDenied + 'wintun-admin-required' discriminator (D-08-01 + D-08-22 + CP-4 close)
+- README-en.md 7-step first-launch walkthrough (D-08-19 + D-08-26 NVIDIA known-issue section)
+- README-ru.md placeholder per D-08-20 translator-brief deferral
+- scripts/build-windows.sh single build entry point (Google bash style + defensive checks + SHA-256 wintun pin verification + Step 11 run_headless_vm_test invocation between verify_zip and BUILD SUCCESS marker)
+- tools/windows/run-headless-vm-test.sh (Google bash style + 5 distinct exit codes + trap cleanup + headless QEMU -display none + SSH-based test execution)
+
+## Remaining Work
+
+**Path to v0.8.0 ZIP + Phase 8 close:**
+
+1. **Fresh-session research dispatch** — auto-provisioning Win 11 VM tool comparison (Packer vs dockurr/windows vs Vagrant vs virt-install). Output: `08-RESEARCH-vm-autoprov.md` with HIGH-confidence recommendation per the quality_gate (every tool claim cited with version + last-release-date + URL; real-world examples cited with specific repo + file paths). The aborted dispatch documented its prompt body inline — restart will use the same prompt.
+2. **Lock D-08-29** in CONTEXT.md — auto-provisioning strategy based on research verdict.
+3. **Amend plans** — either add to Plan 8-3 (one new task: "auto-prov infrastructure") OR create Plan 8-4 dedicated to auto-prov. Planner's discretion based on research-recommended tool's footprint.
+4. **Plan-checker pass** on the amendment (1 of 1 remaining revision-budget pass).
+5. **Run auto-prov + build** — `bash scripts/build-windows.sh` end-to-end. Auto-provisions VM on first invocation (~1-2 hr unattended), then builds ZIP + runs integration test. Exit 0 = Phase 8 ship-ready.
+6. **Phase 8 close-out** — `/gsd-verify-work` Phase 8 → `/gsd-extract-learnings 8` → ROADMAP Phase 8 `[x]` → STATE 8/9 increment.
+
+## Decisions Made (this session)
+
+- Auto-provisioning was identified as a planning miss in D-08-28 (Plan 8-3 Task 5 violated lock-spirit by requiring manual VM setup); locked as the next architectural decision (D-08-29 placeholder).
+- Auto-prov infrastructure will be used for BOTH the v0.8.0 build AND ongoing D-08-28 testing (same code path; no "build vs test" infra split).
+- "Clean is better" — research-first before locking auto-prov tool (Packer vs dockurr/windows vs Vagrant vs virt-install).
+- Pause-and-restart over half-cited mid-session research; matches operator's pattern of catching overclaims in prior research rounds (NVIDIA HIGH-confidence reversal at Round 2b; wireguard_dart-as-Pigeon-twin reversal at Round 5).
+
+## Blockers
+
+- **Plan 8-3 Task 5 autonomous gate** is BLOCKED on auto-prov-research → amendment → run. No code work needed; just dispatch the research in fresh context and follow through.
+
+## Infrastructure State
+
+- Linux dev box (Gentoo) — `qemu-system-x86_64`, `qemu-img`, `edk2-ovmf` (UEFI firmware) available; NO Packer, NO Vagrant, NO Docker (would be added per research verdict)
+- Wine present but limited (no winetricks/proton) — not a usable Flutter desktop runner
+- No git remote — GH Actions path closed (also explicitly rejected by D-08-07)
+- AVD not running — irrelevant for Phase 8 (Android testing was Phase 7)
+- v0.7.0 Android APK ship-ready at `~/prowler-debug-v0.7.0-multiabi.apk` sha 972f26a6… (unchanged)
+
+## Next Action
+
+`/gsd-resume-work` to load this handoff. Then dispatch the auto-prov research in fresh context. The deferred research-dispatch prompt is essentially:
+
+> Research auto-provisioning a headless Win 11 VM for Flutter+Go+Wintun desktop builds. Compare Packer (qemu builder), dockurr/windows Docker, Vagrant + community Win 11 boxes, virt-install + libvirt + autounattend.xml, custom Bash + qemu-system-x86_64 + autounattend.xml. Cover ISO sourcing (Fido, UUP dump, dockurr auto-download, Vagrant Cloud), autounattend.xml authoring for Win 11 24H2 (skip OOBE + MSA bypass + TPM bypass), dev-tool install (VS 2022 BT + MSYS2/mingw-w64 + Go + Flutter + ImageMagick), idempotency + image caching. Output prescriptive HIGH-confidence recommendation to `.planning/phases/08-windows-platform/08-RESEARCH-vm-autoprov.md` (~400-600 lines).
+
+Full prompt body preserved in this session's transcript (Agent dispatch `ae9be921e4566fc87` body).