Server infrastructure (Ansible, Docker, v1 docs) moves to prowler-server/. Client application (PRD v2, SDD v2, design system) lives in prowler-client/. Top-level README links both projects. 💘 Generated with Crush Assisted-by: GLM-5.1 via Crush <crush@charm.land>
10 KiB
SDD v1: Server Infrastructure
Status: Draft | Date: 2026-04-14 | Owner: CTO / R&D | Ref: PRD v1
0. Protocol & transport selection
Layer 1: Direct proxy — DECIDED (ADR-001)
VLESS+Reality+XTLS-Vision via Xray-core. Client connects directly to EU exit VPS on TCP 443. Reality borrows TLS certificate from legitimate site; Vision eliminates TLS-in-TLS via kernel splice. Not whitelist-proof — fails under mobile CIDR whitelisting. That's what Layer 3 is for.
Future additions (not v1): ShadowTLS v3+SS2022, AnyTLS, NaïveProxy, AmneziaWG 2.0.
Layer 3: Domestic relay chain — DECIDED (ADR-003)
Client → Yandex Cloud relay (Moscow) → Timeweb EU exit (Netherlands)
Transport shape differs per leg. Client→relay: VLESS+Reality+Vision (TCP). Relay→exit: VLESS over standard TLS+XHTTP packet-up (no Reality). Both legs on TCP 443. Relay runs Xray-core with transport-level outbound chaining (dialerProxy) to exit VPS. Survives TSPU mobile whitelisting because client connects to a domestic Russian IP. XHTTP packet-up on relay→exit fragments uploads across multiple HTTP requests, surviving the cross-border 16KB curtain. Cloudflare is not used.
Port choice: 443 on both legs, per upstream Xray-core v26.3.27 guidance — non-443 Reality listeners are flagged as high IP-burn risk by project maintainer.
Transport rationale per leg: Vision is Splice-eligible (kernel zero-copy), fastest Xray transport, but maintainer has stated it is not designed for transit — so it is used only on the two client-facing legs, not on the relay-outbound. XHTTP packet-up is the current Russian-community canon for cross-border cloud-to-cloud legs because it fragments the TLS record stream into many HTTP request bodies, denying the curtain a coherent stream to clip. XHTTP parameters on relay→exit: mode: packet-up, xPaddingBytes: "100-1000" (default).
SNI on Reality (exit-side Layer 1 inbound and client→relay): self-steal on operator-owned domain resolving to the respective VPS, with valid Let's Encrypt cert, TLSv1.3+H2, non-trivial static content. Not google.com or third-party dest.
Relay VPS: Yandex Cloud, ~$5–10/mo, non-resident business account (USD via Visa/MC). Not all Yandex IPs are whitelisted — verify by testing HTTPS access from Russian mobile without VPN.
Exit VPS: Timeweb Cloud EU (Netherlands), ~$5–6/mo.
Relay routing: Russian domains (geosite:category-ru) go direct from relay, not through EU exit — avoids suspicious cross-border round-trip for domestic traffic.
Server hardening on port 443: no fail2ban, no GeoIP filtering, no rate limiting. Reality's cover story must behave identically to the dest server for all unauthenticated observers. Hardening only on other ports (SSH on 22) or within authenticated proxy sessions via Xray routing rules.
1. Proxy engine — DECIDED (ADR-002)
Xray-core v26.3.27 for v1, pinned. Canonical Reality/XHTTP implementation. ~20–50 MB idle RAM. Memory leaks under XHTTP load (GitHub #5719) — manageable with periodic restarts via Docker health checks or cron.
Version policy: pin to a specific release, never floating tags. Upgrade cadence is event-driven — changelog review triggered by TSPU-relevant fixes, detection-evasion patches, or CVEs. No scheduled upgrades. Pin location (compose tag vs. Ansible variable) is an implementation decision.
Long-term: sing-box joins when ShadowTLS/AnyTLS are added (sing-box-exclusive protocols).
2. Web server — DECIDED (ADR-004)
Nginx. Exit VPS: terminates TLS with real Let's Encrypt cert for the self-steal domain on port 443, demultiplexes by SNI/path — Reality-bound traffic to Xray's Reality inbound (Layer 1), XHTTP-bound traffic to Xray on loopback/Unix socket (Layer 3). Also serves the static site front for unauthenticated visitors. Most documented reverse proxy for Reality fallback (XTLS/Xray-examples, henrywithu.com guides). 22 MB idle. Docker: nginx:alpine-slim ~7 MB. Use modern h2c single-socket approach (1.25.1+) despite legacy guides using two-socket workaround. ACME automation (certbot or Caddy-style) is an implementation decision.
3. Container orchestration — DECIDED (ADR-005)
Docker Compose + Ansible. Each VPS runs independently with its own docker-compose.yml. Ansible handles server-level provisioning (packages, Docker install, firewall, SSH hardening). Compose handles application deployment (Xray, Nginx). Fleet growth via Ansible inventory — add a node, add it to inventory, run the playbook. No cross-border control plane dependency.
4. Infrastructure-as-code — DECIDED (ADR-006)
Ansible only. VPS created manually via provider web panels. Ansible handles everything post-creation: packages, Docker, firewall, SSH hardening, Compose deployment. No Terraform/OpenTofu state to manage. Fits existing workflow (Ansible for server-level provisioning, Docker Compose for apps).
5. Base OS — DECIDED (ADR-007)
Debian 12 (Bookworm). 68 MB minimal. Natively available on both Timeweb and Yandex Cloud. 5-year support (through ~mid-2028). Zero Docker friction.
6–9. GitOps, packaging, container runtime, CNI — N/A
Not applicable with Docker Compose.
Decisions log
ADR-001: Layer 1 protocol
Decided 2026-04-10. VLESS+Reality+XTLS-Vision via Xray-core. Most field-hardened against TSPU. Proxy-style (per-app routing) preferred over VPN-style. Future expansion: ShadowTLS v3, AnyTLS, NaïveProxy, AmneziaWG (each as additional protocol in the Layer 1 slot).
ADR-002: Proxy engine
Decided 2026-04-10. Xray-core for v1. Canonical Reality/XHTTP implementation. sing-box joins long-term for its exclusive protocols. XHTTP availability was a deciding factor.
Amendment 2026-04-17. Pin to Xray-core v26.3.27 (latest release, published 2026-03-27). Bleeding edge, never floating tags. Upgrade cadence: event-driven — changelog review when a TSPU-relevant fix, detection-evasion patch, or CVE lands. No scheduled upgrades. Pin location (compose tag vs. Ansible variable) is an implementation decision.
ADR-003: Layer 3 transport & relay architecture
Decided 2026-04-14. Domestic relay chain: Yandex Cloud (Moscow) → Timeweb EU (Netherlands). Both legs: VLESS+Reality+XHTTP packet-up. Cloudflare rejected (throttled 16KB, ToS risk). Russian CDNs rejected (no WebSocket/gRPC support for proxy transport, or payment/signup barriers). VPS relay is cheaper ($7–15/mo total), better documented, community-validated, and whitelist-resilient.
Consequences: PRD gains a new component (relay node). Two VPS to provision and maintain. IaC must handle both Yandex Cloud and Timeweb providers. Relay is SORM-visible (metadata only, encrypted traffic) — accepted tradeoff for whitelist resilience.
Amendment 2026-04-17. Heterogeneous transport per leg, each specialized for its threat model. Client→relay: VLESS+Reality+Vision (TCP) — Splice throughput, not cross-border so 16KB curtain irrelevant, matches Russian-community canon. Relay→exit: VLESS+TLS+XHTTP packet-up (no Reality) — packet-up fragments uploads across HTTP requests to survive cross-border TSPU curtain (Selectel incident Mar 2025); Reality dropped due to Xray #5923 breaking local self-steal dest and because datacenter-to-datacenter leg is not subject to TSPU active probing per measurement evidence. Vision explicitly not used on relay-outbound (maintainer states Vision not designed for transit). Both legs remain on TCP 443 per v26.3.27 upstream guidance (non-443 Reality flagged as high IP-burn risk). Chaining: transport-level (dialerProxy), not protocol-level — protocol-level chaining discards source streamSettings, erasing Reality. SNI on Reality: self-steal on operator-owned domain with valid cert, not third-party dest.
ADR-004: Web server
Decided 2026-04-14. Nginx. Lowest RAM (22 MB), smallest Docker image (7 MB), most documented for Reality fallback. h2c+HTTP/1.1 on single socket via http2 on; (1.25.1+). Caddy considered but Nginx's Reality documentation ecosystem is larger.
ADR-005: Container orchestration
Decided 2026-04-14. Docker Compose + Ansible. Each VPS is independent — no shared control plane, no cross-border k8s cluster dependency. k3s rejected: ~500MB overhead unjustified for 2–3 containers per node, and cross-border API server join is itself subject to TSPU. Fleet scales through Ansible inventory. Fits existing infra compose monorepo pattern.
ADR-006: Infrastructure-as-code
Decided 2026-04-14. Ansible only. VPS created manually via provider panels. No Terraform/OpenTofu. Two providers (Yandex Cloud, Timeweb) each have web UIs for VPS creation — at F&F scale, declarative VPS lifecycle management adds complexity without proportional benefit. Ansible handles post-creation configuration.
ADR-007: Base OS
Decided 2026-04-14. Debian 12 on both VPS. Smallest footprint (68 MB), natively available on both providers, zero Docker friction, familiar.
ADR-008: Relay outbound config structure
Decided 2026-04-17. Ship tag-prefix balancer pattern on the relay in v1, even with a single exit outbound. Naming: exit-<country>-<n> (e.g., exit-nl-1). Routing rules reference balancerTag: "exits" with selector: ["exit-"]. Adding a second exit in v2 becomes purely additive — new outbound block only, no routing-rule edits.
Observatory deferred. Auto-failover is marginal at 1–2 exits and the probe pattern (rhythmic outbound TCP/TLS from Russian VPS to multiple foreign IPs) is a distinguishing traffic signature we don't need to emit. Failover in v1 is manual via Ansible re-provisioning from scratch — warm-standby considered and deferred (accepted downtime window during re-provisioning). Re-open at ADR-009's fleet-growth trigger.
ADR-009: Fleet management tooling
Decided 2026-04-17. No panel in v1. Ansible + Docker Compose is sufficient for 1 relay + 1 exit. Re-evaluate when fleet reaches node #3 or user #20, whichever first. Tool selection deferred to that re-evaluation — landscape may shift.
Appendix: Infrastructure summary
| Component | Exit VPS | Relay VPS |
|---|---|---|
| Provider | Timeweb Cloud EU | Yandex Cloud |
| Location | Netherlands | Moscow |
| Pricing | ~$5–10/mo | |
| Protocol | VLESS+Reality+Vision (L1 inbound) + TLS+XHTTP packet-up (L3 inbound from relay) | VLESS+Reality+Vision inbound + TLS+XHTTP packet-up outbound |
| Web server | Nginx | — |
| OS | Debian 12 | Debian 12 |
| Deployment | Ansible + Docker Compose | Ansible + Docker Compose |
| Foreign signup | Stripe (EUR/KZT→RUB) | Non-resident business (USD Visa/MC) |