Cloud infrastructure

The infrastructure that runs SoulMap sits on Google Cloud, owned by the Google Workspace organisation neumatics.eu, with workload separated into dedicated projects under a platform/ folder and a neumatics-network-host Shared-VPC project under shared-services/. Every CMEK-supporting resource is encrypted with a key from a single keyring in europe-west4. AlloyDB is the operational store and is paused by default whenever the product does not need it awake; per-domain Cloud Run services replace the old functions/main.py monolith; Datastream streams change-data from AlloyDB into BigQuery; an aggregated org-level audit sink lands in a sealed neumatics-audit-logs project that no application service can reach. On top of all of this rides the calibration pipeline that produced the Phase-1 corpus and Probe 5 convergence.

This is the reviewer-grade view. The methodology that the calibration pipeline serves lives at /about/science. The full per-report deliverable surface, including the proprietary IP that does not belong in a public document, sits in a passkey-gated reports tier, available on request.

The pages below are written against what is actually deployed, with infra/, services/, src/lib/api-helpers.ts, and apphosting.yaml as the source of truth. Where Foundation v1 names a commitment that has not yet shipped (Knowledge Catalog tagging, the Firestore region cutover), the pages say so explicitly with a ⏳.

What you'll find here

Page	What it covers	Typical read
Architecture	The block diagram across two layers — the Foundation v1 platform (Workspace org, projects, AlloyDB, BigQuery datasets, KMS keyring, Shared VPC + VPC-SC, Datastream CDC, per-domain Cloud Run services, audit sink) and the calibration pipeline (Cloud Workflows + Vertex Batch + Custom Training) that runs on top. The service-choice rationales behind each non-default pick.	18 min
Operations	AlloyDB pause-by-default with the five-component cost-control plane. The `BACKEND_ROUTING` per-route cutover layer in `src/lib/api-helpers.ts`. The PAM-mediator service that injects per-grant delays. The audit-alerter Cloud Run job. EU residency. The cohort-keyed calibration profile registry. Idempotency under retry.	14 min
Cost engineering	The platform run-rate after pause-by-default (~€48–50/mo for AlloyDB; ~€350/mo saved vs always-on). The eight calibration-pipeline cost levers from Phase-1 (unchanged by the foundation). The local QA harness as the unconventional ninth lever.	17 min
Receipts	What landed in the Foundation v1 refactor and what is still in flight. The Phase-1 calibration shakedown narrative compressed to one page: 18 workflow smokes + 5 trainer probes uncovered 27 bugs across 8 categories; Probe 5 converged R-hat = 1.002, n_eff_min = 1,318, 0 divergences in 9.0 s.	9 min

What we shipped, summarised

Foundation v1 — workload projects live

neumatics-prod, neumatics-audit-logs, neumatics-network-host — Workspace-rooted, CMEK across the estate

Monolith decomposed into per-domain services

1,461 LOC

functions/main.py monolith replaced by per-domain Cloud Run services; per-route cutover via BACKEND_ROUTING in src/lib/api-helpers.ts

AlloyDB run-rate

~€48/mo

Pause-by-default; ~€350/mo saved vs always-on

Probe 5 trainer convergence

R-hat = 1.002

n_eff_min = 1,318; 0 divergences; 9.0 s wallclock on real PSL data — calibration pipeline still rides on the foundation

The calibration pipeline that converged Probe 5 was the first tenant of a much larger platform. Foundation v1 built the platform around it: a Workspace-rooted org with three live workload projects, CMEK on every store, AlloyDB pause-by-default, per-domain Cloud Run services replacing the monolith, and Datastream feeding BigQuery from AlloyDB.

Why this shape

Three forces converged on the foundation refactor at 2026-05.

The calibration corpus is the next big lock-in. Phase-2 will commit several thousand euros of Vertex AI compute and tens of millions of rows into neumatics-prod's BigQuery nexus_calibration_corpus dataset. Once there, every later analytics surface, every reasoning audit trail, every Reading-shape downstream of it inherits the dataset's location, partitioning, schema, and CMEK posture. Reshaping after the fact means rewriting the writers, the readers, the workflows, and the manifests. Reshaping before Phase-2 fires is a few days of DDL and IAM. We chose now.

Migration must never break the live product. Per-route cutover via BACKEND_ROUTING in src/lib/api-helpers.ts moves SoulMap from the old monolith to per-domain Cloud Run services one route at a time, each behind its own graduation gate, so no cutover step ever puts the consumer surface at risk.

Compliance retrofit is a step-function expense. CMEK on Firestore is destroy-and-recreate. VPC Service Controls require an organisation. ISO 27701 + 42001 audits require an immutable audit log from day one. Doing all of that now is a few days of org setup; doing it later is a quarter of remediation in 2027. The foundation lands ahead of the scale that would make the retrofit expensive.

The full execution spec the refactor is built against — the eight architectural commitments F-3.1 through F-3.13 plus the parallel security track S-4.1 through S-4.10 — lives at nexus_foundation_v1.md in the repo root. The pages below are the public-facing summary of what shipped against that spec; the gated reports tier (available on request) carries the full per-component receipts including the IAM bindings, the BigQuery DDL excerpts, the per-cluster pause/resume audit trail, and the per-smoke calibration ledger.

The order we recommend reading these pages is the order they sit in the table above: architecture for the spine, operations for the discipline, cost engineering for the dollar story, receipts for what landed and what is still ⏳. Each reads independently if you only have time for one.