Cloud infrastructure

The platform sits on Google Cloud, owned by the Google Workspace organisation neumatics.eu, with workload separated into dedicated projects under a platform/ folder and a neumatics-network-host Shared-VPC project under shared-services/. Every CMEK-supporting resource is encrypted with a key from a single keyring in europe-west4. AlloyDB is the operational store and is paused by default at our pre-launch scale; per-domain Cloud Run services replace the old functions/main.py monolith; Datastream streams change-data from AlloyDB into BigQuery; an aggregated org-level audit sink lands in a sealed neumatics-audit-logs project that no application service can reach. On top of all of this rides the calibration pipeline that produced the Phase-1 corpus and Probe 5 convergence.

This is the reviewer-grade view. The methodology that the calibration pipeline serves lives at /about/science. The full per-report deliverable surface, including the proprietary IP that does not belong in a public document, is gated behind a passkey at /about/reports.

The pages below are written against what is actually deployed, with infra/, services/, src/lib/api-helpers.ts, and apphosting.yaml as the source of truth. Where Foundation v1 names a commitment that has not yet shipped (Knowledge Catalog tagging, the Firestore region cutover, Vertex Reasoning Engine redeploy to europe-west4), the pages say so explicitly with a ⏳.


What you'll find here

PageWhat it coversTypical read
ArchitectureThe block diagram across two layers — the Foundation v1 platform (Workspace org, projects, AlloyDB, BigQuery datasets, KMS keyring, Shared VPC + VPC-SC, Datastream CDC, per-domain Cloud Run services, Vertex AI Agent Engine, audit sink) and the calibration pipeline (Cloud Workflows + Vertex Batch + Custom Training) that runs on top. The service-choice rationales behind each non-default pick.18 min
OperationsAlloyDB pause-by-default with the five-component cost-control plane. The BACKEND_ROUTING per-route cutover layer in src/lib/api-helpers.ts. The PAM-mediator service that injects per-grant delays. The audit-alerter Cloud Run job. EU residency. The cohort-keyed calibration profile registry. Idempotency under retry.14 min
Cost engineeringThe platform run-rate after pause-by-default (~€95–100/mo combined for both AlloyDB clusters at our pre-launch scale; ~€350/mo saved vs always-on). The eight calibration-pipeline cost levers from Phase-1 (unchanged by the foundation). The local QA harness as the unconventional ninth lever.17 min
ReceiptsWhat landed in the Foundation v1 refactor and what is still in flight. The Phase-1 calibration shakedown narrative compressed to one page: 18 workflow smokes + 5 trainer probes uncovered 27 bugs across 8 categories; Probe 5 converged R-hat = 1.002, n_eff_min = 1,318, 0 divergences in 9.0 s.9 min

What we shipped, summarised

Foundation v1 — workload projects live
3 / 4

neumatics-prod, neumatics-audit-logs, neumatics-network-host live; neumatics-staging deferred on billing-account project quota

Per-domain Cloud Run services deployed
24

Replacing the 1,461-line functions/main.py monolith; per-route cutover via BACKEND_ROUTING in src/lib/api-helpers.ts

AlloyDB combined run-rate
~€95–100/mo

Pause-by-default; ~€350/mo saved vs always-on at our pre-launch scale (10 test users)

Probe 5 trainer convergence
R-hat = 1.002

n_eff_min = 1,318; 0 divergences; 9.0 s wallclock on real PSL data — calibration pipeline still rides on the foundation

The calibration pipeline that converged Probe 5 was the first tenant of a much larger platform. Foundation v1 built the platform around it: a Workspace-rooted org with three live workload projects, CMEK on every store, AlloyDB pause-by-default, per-domain Cloud Run services replacing the monolith, and Datastream feeding BigQuery from AlloyDB.

Why this shape

Three forces converged on the foundation refactor at 2026-05.

The calibration corpus is the next big lock-in. Phase-2 will commit several thousand euros of Vertex AI compute and tens of millions of rows into neumatics-prod's BigQuery nexus_calibration_corpus dataset. Once there, every later analytics surface, every reasoning audit trail, every Reading-shape downstream of it inherits the dataset's location, partitioning, schema, and CMEK posture. Reshaping after the fact means rewriting the writers, the readers, the workflows, and the manifests. Reshaping before Phase-2 fires is a few days of DDL and IAM. We chose now.

There are no external API consumers yet. The Python SDK at sdks/python/nexus_soulmap/ calls a placeholder URL that has no implementation in the repo. Per-route cutover via BACKEND_ROUTING in src/lib/api-helpers.ts lets us switch backends without breaking any live consumer; legacy Cloud Functions stay reachable until each route flips to its new per-domain Cloud Run service.

Compliance retrofit is a step-function expense. CMEK on Firestore is destroy-and-recreate. VPC Service Controls require an organisation. ISO 27701 + 42001 audits require an immutable audit log from day one. Doing all of that now is a few days of org setup; doing it after launch is a quarter of remediation in 2027. The foundation lands before the launch surface attracts external traffic.

The full execution spec the refactor is built against — the eight architectural commitments F-3.1 through F-3.13 plus the parallel security track S-4.1 through S-4.10 — lives at nexus_foundation_v1.md in the repo root. The pages below are the public-facing summary of what shipped against that spec; the gated reports tier carries the full per-component receipts including the IAM bindings, the BigQuery DDL excerpts, the per-cluster pause/resume audit trail, and the per-smoke calibration ledger.


The order we recommend reading these pages is the order they sit in the table above: architecture for the spine, operations for the discipline, cost engineering for the dollar story, receipts for what landed and what is still ⏳. Each reads independently if you only have time for one.