Receipts

Two receipts on this page. The Foundation v1 refactor moved the platform from a single-project consumer-Gmail-owned setup to a Workspace-rooted, multi-project, CMEK-encrypted, pause-by-default platform with per-domain Cloud Run services replacing the functions/main.py monolith. It substantially landed; what is and isn't live as of 2026-07-09 is below. The calibration pipeline ran from infrastructure-deploy through end-to-end trainer convergence on real PSL-derived data: 18 workflow smokes plus 5 manual trainer probes; 27 bugs across 8 categories; final probe converged the NumPyro hierarchical NUTS GRM at R-hat = 1.002, n_eff_min = 1,318, zero divergences in 9.0 seconds wallclock.

The full per-component receipts (the gated reports tier, available on request) carry the IAM bindings, the BigQuery DDL excerpts, the per-cluster controller logs, and the per-smoke ledger. This page is the public-facing summary.

Foundation v1 — what landed

The execution spec the refactor is built against is nexus_foundation_v1.md in the repo root (the F-3.x architectural commitments + S-4.x parallel security track). The receipts:

Component	Status	Where to verify
Workspace org `neumatics.eu`	✅ live	Pre-existing; folder structure under `infra/bootstrap/`
Folder `platform/` + `shared-services/`	✅ live	`infra/bootstrap/`, `infra/projects/main.tf`
Project `neumatics-prod` (workload)	✅ live	`infra/projects/main.tf:34`
Project `neumatics-audit-logs` (sealed sink)	✅ live	`infra/projects/main.tf:48` + `infra/audit/`
Project `neumatics-network-host` (Shared VPC)	✅ live	`infra/projects/main.tf:53`
Org-policy bundle (resourceLocations, disableServiceAccountKeyCreation, etc.)	✅ live	`infra/org-policies/`
Essential contacts → `neumatics.eu` domain	✅ live	`infra/projects/main.tf:266`
KMS keyring `nexus-foundation` (eu-west4) + 7 CMEK keys + 3 HSM keys	✅ live	`infra/kms/`
Per-resource CMEK applied (AlloyDB, BigQuery, GCS, Pub/Sub, etc.)	✅ live	`infra/kms/` IAM bindings; resource-level `kms_key_name` references
Aggregated org log sink → `neumatics-audit-logs` BigQuery	✅ live	`infra/audit/`
GCS object-lock archive (10y retention)	✅ live	`infra/audit/`
Shared VPC + subnets (runtime / data / mgmt)	✅ live	`infra/network/`
VPC-SC perimeter (prod)	✅ live	`infra/network/main.tf` `google_access_context_manager_service_perimeter.nexus_prod`
Private Service Connect (AlloyDB, googleapis)	✅ live	`infra/network/`, `infra/alloydb/`
AlloyDB regional cluster `nexus-prod` (CMEK, pgvector, columnar)	✅ live	`infra/alloydb/cluster.tf`
AlloyDB cost-control plane (controller + auto-pause + warming UX + operator CLI)	✅ live	`services/nexus-alloydb-controller/`, `services/nexus-alloydb-auto-pause/`, `src/lib/alloydb-warming.ts`, `scripts/nexus-alloydb*.ps1`
BigQuery datasets (calibration_corpus, warehouse, substrate, alloydb_cdc, synth_substrate, etc.)	✅ live	`infra/bigquery/` — 7 datasets, all CMEK
Datastream AlloyDB → BigQuery CDC stream	✅ live	`infra/datastream/` (private connection + connection profiles + stream)
Firebase Stream-Firestore-to-BigQuery extension	⏳ pending	`docs/operations/firestore_extension.md` runbook
Firestore region cutover (legacy → eu-west4 under CMEK)	⏳ deferred	`soulmap-v4` database name preserved; region cutover gated on dual-write shim
Per-domain Cloud Run services	✅ live	`services/nexus-*/` — see catalogue at architecture
Per-route cutover (`BACKEND_ROUTING` in `src/lib/api-helpers.ts`)	✅ live	Every route carries an explicit backend entry; graduation gate per `docs/operations/apphosting_cutover.md`
Knowledge Catalog tagging baseline	⏳ pending	Catalog service deployed; tagging not yet applied
Workforce groups + WIF pool + custom roles	✅ live	`infra/iam/`
PAM mediator (per-grant delays per OD-19)	✅ live	`services/nexus-pam-mediator/`, `infra/pam-mediator/`
Audit alerter (Pub/Sub → email on curated events)	✅ live	`services/nexus-audit-alerter/`, `infra/audit-alerter/`
Workflows: `iteration_runner`, `bq_inspect`, `iam_probe`	✅ live	`workflows/`
Workflows: `erasure-cascade.yaml`, `calibration-promote.yaml`, `cohort-freeze.yaml`	⏳ reserved	Per F-3.9 / S-4.8; spec'd, not yet implemented
Operator playbook + access-pattern docs	✅ live	`docs/security/operator_playbook.md`, `docs/security/access_patterns.md`, `docs/security/claude_code_access.md`, `docs/security/incident_response.md`
LinkML schema extension (vocabulary → warehouse)	✅ partial	`vocabulary/schema.linkml.yaml` extended; `schemas/foundation_v1.linkml.yaml` added
Assured Workloads enrolment on `neumatics-prod`	⏳ pending	Per OD-18 (resolved-optimistic 2026-05-08); operator-side confirmation queued

The headline: the foundation Stages 0–2 (org / projects / network / KMS / audit) are substantially live. Stage 3a (AlloyDB + the cost-control plane) is live. Stage 4 (compute) is live with per-domain services deployed and per-route cutover complete. Stages 5–7 (CDC / frontend / decommission) are partially live; the Firestore region cutover is the remaining visible-to-the-user piece.

The pause-by-default cost-control plane was structural, not optional: Stage 3a does not exit until both an idle-pause cycle and a fail-fast-503-retry path are integration-tested. Both have been demonstrated against the live prod cluster.

Calibration pipeline — what converged

Probe 5 R-hat

1.002

Convergence threshold R-hat < 1.01; we are well inside

Probe 5 n_eff_min

1,318

Healthy effective sample size across all parameters

Probe 5 divergences

No funnel-trap warnings from NumPyro NUTS

Probe 5 wallclock

9.0 s

On n2-highmem-16, 10 personas × 6 openness constructs

Probe 5 was the final shakedown probe. Sampler: NumPyro NUTS, four chains × 1,000 warmup × 1,000 samples, target_accept = 0.85, non-centered parameterisation on per-item log-discrimination. Inputs: ten personas drawn from the Phase-1 smoke library, six openness-family constructs scored from real PSL evaluator output (not synthetic responses). Outputs: six calibration_metrics rows plus three invariance rows MERGE'd into BigQuery; HDI95 credible intervals on every parameter; JSON-roundtrippable grm.json artifact written to GCS.

Probe 5 is the single most informative receipt in the shakedown. It demonstrates that the fully-implemented pipeline — workflow → shard worker → Vertex Batch → BQ MERGE → Custom Training → NumPyro NUTS → BQ MERGE — produces a converged psychometric calibration on data the model has never seen, in less time and less cost than any single shakedown smoke iteration.

The graded-response model itself is Samejima 1969; convergence at this scale is the small-N receipt that the pipeline runs end-to-end. Production-scale Phase-2 fits will be larger but no harder.

Bug taxonomy

Eight categories absorbed all twenty-seven bugs. Every fix landed either in the production code path or in the eight-check local QA harness that gates Phase-2 readiness.

Category	Count	Cost (est.)	Where the fix lives
Local toolchain	1	$0	Operator-side; switched to gcloud-bundled Python
GCP IAM bindings	4	$0	`scripts/deploy/deploy_all.sh`
Container build / requirements	5	~$0	Dockerfile contexts + `test_container_imports.py`
Vertex AI quirks	4	~$50	Worker config + workflow YAML + `test_workflow_safety.py`
Cloud Workflows YAML	4	~$3	Inline workflow YAML + `test_workflow_safety.py`
BigQuery schemas	5	~$70	Worker `flatten_*_for_bq` + `test_bq_row_shape.py`
Trainer logic	3	~$10	Trainer + `test_attribute_safety.py`
LLM output quality	1	~$25	`maxOutputTokens=8192` cap

The largest cost was BigQuery schemas at ~$70. The iteration touched five sub-classes — STRING-vs-FLOAT64 autodetect drift, JSON dict-vs-string mismatch, JSON_EXTRACT ambiguity on JSON-typed columns, LIKE on JSON column, and namespace collision risk. All five are now caught statically in 60 seconds for $0 by test_bq_row_shape.py.

The second-largest was Vertex AI quirks at ~$50. The iteration discovered that gemini-3-flash-preview is /global/-only (smoke #8), that Vertex Batch enforces same-location for job + model (smoke #9), that the metadata field is rejected by the validator, and that Cloud Run Job auto-retries can spawn duplicate batches if upstream MERGE fails. All four are now baked into the worker config and workflow YAML.

What every smoke caught

The full per-smoke ledger lives at docs/shakedown_ledger.md and in the reviewer-grade R3 infrastructure report (available on request). Compressed:

#	Layer	Fix
1–4	Local + IAM	gcloud Python; `roles/logging.logWriter`; IAM-propagation lag wait; `roles/run.developer`
5–7	Container	Dockerfile `COPY` path; image-digest refresh; first successful Vertex submit
8–9	Vertex	`/global/`-only; same-location for job + model
10	Workflow	LRO connector default 1,800 s timeout → 7,200 s
11	BQ schema	`STRING` vs `FLOAT64` autodetect drift; canonical schema explicit
12–13	Workflow expr	YAML colon-in-string; BQ connector body-wrapper drift
14	Container	`firebase_admin` transitive import via `synthesis.profile`; extracted dep-free `synthesis/construct_mappings.py`
15	BQ JSON	`LIKE` on JSON column; switched to `TO_JSON_STRING(...)`
16	BQ JSON write	`flatten_session_for_bq` doing `json.dumps()` into JSON column; pass dict directly
17	BQ JSON read	`JSON_EXTRACT(col, '$')` ambiguous; switched to `TO_JSON_STRING(col)`
18	Trainer container	Missing `pydantic` + `networkx` requirements
Probe 1	Trainer SQL	Family-filter tuple-unpacking; same `firebase_admin` chain via `synthesis.profile`
Probe 2	Trainer attr	`conformal_report.per_trait` doesn't exist (real attribute is `quantiles`)
Probe 3	Trainer attr (other)	Same `firebase_admin` chain via five additional callers
Probe 4	Trainer logic	Final attribute drift fixed; conformal + BBN + Gate 10 + BQ writes reached
Probe 5	—	All 3 succeeded. R-hat = 1.002, n_eff = 1,318, 0 divergences

Phase-2 readiness signals

Five green, three open:

Signal	Status
End-to-end calibration trainer success on real PSL data (Probe 5)	✅
8-check local QA harness covers every bug class hit in shakedown	✅
Cohort-keyed calibration profile registry prevents smoke → prod config leak	✅
Iteration namespace convention (0–9 smoke; ≥ 10 production) prevents BQ MERGE collisions	✅
Cost engineering choices verified at small scale; forecasts re-derived against actual smoke spend	✅
Foundation v1 substantially landed (workload projects, KMS, AlloyDB pause-by-default, services, VPC-SC, Datastream live)	✅
Cloud Billing budget alerts (auto-pause drill)	⏳ deferred — requires `roles/billing.user`
YourMorals.org DUA (parallel procurement; weeks lead time)	⏳

The two open items: the first resolves with a roles/billing.user grant plus a half-day drill; the second proceeds on its own external schedule. Neither blocks Phase-2.

What is not yet drilled

Honest framing of the four R5 robustness drills, per the reviewer-grade R5 robustness report (available on request):

Auto-pause end-to-end — Cloud Billing pathway: ❌ NOT EXECUTED. Infrastructure built (functions/budget_check/main.py), gated on roles/billing.user.
Mid-shard worker crash + resume — ❌ NOT EXECUTED. No CHAOS_FAULT branch in worker; max-retries-3 untested under deliberate failure.
Vertex 429 backoff under live quota pressure — ◐ PARTIAL. _RateLimiter fully built; shakedown never drove quota saturation.
Idempotency stress — ◐ PARTIAL. Smokes #11 and #16 incidentally exercised the BQ MERGE dedupe path; no deliberate 6-parallel-shards stress drill.

All four are on the Phase-2 hardening backlog. Two are blocked on external dependencies (roles/billing.user for #1; chaos-image build for #2); two can be drilled tomorrow if needed (#3, #4). Phase-2 corpus generation may incidentally exercise #3 even without the deliberate drill.

Foundation-side gaps from the table at the top of this page — Firestore region cutover, Knowledge Catalog tagging baseline, the three reserved workflows, Assured Workloads enrolment, the Firebase Stream-Firestore-to-BQ extension — are scheduled as the foundation refactor's later stages land. None blocks Phase-2; all are in the public roadmap at /about/science/roadmap.

What ships next

Grant roles/billing.user to an operator account → configure Cloud Billing budget → run the auto-pause drill.
Run idempotency stress drill (6 concurrent re-fires of the same (iteration, shard_index)).
Run mid-shard crash drill once a CHAOS_FAULT-flagged worker image is built.
Resolve the three BigQuery write-path divergences flagged in the R7 implementation audit (available on request) — legacy tabledata.insertAll calls in pre-shakedown scaffolding.
Cut iteration 10 — first Phase-2 production iteration; full corpus (2,000 personas × 50 sessions × 12 turns); all ten calibration families parallel.
Install the Firebase Stream-Firestore-to-BQ extension; cut over from the legacy nightly export.

The pipeline is shaken down. The foundation is substantially landed. The receipts are filed. Phase-2 is unblocked on roles/billing.user.