Phase 0: Backend API & UI Integration — Design Spec¶
Date: 2026-03-25 Status: Implemented (2026-03-26) Scope: Wire the existing React UI to real FastAPI endpoints, replace mock JSON data with PostgreSQL + Neo4j backed responses, complete the auth story end-to-end.
Implementation Status¶
Merged to main: 2026-03-26 — commit 2bcf0fa (96 files, 3,210 lines added)
What's done¶
| Slice | Status | Notes |
|---|---|---|
| Slice 0: Foundation | Done | Core infra, DB clients, JWT security, base classes, app factory, migrations, seed data, nginx proxy, UI API client rewrite |
| Slice 1: Communities | Done | GET /communities, GET /communities/{slug}, GET /communities/create-data |
| Slice 2: Graph | Done | GET /communities/{slug}/graph — Neo4j Cypher queries, layout positions |
| Slice 3: Policies | Done | GET /policies, GET /policies/create-data |
| Slice 4: Memory | Done | GET /memory, GET /memory/meta, GET /memory/contribute, POST /memory |
| Slice 5: Queue | Done | GET /queue, PATCH /queue/{id} — includes rejection-requires-note validation |
| Slice 6: Pull Requests | Done | GET /pull-requests, GET /pull-requests/history |
| Slice 7: Simulation | Done | GET /simulation, POST /simulation/run — returns seeded results |
| Slice 8: Search | Done | GET /search/data — config + canned results (GraphRAG is Phase 3) |
| Slice 9: Notifications | Done | GET /notifications |
| Slice 10: Dashboard | Done | GET /dashboard — aggregation queries across all modules |
Follow-up items completed (2026-03-26)¶
- Neo4j seed data loading: Automated via
entrypoint.sh— Python script checks if data exists, loadsschema_setup.cypher+seed_data.cypherif Neo4j is empty. Docker Compose env vars (NEO4J_USER,NEO4J_PASSWORD) wired through. - Backend tests: 31 tests across 12 files, all passing (0.64s). Route-level tests for all 10 modules + auth tests. Services mocked via
dependency_overrides. Test deps added topyproject.toml [project.optional-dependencies] test.
What's NOT done (requires manual verification / follow-up)¶
- End-to-end smoke test:
docker compose uphas not been run to verify the full stack boots and serves real data. This is the critical next step. - JWKS caching in Redis: Currently using in-memory cache with 1h TTL. Redis-backed caching is deferred to Phase 1.
POST /simulation/run: Currently returns the seeded result. Real simulation engine is Phase 4.- Search: Returns canned results. GraphRAG integration is Phase 3.
- Repository-layer tests: Current tests mock at the service level. Integration tests with real DB containers (testcontainers) are deferred.
- RBAC enforcement: All endpoints require auth but don't enforce role-based access (e.g., Architect-only actions). Deferred to Phase 1.
1. Decisions Summary¶
| Decision | Choice | Rationale |
|---|---|---|
| Seed data strategy | Migration-based (Flyway SQL + Cypher) | Deterministic, version-controlled, runs on every fresh docker compose up |
| UI ↔ Backend communication | Nginx reverse proxy (Docker) + Vite proxy (dev) | Single origin, no CORS issues, one codebase for both environments |
| Authentication | Enforce JWT validation on all endpoints | OIDC infra already running; completes auth story now, avoids retrofitting |
| Neo4j exposure | Thin REST endpoints per UI need | Build exactly what the UI needs; generic query endpoint is Phase 3 |
| Backend structure | Modular monolith (domain modules) | Each module is self-contained, extractable to microservice later |
| Database drivers | SQLAlchemy async + neo4j async + redis async | ORM benefits for PostgreSQL as schema grows |
| Implementation approach | Vertical slices | Incremental demoable progress; one feature end-to-end at a time |
2. Backend Project Structure¶
Modular monolith where each domain module owns its own models, schemas, repository, service, and router. Modules communicate via Python Protocols (dependency inversion). To extract a module into a standalone microservice: move the folder, add a main.py, pip-install substrate-core.
backend/
app/
__init__.py
main.py # App factory: registers modules, middleware
core/ # Shared kernel — the ONLY cross-module dependency
__init__.py
config.py # Pydantic Settings (single source of config truth)
security.py # JWT decode, JWKS fetching, token models
dependencies.py # Dependency injection wiring (protocols → implementations)
exceptions.py # Domain exceptions + FastAPI exception handlers
middleware.py # CORS, request logging, error envelope
database/
__init__.py
postgres.py # async engine, sessionmaker, get_session()
neo4j.py # async driver, get_driver()
redis.py # async client, get_client()
models/
__init__.py
base.py # DeclarativeBase + UUIDMixin + TimestampMixin
schemas/
__init__.py
base.py # ResponseEnvelope[T], PaginatedResponse[T], ErrorResponse
repository.py # GenericRepository[M, S] — async CRUD
service.py # BaseService[R] — common get/list patterns
protocols.py # Shared Protocol definitions for cross-module deps
modules/ # Each module = future microservice boundary
__init__.py
dashboard/
__init__.py
router.py # GET /api/v1/dashboard
schemas.py # Metric, DomainEntity, DashboardAlert, DashboardResponse
service.py # Aggregates data from other modules via protocols
dependencies.py # Module-level DI wiring
community/
__init__.py
router.py # GET /communities, GET /communities/{id}
models.py # Community, CommunityServices ORM
schemas.py # CommunityOut, CommunityDetail
repository.py # CommunityRepository
service.py # CommunityService
dependencies.py
graph/
__init__.py
router.py # GET /communities/{id}/graph
schemas.py # GraphNode, GraphEdge
repository.py # GraphRepository (Neo4j Cypher queries)
service.py # GraphService
dependencies.py
policy/
__init__.py
router.py # GET /policies, GET /policies/create-data
models.py # Policy, PolicyPack, PolicyViolation ORM
schemas.py # PolicyOut, PolicyCreate, ViolationOut
repository.py # PolicyRepository
service.py # PolicyService
dependencies.py
memory/
__init__.py
router.py # GET /memory, GET /memory/meta, GET /memory/contribute, POST /memory
models.py # MemoryEntry ORM
schemas.py # MemoryEntryOut, MemoryMetaOut, ContributeMemoryIn
repository.py # MemoryRepository
service.py # MemoryService
dependencies.py
queue/
__init__.py
router.py # GET /queue, PATCH /queue/{id}
models.py # QueueItem ORM
schemas.py # QueueItemOut, QueueActionIn
repository.py # QueueRepository
service.py # QueueService
dependencies.py
pull_request/
__init__.py
router.py # GET /pull-requests, GET /pull-requests/history
models.py # PullRequest, PullRequestHistory ORM
schemas.py # PullRequestOut, PrHistoryOut
repository.py # PullRequestRepository
service.py # PullRequestService
dependencies.py
simulation/
__init__.py
router.py # GET /simulation, POST /simulation/run
models.py # SimulationResult ORM
schemas.py # SimulationIn, SimulationOut
repository.py # SimulationRepository
service.py # SimulationService
dependencies.py
search/
__init__.py
router.py # GET /search/data
schemas.py # SearchData, SearchIntent, SearchResult
service.py # SearchService (config + canned results; GraphRAG is Phase 3)
dependencies.py
notification/
__init__.py
router.py # GET /notifications
models.py # Notification ORM
schemas.py # NotificationOut
repository.py # NotificationRepository
service.py # NotificationService
dependencies.py
db/
postgres/
V1__initial_schema.sql # (exists)
V2__phase0_schema.sql # Extended tables for all entities
V3__seed_data.sql # Representative seed data
neo4j/
schema_setup.cypher # (exists, extended)
seed_data.cypher # Graph nodes, edges, communities
tests/
conftest.py # Testcontainers setup, session fixtures, test JWT factory
test_auth.py # JWT validation tests
modules/
test_dashboard.py
test_communities.py
test_graph.py
test_policies.py
test_memory.py
test_queue.py
test_pull_requests.py
test_simulation.py
test_search.py
test_notifications.py
SOLID & DRY Base Classes¶
core/models/base.py — Every ORM model inherits UUID PK + timestamps:
class UUIDMixin:
id: Mapped[uuid.UUID] = mapped_column(primary_key=True, default=uuid.uuid4)
class TimestampMixin:
created_at: Mapped[datetime] = mapped_column(default=func.now())
updated_at: Mapped[datetime] = mapped_column(default=func.now(), onupdate=func.now())
class Base(DeclarativeBase, UUIDMixin, TimestampMixin): ...
core/repository.py — Generic CRUD so each module repo only adds domain-specific queries:
class GenericRepository(Generic[ModelT, SchemaT]):
def __init__(self, session: AsyncSession, model: type[ModelT]):
self.session = session
self.model = model
async def get_by_id(self, id: UUID) -> ModelT | None: ...
async def list_all(self, offset=0, limit=50) -> list[ModelT]: ...
async def create(self, data: SchemaT) -> ModelT: ...
async def update(self, id: UUID, data: SchemaT) -> ModelT: ...
async def delete(self, id: UUID) -> None: ...
core/protocols.py — Dependency inversion via Python Protocols:
class ServiceReader(Protocol):
async def get_by_id(self, id: UUID) -> ServiceOut | None: ...
async def list_all(self, offset: int, limit: int) -> list[ServiceOut]: ...
class GraphReader(Protocol):
async def get_nodes(self, community_id: str) -> list[GraphNode]: ...
async def get_edges(self, community_id: str) -> list[GraphEdge]: ...
core/schemas/base.py — Consistent API responses:
class ResponseEnvelope(BaseModel, Generic[T]):
data: T
meta: dict | None = None
class PaginatedResponse(ResponseEnvelope[list[T]], Generic[T]):
total: int
offset: int
limit: int
class ErrorResponse(BaseModel):
error: ErrorDetail
class ErrorDetail(BaseModel):
code: str
message: str
detail: Any | None = None
core/service.py — Common service patterns:
class BaseService(Generic[RepoT]):
def __init__(self, repo: RepoT):
self.repo = repo
async def get(self, id: UUID):
return await self.repo.get_by_id(id)
async def list(self, offset=0, limit=50):
return await self.repo.list_all(offset, limit)
Request flow¶
Route (HTTP) → Service (logic) → Repository (data) → DB
↑ Depends(get_current_user) ↑ Depends(get_session)
↑ Depends(get_service) ↑ Protocol contract
Routes never touch SQLAlchemy or Neo4j. Services never parse HTTP. Repositories never know about business rules.
Cross-module communication¶
While monolith: direct Python imports between module services via protocols. After microservice split: replace with NATS messages or HTTP calls — the Protocol interface stays identical.
Module registration¶
# main.py
MODULES = [dashboard, community, graph, policy, memory,
queue, pull_request, simulation, search, notification]
def create_app() -> FastAPI:
app = FastAPI(title="Substrate API")
register_middleware(app)
for module in MODULES:
app.include_router(module.router.router, prefix="/api/v1")
return app
To disable a module (e.g., when extracted to its own service): remove it from the list.
3. Database Schema¶
PostgreSQL — V2__phase0_schema.sql¶
Schema aligned with the Unified Multimodal Knowledge Base (UMKB) entity model from knowledge-base.md. The API layer transforms DB models into UI-friendly response shapes.
-- ============================================================
-- DOMAIN: Services & Architecture
-- ============================================================
ALTER TABLE services ADD COLUMN IF NOT EXISTS slug TEXT UNIQUE;
ALTER TABLE services ADD COLUMN IF NOT EXISTS api_type TEXT;
ALTER TABLE services ADD COLUMN IF NOT EXISTS page_rank FLOAT DEFAULT 0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS betweenness FLOAT DEFAULT 0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS tension_score FLOAT DEFAULT 0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS confidence FLOAT DEFAULT 1.0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS verification_status TEXT DEFAULT 'unverified';
ALTER TABLE services ADD COLUMN IF NOT EXISTS owner_handle TEXT;
CREATE TABLE communities (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE,
slug TEXT NOT NULL UNIQUE,
color TEXT NOT NULL,
description TEXT,
owner_team TEXT,
tension FLOAT DEFAULT 0,
violation_count INT DEFAULT 0,
trend TEXT DEFAULT 'stable',
trend_delta INT DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE community_services (
community_id UUID REFERENCES communities(id) ON DELETE CASCADE,
service_id UUID REFERENCES services(id) ON DELETE CASCADE,
PRIMARY KEY (community_id, service_id)
);
-- ============================================================
-- DOMAIN: Teams & Ownership
-- ============================================================
CREATE TABLE teams (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE,
parent_team_id UUID REFERENCES teams(id),
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE developers (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
github_handle TEXT UNIQUE NOT NULL,
display_name TEXT NOT NULL,
email TEXT,
keycloak_id TEXT,
active BOOLEAN DEFAULT true,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE team_members (
team_id UUID REFERENCES teams(id) ON DELETE CASCADE,
developer_id UUID REFERENCES developers(id) ON DELETE CASCADE,
role TEXT DEFAULT 'member',
since DATE NOT NULL,
PRIMARY KEY (team_id, developer_id)
);
CREATE TABLE service_ownership (
service_id UUID REFERENCES services(id) ON DELETE CASCADE,
owner_id UUID NOT NULL,
owner_type TEXT NOT NULL,
is_primary BOOLEAN DEFAULT false,
confidence FLOAT DEFAULT 1.0,
last_verified TIMESTAMPTZ DEFAULT now(),
since DATE NOT NULL,
PRIMARY KEY (service_id, owner_id)
);
-- ============================================================
-- DOMAIN: Governance
-- ============================================================
CREATE TABLE policy_packs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
slug TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
description TEXT,
official BOOLEAN DEFAULT true,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE policies (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
policy_id TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
pack_id UUID REFERENCES policy_packs(id),
level TEXT NOT NULL,
description TEXT,
rego_source TEXT,
domain_scope TEXT,
active BOOLEAN DEFAULT true,
violation_count INT DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS pr_number INT;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS repo_full_name TEXT;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS affected_nodes JSONB;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS blast_radius_count INT;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS resolution_status TEXT DEFAULT 'open';
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS resolved_at TIMESTAMPTZ;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS exception_node_id UUID;
-- ============================================================
-- DOMAIN: Institutional Memory
-- ============================================================
CREATE TABLE memory_entries (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
entry_type TEXT NOT NULL,
title TEXT NOT NULL,
body TEXT,
author_id UUID REFERENCES developers(id),
source_url TEXT,
service_id UUID REFERENCES services(id),
confidence FLOAT DEFAULT 1.0,
entry_date DATE NOT NULL,
expires_at DATE,
approved_by UUID REFERENCES developers(id),
linked_policy_id UUID REFERENCES policies(id),
reviewed_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
-- ============================================================
-- DOMAIN: Verification Queue
-- ============================================================
CREATE TABLE queue_items (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
entity_name TEXT NOT NULL,
entity_type TEXT NOT NULL,
reason TEXT NOT NULL,
confidence FLOAT NOT NULL,
flag TEXT,
severity TEXT NOT NULL,
status TEXT DEFAULT 'pending',
assigned_to UUID REFERENCES developers(id),
resolution_note TEXT,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
-- ============================================================
-- DOMAIN: Pull Requests
-- ============================================================
CREATE TABLE pull_requests (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
pr_number INT NOT NULL,
repo_full_name TEXT,
author_handle TEXT NOT NULL,
title TEXT NOT NULL,
status TEXT NOT NULL,
violation_count INT DEFAULT 0,
impact_summary TEXT,
blast_radius JSONB,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE pull_request_history (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
pr_number INT NOT NULL,
author_handle TEXT NOT NULL,
title TEXT NOT NULL,
policies_evaluated INT DEFAULT 0,
result TEXT NOT NULL,
resolution_status TEXT,
resolved_time TEXT,
created_at TIMESTAMPTZ DEFAULT now()
);
-- ============================================================
-- DOMAIN: Simulation
-- ============================================================
CREATE TABLE simulation_results (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
scenario TEXT NOT NULL,
prompt TEXT NOT NULL,
status TEXT,
duration_ms INT,
before_state JSONB,
after_state JSONB,
blast_radius_delta JSONB,
policy_delta JSONB,
requested_by UUID REFERENCES developers(id),
created_at TIMESTAMPTZ DEFAULT now()
);
-- ============================================================
-- DOMAIN: Notifications & Alerts
-- ============================================================
CREATE TABLE notifications (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
notification_type TEXT NOT NULL,
title TEXT NOT NULL,
detail TEXT,
recipient_id UUID REFERENCES developers(id),
read BOOLEAN DEFAULT false,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE dashboard_alerts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
alert_type TEXT NOT NULL,
severity TEXT NOT NULL,
title TEXT NOT NULL,
detail TEXT,
domain TEXT,
acknowledged BOOLEAN DEFAULT false,
created_at TIMESTAMPTZ DEFAULT now()
);
-- ============================================================
-- DOMAIN: Sprint Tracking
-- ============================================================
CREATE TABLE sprints (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
sprint_id TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
health TEXT DEFAULT 'healthy',
velocity INT,
target_velocity INT,
violation_delta INT DEFAULT 0,
debt_score FLOAT DEFAULT 0,
started_at DATE,
ended_at DATE,
created_at TIMESTAMPTZ DEFAULT now()
);
Neo4j — Extended schema + seed data¶
Schema constraints:
CREATE CONSTRAINT service_id IF NOT EXISTS FOR (s:Service) REQUIRE s.id IS UNIQUE;
CREATE CONSTRAINT service_name IF NOT EXISTS FOR (s:Service) REQUIRE s.name IS UNIQUE;
CREATE INDEX service_domain IF NOT EXISTS FOR (s:Service) ON (s.domain);
CREATE CONSTRAINT community_id IF NOT EXISTS FOR (c:Community) REQUIRE c.id IS UNIQUE;
CREATE CONSTRAINT developer_handle IF NOT EXISTS FOR (d:Developer) REQUIRE d.github_handle IS UNIQUE;
CREATE CONSTRAINT team_name IF NOT EXISTS FOR (t:Team) REQUIRE t.name IS UNIQUE;
CREATE CONSTRAINT decision_id IF NOT EXISTS FOR (d:DecisionNode) REQUIRE d.id IS UNIQUE;
CREATE CONSTRAINT failure_id IF NOT EXISTS FOR (f:FailurePattern) REQUIRE f.id IS UNIQUE;
CREATE CONSTRAINT policy_id IF NOT EXISTS FOR (p:Policy) REQUIRE p.policy_id IS UNIQUE;
Seed data entities:
12 :Service nodes — with domain, page_rank, tension_score, status, confidence
5 :Developer nodes — alice, bob, carol, dave, frank
3 :Team nodes — payments-team, orders-team, platform-team
4 :Community nodes — Payment, Auth, Orders, Infrastructure
3 :DecisionNode nodes — ADR-047 (gateway-first), ADR-051 (mTLS), ADR-032 (stale)
1 :FailurePattern node — PM-019 (double-charge incident)
1 :ExceptionNode — FulfillmentSvc legacy direct call
8 :Policy nodes — the 8 governance policies
Edges:
~15 :DEPENDS_ON — intended dependencies
~12 :ACTUALLY_CALLS — observed runtime calls (including the violating one)
~12 :OWNS — developer/team → service ownership
4 :CONTAINS — community → service membership
3 :WHY — decision → service/policy rationale
1 :CAUSED_BY — failure → service
1 :PREVENTED_BY — service → policy
5 :MEMBER_OF — developer → team
Seed data narrative¶
The seed data tells a coherent story that exercises every persona's core workflow:
Setting: Mid-size e-commerce team (5-9 developers) running ~12 microservices across 4 domains. Two years of operation with accumulated decisions, one survived incident, and active structural decay.
The story a user discovers:
- Dashboard (Executive, Architect) — Payment domain on fire: tension spiked 40% this sprint, 2 hard violations, key person (@alice) departing Sprint 25
- Graph (Developer, Architect) — OrderService bypassing API gateway to call PaymentService directly. Visible as red ACTUALLY_CALLS edge violating the intended DEPENDS_ON path
- PR Details (Developer) — PR #2847 blocked. Violation explanation links to ADR-047 (gateway-first) created after PM-019 (double-charge incident). Developer sees WHY the rule exists
- Memory Timeline (Tech Lead) — Traceable chain: PM-019 → ADR-047 → GOV-012 → PR #2847 blocked
- Queue (Architect) — 6 items across all three confidence bands (61%-89%): stale ownership, missing ADRs, duplicate docs
- Simulation (Architect, DevOps) — "Split OrderService?" shows blast radius shrinks, SRP violation resolves, but creates new ownership gap
- Search (any persona) — "Why does PaymentService route through the gateway?" returns sourced answer citing ADR-047 and PM-019
- Policies (Security Engineer) — 8 active policies from 6 packs with violation counts matching graph state
Computed vs stored¶
| Data | Strategy | Why |
|---|---|---|
| Dashboard metrics | Aggregation queries across policies, communities | Always fresh |
| Memory meta (total, gaps, coverage) | COUNT/AVG on memory_entries + services | Single source of truth |
| Community violation counts | JOIN policies → violations → community_services | No stale counts |
| Policy violation counts | COUNT on policy_violations WHERE open | Always accurate |
| Node status (ok/warn/bad) | Derived from violation count + tension | Business rule |
4. Authentication & Security¶
JWT validation flow¶
UI (Keycloak login) → Bearer token in Authorization header
↓
Nginx reverse proxy → /api/v1/* → Backend
↓
FastAPI dependency: get_current_user()
↓
1. Extract token from header
2. Fetch Keycloak JWKS (cached in Redis, TTL 1h)
3. Decode + verify signature, expiry, audience, issuer
4. Return UserInfo (sub, email, roles, groups)
Implementation¶
- JWKS caching: Fetch Keycloak's
/.well-known/openid-configuration→jwks_urion first request. Cache public keys in Redis (1h TTL). - Token validation:
python-josedecodes JWT, verifies signature against cached JWKS, checksiss=OIDC_ISSUER_URL,audcontainssubstrate-backend,expnot past. - UserInfo model: Pydantic schema with
sub(UUID),email,preferred_username,realm_roles,groups. Extracted from JWT claims.
Route protection¶
All /api/v1/* routes require auth except /api/v1/config (UI needs it pre-login to discover OIDC issuer).
UI token plumbing¶
Thin wrapper reads token from react-oidc-context and makes it available outside React components for the api() fetch function.
5. Nginx Reverse Proxy & UI Wiring¶
Nginx config¶
server {
listen 8080;
listen 8443 ssl;
ssl_certificate /etc/nginx/certs/tls.crt;
ssl_certificate_key /etc/nginx/certs/tls.key;
root /usr/share/nginx/html;
index index.html;
# Reverse proxy: /api/* → backend
location /api/ {
proxy_pass http://substrate-backend:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Authorization $http_authorization;
proxy_http_version 1.1;
}
# SPA fallback
location / {
try_files $uri $uri/ /index.html;
}
# Health check
location = /healthz {
return 200 'ok';
add_header Content-Type text/plain;
}
}
UI API client rewrite¶
Replace static JSON fetcher:
// api/client.ts
const BASE = import.meta.env.VITE_BACKEND_URL || '/api/v1';
export async function api<T>(path: string, options?: RequestInit): Promise<T> {
const token = getAccessToken();
const res = await fetch(`${BASE}${path}`, {
...options,
headers: {
'Content-Type': 'application/json',
...(token ? { Authorization: `Bearer ${token}` } : {}),
...options?.headers,
},
});
if (!res.ok) throw new ApiError(res.status, await res.text());
return res.json();
}
substrateApi rewired to call /api/v1/* paths instead of /data/*.json.
Vite dev proxy¶
// vite.config.ts
server: {
proxy: {
'/api/v1': { target: 'http://localhost:8000', changeOrigin: true },
},
},
Two environments, one codebase¶
| Environment | UI served by | API calls | Auth |
|---|---|---|---|
docker compose up |
Nginx (:8080/:8443) | Nginx proxies /api/* → backend:8000 |
Keycloak container |
npm run dev |
Vite (:5173) | Vite proxies /api/v1/* → localhost:8000 |
Keycloak container |
6. API Endpoints¶
All prefixed with /api/v1. All require JWT auth except /config.
Dashboard module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/dashboard |
DashboardData |
PG + Neo4j (aggregation) |
Community module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/communities |
Community[] |
Neo4j + PG |
GET |
/communities/{slug} |
CommunityDetail |
Neo4j + PG |
GET |
/communities/create-data |
CreateCommunityData |
PG |
Graph module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/communities/{slug}/graph |
{nodes, edges} |
Neo4j |
Policy module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/policies |
Policy[] |
PG |
GET |
/policies/create-data |
CreatePolicyData |
PG |
Memory module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/memory |
MemoryEntry[] |
PG |
GET |
/memory/meta |
MemoryMetaData |
PG (computed) |
GET |
/memory/contribute |
ContributeMemoryData |
PG |
POST |
/memory |
MemoryEntry |
PG |
Queue module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/queue |
QueueItem[] |
PG |
PATCH |
/queue/{id} |
QueueItem |
PG |
Pull Request module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/pull-requests |
PullRequest[] |
PG |
GET |
/pull-requests/history |
PrHistoryItem[] |
PG |
Simulation module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/simulation |
SimulationData |
PG |
POST |
/simulation/run |
SimulationResult |
PG (seeded result; real engine Phase 4) |
Search module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/search/data |
SearchData |
PG (config + canned results; GraphRAG Phase 3) |
Notification module¶
| Method | Path | Response | Source |
|---|---|---|---|
GET |
/notifications |
Notification[] |
PG |
Response conventions¶
Collections:
{ "data": [...], "meta": { "total": 42, "offset": 0, "limit": 50 } }
Single entities: object directly (no envelope).
Errors:
{ "error": { "code": "POLICY_NOT_FOUND", "message": "...", "detail": null } }
7. Implementation Order — Vertical Slices¶
Each slice is independently demoable. Dependencies flow top-down.
Slice 0: Foundation¶
Core infrastructure all modules depend on:
- core/ — config, security, exceptions, middleware, database clients
- core/models/base.py — ORM base classes
- core/schemas/base.py — response envelope, error response
- core/repository.py — GenericRepository
- core/service.py — BaseService
- core/dependencies.py — DI wiring
- main.py — app factory, module registration
- V2__phase0_schema.sql + V3__seed_data.sql + seed_data.cypher
- Nginx config + UI api/client.ts rewrite + Vite proxy + token plumbing
Demoable: docker compose up → Keycloak login → authenticated API calls return 200.
Slice 1: Communities¶
modules/community/ — full vertical slice.
Demoable: Communities page lists real domains with tension/violations.
Slice 2: Graph¶
modules/graph/ — Neo4j Cypher queries.
Demoable: Graph page renders real architecture topology. Red edges where observed diverges from intended.
Slice 3: Policies¶
modules/policy/ — full vertical slice.
Demoable: Policies page shows real enforcement rules. Violation counts match graph.
Slice 4: Memory¶
modules/memory/ — full vertical slice including POST.
Demoable: Memory timeline shows institutional history. Contributing entries works.
Slice 5: Queue¶
modules/queue/ — full vertical slice including PATCH.
Demoable: Verification queue with confidence bands. Actions persist.
Slice 6: Pull Requests¶
modules/pull_request/ — full vertical slice.
Demoable: PR #2847 blocked with blast radius. History table populated.
Slice 7: Simulation¶
modules/simulation/ — seeded results.
Demoable: Before/after diff. "Run Simulation" returns seeded OrderService split result.
Slice 8: Search¶
modules/search/ — config + canned results.
Demoable: Search page with sample queries returning sourced answers.
Slice 9: Notifications¶
modules/notification/ — full vertical slice.
Demoable: Bell icon shows real notifications.
Slice 10: Dashboard¶
modules/dashboard/ — aggregation across all other modules.
Demoable: Dashboard shows real computed metrics, domain cards, live alerts.
Build order¶
Slice 0 (Foundation)
→ Slice 1 (Communities) → Slice 2 (Graph)
→ Slice 3 (Policies)
→ Slice 4 (Memory)
→ Slice 5 (Queue)
→ Slice 6 (Pull Requests)
→ Slice 7 (Simulation)
→ Slice 8 (Search)
→ Slice 9 (Notifications)
→ Slice 10 (Dashboard) ← last, aggregates everything
8. Testing Strategy¶
Infrastructure¶
pytest+pytest-asynciotestcontainers-python— disposable PostgreSQL, Neo4j, Redis per test sessionhttpx.AsyncClient— tests FastAPI app directly
Test layers¶
| Layer | What | How |
|---|---|---|
| Repository | Queries return correct data from seeded DB | Real DB containers, seed fixtures |
| Service | Business logic, computed metrics, aggregation | Real repos with test DB |
| Route | HTTP status codes, response shapes, auth | httpx.AsyncClient(app=app), JWT fixtures |
| Auth | JWT validation edge cases | Crafted test JWTs, assert 401/403 |
Key test scenarios¶
- Auth: valid token → 200, missing → 401, expired → 401, wrong audience → 403
- Dashboard: metrics match actual DB counts; empty DB → zero metrics
- Graph: correct node/edge count per community; nonexistent community → 404
- Queue: PATCH approve persists; reject without note → 422
Not tested in Phase 0¶
- Trivial Pydantic schema serialization
- Load testing (Phase 2+)
- E2E browser tests
9. Dependencies¶
Backend — new in pyproject.toml¶
sqlalchemy[asyncio] >= 2.0
asyncpg
neo4j >= 5.0
redis[hiredis] >= 5.0
python-jose[cryptography]
httpx
Backend — test dependencies¶
pytest >= 8.0
pytest-asyncio
testcontainers[postgres,neo4j,redis]
httpx
Frontend — no new dependencies¶
Existing react-oidc-context and oidc-client-ts already handle auth. API client rewrite uses native fetch.
10. Future phases (out of scope)¶
These are explicitly NOT part of Phase 0 but the modular structure accommodates them:
- Phase 0.5: Marketplace module + Connector plugin framework
- Phase 1: First data connectors (Git, Terraform, K8s) built on that framework
- Phase 1.5: Chat-based ingestion + LLM integration (vLLM endpoints)
- Phase 2: Governance engine (OPA evaluation, PR blocking)
- Phase 3: Reasoning & Search (GraphRAG, NL-to-Cypher)
- Phase 4: Simulation engine + Agent orchestration