Skip to content

Phase 0: Backend API & UI Integration — Design Spec

Date: 2026-03-25 Status: Implemented (2026-03-26) Scope: Wire the existing React UI to real FastAPI endpoints, replace mock JSON data with PostgreSQL + Neo4j backed responses, complete the auth story end-to-end.


Implementation Status

Merged to main: 2026-03-26 — commit 2bcf0fa (96 files, 3,210 lines added)

What's done

Slice Status Notes
Slice 0: Foundation Done Core infra, DB clients, JWT security, base classes, app factory, migrations, seed data, nginx proxy, UI API client rewrite
Slice 1: Communities Done GET /communities, GET /communities/{slug}, GET /communities/create-data
Slice 2: Graph Done GET /communities/{slug}/graph — Neo4j Cypher queries, layout positions
Slice 3: Policies Done GET /policies, GET /policies/create-data
Slice 4: Memory Done GET /memory, GET /memory/meta, GET /memory/contribute, POST /memory
Slice 5: Queue Done GET /queue, PATCH /queue/{id} — includes rejection-requires-note validation
Slice 6: Pull Requests Done GET /pull-requests, GET /pull-requests/history
Slice 7: Simulation Done GET /simulation, POST /simulation/run — returns seeded results
Slice 8: Search Done GET /search/data — config + canned results (GraphRAG is Phase 3)
Slice 9: Notifications Done GET /notifications
Slice 10: Dashboard Done GET /dashboard — aggregation queries across all modules

Follow-up items completed (2026-03-26)

  • Neo4j seed data loading: Automated via entrypoint.sh — Python script checks if data exists, loads schema_setup.cypher + seed_data.cypher if Neo4j is empty. Docker Compose env vars (NEO4J_USER, NEO4J_PASSWORD) wired through.
  • Backend tests: 31 tests across 12 files, all passing (0.64s). Route-level tests for all 10 modules + auth tests. Services mocked via dependency_overrides. Test deps added to pyproject.toml [project.optional-dependencies] test.

What's NOT done (requires manual verification / follow-up)

  • End-to-end smoke test: docker compose up has not been run to verify the full stack boots and serves real data. This is the critical next step.
  • JWKS caching in Redis: Currently using in-memory cache with 1h TTL. Redis-backed caching is deferred to Phase 1.
  • POST /simulation/run: Currently returns the seeded result. Real simulation engine is Phase 4.
  • Search: Returns canned results. GraphRAG integration is Phase 3.
  • Repository-layer tests: Current tests mock at the service level. Integration tests with real DB containers (testcontainers) are deferred.
  • RBAC enforcement: All endpoints require auth but don't enforce role-based access (e.g., Architect-only actions). Deferred to Phase 1.

1. Decisions Summary

Decision Choice Rationale
Seed data strategy Migration-based (Flyway SQL + Cypher) Deterministic, version-controlled, runs on every fresh docker compose up
UI ↔ Backend communication Nginx reverse proxy (Docker) + Vite proxy (dev) Single origin, no CORS issues, one codebase for both environments
Authentication Enforce JWT validation on all endpoints OIDC infra already running; completes auth story now, avoids retrofitting
Neo4j exposure Thin REST endpoints per UI need Build exactly what the UI needs; generic query endpoint is Phase 3
Backend structure Modular monolith (domain modules) Each module is self-contained, extractable to microservice later
Database drivers SQLAlchemy async + neo4j async + redis async ORM benefits for PostgreSQL as schema grows
Implementation approach Vertical slices Incremental demoable progress; one feature end-to-end at a time

2. Backend Project Structure

Modular monolith where each domain module owns its own models, schemas, repository, service, and router. Modules communicate via Python Protocols (dependency inversion). To extract a module into a standalone microservice: move the folder, add a main.py, pip-install substrate-core.

backend/
  app/
    __init__.py
    main.py                        # App factory: registers modules, middleware

    core/                          # Shared kernel — the ONLY cross-module dependency
      __init__.py
      config.py                    # Pydantic Settings (single source of config truth)
      security.py                  # JWT decode, JWKS fetching, token models
      dependencies.py              # Dependency injection wiring (protocols → implementations)
      exceptions.py                # Domain exceptions + FastAPI exception handlers
      middleware.py                # CORS, request logging, error envelope
      database/
        __init__.py
        postgres.py                # async engine, sessionmaker, get_session()
        neo4j.py                   # async driver, get_driver()
        redis.py                   # async client, get_client()
      models/
        __init__.py
        base.py                    # DeclarativeBase + UUIDMixin + TimestampMixin
      schemas/
        __init__.py
        base.py                    # ResponseEnvelope[T], PaginatedResponse[T], ErrorResponse
      repository.py                # GenericRepository[M, S] — async CRUD
      service.py                   # BaseService[R] — common get/list patterns
      protocols.py                 # Shared Protocol definitions for cross-module deps

    modules/                       # Each module = future microservice boundary
      __init__.py

      dashboard/
        __init__.py
        router.py                  # GET /api/v1/dashboard
        schemas.py                 # Metric, DomainEntity, DashboardAlert, DashboardResponse
        service.py                 # Aggregates data from other modules via protocols
        dependencies.py            # Module-level DI wiring

      community/
        __init__.py
        router.py                  # GET /communities, GET /communities/{id}
        models.py                  # Community, CommunityServices ORM
        schemas.py                 # CommunityOut, CommunityDetail
        repository.py              # CommunityRepository
        service.py                 # CommunityService
        dependencies.py

      graph/
        __init__.py
        router.py                  # GET /communities/{id}/graph
        schemas.py                 # GraphNode, GraphEdge
        repository.py              # GraphRepository (Neo4j Cypher queries)
        service.py                 # GraphService
        dependencies.py

      policy/
        __init__.py
        router.py                  # GET /policies, GET /policies/create-data
        models.py                  # Policy, PolicyPack, PolicyViolation ORM
        schemas.py                 # PolicyOut, PolicyCreate, ViolationOut
        repository.py              # PolicyRepository
        service.py                 # PolicyService
        dependencies.py

      memory/
        __init__.py
        router.py                  # GET /memory, GET /memory/meta, GET /memory/contribute, POST /memory
        models.py                  # MemoryEntry ORM
        schemas.py                 # MemoryEntryOut, MemoryMetaOut, ContributeMemoryIn
        repository.py              # MemoryRepository
        service.py                 # MemoryService
        dependencies.py

      queue/
        __init__.py
        router.py                  # GET /queue, PATCH /queue/{id}
        models.py                  # QueueItem ORM
        schemas.py                 # QueueItemOut, QueueActionIn
        repository.py              # QueueRepository
        service.py                 # QueueService
        dependencies.py

      pull_request/
        __init__.py
        router.py                  # GET /pull-requests, GET /pull-requests/history
        models.py                  # PullRequest, PullRequestHistory ORM
        schemas.py                 # PullRequestOut, PrHistoryOut
        repository.py              # PullRequestRepository
        service.py                 # PullRequestService
        dependencies.py

      simulation/
        __init__.py
        router.py                  # GET /simulation, POST /simulation/run
        models.py                  # SimulationResult ORM
        schemas.py                 # SimulationIn, SimulationOut
        repository.py              # SimulationRepository
        service.py                 # SimulationService
        dependencies.py

      search/
        __init__.py
        router.py                  # GET /search/data
        schemas.py                 # SearchData, SearchIntent, SearchResult
        service.py                 # SearchService (config + canned results; GraphRAG is Phase 3)
        dependencies.py

      notification/
        __init__.py
        router.py                  # GET /notifications
        models.py                  # Notification ORM
        schemas.py                 # NotificationOut
        repository.py              # NotificationRepository
        service.py                 # NotificationService
        dependencies.py

  db/
    postgres/
      V1__initial_schema.sql       # (exists)
      V2__phase0_schema.sql        # Extended tables for all entities
      V3__seed_data.sql            # Representative seed data
    neo4j/
      schema_setup.cypher          # (exists, extended)
      seed_data.cypher             # Graph nodes, edges, communities

  tests/
    conftest.py                    # Testcontainers setup, session fixtures, test JWT factory
    test_auth.py                   # JWT validation tests
    modules/
      test_dashboard.py
      test_communities.py
      test_graph.py
      test_policies.py
      test_memory.py
      test_queue.py
      test_pull_requests.py
      test_simulation.py
      test_search.py
      test_notifications.py

SOLID & DRY Base Classes

core/models/base.py — Every ORM model inherits UUID PK + timestamps:

class UUIDMixin:
    id: Mapped[uuid.UUID] = mapped_column(primary_key=True, default=uuid.uuid4)

class TimestampMixin:
    created_at: Mapped[datetime] = mapped_column(default=func.now())
    updated_at: Mapped[datetime] = mapped_column(default=func.now(), onupdate=func.now())

class Base(DeclarativeBase, UUIDMixin, TimestampMixin): ...

core/repository.py — Generic CRUD so each module repo only adds domain-specific queries:

class GenericRepository(Generic[ModelT, SchemaT]):
    def __init__(self, session: AsyncSession, model: type[ModelT]):
        self.session = session
        self.model = model

    async def get_by_id(self, id: UUID) -> ModelT | None: ...
    async def list_all(self, offset=0, limit=50) -> list[ModelT]: ...
    async def create(self, data: SchemaT) -> ModelT: ...
    async def update(self, id: UUID, data: SchemaT) -> ModelT: ...
    async def delete(self, id: UUID) -> None: ...

core/protocols.py — Dependency inversion via Python Protocols:

class ServiceReader(Protocol):
    async def get_by_id(self, id: UUID) -> ServiceOut | None: ...
    async def list_all(self, offset: int, limit: int) -> list[ServiceOut]: ...

class GraphReader(Protocol):
    async def get_nodes(self, community_id: str) -> list[GraphNode]: ...
    async def get_edges(self, community_id: str) -> list[GraphEdge]: ...

core/schemas/base.py — Consistent API responses:

class ResponseEnvelope(BaseModel, Generic[T]):
    data: T
    meta: dict | None = None

class PaginatedResponse(ResponseEnvelope[list[T]], Generic[T]):
    total: int
    offset: int
    limit: int

class ErrorResponse(BaseModel):
    error: ErrorDetail

class ErrorDetail(BaseModel):
    code: str
    message: str
    detail: Any | None = None

core/service.py — Common service patterns:

class BaseService(Generic[RepoT]):
    def __init__(self, repo: RepoT):
        self.repo = repo

    async def get(self, id: UUID):
        return await self.repo.get_by_id(id)

    async def list(self, offset=0, limit=50):
        return await self.repo.list_all(offset, limit)

Request flow

Route (HTTP) → Service (logic) → Repository (data) → DB
     ↑ Depends(get_current_user)      ↑ Depends(get_session)
     ↑ Depends(get_service)           ↑ Protocol contract

Routes never touch SQLAlchemy or Neo4j. Services never parse HTTP. Repositories never know about business rules.

Cross-module communication

While monolith: direct Python imports between module services via protocols. After microservice split: replace with NATS messages or HTTP calls — the Protocol interface stays identical.

Module registration

# main.py
MODULES = [dashboard, community, graph, policy, memory,
           queue, pull_request, simulation, search, notification]

def create_app() -> FastAPI:
    app = FastAPI(title="Substrate API")
    register_middleware(app)
    for module in MODULES:
        app.include_router(module.router.router, prefix="/api/v1")
    return app

To disable a module (e.g., when extracted to its own service): remove it from the list.


3. Database Schema

PostgreSQL — V2__phase0_schema.sql

Schema aligned with the Unified Multimodal Knowledge Base (UMKB) entity model from knowledge-base.md. The API layer transforms DB models into UI-friendly response shapes.

-- ============================================================
-- DOMAIN: Services & Architecture
-- ============================================================

ALTER TABLE services ADD COLUMN IF NOT EXISTS slug TEXT UNIQUE;
ALTER TABLE services ADD COLUMN IF NOT EXISTS api_type TEXT;
ALTER TABLE services ADD COLUMN IF NOT EXISTS page_rank FLOAT DEFAULT 0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS betweenness FLOAT DEFAULT 0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS tension_score FLOAT DEFAULT 0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS confidence FLOAT DEFAULT 1.0;
ALTER TABLE services ADD COLUMN IF NOT EXISTS verification_status TEXT DEFAULT 'unverified';
ALTER TABLE services ADD COLUMN IF NOT EXISTS owner_handle TEXT;

CREATE TABLE communities (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT NOT NULL UNIQUE,
    slug TEXT NOT NULL UNIQUE,
    color TEXT NOT NULL,
    description TEXT,
    owner_team TEXT,
    tension FLOAT DEFAULT 0,
    violation_count INT DEFAULT 0,
    trend TEXT DEFAULT 'stable',
    trend_delta INT DEFAULT 0,
    created_at TIMESTAMPTZ DEFAULT now(),
    updated_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE community_services (
    community_id UUID REFERENCES communities(id) ON DELETE CASCADE,
    service_id UUID REFERENCES services(id) ON DELETE CASCADE,
    PRIMARY KEY (community_id, service_id)
);

-- ============================================================
-- DOMAIN: Teams & Ownership
-- ============================================================

CREATE TABLE teams (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT NOT NULL UNIQUE,
    parent_team_id UUID REFERENCES teams(id),
    created_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE developers (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    github_handle TEXT UNIQUE NOT NULL,
    display_name TEXT NOT NULL,
    email TEXT,
    keycloak_id TEXT,
    active BOOLEAN DEFAULT true,
    created_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE team_members (
    team_id UUID REFERENCES teams(id) ON DELETE CASCADE,
    developer_id UUID REFERENCES developers(id) ON DELETE CASCADE,
    role TEXT DEFAULT 'member',
    since DATE NOT NULL,
    PRIMARY KEY (team_id, developer_id)
);

CREATE TABLE service_ownership (
    service_id UUID REFERENCES services(id) ON DELETE CASCADE,
    owner_id UUID NOT NULL,
    owner_type TEXT NOT NULL,
    is_primary BOOLEAN DEFAULT false,
    confidence FLOAT DEFAULT 1.0,
    last_verified TIMESTAMPTZ DEFAULT now(),
    since DATE NOT NULL,
    PRIMARY KEY (service_id, owner_id)
);

-- ============================================================
-- DOMAIN: Governance
-- ============================================================

CREATE TABLE policy_packs (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    slug TEXT NOT NULL UNIQUE,
    name TEXT NOT NULL,
    description TEXT,
    official BOOLEAN DEFAULT true,
    created_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE policies (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    policy_id TEXT NOT NULL UNIQUE,
    name TEXT NOT NULL,
    pack_id UUID REFERENCES policy_packs(id),
    level TEXT NOT NULL,
    description TEXT,
    rego_source TEXT,
    domain_scope TEXT,
    active BOOLEAN DEFAULT true,
    violation_count INT DEFAULT 0,
    created_at TIMESTAMPTZ DEFAULT now(),
    updated_at TIMESTAMPTZ DEFAULT now()
);

ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS pr_number INT;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS repo_full_name TEXT;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS affected_nodes JSONB;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS blast_radius_count INT;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS resolution_status TEXT DEFAULT 'open';
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS resolved_at TIMESTAMPTZ;
ALTER TABLE policy_violations ADD COLUMN IF NOT EXISTS exception_node_id UUID;

-- ============================================================
-- DOMAIN: Institutional Memory
-- ============================================================

CREATE TABLE memory_entries (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    entry_type TEXT NOT NULL,
    title TEXT NOT NULL,
    body TEXT,
    author_id UUID REFERENCES developers(id),
    source_url TEXT,
    service_id UUID REFERENCES services(id),
    confidence FLOAT DEFAULT 1.0,
    entry_date DATE NOT NULL,
    expires_at DATE,
    approved_by UUID REFERENCES developers(id),
    linked_policy_id UUID REFERENCES policies(id),
    reviewed_at TIMESTAMPTZ,
    created_at TIMESTAMPTZ DEFAULT now(),
    updated_at TIMESTAMPTZ DEFAULT now()
);

-- ============================================================
-- DOMAIN: Verification Queue
-- ============================================================

CREATE TABLE queue_items (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    entity_name TEXT NOT NULL,
    entity_type TEXT NOT NULL,
    reason TEXT NOT NULL,
    confidence FLOAT NOT NULL,
    flag TEXT,
    severity TEXT NOT NULL,
    status TEXT DEFAULT 'pending',
    assigned_to UUID REFERENCES developers(id),
    resolution_note TEXT,
    created_at TIMESTAMPTZ DEFAULT now(),
    updated_at TIMESTAMPTZ DEFAULT now()
);

-- ============================================================
-- DOMAIN: Pull Requests
-- ============================================================

CREATE TABLE pull_requests (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    pr_number INT NOT NULL,
    repo_full_name TEXT,
    author_handle TEXT NOT NULL,
    title TEXT NOT NULL,
    status TEXT NOT NULL,
    violation_count INT DEFAULT 0,
    impact_summary TEXT,
    blast_radius JSONB,
    created_at TIMESTAMPTZ DEFAULT now(),
    updated_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE pull_request_history (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    pr_number INT NOT NULL,
    author_handle TEXT NOT NULL,
    title TEXT NOT NULL,
    policies_evaluated INT DEFAULT 0,
    result TEXT NOT NULL,
    resolution_status TEXT,
    resolved_time TEXT,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- ============================================================
-- DOMAIN: Simulation
-- ============================================================

CREATE TABLE simulation_results (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    scenario TEXT NOT NULL,
    prompt TEXT NOT NULL,
    status TEXT,
    duration_ms INT,
    before_state JSONB,
    after_state JSONB,
    blast_radius_delta JSONB,
    policy_delta JSONB,
    requested_by UUID REFERENCES developers(id),
    created_at TIMESTAMPTZ DEFAULT now()
);

-- ============================================================
-- DOMAIN: Notifications & Alerts
-- ============================================================

CREATE TABLE notifications (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    notification_type TEXT NOT NULL,
    title TEXT NOT NULL,
    detail TEXT,
    recipient_id UUID REFERENCES developers(id),
    read BOOLEAN DEFAULT false,
    created_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE dashboard_alerts (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    alert_type TEXT NOT NULL,
    severity TEXT NOT NULL,
    title TEXT NOT NULL,
    detail TEXT,
    domain TEXT,
    acknowledged BOOLEAN DEFAULT false,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- ============================================================
-- DOMAIN: Sprint Tracking
-- ============================================================

CREATE TABLE sprints (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sprint_id TEXT NOT NULL UNIQUE,
    name TEXT NOT NULL,
    health TEXT DEFAULT 'healthy',
    velocity INT,
    target_velocity INT,
    violation_delta INT DEFAULT 0,
    debt_score FLOAT DEFAULT 0,
    started_at DATE,
    ended_at DATE,
    created_at TIMESTAMPTZ DEFAULT now()
);

Neo4j — Extended schema + seed data

Schema constraints:

CREATE CONSTRAINT service_id IF NOT EXISTS FOR (s:Service) REQUIRE s.id IS UNIQUE;
CREATE CONSTRAINT service_name IF NOT EXISTS FOR (s:Service) REQUIRE s.name IS UNIQUE;
CREATE INDEX service_domain IF NOT EXISTS FOR (s:Service) ON (s.domain);
CREATE CONSTRAINT community_id IF NOT EXISTS FOR (c:Community) REQUIRE c.id IS UNIQUE;
CREATE CONSTRAINT developer_handle IF NOT EXISTS FOR (d:Developer) REQUIRE d.github_handle IS UNIQUE;
CREATE CONSTRAINT team_name IF NOT EXISTS FOR (t:Team) REQUIRE t.name IS UNIQUE;
CREATE CONSTRAINT decision_id IF NOT EXISTS FOR (d:DecisionNode) REQUIRE d.id IS UNIQUE;
CREATE CONSTRAINT failure_id IF NOT EXISTS FOR (f:FailurePattern) REQUIRE f.id IS UNIQUE;
CREATE CONSTRAINT policy_id IF NOT EXISTS FOR (p:Policy) REQUIRE p.policy_id IS UNIQUE;

Seed data entities:

12 :Service nodes       — with domain, page_rank, tension_score, status, confidence
 5 :Developer nodes     — alice, bob, carol, dave, frank
 3 :Team nodes          — payments-team, orders-team, platform-team
 4 :Community nodes     — Payment, Auth, Orders, Infrastructure
 3 :DecisionNode nodes  — ADR-047 (gateway-first), ADR-051 (mTLS), ADR-032 (stale)
 1 :FailurePattern node — PM-019 (double-charge incident)
 1 :ExceptionNode       — FulfillmentSvc legacy direct call
 8 :Policy nodes        — the 8 governance policies

Edges:
~15 :DEPENDS_ON         — intended dependencies
~12 :ACTUALLY_CALLS     — observed runtime calls (including the violating one)
~12 :OWNS               — developer/team → service ownership
 4  :CONTAINS           — community → service membership
 3  :WHY                — decision → service/policy rationale
 1  :CAUSED_BY          — failure → service
 1  :PREVENTED_BY       — service → policy
 5  :MEMBER_OF          — developer → team

Seed data narrative

The seed data tells a coherent story that exercises every persona's core workflow:

Setting: Mid-size e-commerce team (5-9 developers) running ~12 microservices across 4 domains. Two years of operation with accumulated decisions, one survived incident, and active structural decay.

The story a user discovers:

  1. Dashboard (Executive, Architect) — Payment domain on fire: tension spiked 40% this sprint, 2 hard violations, key person (@alice) departing Sprint 25
  2. Graph (Developer, Architect) — OrderService bypassing API gateway to call PaymentService directly. Visible as red ACTUALLY_CALLS edge violating the intended DEPENDS_ON path
  3. PR Details (Developer) — PR #2847 blocked. Violation explanation links to ADR-047 (gateway-first) created after PM-019 (double-charge incident). Developer sees WHY the rule exists
  4. Memory Timeline (Tech Lead) — Traceable chain: PM-019 → ADR-047 → GOV-012 → PR #2847 blocked
  5. Queue (Architect) — 6 items across all three confidence bands (61%-89%): stale ownership, missing ADRs, duplicate docs
  6. Simulation (Architect, DevOps) — "Split OrderService?" shows blast radius shrinks, SRP violation resolves, but creates new ownership gap
  7. Search (any persona) — "Why does PaymentService route through the gateway?" returns sourced answer citing ADR-047 and PM-019
  8. Policies (Security Engineer) — 8 active policies from 6 packs with violation counts matching graph state

Computed vs stored

Data Strategy Why
Dashboard metrics Aggregation queries across policies, communities Always fresh
Memory meta (total, gaps, coverage) COUNT/AVG on memory_entries + services Single source of truth
Community violation counts JOIN policies → violations → community_services No stale counts
Policy violation counts COUNT on policy_violations WHERE open Always accurate
Node status (ok/warn/bad) Derived from violation count + tension Business rule

4. Authentication & Security

JWT validation flow

UI (Keycloak login) → Bearer token in Authorization header
                          ↓
Nginx reverse proxy → /api/v1/* → Backend
                          ↓
                    FastAPI dependency: get_current_user()
                          ↓
                    1. Extract token from header
                    2. Fetch Keycloak JWKS (cached in Redis, TTL 1h)
                    3. Decode + verify signature, expiry, audience, issuer
                    4. Return UserInfo (sub, email, roles, groups)

Implementation

  • JWKS caching: Fetch Keycloak's /.well-known/openid-configurationjwks_uri on first request. Cache public keys in Redis (1h TTL).
  • Token validation: python-jose decodes JWT, verifies signature against cached JWKS, checks iss = OIDC_ISSUER_URL, aud contains substrate-backend, exp not past.
  • UserInfo model: Pydantic schema with sub (UUID), email, preferred_username, realm_roles, groups. Extracted from JWT claims.

Route protection

All /api/v1/* routes require auth except /api/v1/config (UI needs it pre-login to discover OIDC issuer).

UI token plumbing

Thin wrapper reads token from react-oidc-context and makes it available outside React components for the api() fetch function.


5. Nginx Reverse Proxy & UI Wiring

Nginx config

server {
    listen 8080;
    listen 8443 ssl;

    ssl_certificate     /etc/nginx/certs/tls.crt;
    ssl_certificate_key /etc/nginx/certs/tls.key;

    root /usr/share/nginx/html;
    index index.html;

    # Reverse proxy: /api/* → backend
    location /api/ {
        proxy_pass         http://substrate-backend:8000;
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
        proxy_set_header   Authorization $http_authorization;
        proxy_http_version 1.1;
    }

    # SPA fallback
    location / {
        try_files $uri $uri/ /index.html;
    }

    # Health check
    location = /healthz {
        return 200 'ok';
        add_header Content-Type text/plain;
    }
}

UI API client rewrite

Replace static JSON fetcher:

// api/client.ts
const BASE = import.meta.env.VITE_BACKEND_URL || '/api/v1';

export async function api<T>(path: string, options?: RequestInit): Promise<T> {
    const token = getAccessToken();
    const res = await fetch(`${BASE}${path}`, {
        ...options,
        headers: {
            'Content-Type': 'application/json',
            ...(token ? { Authorization: `Bearer ${token}` } : {}),
            ...options?.headers,
        },
    });
    if (!res.ok) throw new ApiError(res.status, await res.text());
    return res.json();
}

substrateApi rewired to call /api/v1/* paths instead of /data/*.json.

Vite dev proxy

// vite.config.ts
server: {
    proxy: {
        '/api/v1': { target: 'http://localhost:8000', changeOrigin: true },
    },
},

Two environments, one codebase

Environment UI served by API calls Auth
docker compose up Nginx (:8080/:8443) Nginx proxies /api/* → backend:8000 Keycloak container
npm run dev Vite (:5173) Vite proxies /api/v1/* → localhost:8000 Keycloak container

6. API Endpoints

All prefixed with /api/v1. All require JWT auth except /config.

Dashboard module

Method Path Response Source
GET /dashboard DashboardData PG + Neo4j (aggregation)

Community module

Method Path Response Source
GET /communities Community[] Neo4j + PG
GET /communities/{slug} CommunityDetail Neo4j + PG
GET /communities/create-data CreateCommunityData PG

Graph module

Method Path Response Source
GET /communities/{slug}/graph {nodes, edges} Neo4j

Policy module

Method Path Response Source
GET /policies Policy[] PG
GET /policies/create-data CreatePolicyData PG

Memory module

Method Path Response Source
GET /memory MemoryEntry[] PG
GET /memory/meta MemoryMetaData PG (computed)
GET /memory/contribute ContributeMemoryData PG
POST /memory MemoryEntry PG

Queue module

Method Path Response Source
GET /queue QueueItem[] PG
PATCH /queue/{id} QueueItem PG

Pull Request module

Method Path Response Source
GET /pull-requests PullRequest[] PG
GET /pull-requests/history PrHistoryItem[] PG

Simulation module

Method Path Response Source
GET /simulation SimulationData PG
POST /simulation/run SimulationResult PG (seeded result; real engine Phase 4)

Search module

Method Path Response Source
GET /search/data SearchData PG (config + canned results; GraphRAG Phase 3)

Notification module

Method Path Response Source
GET /notifications Notification[] PG

Response conventions

Collections:

{ "data": [...], "meta": { "total": 42, "offset": 0, "limit": 50 } }

Single entities: object directly (no envelope).

Errors:

{ "error": { "code": "POLICY_NOT_FOUND", "message": "...", "detail": null } }


7. Implementation Order — Vertical Slices

Each slice is independently demoable. Dependencies flow top-down.

Slice 0: Foundation

Core infrastructure all modules depend on: - core/ — config, security, exceptions, middleware, database clients - core/models/base.py — ORM base classes - core/schemas/base.py — response envelope, error response - core/repository.py — GenericRepository - core/service.py — BaseService - core/dependencies.py — DI wiring - main.py — app factory, module registration - V2__phase0_schema.sql + V3__seed_data.sql + seed_data.cypher - Nginx config + UI api/client.ts rewrite + Vite proxy + token plumbing

Demoable: docker compose up → Keycloak login → authenticated API calls return 200.

Slice 1: Communities

modules/community/ — full vertical slice.

Demoable: Communities page lists real domains with tension/violations.

Slice 2: Graph

modules/graph/ — Neo4j Cypher queries.

Demoable: Graph page renders real architecture topology. Red edges where observed diverges from intended.

Slice 3: Policies

modules/policy/ — full vertical slice.

Demoable: Policies page shows real enforcement rules. Violation counts match graph.

Slice 4: Memory

modules/memory/ — full vertical slice including POST.

Demoable: Memory timeline shows institutional history. Contributing entries works.

Slice 5: Queue

modules/queue/ — full vertical slice including PATCH.

Demoable: Verification queue with confidence bands. Actions persist.

Slice 6: Pull Requests

modules/pull_request/ — full vertical slice.

Demoable: PR #2847 blocked with blast radius. History table populated.

Slice 7: Simulation

modules/simulation/ — seeded results.

Demoable: Before/after diff. "Run Simulation" returns seeded OrderService split result.

modules/search/ — config + canned results.

Demoable: Search page with sample queries returning sourced answers.

Slice 9: Notifications

modules/notification/ — full vertical slice.

Demoable: Bell icon shows real notifications.

Slice 10: Dashboard

modules/dashboard/ — aggregation across all other modules.

Demoable: Dashboard shows real computed metrics, domain cards, live alerts.

Build order

Slice 0 (Foundation)
  → Slice 1 (Communities) → Slice 2 (Graph)
  → Slice 3 (Policies)
  → Slice 4 (Memory)
  → Slice 5 (Queue)
  → Slice 6 (Pull Requests)
  → Slice 7 (Simulation)
  → Slice 8 (Search)
  → Slice 9 (Notifications)
  → Slice 10 (Dashboard) ← last, aggregates everything

8. Testing Strategy

Infrastructure

  • pytest + pytest-asyncio
  • testcontainers-python — disposable PostgreSQL, Neo4j, Redis per test session
  • httpx.AsyncClient — tests FastAPI app directly

Test layers

Layer What How
Repository Queries return correct data from seeded DB Real DB containers, seed fixtures
Service Business logic, computed metrics, aggregation Real repos with test DB
Route HTTP status codes, response shapes, auth httpx.AsyncClient(app=app), JWT fixtures
Auth JWT validation edge cases Crafted test JWTs, assert 401/403

Key test scenarios

  • Auth: valid token → 200, missing → 401, expired → 401, wrong audience → 403
  • Dashboard: metrics match actual DB counts; empty DB → zero metrics
  • Graph: correct node/edge count per community; nonexistent community → 404
  • Queue: PATCH approve persists; reject without note → 422

Not tested in Phase 0

  • Trivial Pydantic schema serialization
  • Load testing (Phase 2+)
  • E2E browser tests

9. Dependencies

Backend — new in pyproject.toml

sqlalchemy[asyncio] >= 2.0
asyncpg
neo4j >= 5.0
redis[hiredis] >= 5.0
python-jose[cryptography]
httpx

Backend — test dependencies

pytest >= 8.0
pytest-asyncio
testcontainers[postgres,neo4j,redis]
httpx

Frontend — no new dependencies

Existing react-oidc-context and oidc-client-ts already handle auth. API client rewrite uses native fetch.


10. Future phases (out of scope)

These are explicitly NOT part of Phase 0 but the modular structure accommodates them:

  • Phase 0.5: Marketplace module + Connector plugin framework
  • Phase 1: First data connectors (Git, Terraform, K8s) built on that framework
  • Phase 1.5: Chat-based ingestion + LLM integration (vLLM endpoints)
  • Phase 2: Governance engine (OPA evaluation, PR blocking)
  • Phase 3: Reasoning & Search (GraphRAG, NL-to-Cypher)
  • Phase 4: Simulation engine + Agent orchestration