Complete Use Case Matrix — Roadmap Reference¶

Overview¶

Every Substrate capability is expressed as a discrete use case with a defined trigger, model involvement, delivery mode, and latency target. Use cases are the unit of implementation planning, acceptance testing, and performance verification.

Use cases are grouped into six service domains: Ingestion (ING), Governance (GOV), Reasoning and Search (RSN), Proactive Maintenance (PRO), Simulation (SIM), and Agent Orchestration (AOC).

The model column indicates which AI model is involved. "None (deterministic)" means the function is entirely deterministic — no LLM inference. "None (OPA/Rego)" means the decision is made by OPA policy evaluation. These distinctions matter: deterministic functions have guaranteed latency; LLM-involved functions have probabilistic latency with p95 targets.

Ingestion Use Cases (ING)¶

These use cases populate the Observed Graph and Institutional Memory layer. The Ingestion Service processes all inbound events from source connectors.

UC ID	Use Case	Model	Trigger	Mode
ING-UC-01	GitHub PR opened → AST parsed via Rust CLI → code graph delta (Function and Module node changes, new CALLS and IMPORTS edges) → NATS emit to Governance Service	Qwen2.5-Coder (for complex AST enrichment); Rust CLI (deterministic for basic parse)	PR open webhook (`pull_request.opened`, `pull_request.synchronize`)	Event-driven, async
ING-UC-02	GitHub Projects v2 item status updated → SprintNode health field updated in Neo4j; NATS event emitted	Dense extract-lora	`projects_v2_item.edited` webhook	Event-driven, async
ING-UC-03	GitHub Pages build completed → documentation staleness delta computed; doc coverage score updated on linked Service node	Dense extract-lora	6-hour scheduled poll of Pages build status API	Scheduled
ING-UC-04	SSH host inspection → running process list, open ports, and network connections captured as JSON → diff against declared InfraResource topology in Neo4j → discrepancies written as observations	None (fully deterministic)	15-minute Celery beat schedule + on-demand via FastAPI gateway endpoint	Scheduled + on-demand
ING-UC-05	Terraform apply completes → Terraform state file parsed → InfraResource nodes updated → drift from prior Terraform state computed → NATS event emitted	Dense extract-lora	Post-apply webhook or state file poll	Event-driven, async
ING-UC-06	ADR file committed to GitHub repository → full ADR text extracted → DecisionNode created with rationale, source_url, decision_date → WHY edges created to affected Service and Policy nodes	MoE Scout (Llama 4 Scout)	Git push webhook on paths matching `docs/adr/`, `adr/`, `architecture/**`	Event-driven, async
ING-UC-07	Post-mortem published to Confluence or GitHub Pages → FailurePattern node created with root_cause, lessons, incident_date → CAUSES edges created to affected service nodes → policy gap check triggered (PRO-UC-08)	MoE Scout	Confluence webhook (page_created, page_updated) / GitHub Pages push webhook	Event-driven, async
ING-UC-08	CVE feed polled → new vulnerabilities classified for relevance against current dependency graph → affected DEPENDS_ON edges and Module nodes identified and flagged with severity badge	Dense extract-lora	15-minute Celery beat schedule (OSV.dev / NVD feed)	Scheduled
ING-UC-09	Nightly entity resolution pass: all service names from all sources canonicalized using Dense resolve-lora; duplicate node candidates identified; auto-merged above threshold; human-queued below threshold	Dense resolve-lora	Celery beat 2:15 AM daily (after nightly ingestion completes)	Scheduled
ING-UC-10	Jira ticket created → IntentAssertion node created with linked_ticket, source_text, intent_embedding → linked to relevant Service nodes via entity resolution on ticket component/label fields	Dense extract-lora	Jira webhook (`jira:issue_created`)	Event-driven, async (Nice to Have for MVP; basic Jira connector covers sprint close)

Governance Use Cases (GOV)¶

These use cases enforce architectural policies and surface violations. The Governance Service processes ING outputs and applies OPA/Rego evaluation.

UC ID	Use Case	Model	Mode	Latency Target
GOV-UC-01	PR graph delta evaluated against all active OPA policy packs → pass/fail result → if fail, plain English violation explanation generated → GitHub Checks API result posted → violation comment with ADR link posted to PR	OPA/Rego (decision); Dense explain-lora (explanation only)	Concurrent, blocking (PR cannot merge until check completes)	< 2 seconds end-to-end from PR webhook to Checks API result
GOV-UC-02	New npm, pip, Go, or Maven dependency added in PR → license of new dependency checked against approved license list in PostgreSQL policy store → if conflict detected, warning comment posted to PR	Dense explain-lora (for conflict explanation)	Concurrent, non-blocking (warning comment, not a hard block unless configured)	< 1 second OPA evaluation; < 2 seconds with explanation
GOV-UC-03	Terraform plan result contains infrastructure deviation from declared topology → deviation details extracted → plain English alert generated with diff → alert posted to Drift dashboard and optionally to PR	Dense explain-lora	Concurrent; triggered by Terraform plan webhook	< 2 seconds from Terraform plan event to alert
GOV-UC-04	Node selected in Architecture Graph UI → blast radius computed via PageRank-weighted DEPENDS_ON/ACTUALLY_CALLS traversal (up to 3 hops) → affected services ranked by impact → visualization rendered in UI	MoE Scout (for narrative blast radius summary); Cypher (for traversal, deterministic)	On demand (UI interaction)	< 3 seconds from selection to visualization
GOV-UC-05	PR violation explanation expanded with institutional memory context → linked ADR content and linked post-mortem lesson surfaced inline in PR comment → source URLs provided	MoE Scout	Part of GOV-UC-01 pipeline; runs after OPA evaluation	< 2 seconds (part of GOV-UC-01 budget)
GOV-UC-06	Architect approves policy exception via UI → ExceptionNode created with rationale, approved_by, expires_at, policy_id → ExceptionNode linked to violating Service and Policy nodes → audit log entry written	Dense extract-lora (for exception metadata extraction)	On Architect approval action (UI button)	Async; confirmation within 5 seconds
GOV-UC-07	SSH runtime inspection diff exceeds configured drift threshold (e.g., undeclared service on port, declared service not running) → runtime violation raised in Governance Service → alert posted to Drift dashboard	None (fully deterministic)	Scheduled comparison (runs after every ING-UC-04 completion)	< 1 minute from SSH inspection completion to alert visible in UI
GOV-UC-08	SOLID violation detected: service efferent coupling count exceeds configured threshold (default: 5 outgoing DEPENDS_ON edges to external services outside the service's domain) → policy violation raised on PR	None (OPA/Rego evaluation)	On PR graph delta (synchronous with GOV-UC-01 pipeline)	< 500ms OPA evaluation
GOV-UC-09	TDD coverage violation: test coverage percentage on a Service node falls below configured threshold (default: 80%) after PR merge → soft-mandatory violation posted to PR (blocks merge if configured as hard)	None (OPA/Rego evaluation)	On PR graph delta	< 500ms OPA evaluation
GOV-UC-10	API-first violation: REST service (detected by api_type property) has no OpenAPI spec node in the graph (no linked documentation node with spec type) → policy violation raised on PR	None (OPA/Rego evaluation)	On PR graph delta	< 500ms OPA evaluation

Reasoning and Search Use Cases (RSN)¶

These use cases answer natural language and structured questions about the graph. The Reasoning Service orchestrates retrieval strategy selection and LLM invocation.

UC ID	Use Case	Retrieval Strategy	Latency Target
RSN-UC-01	"What does PaymentService depend on?" — returns full direct and transitive dependency tree with confidence scores and domain classification	Local GraphRAG: direct entity recognition (Dense cypher-lora) → Cypher DEPENDS_ON traversal (1–5 hops) → structured result formatting	< 1 second (pure graph traversal; no LLM generation for basic result)
RSN-UC-02	"Who owns the checkout service?" — returns primary and secondary owners (Developer + Team) with ownership confidence scores and CODEOWNERS data	Local GraphRAG: OWNS edge traversal + CODEOWNERS document retrieval + MEMBER_OF team traversal	< 500ms (deterministic graph traversal after entity resolution)
RSN-UC-03	Intent mismatch detection: code changes in a PR do not match the stated intent in the linked GitHub Projects item or Jira ticket → mismatch alert posted to PR as a warning	Hybrid embedding similarity: bge-m3 embedding of PR description + diff summary vs bge-m3 embedding of linked IntentAssertion → cosine similarity → bge-reranker-v2-m3 RRF fusion	< 1 second (embedding comparison; no LLM generation)
RSN-UC-04	"What are our top 3 architectural risks?" — system-wide analysis returning the most critical structural risks with reasoning	Global GraphRAG: Leiden community detection → per-community MoE Scout summarization → RAPTOR map-reduce aggregation across communities → final answer generation	< 8 seconds end-to-end (multiple LLM calls with community summaries)
RSN-UC-05	"Why was this architectural decision made?" — returns the DecisionNode(s) explaining the decision with source URLs, linked failure patterns, and institutional memory context	Memory retrieval: NL entity extraction → DecisionNode traversal via WHY edges → FailurePattern traversal via CAUSES edges → MoE Scout narrative generation with source citations	< 5 seconds end-to-end
RSN-UC-06	"What changed before last Friday's incident?" — returns structural graph changes in the specified time window with causal relevance ranking	Temporal graph snapshot diff: NL time reference parsing → PostgreSQL snapshot retrieval → set-difference graph diff → HyDE query expansion for relevance ranking → MoE Scout narrative summary	< 5 seconds end-to-end
RSN-UC-07	"Find all services calling auth directly" — translates NL to Cypher, executes, returns structured result with confidence	Cypher translation: Dense cypher-lora NL→Cypher → Cypher execution against Neo4j → result formatting	< 1 second (Cypher translation < 500ms; graph execution < 500ms)
RSN-UC-08	"Which services does this epic affect?" — maps a GitHub Projects epic or Jira epic to service nodes via IntentAssertion links and graph traversal	Project item traversal: IntentAssertion node lookup → linked Service node traversal → DEPENDS_ON expansion for transitive impact → MoE Scout summary if transitive impact is non-trivial	< 5 seconds end-to-end

Proactive Maintenance Use Cases (PRO)¶

These use cases surface insight without the user asking. The Proactive Maintenance Service runs on scheduled triggers and real-time graph change events.

UC ID	Use Case	Model	Mode	Latency / Frequency
PRO-UC-01	Structural tension threshold breached on a domain or service → plain English drift alert generated with the specific edges and nodes contributing to the tension → alert posted to Drift dashboard feed	Dense explain-lora	Real-time (triggered by Ingestion Service on tension score update)	< 1 minute from tension threshold breach to alert visible in UI
PRO-UC-02	Sprint close event received → structural debt report generated: violation delta (introduced vs resolved), trending violations, top 3 structural action items, velocity vs debt correlation → report delivered to Scrum Master role	MoE Scout	Sprint close webhook (GitHub Projects v2 iteration_close / Jira sprint_closed)	< 30 seconds from sprint close event to report visible in UI
PRO-UC-03	Undocumented services detected (no linked documentation nodes, no README coverage) + orphaned documentation detected (documentation nodes with no linked Service nodes via stale WHY edges) → both lists surfaced in Verification Queue	bge-m3 embedding similarity (documentation coverage check)	Nightly 2:00 AM Celery beat	Batch; results available at 3:00 AM
PRO-UC-04	Tribal knowledge extraction: pass over all newly ingested PR comments, ADR text, and Confluence documents since last run → extract implicit decisions and rationale → create MemoryNode candidates with confidence scores → route to Verification Queue	Dense extract-lora	Nightly 2:00 AM Celery beat	Batch; runs concurrently with PRO-UC-03
PRO-UC-05	Domain drift trend alert: compare domain tension scores over 7-day rolling window → detect accelerating drift trend → alert Engineering Manager and Architect roles with trend chart data	Dense explain-lora (for narrative trend summary)	Daily Celery beat (6:00 AM)	Async daily
PRO-UC-06	ADR gap detection: identify Service nodes with no WHY edges (no architectural decisions explain why this service exists or how it was designed) → gap list surfaced in Verification Queue and sprint retro report	bge-m3 embedding (for prioritizing gaps by architectural centrality)	Nightly Celery beat (4:30 AM)	Batch; nightly
PRO-UC-07	Graft pattern rewrite suggestion: PR introduces a code pattern that matches a known anti-pattern (detected by AST analysis or high structural tension) → Qwen2.5-Coder generates a refactoring suggestion as a PR comment	Qwen2.5-Coder	On PR open (triggered by ING-UC-01 completion)	< 5 seconds (socket-activated Qwen2.5-Coder; suggestion is a comment, not a blocking check)
PRO-UC-08	Post-mortem ingested (ING-UC-07 completed) → policy gap check triggered: does the root cause described in the post-mortem correspond to a violation class that is not currently covered by an active OPA policy? → if gap detected, Architect alerted with suggested policy category	MoE Scout	On post-mortem ingest completion (triggered by ING-UC-07)	Async; alert within 2 minutes of post-mortem ingest
PRO-UC-09	Key-person risk detection: identify Service nodes where a single Developer holds the sole OWNS edge (no team ownership, no secondary owner) → ranked by service PageRank (most critical single-owner services first) → weekly report to Engineering Manager	bge-m3 (for service criticality ranking supplementation)	Weekly Celery beat (Monday 5:00 AM)	Batch; weekly report
PRO-UC-10	Duplicate documentation detection: compute pairwise cosine similarity between bge-m3 embeddings of all documentation nodes → flag pairs with similarity > 0.85 as probable duplicates → surfaced in Verification Queue for consolidation	bge-m3	Nightly Celery beat (3:00 AM)	Batch; nightly
PRO-UC-11	Daily structural digest for Team Lead: aggregate violation delta from prior day, top 3 newly introduced risks, memory gap count, and one recommended action item → delivered by 9:00 AM	Dense explain-lora	Daily Celery beat (8:30 AM generation; 9:00 AM delivery target)	Daily; async generation

Simulation Use Cases (SIM)¶

These use cases answer "what happens if?" questions using a Neo4j named database sandbox that does not modify the production graph.

All simulations follow the same pipeline: create sandbox database (CREATE DATABASE substrate_sim_{id} IF NOT EXISTS) → copy relevant subgraph to sandbox → apply NL-translated Cypher mutations → run OPA evaluation on sandbox → compute before/after policy delta → render result in Simulation Panel → drop sandbox after 1 hour.

UC ID	Use Case	Description	Latency Target
SIM-UC-01	"What happens if I split OrderService into OrderService and FulfillmentService?"	Split mutation: create two new Service nodes, redistribute DEPENDS_ON and ACTUALLY_CALLS edges proportionally, apply ownership inheritance → run OPA on modified sandbox → compute new violation set → diff against current violations → render blast radius of the split	< 15 seconds end-to-end
SIM-UC-02	"What breaks if I upgrade axios to version 1.x?"	Dependency version mutation: update DEPENDS_ON edge version property for axios → identify all modules using axios across all services (transitive DEPENDS_ON traversal) → run OPA license and compatibility checks → flag breaking changes based on known CVE/breaking change data	< 10 seconds end-to-end
SIM-UC-03	"What is the blast radius of removing the API gateway?"	Removal mutation: delete APIGateway Service node and all its edges from sandbox → compute which ACTUALLY_CALLS edges become direct violations of domain boundary policies → rank affected services by PageRank-weighted impact score → render blast radius visualization	< 10 seconds end-to-end
SIM-UC-04	"If we add this policy, what currently passes that would now break?"	Policy addition simulation: load new Rego policy pack into OPA server sandbox scope → run OPA evaluation against current Observed Graph (no graph mutation needed) → return list of services and edges that would newly violate → rank by severity and PageRank	< 10 seconds (no graph sandbox needed; only OPA evaluation scope changes)
SIM-UC-05	"We're planning to add a new NotificationService and a MessageQueueService in the next sprint — what policies would they violate and who would own them?"	Sprint planning pre-simulation: create placeholder Service nodes with proposed attributes → connect them to existing graph based on team ownership and domain assignment → run full OPA evaluation → return violation forecast and suggested ownership assignment	< 20 seconds end-to-end (requires creating multiple nodes and edges in sandbox)

Agent Orchestration Use Cases (AOC)¶

These use cases involve multi-step automated workflows with human-in-the-loop gates. The Agent Orchestration Service manages workflow state; all automated actions are logged to the immutable audit table.

UC ID	Use Case	Description
AOC-UC-01	Fix PR generation workflow	A 10-step state machine triggered when a violation is detected and an Architect or Developer initiates a Fix PR request: (1) Ingest violation context from Governance Service; (2) Dense explain-lora generates violation explanation and proposed fix strategy; (3) HITL gate — present to Developer for approval to proceed; (4) Qwen2.5-Coder generates implementation diff based on fix strategy; (5) Simulation — apply diff to sandbox, run OPA evaluation, confirm violations are resolved and no new violations are introduced; (6) HITL gate — present diff and simulation results to Developer/Architect for final approval; (7) Open Fix PR on GitHub via API with generated diff; (8) Monitor Fix PR CI results (poll GitHub Checks API every 60 seconds); (9) On PR merge, re-ingest via ING-UC-01, confirm violation is resolved in Neo4j; (10) Write immutable audit log entry with full action trace
AOC-UC-02	HITL approval gate with 24-hour timeout	At each HITL gate, the workflow suspends. The approver is notified via in-app notification and Slack (if configured). If no response is received within 24 hours, the workflow escalates to the Architect role. If no response after an additional 4 hours (28 hours total), the workflow is automatically cancelled and the violation remains unresolved in the queue. All gate states (pending, approved, rejected, escalated, timed_out) are recorded in the audit log with timestamps.
AOC-UC-03	Rollback on rejection	If any HITL gate is rejected, or if any automated step fails (e.g., Qwen2.5-Coder generates a diff that does not pass the simulation validation), all preceding automated actions are reverted: the Fix PR is closed if it was opened, the sandbox is dropped, and the workflow state is reset to "requires_manual_resolution". The audit log retains the full trace of the attempted fix for future reference.
AOC-UC-04	Confidence-based auto-proceed	HITL gates can be bypassed when all steps in the workflow to that point exceed a configurable confidence threshold (default: 90%). The threshold is per-organization, per-violation-type, and configurable by Admin role. When auto-proceed is triggered, the audit log records the confidence scores for each step and the fact that no human approved the action. This is an explicit opt-in feature, not a default behavior, to prevent silent automated changes.
AOC-UC-05	Immutable audit log for all agent actions	Every action taken by the Agent Orchestration Service is recorded as an append-only entry in the PostgreSQL audit table. Each entry contains: `workflow_id` (UUID), `step_number` (1–10), `timestamp`, `actor` (user ID for HITL steps; service account ID for automated steps), `action` (string description), `input_hash` (SHA-256 of the step's input), `output_hash` (SHA-256 of the step's output), `confidence` (float), `reasoning` (LLM-generated rationale if applicable), `approved_by` (user ID for HITL steps), `auto_proceeded` (boolean). This log is the primary evidence for compliance and retrospective audits.

Active Governance and Lifecycle Use Cases (AGV)¶

These use cases address the third core problem solved by the platform: active governance of living knowledge artifacts.

UC ID	Use Case	Model	Mode	Latency / Frequency
AGV-UC-01	Lifecycle labeling on artifact update: every incoming ADR/doc/ticket/comment/file is labeled (`latest`, `active`, `stale`, `outdated`, `incomplete`, `archived`, `oldest_snapshot`) after LLM + deterministic checks	Dense extract-lora + deterministic freshness rules	Event-driven per artifact update	< 10 seconds per artifact
AGV-UC-02	Chunk profile selection and vectorization: choose profile by source type, chunk content, embed in pgvector, and attach to graph entities for semantic search	None for profile routing; bge-m3 for embeddings	Event-driven + nightly backfill	< 5 seconds per artifact; nightly backfill batch
AGV-UC-03	Automated Leiden community refresh: update only impacted communities after graph deltas and regenerate community summaries	Neo4j GDS + MoE Scout for summaries	Event-driven incremental + nightly full pass	< 2 minutes incremental; nightly full recompute
AGV-UC-04	Policy gap detection from incidents: when post-mortem or repeated violations are ingested, propose draft policy packs for architect approval	MoE Scout	Event-driven	Alert within 2 minutes
AGV-UC-05	Delegated remediation assignment: stale/outdated/incomplete artifacts are routed to accountable user groups with SLA-based escalation	None (deterministic ownership routing)	Event-driven + scheduled escalation	Assignment < 1 minute from detection
AGV-UC-06	Curated response update: owner response is normalized and formatted, then applied to affected artifacts and graph links	Dense explain-lora (curation)	On user response	< 30 seconds after response
AGV-UC-07	Post-remediation revalidation: rerun impacted governance checks and update violation status automatically	OPA/Rego + deterministic graph diff	Event-driven on update completion	< 2 seconds for scoped re-check
AGV-UC-08	Executive governance digest: summarize risk trends, stale knowledge backlog, and policy-compliance posture by quarter	MoE Scout + deterministic metrics	Weekly and monthly scheduled jobs	Weekly digest by Monday 9:00 AM

Role-to-Use-Case Coverage (Execution Focus)¶

Role	Primary Use Cases	Typical Simulation Style
Developer	GOV-UC-01, RSN-UC-01, RSN-UC-03, AGV-UC-06	PR-scope impact and dependency checks
Business Analyst	RSN-UC-05, RSN-UC-08, AGV-UC-01, AGV-UC-04	Business-rule impact simulations
Solutions Architect	SIM-UC-01, SIM-UC-04, GOV-UC-05, AGV-UC-03	Future-state topology simulations
Enterprise Architect	RSN-UC-04, SIM-UC-04, PRO-UC-05, AGV-UC-08	Cross-domain portfolio simulations
Product Owner	RSN-UC-08, SIM-UC-05, PRO-UC-02, AGV-UC-05	Epic impact and dependency simulations
DevOps Engineer	ING-UC-04, GOV-UC-07, SIM-UC-03, AGV-UC-05	Infra-change blast radius simulations
SRE	ING-UC-04, GOV-UC-07, RSN-UC-06, PRO-UC-05	Incident time-window and drift simulations
Product Manager	RSN-UC-04, SIM-UC-05, PRO-UC-11, AGV-UC-08	Roadmap feasibility simulations
Platform Engineer	ING-UC-08, GOV-UC-02, SIM-UC-02, AGV-UC-02	Upgrade/platform dependency simulations
Security Engineer	GOV-UC-02, GOV-UC-07, RSN-UC-07, AGV-UC-04	Threat-path and CVE propagation simulations
Scrum Master	PRO-UC-02, PRO-UC-11, AGV-UC-05, AGV-UC-08	Sprint health and structural debt simulations
Project Manager	RSN-UC-08, PRO-UC-11, AGV-UC-05, AGV-UC-08	Timeline-risk and dependency bottleneck simulations
Executive	RSN-UC-04, PRO-UC-05, PRO-UC-11, AGV-UC-08	Portfolio risk and investment-option simulations