System overview¶
Scribe IQ is a governed clinical documentation AI prototype that turns an offline synthetic clinical corpus into a clinical product surface with auditability built in from the schema up: Postgres/pgvector serving, FastAPI APIs, provider-agnostic LLM workflows, a Next.js clinical UI, and Responsible AI review. LLM and embedding providers are configurable per deployment: the demo defaults to Groq for completions and OpenAI for embeddings, but the same code paths run against Azure OpenAI or Amazon Bedrock for institutional postures. AI interaction audit is modeled as first-class Postgres data in ai_interactions; admin routes and UI only control whether those rows are exposed for inspection.
This document is the architecture story: diagrams, capability flags, extension seams. For rationale and alternatives considered, see DESIGN_NOTES.md. For exact routes, schema, and flags, see docs/architecture/IMPLEMENTED_BASELINE.md. For the provider configuration matrix, see docs/guides/LLM_AND_EMBEDDING_PROVIDERS.md.
Runtime flow¶
flowchart TD
Browser["Browser (Next.js)"]
API["FastAPI (asyncpg pool)"]
DB["Postgres 16 + pgvector"]
LLM["LLM provider<br/>(Groq / Azure OpenAI / Bedrock)"]
EMB["Embedding provider<br/>(OpenAI / Azure OpenAI / Bedrock)"]
AuditTbl["ai_interactions append"]
Browser -->|"REST + X-Request-ID"| API
API --> DB
API -->|"chat / note gen / meeting prep"| LLM
API -->|"embed query / note"| EMB
API -->|"append-only audit"| AuditTbl
AuditTbl --> DB
The request path is intentionally short: the browser calls FastAPI, FastAPI queries Postgres directly, and only the AI-touching routes (chat, note generation, meeting prep) reach external LLM/embedding services. Every request carries an X-Request-ID that propagates through structured logs, making user-visible actions traceable end-to-end without logging PHI in bodies.
Corpus lifecycle¶
The corpus is built offline, not on demand. The application reads a stable, validated artifact; the data pipeline is a separate concern that can be re-run independently.
flowchart LR
Synthea["Synthea JAR (seed 42)"]
NotePool["Note pool (ACI-Bench + MTSamples + MedSynth)"]
Pipeline["data_prep scripts 01-09"]
JSONL["clinical_corpus_v2/ JSONL"]
Loader["scribe-load-corpus"]
Tables["patients + notes + embeddings (Postgres)"]
Synthea --> Pipeline
NotePool --> Pipeline
Pipeline --> JSONL
JSONL --> Loader
Loader --> Tables
Synthea produces a deterministic synthetic patient population. The note pool contributes realistic clinical narrative from public datasets. The nine-step data_prep pipeline matches notes to synthetic patients, scores quality, selects a cohort, adapts notes via the configured LLM provider for consistency, and emits validated JSONL with a dataset card and audit report. The backend loader (scribe-load-corpus) upserts that JSONL into Postgres; with --embed, it generates embeddings via the configured embedding provider (OpenAI, Azure OpenAI, or Amazon Bedrock — see docs/guides/LLM_AND_EMBEDDING_PROVIDERS.md) into the notes.embedding vector column.
For pipeline detail, see data_prep/README.md and docs/reference/corpus_offline_pipeline_v2_brief.md.
Capability flags¶
Health (GET /health) reports which capabilities are configured; the UI surfaces degraded states instead of failing silently.
| Flag | What it unlocks | Default |
|---|---|---|
LLM_PROVIDER |
Selects LLM provider (groq, azure_openai, bedrock) for chat, pre-meeting summaries, and note generation |
groq |
GROQ_API_KEY |
Groq completions when LLM_PROVIDER=groq |
unset |
AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT + AZURE_OPENAI_CHAT_DEPLOYMENT |
Azure OpenAI completions when LLM_PROVIDER=azure_openai |
unset |
AWS_REGION + AWS_BEDROCK_CHAT_MODEL_ID (+ AWS credentials or role/profile) |
Bedrock completions when LLM_PROVIDER=bedrock |
unset |
EMBEDDING_PROVIDER + provider credentials + scribe-load-corpus --embed |
RAG chat over note embeddings; chat returns 503 until embeddings exist | openai (unset credentials) |
NOTE_GENERATION_ENABLED |
POST /notes/generate accepts writes |
false |
MEETING_PREP_ENABLED |
GET /patients/{id}/meeting-prep produces LLM summaries |
true |
RESPONSIBLE_AI_ADMIN_ENABLED (backend) |
/admin/responsible-ai/* admin routes mounted |
false |
NEXT_PUBLIC_SCRIBE_ADMIN_UI (frontend) |
Responsible AI Control Center nav and pages | false |
BACKEND_API_KEY |
API key gate on all non-public routes | unset |
CORS_RELAX_LOCAL |
Local/LAN demo CORS regex | false |
Provider boundary¶
The LLM and embedding providers are the only external network boundaries on the AI request path. Everything else — patient data, notes, encounter context, audit rows — stays inside the Postgres instance for the duration of a request.
- What leaves the deployment: the selected prompt context (system instructions, retrieved note excerpts, user messages, transcripts) is sent to the configured LLM provider; the query and note text required for embeddings is sent to the configured embedding provider.
- What does not leave: identifiers, raw chart rows, audit table contents, structured codes, request bodies. The provider sees only what the route handler explicitly serializes into the prompt or embedding payload.
- Per-deployment, not per-request:
LLM_PROVIDERandEMBEDDING_PROVIDERare settings, not knobs that change at runtime. Switching providers in production would be a deployment-level decision with an embedding rebuild step where applicable.
For the full provider matrix, environment variables, and rebuild workflow, see docs/guides/LLM_AND_EMBEDDING_PROVIDERS.md. For PHI posture and the explicit caveat that enterprise providers do not by themselves create compliance, see PRIVACY_AND_PROVIDER_BOUNDARIES.md.
Architecture themes (pointers)¶
These themes are intentional; alternatives considered, depth, and production deltas live in DESIGN_NOTES.md.
- Colocation: vectors live in Postgres with relational rows; corpus is produced offline and loaded, not generated per request.
- Grounding: chat is retrieval-first with a citation-shaped prompt contract (
[note:uuid]), not a post-hoc verifier loop. - Governance:
ai_interactionsis a first-class table for completed and handled degraded AI paths; admin UI is optional exposure.
Extension points¶
| Extension | Where it plugs in |
|---|---|
| Alternative LLM provider | app/llm/ — Settings.llm_provider |
| Alternative embedding provider | app/embeddings/ — Settings.embedding_provider (openai / azure_openai / bedrock / none) |
| Agentic tool loop for chat | app/api/chat.py — single-shot today; tool calls can extend without changing audit shape |
| Production authentication | OptionalApiKeyMiddleware — replace with SSO/RBAC at the middleware layer |
| Audio transcription | docs/roadmap/SCRIBE_IQ_UI_ROADMAP.md §12 — POST /transcribe before POST /notes/generate |
| Multi-tenant isolation | domain on patients / notes — row-level or pool work, not a new data model |
Repository anchors¶
| Concern | Location |
|---|---|
| Local Postgres + pgvector | docker-compose.yml (host port 5433) |
| Backend | backend/ |
| Frontend | frontend/ |
| Corpus pipeline | data_prep/ |
| Generated corpus artifact | data/clinical_corpus_v2/ |
| As-built API and schema | docs/architecture/IMPLEMENTED_BASELINE.md |
| Run instructions | docs/guides/QUICKSTART.md |
| Provider configuration | docs/guides/LLM_AND_EMBEDDING_PROVIDERS.md |
| Product framing | docs/overview/PRODUCT_CONTEXT.md |
| Design rationale | docs/overview/DESIGN_NOTES.md |