Skip to content

About & Notice

What this is

scribe-iq-lakehouse is a portfolio engineering artifact — a production-pattern healthcare data lakehouse built to be reviewed, run, and reasoned about. It is the data-platform layer beneath a small family of clinical-AI projects (see Downstream & Portfolio).

Data & privacy notice

Built entirely on Synthea Coherent synthetic data (AWS Open Data, no credentials required). It contains no real patient information — no PHI. Genomic content is simulated inheritance, not clinical variants (Responsible Data). Not for clinical decision-making.

License

MIT. The synthetic source dataset is published by the Synthea project under its own open terms.

Source & companion work

How the docs are maintained

  • The Data Dictionary and the Gold JSON Schema are generated from code and verified by a CI gate — they cannot drift (ADR-011).
  • Decisions are recorded as ADRs; the Architecture page tracks the as-built system; the Changelog records what changed.

Contact

Sandeep Jayaprakash — github.com/sandeep-jay.