DATAP.AI HEALTH
AI Governance & Compliance
Privacy-aware AI routing, full audit trail, and immutable session archival. No black boxes.
Total Requests
12,847
+8.2% vs last month
PHI Detected
342
+2.1% vs last month
Models Active
7
Compliance Score
94%
+1.5% vs last month
DATAP.AI Document Processing Pipeline
DATAP.AI Health processes clinical documents through 4 layers of privacy protection. Raw text containing patient identifiers (Medicare, IHI, MRN) is processed exclusively by HIPAA-compliant AI providers. De-identified text uses frontier models for the best clinical reasoning quality.
DATAP.AI Privacy-Aware LLM Router
DATAP.AI classifies every healthcare AI task by PHI risk level, then routes to the appropriate provider. Fireworks AI (HIPAA-compliant, BAA signed) handles 8 of 11 tasks. Google Gemini handles 3 patient-facing tasks where reasoning quality is paramount.
Healthcare AI — LLM Routing Table
Each healthcare AI task is routed to a specific provider and model based on PHI risk, clinical reasoning requirements, and cost.
| Task | Provider | Model | Why |
|---|---|---|---|
| Customer chat(copilot widget) | Gemini | 2.5 Flash | Fast, cheap, good UX |
| Clinical copilot(backend) | Bedrock | Claude Sonnet 4.6 | Best clinical reasoning, AU data residency |
| Document Q&A | Bedrock | Claude Sonnet 4.6 | Long-document comprehension |
| Compliance reports | Bedrock | Claude Opus 4.6 | Highest writing + regulatory accuracy |
| PHI scanning | Fireworks | DeepSeek V3 | HIPAA BAA, sees raw patient data |
| Document NER | Fireworks | DeepSeek V3 | HIPAA BAA, raw clinical text |
| Regulatory signals | Fireworks | DeepSeek V3 | Public data, high volume, cheap |
| Embeddings | Fireworks | Nomic Embed | HIPAA, residual PHI risk |
Healthcare AI Architecture — Four Intelligences
Every healthcare AI interaction at DATAP.AI runs on four intelligence layers. Each layer is auditable, configurable, and compliant by default.
| Pillar | Intelligence | Components | Tools & Services |
|---|---|---|---|
| AI | Artificial Intelligence | Multi-agent LLM orchestration, privacy-aware routing, clinical NER, PHI detection | Gemini, Fireworks AI, ag2, LanceDB |
| BI | Business Intelligence | Dashboards, Text-to-SQL, AI-generated clinical insights | Lightdash, Vanna, Recharts |
| CI | Customer Intelligence | CRM-backed AI chatbot, patient & practitioner management, interaction audit | ERPNext Healthcare, REST API |
| DI | Data Intelligence | Chat archival to S3 Parquet, Glue catalog, Athena queries, immutable audit trail | pyarrow, boto3, Athena, Glue |
AI Audit Trail — Three-Tier Traceability
Every AI conversation is persisted to three independent stores. The S3 cold tier is immutable — once a Parquet file is written, it is never modified or deleted. This guarantees a tamper-proof audit trail for regulatory review.
Hot Tier
PostgreSQL (framework_db) · 90 days
Every AI chat message — user question + AI response + model + tokens — persisted in real-time for active UI and LLM context recall.
Access: Live queries via clinical chat endpoint
CRM Tier
ERPNext (crm-health.datap.ai) · Forever
Each chat interaction logged as a Note on the practitioner timeline. Human reviewers see the full conversation history in the CRM admin.
Access: ERPNext admin UI + REST API
Cold Tier
S3 Parquet (codepais3 bucket) · Forever (immutable)
Weekly archival exports chat history as Hive-partitioned Parquet files. Snappy-compressed, verified read-after-write. Once written, files are never modified — immutable audit trail.
Access: AWS Athena / DuckDB / Glue Crawler
S3 Immutable Session Archive — Hive Partitioning
s3://codepais3/stock/raw/chat_history/year=2026/month=04/day=12/part-00000-*.parquet
s3://codepais3/health/raw/chat_history/year=2026/month=04/day=12/part-00000-*.parquetEach file contains: message_id, session_id, user_id, role, content, model_used, tokens_used, context_sources, created_at. Queryable via AWS Athena or DuckDB. Glue Crawler auto-discovers new partitions.
Healthcare Data Standards
DATAP.AI processes clinical data using international and Australian healthcare standards.
| Standard | Full Name | What It Does | Australian Equivalent |
|---|---|---|---|
| FHIR R4 | Fast Healthcare Interoperability Resources (HL7) | Standard format for exchanging clinical data between healthcare systems | Australian Digital Health Agency adopted FHIR as national standard. My Health Record uses FHIR R4. |
| HL7 v2 | Health Level Seven (messaging protocol) | Legacy messaging format used between hospital systems | Still widely used in Australian hospitals and pathology labs |
| HIPAA | US Health Insurance Portability and Accountability Act | US law governing protection of patient health data. Requires BAA with vendors who handle PHI. | Australian Privacy Act 1988 + Health Records Act. Australian Privacy Principles (APPs) govern health data. |
| PHI | Protected Health Information | Any data that can identify a patient — names, Medicare numbers, medical record numbers, dates of birth | In Australia: Medicare number, IHI (Individual Healthcare Identifier), MRN (Medical Record Number), DVA numbers |
| SOC2 | Service Organization Control Type 2 | Independent security audit verifying data protection controls | IRAP or ISO 27001 are the Australian equivalents for government/healthcare |
| TGA | Therapeutic Goods Administration | Australia's regulatory body for medical devices, medicines, and biologicals. | Equivalent to US FDA, EU EMA. DATAP.AI monitors TGA but does NOT require TGA approval. |
| BAA | Business Associate Agreement | Legal contract with AI/cloud vendors ensuring they protect patient data. | No direct AU equivalent, but APP 8 and contractual privacy clauses serve similar purpose under the Privacy Act. |
DATAP.AI Technology Partners
Fireworks AI
$4B valuation | Sequoia Capital-backed
- HIPAA + SOC2 compliant with signed BAA
- Zero data retention — patient data never stored
- 140B+ tokens/day, 99.99% uptime
- 5-10x cheaper than proprietary models
- Handles 8 of 11 healthcare AI tasks
Google Gemini
Frontier reasoning model
- Highest quality clinical reasoning
- 2M token context window
- Google Search grounding for real-time data
- Used for patient-facing responses only
- Handles 3 of 11 healthcare AI tasks
How DATAP.AI Addresses Healthcare AI Governance
Patient Privacy (Australian Privacy Act)
DATAP.AI detects Australian healthcare identifiers (Medicare, IHI, MRN), de-identifies via Safe Harbour method, and routes high-risk tasks to HIPAA-compliant providers. 4-layer defence-in-depth ensures no single point of failure.
AI Transparency (TGA Feb 2026 Guidance)
Every AI decision is logged with model name, provider, data classification level, and full audit trail. The LLM routing table is exposed via API for governance review. DATAP.AI builds governance INTO the platform from day 1.
Bias Detection (AI Ethics)
DATAP.AI monitors statistical parity and equalised odds across demographic dimensions and 8 Asia-Pacific languages. Bias reports are generated automatically and available via the governance dashboard.
Cost Control (Operational Governance)
CostGuard enforces daily LLM spend limits per provider. Multi-provider routing optimises cost-per-task — $0.56/1M tokens for bulk work, frontier models only where clinical reasoning demands it. Critical for B2B pricing in APAC markets.
Governance Modules
DATAP.AI Live Routing Table (API)
DATAP.AI exposes the full LLM routing table as a live API endpoint for governance audit and compliance review:
GET https://healthapi.datap.ai/agent/llm-routing