MAPLE: Model-Aware Parameterization from Literature Evidence

· quantitative systems pharmacology, Bayesian inference, prior elicitation, literature extraction, simulation-based inference, pancreatic cancer, LLM-assisted extraction

Summary

QSP models have many biological parameters, and most can’t be measured directly in the clinical context being modeled. The relevant data is usually scattered across papers, often from different species or indications. MAPLE provides a structured pipeline for turning those measurements into informative priors that account for the gap between the experimental and model contexts.

Extraction

MAPLE provides two structured YAML schemas for extracting calibration data from papers.

SubmodelTargets — in vitro and preclinical data with self-contained forward models (algebraic, dose-response, power law, ODE). Each target pairs extracted data with a forward model, bootstrap observation code, and a source relevance assessment.

CalibrationTargets — clinical and in vivo observables (tumor volumes, immune cell densities, treatment response rates) requiring full QSP model simulation. These capture the observable, experimental context, intervention scenarios, and Monte Carlo derivation code.

Both schemas are filled out interactively using an MCP server with Claude Code. The server exposes the extraction prompt, valid enum values, a multi-step workflow guide, and hard rules that LLMs commonly violate during extraction (e.g., inventing uncertainties, using wrong input types).

A validate_target tool runs three levels of checks:

Schema validation — Pydantic model with 30+ validators
Prior derivation — bootstrap + forward model inversion + distribution fitting + translation sigma
Snippet verification — checks that every extracted value appears in the source paper text (Europe PMC full text or source PDFs), catching hallucinated numbers before they enter the pipeline

Inference

All SubmodelTargets are combined into a joint NumPyro model for MCMC inference. A source relevance rubric scores each target across eight axes — species, indication, TME compatibility, measurement directness, and others — and maps these to a translation sigma that widens the likelihood for that target. Mouse in vitro data naturally contributes less than human clinical data constraining the same parameter.

The joint posterior is parameterized as marginal distributions + a Gaussian copula that preserves posterior correlations.

Two-Stage Calibration

Stage 1 — MAPLE
SubmodelTargets
In vitro / preclinical data
Joint MCMC (NumPyro/NUTS)
Output: marginals + Gaussian copula

→

Stage 2 — qsp-sbi
CalibrationTargets
Clinical data + full QSP simulator
Copula prior from Stage 1 + SBI (SNPE-C)
Output: final posterior

Joel Eliason

Postdoctoral Researcher | Popel Lab | Johns Hopkins University

MAPLE: Model-Aware Parameterization from Literature Evidence

Summary

Extraction

Inference

Two-Stage Calibration

Links