Fit model with Deep Mutational Scanning priors

A penalised multinomial logistic regression engine that incorporates Deep Mutational Scanning (DMS) escape scores as informative priors on variant fitness. This is valuable for early-emergence scenarios where a new lineage has few observed sequences but laboratory-measured phenotypic data (e.g., ACE2 binding affinity, antibody escape) are available.

Usage

fit_dms_prior(data, dms_scores, lambda = 1, pivot = NULL, ci_level = 0.95)

Arguments

data: An lfq_data object.
dms_scores: Named numeric vector of DMS-derived fitness priors. Names correspond to lineage identifiers. Values are on the log growth rate scale (positive = fitter than average). Lineages not in the vector receive a prior of 0.
lambda: Regularisation strength (penalty weight). Default 1.0. Larger values pull estimates more strongly toward the DMS prior. At lambda = 0, the result is identical to the standard MLR engine.
pivot: Reference lineage name. Default NULL (automatic selection).
ci_level: Confidence level. Default 0.95.

Value

An lfq_fit object compatible with all downstream functions (forecast, growth_advantage, etc.).

Details

The approach uses penalised maximum likelihood where the penalty is proportional to the squared difference between the estimated growth rate and the DMS-derived prior. This implements an empirical Bayes shrinkage: with abundant data, the penalty has little effect; with sparse data, estimates are pulled toward the DMS prior.

The penalised log-likelihood is: $$\ell_{\text{pen}}(\alpha, \delta) = \ell(\alpha, \delta) - \frac{\lambda}{2} \sum_v (\delta_v - \mu_v)^2$$ where $\ell$ is the standard multinomial log-likelihood, $\delta_v$ is the growth rate for lineage $v$, and $\mu_v$ is the DMS prior. The Hessian is adjusted accordingly, ensuring correct confidence interval widths.

References

Dadonaite B, Crawford KHD, Radford CE, et al. (2023). A pseudovirus system enables deep mutational scanning of the full SARS-CoV-2 spike. Cell, 186(6), 1263–1278. doi:10.1016/j.cell.2023.02.001

Bloom JD, Neher RA (2023). Fitness effects of mutations to SARS-CoV-2 proteins. Virus Evolution, 9(2), vead055. doi:10.1093/ve/vead055

Examples