Unified interface for modeling lineage frequency dynamics. Supports multiple engines that share the same input/output contract.
Usage
fit_model(
data,
engine = c("mlr", "hier_mlr", "piantham", "fga", "garw"),
pivot = NULL,
horizon = 28L,
ci_level = 0.95,
...
)Arguments
- data
An lfq_data object.
- engine
Model engine to use:
"mlr"(default): Multinomial logistic regression. Fast, frequentist, no external dependencies."hier_mlr": Hierarchical MLR with partial pooling across locations. Requires.locationcolumn in data."piantham": Piantham approximation converting MLR growth rates to relative reproduction numbers. Requiresgeneration_timeargument."fga": Fixed growth advantage model (Bayesian via CmdStan). Requires 'CmdStan'; check withlfq_stan_available()."garw": Growth advantage random walk model (Bayesian via CmdStan). Allows fitness to change over time.
- pivot
Reference lineage name. Growth rates are reported relative to this lineage (fixed at 0). Default: the lineage with the highest count at the earliest time point.
- horizon
Forecast horizon in days (stored for later use by
forecast()). Default 28.- ci_level
Confidence level for intervals. Default 0.95.
- ...
Engine-specific arguments passed to the internal engine function. For
engine = "mlr":window,ci_method,laplace_smooth. Forengine = "piantham":generation_time(required). Forengine = "hier_mlr":shrinkage_method.
Value
An lfq_fit object (S3 class), a list containing:
- engine
Engine name (character).
- growth_rates
Named numeric vector of growth rates per
time_scaledays (pivot = 0).- intercepts
Named numeric vector of intercepts.
- pivot
Name of pivot lineage.
- lineages
Character vector of all lineage names.
- fitted_values
Tibble of fitted frequencies.
- residuals
Tibble with observed, fitted, Pearson residuals.
- vcov_matrix
Variance-covariance matrix.
- loglik, aic, bic
Model fit statistics.
- nobs, n_timepoints, df
Sample and model sizes.
- ci_level, horizon
As specified.
- call
The matched call.
Details
The MLR engine models the frequency of lineage \(v\) at time
\(t\) as:
$$p_v(t) = \frac{\exp(\alpha_v + \delta_v t)}{\sum_k \exp(\alpha_k + \delta_k t)}$$
where \(\alpha_v\) is the intercept, \(\delta_v\) is the
growth rate per time_scale days (default 7), and the pivot
lineage has \(\alpha = \delta = 0\). Parameters are estimated
by maximum likelihood via optim(method = "BFGS") with the
Hessian used for asymptotic Wald confidence intervals.
The constant growth rate assumption is appropriate for monotonic
variant replacement periods (typically 2–4 months). For longer
periods or non-monotonic dynamics, use the window argument to
restrict the estimation window, or consider the "garw" engine
which allows time-varying growth advantages.
References
Abousamra E, Figgins M, Bedford T (2024). Fitness models provide accurate short-term forecasts of SARS-CoV-2 variant frequency. PLoS Computational Biology, 20(9):e1012443. doi:10.1371/journal.pcbi.1012443
Piantham C, Linton NM, Nishiura H (2022). Predicting the trajectory of replacements of SARS-CoV-2 variants using relative reproduction numbers. Viruses, 14(11):2556. doi:10.3390/v14112556
See also
growth_advantage() to extract fitness estimates,
forecast() for frequency prediction, backtest() for
rolling-origin evaluation.
Examples
sim <- simulate_dynamics(
n_lineages = 3,
advantages = c("JN.1" = 1.3, "KP.3" = 0.9),
n_timepoints = 15, seed = 42
)
fit <- fit_model(sim, engine = "mlr")
fit
#> Lineage frequency model (mlr)
#> 3 lineages, 15 time points
#> Date range: 2026-04-13 to 2026-07-20
#> Pivot: "KP.3"
#>
#> Growth rates (per 7-day unit):
#> ↑ JN.1: 0.3638
#> ↑ ref: 0.1122
#>
#> AIC: 9450; BIC: 9453