HAL Conditional Density Estimation in a Cross-validation Fold

cv_haldensify(
  fold,
  long_data,
  wts = rep(1, nrow(long_data)),
  lambda_seq = exp(seq(-1, -13, length = 1000L)),
  smoothness_orders = 0L,
  ...
)

Arguments

fold

Object specifying cross-validation folds as generated by a call to make_folds.

long_data

A data.table or data.frame object containing the data in long format, as given in Díaz I, van der Laan MJ (2011). “Super learner based conditional density estimation with application to marginal structural models.” International Journal of Biostatistics, 7(1), 1--20. doi:10.2202/1557-4679.1356 . , as produced by format_long_hazards.

wts

A numeric vector of observation-level weights, matching in its length the number of records present in the long format data. Default is to weight all observations equally.

lambda_seq

A numeric sequence of values of the regularization parameter of Lasso regression; passed to fit_hal.

smoothness_orders

A integer indicating the smoothness of the HAL basis functions; passed to fit_hal. The default is set to zero, for indicator basis functions.

...

Additional (optional) arguments of fit_hal that may be used to control fitting of the HAL regression model. Possible choices include use_min, reduce_basis, return_lasso, and return_x_basis, but this list is not exhaustive. Consult the documentation of fit_hal for complete details.

Value

A list, containing density predictions, observations IDs, observation-level weights, and cross-validation indices for conditional density estimation on a single fold of the overall data.

Details

Estimates the conditional density of A|W for a subset of the full set of observations based on the inputted structure of the cross-validation folds. This is a helper function intended to be used to select the optimal value of the penalization parameter for the highly adaptive lasso estimates of the conditional hazard (via cross_validate). The