Fit Conditional Density Estimation over a Sequence of HAL Models
Source:R/haldensify.R
fit_haldensify.RdFit Conditional Density Estimation over a Sequence of HAL Models
Arguments
- A
The
numericvector of observed values.- W
A
data.frame,matrix, or similar giving the values of baseline covariates (potential confounders) for the observed units. These make up the conditioning set for the conditional density estimate.- wts
A
numericvector of observation-level weights. The default is to weight all observations equally.- grid_type
A
characterindicating the strategy to be used in creating bins along the observed support ofA. For bins of equal range, use"equal_range"; consult the documentation ofcut_intervalfor more information. To ensure each bin has the same number of observations, use"equal_mass"; consult the documentation ofcut_numberfor details.- n_bins
This
numericvalue indicates the number(s) of bins into which the support ofAis to be divided. As withgrid_type, multiple values may be specified, in which case cross-validation will be used to choose the optimal number of bins. The default sets the candidate choices of the number of bins based on heuristics tested in simulation.- cv_folds
A
numericindicating the number of cross-validation folds to be used in fitting the sequence of HAL conditional density models.- lambda_seq
A
numericsequence of values of the regularization parameter of Lasso regression; passed tofit_hal.- smoothness_orders
A
integerindicating the smoothness of the HAL basis functions; passed tofit_hal. The default is set to zero, for indicator basis functions.- ...
Additional (optional) arguments of
fit_halthat may be used to control fitting of the HAL regression model. Possible choices includeuse_min,reduce_basis,return_lasso, andreturn_x_basis, but this list is not exhaustive. Consult the documentation offit_halfor complete details.
Value
A list, containing density predictions for the sequence of
fitted HAL models; the index and value of the L1 regularization parameter
minimizing the density loss; and the sequence of empirical risks for the
sequence of fitted HAL models.
Details
Estimation of the conditional density of A|W via a cross-validated highly adaptive lasso, used to estimate the conditional hazard of failure in a given bin over the support of A.
Examples
# simulate data: W ~ U[-4, 4] and A|W ~ N(mu = W, sd = 0.5)
set.seed(11249)
n_train <- 50
w <- runif(n_train, -4, 4)
a <- rnorm(n_train, w, 0.5)
# fit cross-validated HAL-based density estimator of A|W
haldensify_cvfit <- fit_haldensify(
A = a, W = w, n_bins = 10L, lambda_seq = exp(seq(-1, -10, length = 100)),
# the following arguments are passed to hal9001::fit_hal()
max_degree = 3, reduce_basis = 1 / sqrt(length(a))
)