This function is a wrapper around FCnet::FCnetLOO() which executes nperm times LOO robust regression pipelines on randomly permutated y scores. A vector of R2 obtained from permutated (null) models is returned. If asked, a data.frame (possibly huge) of the coefficients for the null models is also returned (this is the default). A summary data.frame describes the distribution of the permutated models.

permutateLOO(
  y,
  x,
  alpha = seq(0, 1, by = 0.1),
  lambda = rev(10^seq(-5, 5, length.out = 200)),
  cv_Ncomp = NULL,
  cv_Ncomp_method = c("order", "R"),
  parallelLOO = F,
  scale_y = T,
  scale_x = T,
  nperm = 100,
  model_R2 = NULL,
  model_Accuracy = NULL,
  return_coeffs = T,
  family = optionsFCnet("family"),
  type.measure = optionsFCnet("cv.type.measure"),
  intercept = optionsFCnet("intercept"),
  standardize = optionsFCnet("standardize"),
  thresh = optionsFCnet("thresh"),
  ...
)

Arguments

y

The dependent variable, typically behavioral scores to predict. This can be a vector or a single data.frame column.

x

The independent variables, typically neural measures that have been already summarised through data reduction techniques (e.g. ICA, PCA): an object created by reduce_featuresFC() will do. If such an object is passed to this function, the "Weights" slot is taken as x. Another kind of list can be passed to this function: in this case the function needs an entry named "Weights". Otherwise, a data.frame can be passed to x.

alpha

Value(s) that bias the elastic net toward ridge regression (alpha== 0) or LASSO regression (alpha== 1). If a vector of alpha values is supplied, the value is optimized through crossvalidation. It defaults to a vector ranging from 0 to 1 with steps of 0.1.

lambda

Regularization parameter for the regression, see glmnet::glmnet(). Lambda must be a vector with length>1. When a vector of lambda values is supplied, the value of lambda is optimized through internal crossvalidation. It defaults to a vector ranging from 10^-5 to 10^5 with 200 values in logarithmic steps.

cv_Ncomp

Whether to crossvalidate the number of components or not. It defaults to NULL, but a vector can be supplied specifing the number (range) of components to test in the inner loops.

cv_Ncomp_method

Whether the number of components to optimize means components are ordered (e.g. according to the explained variance of neuroimaging data) or - somehow experimental - whether to use the N best components ranked according to their relationship (pearson's R) with y.

parallelLOO

If TRUE - recommended, but not the default - uses future.apply::future_lapply() for the inner loops: future.apply must be installed, the machine should have multiple cores available for use, and threads should be defined explicitly beforehand by the user (e.g. by calling plan(multisession)). Outer loops have been changed in the most recent versions of FCnet to avoid excessive parallelization and resources consumption, which was causing this function to be very often slower than desirable.

scale_y

Whether y should be scaled prior to fit. Default, TRUE, scales and center y with scale().

scale_x

Whether x should be scaled prior to fit. Default, TRUE, subtracts the mean matrix value and divides each entry for the matrix variance. Beware that this adds to optionsFCnet("standardize").

nperm

The number of permutations for the null models. Default is 100.

model_R2

Optional. If this entry is left NULL, the original model is fitted again. Either an object created by FCnet::FCnetLOO() or a precise value of R2 can be supplied, as to avoid unnecessary computations.

model_Accuracy

Optional. If this entry is left NULL, the original model is fitted again. Either an object created by FCnet::FCnetLOO() or a precise value of Accuracy can be supplied, as to avoid unnecessary computations.

return_coeffs

Optional: whether coefficients for the null models should be returned as well. This may interesting should inferential statistics be envisaged for single coefficients. The returned data.frame, on the other hand, may be quite large.

family

Defaults to "gaussian." Experimental support for "binomial" on the way.

intercept

whether to fit (TRUE) or not (FALSE) an intercept to the model.

standardize

Whether x must be standardized internally to glmnet.

thresh

Threshold for glmnet to stop converging to the solution.

...

Other parameters passed to glmnetUtils::cva.glmnet() or glmnet::glmnet().

cv.type.measure

The measure to minimize in crossvalidation inner loops. Differently from glmnetUtils::cva.glmnet() the default is the mean absolute error.

Value

A list including the R2 for the permutated models and a summary data.frame. Optionally, a data.frame including the permutated coefficients.