Highlights and test the time-course of effects through crossvalidation
Source:R/decode_signal.R
decode_signal.Rd
This function is meant to implement, roughly and with no assurance of full compatibility the procedure proposed by Mathôt and Vilotijević (2022, Behavior Research Methods). First, each trial (i.e., one complete time series) is assigned to one fold in a deterministic fashion (the first trial to the first fold, then the second trial to the second fold, etc.). In doing that, there is no regard of how conditions are distributed across folds, i.e. data may be slightly unbalanced; thus, you should think carefully as to whether this strategy applies to your design (e.g., blocked conditions). Then, data are separated for each time-point, and a LMEM as specified by the 'formula' parameter, which is passed to 'lmerTest::lmer', is performed by iteratively leaving one fold out. This results in a table, with as many rows as effects implied by the formula by 'nfolds', summarising which time point had a peak t-value (in absolute value) in the trained folds. In a separate table these peak values are tested: the dependent variable becomes, for each fold, the variable provided by 'dv' at that specific peak. Another LMEM is then computed by using this newly created variable. One problem with this approach is that peak values can be all over the places, depending on your data. Also, choosing the time-points based on the maximum value in the training dataset can occasionally decrease the precision of the estimate or give overfitting. You may use this approach if you are confident that a specific effect only has an effect at a specific window; effects with multiple windows - e.g., an early and late impact on pupil size - may not be properly captured with this approach. Therefore, in addition to this procedure, a very coarse consensus is seek by assessing, across all folds and effects, which time points resulted in t-values above a certain threshold; if the same time points pop out consistently across folds (e.g.,= 'consensus_thresh' % of the times), then the time point is retained; all time-pointsretained in the consensus are collapsed (averaged), and a final LMEM is performed withthese time points. This can be interpreted more similarly to a cluster-basedpermutation test (although it is not the same).
Usage
decode_signal(
data,
formula,
dv,
time,
id,
trial,
nfolds = 3,
t_thresh = 2,
consensus_thresh = 0.75,
formula_max = NULL
)
Arguments
- data
A data.frame containing all the necessary variables.
- formula
A 'lme4'-style formula, passed as a string.
- dv
A string indicating the name of the dependent variable.
- time
A string indicating the name of the time variable.
- id
A string indicating the name of the id (participant) variable.
- trial
A string indicating the name of the trial variable.
- nfolds
Number of folds to split trials in. Defaults to 3.
- t_thresh
Used to seek consensus: the minimum t-value required to push the time-point forward.
- consensus_thresh
The minimum proportion of time-points that must be above 't_thresh' across folds in order to keep the time-point in the consensus.
- formula_max
A 'lme4'-style formula, passed as a string. This formula is only used to test the final models. Could be useful to save computational resources. I do not necessarily recommend it because it could be overly conservative and prone to convergence issues.