Back to the main page.

# Bug 2992 - ft_statfun_correlationT is prone to cross-condition bias when creating the permutation distribution

Status | CLOSED FIXED |

Reported | 2015-10-22 00:49:00 +0200 |

Modified | 2019-08-10 12:31:35 +0200 |

Product: | FieldTrip |

Component: | core |

Version: | unspecified |

Hardware: | PC |

Operating System: | Mac OS |

Importance: | P5 normal |

Assigned to: | Arjen Stolk |

URL: | |

Tags: | |

Depends on: | |

Blocks: | |

See also: |

## Arjen Stolk - 2015-10-22 00:49:39 +0200

It seems that permuting labels of conditions for calculating a permutation distribution leads to biased estimates in case there is an offset across conditions. This may be best depicted as follows: normal: ~1000 ~1000 ~1000 ~1000 ~1 ~1 ~1 ~1 corr = ~0 post shuffling: ~1000 ~1 ~1000 ~1000 ~1 ~1000 ~1 ~1 corr = anti-correlation This systematic bias will lead to statistically significant correlations, despite the correlations themselves (i.e. without permuting) are perfectly fine. % simulate simple single-subject timelock structures data_brain = []; data_brain.trial = rand(10,1); % increasing data_brain.dimord = 'rpt_chan_time'; data_brain.time = 1; data_brain.label = {'1'}; data_behav = data_brain; data_behav.trial = rand(10,1)+1000; % add scaling difference % compute statistics with correlationT cfg = []; cfg.method = 'montecarlo'; cfg.statistic = 'ft_statfun_correlationT'; cfg.numrandomization = 100; n1 = 10; % n1 is the number of trials design = zeros(2, n1 * 2); design(1,1:n1) = 1; design(1,(n1 + 1):(n1 * 2)) = 2; design(2, :) = [1:n1 1:n1]; cfg.design = design; cfg.ivar = 1; cfg.uvar = 2; stat = ft_timelockstatistics(cfg, data_brain, data_behav); stat = prob: 0.0099 cirange: 0.0193 mask: 1 stat: -0.8405 ref: -3.2827 rho: -0.2848 dimord: 'chan_time' label: {'1'} time: 1 cfg: [1x1 struct]

## Arjen Stolk - 2015-10-22 00:54:43 +0200

As a (pre-liminary?) prevention, I have added the following check to ft_statfun_correlationT.m (line 77): if strcmp(cfg.resampling, 'permutation') error('shuffling the design matrix may bias the permutation distribution, see bug #2992 for details'); end

## Arjen Stolk - 2015-10-22 00:57:51 +0200

(In reply to Arjen Stolk from comment #1) svn commit ft_statfun_correlationT.m -m 'added prevention: check whether ft_statfun_correlationT is not used to create a permutation distribution, where it may be prone to biases between conditions' arjsto@svn.fcdonders.nl's password: Sending ft_statfun_correlationT.m Transmitting file data . Committed revision 10810.

## Arjen Stolk - 2015-10-22 01:29:26 +0200

One solution would be to normalize the sample populations, separately per condition, prior to giving to ft_xxxstatistics. This way any systematic bias between conditions is taken care of, as much as possible. See example below (note that P > 0.05). However, this normalization procedure cannot be done within ft_statfun_correlationT itself since it receives already shuffled data (during permuting), meaning the bias is already incorporated (too late for normalizing). % simulate simple single-subject timelock structures with random data data_brain = []; data_brain.trial = rand(10,1); % increasing data_brain.dimord = 'rpt_chan_time'; data_brain.time = 1; data_brain.label = {'1'}; data_behav = data_brain; data_behav.trial = rand(10,1)+1000; % add scaling difference % normalize data_brain.trial = (data_brain.trial - nanmean(data_brain.trial)) ./ nanstd(data_brain.trial); data_behav.trial = (data_behav.trial - nanmean(data_behav.trial)) ./ nanstd(data_behav.trial); % compute statistics with correlationT cfg = []; cfg.method = 'montecarlo'; cfg.statistic = 'ft_statfun_correlationT'; cfg.numrandomization = 100; n1 = 10; % n1 is the number of trials design = zeros(2, n1 * 2); design(1,1:n1) = 1; design(1,(n1 + 1):(n1 * 2)) = 2; design(2, :) = [1:n1 1:n1]; cfg.design = design; cfg.ivar = 1; cfg.uvar = 2; stat = ft_timelockstatistics(cfg, data_brain, data_behav); stat = prob: 0.1188 cirange: 0.0631 mask: 0 stat: 0.9193 ref: 2.1213 rho: 0.3091 dimord: 'chan_time' label: {'1'} time: 1 cfg: [1x1 struct]

## Arjen Stolk - 2015-10-22 05:44:35 +0200

Moving forward a bit more, I have adjusted the documentation of ft_statfun_correlationT into: % When using this function to calculate a randomization distribution, for % statistical inference purposes, it is recommended to use the bootstrapping % approach (cfg.resampling = 'bootstrap') when calling any of the % high-level statistics functions above. The default (cfg.resampling = 'permutation') % may be prone to systematic across-condition bias, see bugreport #2992: % http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=2992 And since the permutation option should in theory be fine when there is no systematic across-condition bias (either resolved by normalization, or simply not present in the first place), I've changed the error throwing into a warning linking to this bug report and suggesting the bootstrapping method: if strcmp(cfg.resampling, 'permutation') warning('permutation may be prone to systematic across-condition bias, see http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=2992, consider using the bootstrap resampling method instead (cfg.resampling = ''bootstrap'')'); end Using cfg.resampling = 'bootstrap': % simulate simple single-subject timelock structures data_brain = []; data_brain.trial = rand(10,1); % increasing data_brain.dimord = 'rpt_chan_time'; data_brain.time = 1; data_brain.label = {'1'}; data_behav = data_brain; data_behav.trial = rand(10,1)+1000; % add offset (which should be not allowed to bias the results) % compute statistics with correlationT cfg = []; cfg.statistic = 'ft_statfun_correlationT'; cfg.method = 'montecarlo'; cfg.resampling = 'bootstrap'; % this parameter is crucial, the default (permutation) is prone to systematic across-condition bias cfg.numrandomization = 1000; n1 = 10; % n1 is the number of trials design = zeros(2, n1 * 2); design(1,1:n1) = 1; design(1,(n1 + 1):(n1 * 2)) = 2; design(2, :) = [1:n1 1:n1]; cfg.design = design; cfg.ivar = 1; cfg.uvar = 2; stat = ft_timelockstatistics(cfg, data_brain, data_behav); stat = prob: 0.4535 cirange: 0.0308 mask: 0 stat: 1.7552 ref: 3.9421 rho: 0.5273 dimord: 'chan_time' label: {'1'} time: 1 cfg: [1x1 struct]

## Eric Maris - 2015-10-22 09:20:26 +0200

Hoi Arjen, Ik stel voor dat we even wachten met aanpassingen van de documentatie. Ik heb een grote voorkeur voor principe-gebaseerde oplossingen ipv patches, maar op dit moment weet ik nog niet wat deze statfun precies doet. We moeten ook de vergelijking maken met statfun_depsamplesRegrT en statfun_indepsamplesRegrT, die een zeer vergelijkbare functionaliteit hebben (als de data normaal verdeeld zijn, zijn de correlatie en de regressiecoeffient toets equivalent in het Neyman-Pearson framework). Ik dat sommige FT-gebruikers beiden gebruikt hebben, en wij moeten daarom aan kunnen geven wanneer er verschillen te verwachten. Ik heb sowieso een probleem met het adviseren van de Bootstrap als alternatief voor de permutatietoets: er is geen algemeen-bruikbare statistische theorie van de Bootstrap, dit in tegenstelling tot de permutatietoets. groet, Eric

## Egbert Hartstra - 2015-10-23 17:35:44 +0200

Hi Arjen and Eric, I have followed the discussion on the mailing list and on the bug report and I was wondering if the following approach could be a solution? When shuffling the data during the permutation steps one could shuffle only the first part of the uvar variable such that in each permutation the brain data gets randomly linked with the behavioral data. In an example of a data set with 10 subjects: before permutation: ivar: 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 uvar: 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 after permutation: ivar: 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 uvar: 2 5 10 3 1 6 7 4 8 9 1 2 3 4 5 6 7 8 9 10 However I am not sure if: 1. This type of shuffling is allowed during the permutation tests? 2. What would happen in cases in which during permutation there are multiple instances in which for example the brain data from the same subject becomes linked with the behavioral data of that same subject? Would this not increase the likelihood of false negatives? 3. Other consequences? I am curious what you think. Best, Egbert

## Arjen Stolk - 2015-10-23 19:12:17 +0200

Created attachment 746 permutation simulation

## Arjen Stolk - 2015-10-23 19:17:45 +0200

Thanks for joining the discussion, Egbert. To better illustrate the behavior of the permutation under statfun_correlationT, see attached plot constituting iterative simulations of brain-behavior correlations involving 20 subjects. The brain and behavior values are random (between 0 and 1). What is different across iterations, is an offset (a shift in amplitude across conditions, the number in the figure's legend). As can be observed, permuting with any offset (1, 10, or 100) will bias the randomization distribution towards the negative. Without bias, the distribution seems to lie around the real correlationT (1.68, see below). SIMULATION: % fake data nsubj = 20; data1 = rand(nsubj,1); data2 = rand(nsubj,1); corr(data1, data2,'type','Spearman') = 0.3684 with nunits = nsubj, that is tstat = rho*(sqrt(max(nunits)-2))/sqrt((1-rho^2)) = 1.6812 % from here onwards re-run iteratively (to keep rand numbers identical across sims) tempdata = data2+100; % simulate simple multiple subjects timelock structures data_brain = []; data_behav = []; for j = 1:nsubj data_brain{j}.avg = data1(j); data_brain{j}.dimord = 'chan_time'; data_brain{j}.time = 1; data_brain{j}.label = {'1'}; data_behav{j} = data_brain{j}; data_behav{j}.avg = tempdata(j); end % compute statistics with correlationT cfg = []; cfg.statistic = 'ft_statfun_correlationT'; cfg.method = 'montecarlo'; cfg.resampling = 'bootstrap'; % this parameter is crucial, the default (permutation) is prone to systematic across-condition bias cfg.numrandomization = 10000; n1 = nsubj; % n1 is the number of subjects design = zeros(2, n1 * 2); design(1,1:n1) = 1; design(1,(n1 + 1):(n1 * 2)) = 2; design(2, :) = [1:n1 1:n1]; cfg.design = design; cfg.ivar = 1; cfg.uvar = 2; stat = ft_timelockstatistics(cfg, data_brain{:}, data_behav{:}); % debug in ft_statistics_montecarlo @ ft_progress('close') - line 352, and % uncomment stat.statkeep(:,i) = statrand; then save: save('permutation_XXXoffset.mat','stat') offsets = {'0','1','10','100'}; figure; for i = 1:numel(offsets) load(['permutation_' offsets{i} 'offset.mat']); hold on; histogram(stat.statkeep) end load(['bootstrap_' offsets{4} 'offset.mat']); hold on; histogram(stat.statkeep) legend(offsets,'100 (bootstrap)') xlabel('t vals') ylabel('counts') print('permutationsim_differentoffsets', '-dpdf'); close

## Arjen Stolk - 2015-10-23 19:30:30 +0200

(In reply to Arjen Stolk from comment #8) Note that the above code was for the 100 offset bootstrapping condition, hence 1): tempdata = data2+100; and 2) this was uncommented: cfg.resampling = 'bootstrap'; % this parameter is crucial, the default (permutation) is prone to systematic across-condition bias Note that the bootstrapping randomization distribution lies around the same value as the zero offset permutation randomization distribution, but with larger side lobes.

## Arjen Stolk - 2015-10-23 19:35:05 +0200

Created attachment 747 permutation_0offset.mat

## Arjen Stolk - 2015-10-23 19:35:55 +0200

Created attachment 748 permutation distributions

## Arjen Stolk - 2015-10-24 00:46:29 +0200

Given that the EU side of this seemingly growing project will wake up later, here's a summary of the status quo: 1) Shuffling units across conditions ('permutation') is sensitive to across-condition bias if there's any (see pdf above). To grasp this point consider this example: A = [1 0 -1 3 2] B = A+100; corr(A',B','type','spearman') = 1; now shuffle one digit/observation unit: A = [1 100 -1 3 2] B = [101 0 99 103 102] corr(A',B','type','spearman') = 0; .. or more units (e.g. 2): A = [1 100 99 3 2] B = [101 0 -1 103 102] corr(A',B','type','spearman') = -0.5 In sum, the bias drives the correlation towards the negative. 2) Egbert proposed a solution, in terms of shuffling strategy. This solution seems closely related to the bootstrap approach, cf example shuffles: existing randomization strategies: BOOTSTRAP (separately per condition, but with identical order within each condition): 1 2 2 2 4 5 5 6 7 9 11 12 12 12 14 15 15 16 17 19 PERMUTATION (across conditions, e.g. 1 and 11 swapped between condition 1 and 2): 11 12 13 4 15 6 17 18 19 10 1 2 3 14 5 16 7 8 9 20 PROPOSED (equivalent to bootstrapping one condition only?): 11 12 13 4 15 6 17 18 19 10 1 2 3 4 5 6 7 8 9 10 Briefly trying this approach*, it seems that it produces similar results as the bootstrap method (attaching another pdf for illustration). Yet, I recall Eric saying that there might be downsides of the bootstrapping approach. I don't remember what they were, and whether they also hold for shuffling one condition only. * in ft_statistics_montecarlo.m, insert resample = [resample(:,1:size(cfg.design,2)/2) repmat(1:size(cfg.design,2)/2,size(resample,1),1)]; after line 223: resample = resampledesign(cfg, design); to overwrite the second half of the randomization distribution (ensure cfg.resampling = 'bootstrap' when calling ft_xxxstatistics, to achieve the proposed shuffling strategy).

## Arjen Stolk - 2015-10-24 00:50:58 +0200

Created attachment 749 randomization using bootstrapping The actual correlation between the random vectors is 0.1053, which translates to a 0.45 tval. Half = randomization of the first condition only, see comment 12.

## Eric Maris - 2015-10-25 00:09:29 +0200

Hoi Arjen, 1. In de help van de functie schrijf je "In case of calculating brain-behavior correlations, ensure the brain data is matched in terms of size and dimensions to the behavioral data, or vice versa.” Dat snap ik niet. De brein-data zijn in de regel hoogdimensioneel (kanalen-bij-frekwenties-bij-tijdspunten) en de gedragsdata bestaan in de regel uit 1 getal per trial of persoon (afhankelijk van of je een een correlatie over trials of over personen berekent). 2. In cfg.design wordt geen gedragsvariabele gespecificeerd. Waar dan wel? 3. In cfg.design moeten wel condities gespecificeerd worden. Ben je dan geïnteresseerd in de correlatie tussen de conditie-indicator en de brein-data? Zo ja, indien je voor de Pearson correlatie kiest, dan is de T-statistiek die je berekent identiek aan de independent-samples T-statistiek die door een andere statfun (ft_statfun_indepsamplesT) berekend wordt. Voor andere correlatie types (e.g., Spearman) is de formule die jij gebruikt om een T-statistiek en bijbehorende p-waarde te berekenen niet valide. 4. Is het zo dat jij statfun_correlationT bedoeld is voor (1) correlaties over trials binnen 1 proefpersoon, en (2) correlaties over proefpersonen (en dus voor het identificeren van individuele verschillen in gedrag die door een neurobiologische variabele verklaard kunnen worden)? groet, Eric

## Arjen Stolk - 2015-10-25 02:12:57 +0100

(In reply to Eric Maris from comment #14) Dear Eric, Thanks for pointing out the need for these clarifications. I'll try to provide my answers to each point as concise as possible: "1. In de help van de functie schrijf je "In case of calculating brain-behavior correlations, ensure the brain data is matched in terms of size and dimensions to the behavioral data, or vice versa.” Dat snap ik niet. De brein-data zijn in de regel hoogdimensioneel (kanalen-bij-frekwenties-bij-tijdspunten) en de gedragsdata bestaan in de regel uit 1 getal per trial of persoon (afhankelijk van of je een een correlatie over trials of over personen berekent)." >> At some point in the code, a correlation is calculated between two vectors or matrices: rho = corr(dat1, dat2, 'type', cfg.type); This requires dat1 and dat2 to be identically sized. It's true that this is not always the case, as you sketch above. The user therefore is required to ensure that they are, for the sake of calculating correlations, for instance by repmat'ing the behavioral data: 1000 1100 800 900 (ms) becomes: 1000 1100 800 900 1000 1100 800 900 1000 1100 800 900 ... as long as is needed to match the matrix size (e.g. in the chan or freq dimension) of the brain data. Alternatively, the user would have to calculate correlations iteratively per chan/freq. "2. In cfg.design wordt geen gedragsvariabele gespecificeerd. Waar dan wel?" >> The behavioral variable is dat1 or dat2. A different approach - but not how it's currently implemented - would be indeed to have the behavioral variable encoded in the design matrix. "3. In cfg.design moeten wel condities gespecificeerd worden. Ben je dan geïnteresseerd in de correlatie tussen de conditie-indicator en de brein-data? Zo ja, indien je voor de Pearson correlatie kiest, dan is de T-statistiek die je berekent identiek aan de independent-samples T-statistiek die door een andere statfun (ft_statfun_indepsamplesT) berekend wordt. Voor andere correlatie types (e.g., Spearman) is de formule die jij gebruikt om een T-statistiek en bijbehorende p-waarde te berekenen niet valide." >> The conditions are still specified in the design matrix, as one would also do when calling the depsamples statfun. No correlation is calculated between the condition indicators/labels and the data. The condition labels in the design matrix are merely used to determine which observation units belong to which variable, and thus how and on which units the correlation is calculated. "4. Is het zo dat jij statfun_correlationT bedoeld is voor (1) correlaties over trials binnen 1 proefpersoon, en (2) correlaties over proefpersonen (en dus voor het identificeren van individuele verschillen in gedrag die door een neurobiologische variabele verklaard kunnen worden)?" >> statfun_correlationT is meant to support both 1 and 2. Hope this helps painting a picture of the functionality of ft_statfun_correlationT. The challenge now is to define a procedure for optimally calculating a randomization distribution (see my one but last comment above). Yours sincerely, Arjen

## Arjen Stolk - 2015-10-25 02:14:49 +0100

p.s. comment #12 summarizes and gives an overview of the current issue, Eric. You could safely ignore the other comments.

## Eric Maris - 2015-10-25 08:34:40 +0100

Hi Arjen, >> At some point in the code, a correlation is calculated between two vectors or matrices: rho = corr(dat1, dat2, 'type', cfg.type); This requires dat1 and dat2 to be identically sized. It's true that this is not always the case, as you sketch above. The user therefore is required to ensure that they are, for the sake of calculating correlations, for instance by repmat'ing the behavioral data: 1000 1100 800 900 (ms) becomes: 1000 1100 800 900 1000 1100 800 900 1000 1100 800 900 ... as long as is needed to match the matrix size (e.g. in the chan or freq dimension) of the brain data. Alternatively, the user would have to calculate correlations iteratively per chan/freq. The same functionality is provided by ft_statfun_indepsamplesRegrT. However, the user has to provide the behavioural variable as a row of the design matrix, who's number has to be specified in cfg.ivar. Thus, the behavioural variable is the independent variable, and it plays the same role as the condition-membership-variable (taking values 1 or 2) in ft_statfun_indepsamplesT. The brain data is the dependent variable. It seems that in statfun_correlationT, 3 instead of 2 variables are specified: brain data (dependent variable), behavioural data (e.g., response time or accuracy), and condition-membership (typically, a binary variable). Unless one has an interest in interaction effects, the researcher has an interest in the relation between the dependent variable and one of the two independent variables (behavioural or condition-membership). If the interest is in a binary variable (which can be behavioural, as with single trial accuracy, or condition-membership), one can use ft_statfun_indepsamplesT; if the interest is in a continuous variable (response time or proportion correct), one can use ft_statfun_indepsamplesRegrT. It also seems that, in statfun_correlationT, the condition-membership variable in cfg.design (row indicated by cfg.ivar) is used to identify which parts of the data must be correlated with each other. So, it seems you use the term "condition" to distinguish between your brain and your behavioural data. This conflicts with standard terminology, in which "condition" is used to denote one of the levels of a categorical independent variable. (Unrelated note to the side, in the help of ft_indepsamplesRegrT, rows and columns are erroneously interchanged: the design must be Nvar X Nreplications instead of Replication X Nvar, as is assumed by the code.) >> The behavioral variable is dat1 or dat2. A different approach - but not how it's currently implemented - would be indeed to have the behavioral variable encoded in the design matrix. That would be more parsimonious. "3. In cfg.design moeten wel condities gespecificeerd worden. Ben je dan geïnteresseerd in de correlatie tussen de conditie-indicator en de brein-data? Zo ja, indien je voor de Pearson correlatie kiest, dan is de T-statistiek die je berekent identiek aan de independent-samples T-statistiek die door een andere statfun (ft_statfun_indepsamplesT) berekend wordt. Voor andere correlatie types (e.g., Spearman) is de formule die jij gebruikt om een T-statistiek en bijbehorende p-waarde te berekenen niet valide." >> The conditions are still specified in the design matrix, as one would also do when calling the depsamples statfun. No correlation is calculated between the condition indicators/labels and the data. The condition labels in the design matrix are merely used to determine which observation units belong to which variable, and thus how and on which units the correlation is calculated. I think I now see where the problems come from. Random permutation involves ramdom re-assignment of replications to conditions (which are specified in the cfg.ivar-th row of cfg.design). However, you consider the conditions to correspond to your 2 data types (brain data and behavioural data) and so you are effectively mixing brain and behavioural data as a part of the random permutation scheme that generates your reference distribution. I can think of no sensible null hypothesis that can be tested with this approach. For testing whether some behavioural variable is correlated with brain data, one should break the association between these 2 variables, and this is achieved by randomly permuting the indices of the behavioural variable (which might also be a condition-membership indicator; note, with the term "condition" used in its standard meaning, not the meaning you use). These problems can be solved by being clear about your independent variable, which is a behavioural one in your case. That variable goes in cfg.design, and therefore you do need this row with 1s and 2s (which you call "conditions") to determine what is brain and what is behavioural data. This would be my advice: 1. For a between-UO (UO=unit-of-observation) design (one subject with multiple trials for which also behavioural responses were collected, or multiple subjects characterised by trial-averaged brain data and a behavioural measure such as trial-averaged accuracy or RT) : use ft_statfun_indepsamplesT (for a dichotomous behavioural variable) of ft_statfun_indepsamplesregrT (for a continuous behavioural variable) 2. For a within-UO design (multiple subjects characterised by trial-averaged brain data in multiple conditions for which you also have a behavioural measure such as trial-averaged accuracy or RT) : use ft_statfun_depsamplesregrT. Note that the latter statfun requires cfg.uvar to be specified, with which the subjects are identified, allowing for a restricted permutation in which the brain and behavioural data are randomly re-assigned WITHIN each of the subjects (which are the UOs here). The FT randomisation/permutation code takes care of that. 3. Improve the documentation, starting with the help of the different functions. We could also collaborate on a tutorial. best, Eric

## Arjen Stolk - 2015-10-25 20:25:58 +0100

Thanks, Eric. "Random permutation involves ramdom re-assignment of replications to conditions (which are specified in the cfg.ivar-th row of cfg.design). However, you consider the conditions to correspond to your 2 data types (brain data and behavioural data) and so you are effectively mixing brain and behavioural data as a part of the random permutation scheme that generates your reference distribution." > This is indeed what is going on here. "I can think of no sensible null hypothesis that can be tested with this approach. For testing whether some behavioural variable is correlated with brain data, one should break the association between these 2 variables, and this is achieved by randomly permuting the indices of the behavioural variable." > Could you elaborate a bit, for my own understanding, on why breaking the association between two variables is crucial. Essentially, one does exactly this - mixing two different data distributions - with an independent t-test, assessing 'whether replications under two conditions are interchangeable or not'. In sum, my take of your proposal is to have the behavioral variable encoded in cfg.design, similarly to the approached used for the regr statfun family. I see this works when one wants to indeed perform correlations between a behavioral variable and different aspects of neural observations (e.g with dimord rpt_chan, or subj_chan). But what would you recommend for situations where one wants to correlate neural data with other neural data, e.g. rpt_chan with rpt_chan, without having to do this iteratively - admittedly I haven't encountered these types of situations myself yet? To briefly address another point: the regr statfun family does not support calculating spearman, or other non-Pearson correlation types, justifying the existence of ft_statfun_correlationT? It would be great to have all your above considerations, and hopefully more to come, documented on a dedicated page of the wiki (e.g, a FAQ, or more). Yours, Arjen

## Eric Maris - 2015-10-25 21:19:16 +0100

Hi Arjen, > Could you elaborate a bit, for my own understanding, on why breaking the association between two variables is crucial. Essentially, one does exactly this - mixing two different data distributions - with an independent t-test, assessing 'whether replications under two conditions are interchangeable or not'. >>It's not the t-test that test for "interchangeable" (the proper term is "exchangeable") observation/replication, it's the permutation test that does this, and it does so using every test statistic. In sum, my take of your proposal is to have the behavioral variable encoded in cfg.design, similarly to the approached used for the regr statfun family. I see this works when one wants to indeed perform correlations between a behavioral variable and different aspects of neural observations (e.g with dimord rpt_chan, or subj_chan). But what would you recommend for situations where one wants to correlate neural data with other neural data, e.g. rpt_chan with rpt_chan, without having to do this iteratively - admittedly I haven't encountered these types of situations myself yet? >>In my experience, behavioural data come in the form of one scalar per trial (rpt) in single-subject studies, and one scalar (proportion correct, average RT) per subject in multiple-subject studies. I find it hard to conceive situation where the behavioural data have a spatal, spectral or temporal dimension, as in brain data. To briefly address another point: the regr statfun family does not support calculating spearman, or other non-Pearson correlation types, justifying the existence of ft_statfun_correlationT? It would be great to have all your above considerations, and hopefully more to come, documented on a dedicated page of the wiki (e.g, a FAQ, or more). >>It's no problem to add rank correlations to the existing statfuns, the problem is how to calculate their p-values and critical values. For the Pearson correlation, we have an exact solution (it's use in station_correlationT), but this solution is not valid for the non-Pearson correlations. >>I think we should also write a contribution to the FT discussion list, clarifying the issues that were reported by some FT users. >>Maybe you also want to contribute to wiki page? best, Eric

## Arjen Stolk - 2015-10-26 00:32:43 +0100

Dear all, Here's s point-wise update: 1) Implementation 1.1 - Following the discussion above, ft_statfun_correlation.m now requires the predictor variable to be encoded in the design matrix, similarly to the approach in the reg statfun family. Consequently, using cfg.resampling = 'permutation' (the default) produces a randomization distribution based on shuffling replications of the predictor variable and that is no longer prone to systematic bias (see point 2). 1.2 - Also following the above, the default correlation is now set to 'Pearson'. Consequently, the default functionality is identical to ft_statfun_indepsamplesregrT. I have actually tested this: ft_statfun_correlationT with cfg.correlation = 'Pearson' produces the same T value as ft_statfun_indepsamplesregrT. It remains to be documented/investigated how the other correlation types may affect the randomization distribution (see point 3). 1.3 - According to the above changes, I have adjusted the test function of ft_statfun_correlationT. 1.4 - Updated a sentence in ft_statfun_indepsamplesregrtT's function help: "design contains the independent variable, Nvar X Nreplications". 1.5 - Fixed a matlab '&' statement in the same function. svn commit ft_statfun_correlationT.m -m 'enhancement: overhaul of ft_statfun_correlationT according to bug report 2992' Sending ft_statfun_correlationT.m svn commit test_ft_statfun_correlationT.m -m 'test function adjustment according to bug report 2992' Sending test_ft_statfun_correlationT.m svn commit ft_statfun_indepsamplesregrT.m -m 'documentation change of ft_statfun_indepsamplesregrT according to bug report 2992' Sending ft_statfun_indepsamplesregrT.m 2) Validation See pdf and code in the up-following comment: shuffling replications of the predictor variable produces a randomization distribution that is no longer prone to systematic bias. This is expected behavior following the above comments in this bug report. 3) Documentation I've opened a wiki page where we can collect the considerations made here: http://www.fieldtriptoolbox.org/faq/what_statfun_should_i_use_for_my_design Hopefully this page should be able to guide the user to which statfun to select under which circumstances, with some example high-level ft code. It would be nice if this page would address the following issues: 3.1 - Which statfun should I choose? 3.2 - How to perform regression statistics with fieldtrip? 3.3 - Why is it important to break the association between two variables when creating a randomization distribution? 3.4 - Why is it problematic to calculate p values for rank-band correlations? Upon completion, we could send out an update to the mail list, stating the ft version containing the updated versions of the statfuns (incl. corrT and regrT). Yours, Arjen

## Arjen Stolk - 2015-10-26 00:34:06 +0100

Created attachment 750 randomization distributions %% simulate multiple single-subject rand timelock structures n1 = 20; % n1 is the number of subject data1 = rand(n1,1); data2 = rand(n1,1); % offsets offsets = [0, 1, 10]; for o=1:numel(offsets) offset = offsets(o); data_brain = []; for j=1:n1 data_brain{j}.avg = data1(j); data_brain{j}.dimord = 'chan_time'; data_brain{j}.label = {'1'}; data_brain{j}.time = [1]; end data_behav = data2+offset; % inserted in the cfg.ivar-th row of cfg.design % compute statistics with correlationT cfg = []; cfg.statistic = 'ft_statfun_correlationT'; % ft_statfun_correlationT vs. ft_statfun_indepsamplesregrT (produce same t value) cfg.method = 'montecarlo'; cfg.numrandomization = 1000; design(1,1:n1) = data_behav; cfg.design = design; cfg.ivar = 1; stat(o) = ft_timelockstatistics(cfg, data_brain{:}); % ensure line 338 'stat.statkeep(:,i) = statrand;' is uncommented in ft_statistics_montecarlo end figure; for o = 1:numel(offsets) hold on; histogram(stat(o).statkeep) end legend(num2str(offsets(:))) xlabel('t vals') ylabel('counts')

## Eric Maris - 2015-11-02 07:49:59 +0100

I changed the title of the FAQ page because I found the previous title too general. I would like to focus on how to deal with quantitative stimulus and behavioural variables. Stuff like the use of cfg.var should go in a different FAQ. After changing the title the link with the associated page was broken. I don't know how to fix it. Arjen? best, Eric

## Arjen Stolk - 2015-11-02 08:06:04 +0100

(In reply to Eric Maris from comment #22) No problem, we can simply copy the content from that previous page onto the page you created: http://www.fieldtriptoolbox.org/faq/how_can_i_test_for_correlations_between_neuronal_data_and_quantitative_stimulus_and_behavioural_variables

## Eric Maris - 2015-11-02 20:44:02 +0100

I don't see the button that is needed to create the new page of which I have specified the title. Why?

## Arjen Stolk - 2015-11-02 20:46:47 +0100

(In reply to Eric Maris from comment #24) Yeah, it's hard to find. Top left screen: Page tools > create this page

## Eric Maris - 2015-11-03 13:04:41 +0100

I have contributed four sections to the FAQ on statistical testing of relation between neurobiological signal and quantitative independent variables. These section mainly focus on conceptual aspects. I propose that someone else completes this FAQ by contributing concrete advise on what to do in a number of typical cases. best, Eric

## Egbert Hartstra - 2015-11-07 21:18:55 +0100

Dear Eric, Thank you for creating the FAQ. Arjen and I have worked out some concrete examples including example code of using different statfuns in these cases. I have added them to the FAQ page under the section Examples. Eric would you mind to look at them to double check if everything is correct? Best, Egbert

## Eric Maris - 2015-11-07 22:21:13 +0100

Hi Egbert & Arjen, For me, the FAQ looks good. Does one of you want to write a contribution for the Fieldtrip discussion list, referring to the earlier discussions, and referring to the new FAQ? best, Eric

## Arjen Stolk - 2015-11-08 03:17:38 +0100

Hi both, This has become a great wiki page. I think the combination of the overview plus the examples make it very strong, and certainly a page worth forwarding the user to for statistics with fieldtrip. It's true that we still need to update the users on the mailinglist on how we have identified and dealt with the issue of correlationT, and more generally on how we have improved the documentation accordingly. I'll make a draft. Arjen

## Arjen Stolk - 2015-11-08 05:23:50 +0100

A copy of the email: Dear participants in the discussion on behavioural-power correlation, and interested folks, Following recent discussion on this mailing list (thanks to Xiaoming Du and Martin Krebber), we have updated ft_statfun_correlationT, a function that can be used for correlating neural and behavioral variables. Following the update, the correlation values calculated on genuine data have not changed. However, the permutation procedure for calculating the randomization distribution has. Namely, prior to the update the permutation procedure would randomly permute across both the independent (e.g., behavior) and dependent variables (e.g., neural data). This procedure is prone to systematic bias across the data belonging to these variables. And conceptually, as outlined in a new wiki page (see below), the independent and dependent variables should be statistically independent, meaning that any association between these variables should be broken by randomly permuting the values of the independent variable. http://www.fieldtriptoolbox.org/faq/how_can_i_test_for_correlations_between_neuronal_data_and_quantitative_stimulus_and_behavioural_variables? Those that have been using ft_statfun_correlationT for calculating a randomization distribution using the permutation procedure are advised to update to the latest fieldtrip version and re-calculate those distributions. We are sorry for any inconvenience this may cause. On a related note, the functionality of ft_statfun_correlationT (under Pearson) is highly similar to that of ft_statfun_indepsamplesregrT. To make the latter function, and others in the statfun suite, more accessible, we would like to forward those interested to the above wiki page where an overview is provided of the different approaches to correlating neural and behavioral variables, with some example fieldtrip code. Yours, Arjen on behalf of Eric Maris and Egbert Hartstra