Back to the main page.
Bug 1735 - ft_appenddata doesn't append data correctly --> ft_databrowser and ft_fetch_data failure
Status | CLOSED FIXED |
Reported | 2012-09-21 16:42:00 +0200 |
Modified | 2014-02-24 10:56:37 +0100 |
Product: | FieldTrip |
Component: | core |
Version: | unspecified |
Hardware: | PC |
Operating System: | Linux |
Importance: | P3 major |
Assigned to: | Robert Oostenveld |
URL: | |
Tags: | |
Depends on: | |
Blocks: | |
See also: |
Ayelet N. Landau - 2012-09-21 16:42:25 +0200
When appending data1 (trials count 33) and data2 (trial count 46) and get data3 with 79 trials. trials 1-33 are data1 trials 34-66 are data1 (!) and the remaining trials are the last 32 trials from data2. This is very strange, and I look forward to discovering that it is some utterly silly user error on my behalf. the way I assessed this is by going trial by trial in ft_databrowser. attached is a mat file that has two data structures - and another - called test that has the appended version. in data browser you can go to trial 1 and then trial 34 and see that they are identical. you can of course browse the parent structures. last - it would be nice if trialinfo were concatinated too when doig ft_appenddata.
Ayelet N. Landau - 2012-09-21 16:43:59 +0200
Created attachment 317 a file with two parent structurse and 'test' a ft_appenddata output of those variable.
Robert Oostenveld - 2012-09-22 14:22:29 +0200
Thanks for the test data. What is weird is that I cannot reproduce the exact problem: >> test1 = ft_appenddata([], data_trgtOnstimOnBTrgtLDetected_sourceL, data_trgtOnstimOnBTrgtLNotDetected_sourceL) input dataset 1, 1 channels, 33 trials input dataset 2, 1 channels, 46 trials concatenating the trials over all datasets test1 = hdr: [1x1 struct] fsample: 100 sampleinfo: [79x2 double] trial: {1x79 cell} time: {1x79 cell} label: {'sourceECCgammaL_x'} cfg: [1x1 struct] which looks fine, whereas the one in the mat file is >> test test = hdr: [1x1 struct] fsample: 100 sampleinfo: [92x2 double] trial: {1x92 cell} time: {1x92 cell} label: {'sourceECCgammaL_x'} cfg: [1x1 struct] The main function used appears to be the same >> test1.cfg.version ans = name: '/Users/robert/matlab/fieldtrip/ft_appenddata.m' id: '$Id: ft_appenddata.m 5554 2012-03-28 14:00:12Z roboos $' >> test.cfg.version ans = name: '/opt/fieldtrip/ft_appenddata.m' id: '$Id: ft_appenddata.m 5554 2012-03-28 14:00:12Z roboos $'
Robert Oostenveld - 2012-09-22 14:29:35 +0200
Using "isequal" I found something suspicious. Note that test is yours, test1 is mine. >> test.cfg.previous{1}.callinfo ans = matlab: '7.12.0.635 (R2011a)' proctime: 0.2545 procmem: 0 calltime: [2012 9 21 15 46 33.7852] user: 'landaua' hostname: 'esi-svhpc10' pwd: '/mnt/hps/home/landaua/megData/twoG_static/S01_to_S05_trgtanalysis/Backup_24.8/S01_AB' >> test.cfg.previous{2}.callinfo ans = matlab: '7.12.0.635 (R2011a)' proctime: 0.2545 procmem: 0 calltime: [2012 9 21 15 46 33.7852] user: 'landaua' hostname: 'esi-svhpc10' pwd: '/mnt/hps/home/landaua/megData/twoG_static/S01_to_S05_trgtanalysis/Backup_24.8/S01_AB' >> test1.cfg.previous{1}.callinfo ans = matlab: '7.12.0.635 (R2011a)' proctime: 0.2006 procmem: 0 calltime: [2012 9 21 15 46 30.2574] user: 'landaua' hostname: 'esi-svhpc10' pwd: '/mnt/hps/home/landaua/megData/twoG_static/S01_to_S05_trgtanalysis/Backup_24.8/S01_AB' >> test1.cfg.previous{2}.callinfo ans = matlab: '7.12.0.635 (R2011a)' proctime: 0.2545 procmem: 0 calltime: [2012 9 21 15 46 33.7852] user: 'landaua' hostname: 'esi-svhpc10' pwd: '/mnt/hps/home/landaua/megData/twoG_static/S01_to_S05_trgtanalysis/Backup_24.8/S01_AB' Looking at these, I'd say that you have appended two copies of the same dataset. To zoom in, have a look at the actual data structures >> data_trgtOnstimOnBTrgtLNotDetected_sourceL.cfg.callinfo.proctime ans = 0.2545 >> data_trgtOnstimOnBTrgtLDetected_sourceL.cfg.callinfo.proctime ans = 0.2006 Your previous{1} and previous{2} are both equal to data_trgtOnstimOnBTrgtLDetected_sourceL, which has 46 trials, which means that 2*46=92 makes sense. >> isequal(test.trial{1}, test.trial{47}) ans = 1 I'll close it as an invalid bug, if you disagree, please reopen.
Robert Oostenveld - 2012-09-22 14:32:44 +0200
I added it as test script for future reference. mbp> svn commit Adding test/test_bug1735.m Adding (bin) test/test_bug1735.mat Transmitting file data .. Committed revision 6513.
Ayelet N. Landau - 2012-09-28 11:31:03 +0200
I think I made some progress on this bug diagnosis: I am not sure anymore that it is appenddata but rather ft_databrowser. to witness it, I propose you run ft_appenddata on the two variables, and then open the data browser. go to the trial number that should have the first in the second appended variable and compare it to another data browser where you display the first trial. those are displayed as identical. When I plot the data directly from the structure (appended) - it actually does have the new data. I really don't know what indices ft_databrowser uses. The reality is that in an appended dataset, it fails to count up to the correct trial to display best, Ayelet
Robert Oostenveld - 2012-10-01 16:31:52 +0200
(In reply to comment #5) I have extended the test script to --------------- part1 = data_trgtOnstimOnBTrgtLDetected_sourceL; part2 = data_trgtOnstimOnBTrgtLNotDetected_sourceL; cfg = []; combined = ft_appenddata(cfg, part1, part2); assert(length(combined.trial)==79); N1 = size(part1.sampleinfo,1); N2 = size(part2.sampleinfo,1); for i=1:N1 disp(i) assert(isequal(part1.trial{i}, combined.trial{i})); end for i=1:N2 disp(i) assert(isequal(part2.trial{i}, combined.trial{i+N1})); end % the data in the two conditions should be different for i=1:N1 for j=1:N2 assert(~isequal(part1.trial{i}, part2.trial{j})); end end --------------- With this test on the (non) equality of the combined trials to the respective part1 and part2, no errors were found. manzana> svn commit test_bug1735.m Sending test_bug1735.m Transmitting file data . Committed revision 6611. I will try with the databrowser.
Robert Oostenveld - 2012-10-01 16:35:20 +0200
Created attachment 333 screenshot of databrowser that shows the problem yup, this indeed reveals a problem. See attached screenshot with part1, part2 and in the figure with the combined data the 33rd trial, which is expected to be the 1st from part2 but seems to be the 1st of the part1. I guess it is a problem with sampleinfo...
Robert Oostenveld - 2012-10-01 17:09:22 +0200
(I added Roemer to the CC, as I expect he might know more about this) >> combined.sampleinfo ans = 1 51 52 102 103 153 154 204 205 255 256 306 307 357 358 408 409 459 ... 1531 1581 1582 1632 1633 1683 1 51 % this is where part2 starts 52 102 103 153 ... I have added the following to the test script try ft_fetch_data(combined, 'begsample', 1, 'endsample', 50, 'allowoverlap', false); status = false; % an error should be given by ft_fetch_data catch status = true; end assert(status, 'ft_fetch_data did not detect the error properly'); I detected two probems in ft_fetch_data: 1) allowoverlap did not get a default value, causing it to be empty, which is neither true nor false. 2) the short-cut in detecting that it falls inside one trial cause the overlap detection to fail manzana> svn commit utilities/ft_fetch_data.m test/ Sending test/test_bug1735.m Sending utilities/ft_fetch_data.m Transmitting file data .. Committed revision 6612.
Robert Oostenveld - 2012-10-01 17:12:25 +0200
(In reply to comment #8) With the combined dataset, ft_databrowser now prints in the command window Warning: samples present in multiple trials, using only the last occurence of each sample > In ft_fetch_data at 125 In ft_databrowser>redraw_cb at 1383 In ft_databrowser>keyboard_cb at 1175 That is some information to Ayelet that there is a problem. However, in my impression this data should not have been allowed to be plot-able with ft_databrowser in any case. Why does it allow for 100% overlap? This "feature" of ft_databrowser requires discusison. I suggest to do it in the next FT meeting on Wednesday.
Robert Oostenveld - 2012-10-01 17:15:02 +0200
added Jorn and Eelke to CC, removed Johanna.
Robert Oostenveld - 2012-10-01 17:27:46 +0200
(In reply to comment #10) Oh, forgot to mention that that is to your apparent involvement to this function according to http://code.google.com/p/fieldtrip/source/list?path=/trunk/utilities/ft_fetch_data.m
Roemer van der Meij - 2012-10-01 17:46:44 +0200
Sorry that this caused elaborate investigation. The overlap possibillity was added in order to browse segmented data where the last bit of a trial can correspond the first bit of the next trial, i.e. overlapping trials. The only ambiguity here would have been due to data at the edges being different because of filtering artifacts, or other trial specific preprocessing (e.g. demeaning). I added the key-value as exception management. It never occurred to me that there would be a situation where the same number in sampleinfo refers to data of different origin. Now that I think of it, is this intended behaviour? Doesn't it make sense to create a 'non-decreasing' sampleinfo when appending? The true meaning of sampleinfo is lost anways. Let's discuss.
Robert Oostenveld - 2014-02-24 09:08:34 +0100
(I am cleaning up old bugs) This seems to have been resolved, the test script runs fine. If you disagree, please reopen this bug and specify what is still problematic.