Back to the main page.
Bug 1315 - *something* in convert2raw or raw2timelock or timelock2raw ruins time-axis
Status | CLOSED FIXED |
Reported | 2012-02-08 10:43:00 +0100 |
Modified | 2014-01-29 13:28:44 +0100 |
Product: | FieldTrip |
Component: | core |
Version: | unspecified |
Hardware: | PC |
Operating System: | Windows |
Importance: | P3 normal |
Assigned to: | Eelke Spaak |
URL: | |
Tags: | |
Depends on: | |
Blocks: | |
See also: |
Jörn M. Horschig - 2012-02-08 10:43:17 +0100
Found by Ana: >> erf_planar erf_planar = time: [1x1800 double] label: {548x1 cell} grad: [1x1 struct] trialinfo: [148x5 double] cfg: [1x1 struct] trial: [148x548x1801 double] sampleinfo: [148x2 double] dimord: 'rpt_chan_time' I got her data and will try to figure out why .time has different dimension that the last dim of .trial
Johanna - 2012-02-08 10:51:09 +0100
is it related to bug 498?
Jörn M. Horschig - 2012-02-08 12:49:53 +0100
A little bit, yes. The problem is caused by ft_checkdata, line 1711, where the sampling frequency is estimated using: data.fsample = 1/(data.time{seln}(2)-data.time{seln}(1)); As Roemer earlier said, the sampling frequency could be better approximated by mean(diff(data.time{seln})), which seems to be numerical more precise. However, due to the double conversion when calling ft_megplanar and ft_combineplanar with timelock data (actually it's two times timnelock2raw and raw2timelock), the numerical accuracy get's reduced even more, and after ft_combineplanar, fsample is not 1200 but: K>> fprintf('%.16f\n', data.fsample) 1199.9999999999745000 - hacky solution: round fsample to using the last, say, 8 digits - work intense solution: re-add fsample more to be discussed in 11 minutes
Jörn M. Horschig - 2012-02-08 12:53:18 +0100
oh, just to add to this: ft_checkdata is calling raw2timelock and reconstructs the time-axis in line 1658: tmptime = min(begtime):(1/data.fsample):max(endtime); try yourself at home: -0.5:1/1200:1 gives 1801 points -0.5:1/1199.9999999999745000:1 gives 1800 points adding to more nines solves the problem, btw ^^
Jörn M. Horschig - 2012-02-08 15:58:41 +0100
spike2raw actually uses fsample, but requires fsample as an extra input: % SPIKE2RAW converts a point representation SPIKE % structure to a continuous representation DATA structure. % % Use as % [data] = ft_spike2raw(spike,fsample) % % Inputs: % Data is a standard RAW structure. % % fsample is the sampling rate for the continuous representation in RAW I think it should be made consistent, or should I not care about this?
Jörn M. Horschig - 2012-02-08 15:59:18 +0100
Hi JM, please see my previous question
Jan-Mathijs Schoffelen - 2012-02-08 16:38:05 +0100
hmmm, perhaps leave this as is for the time being. I am not familiar with the spike datatype, and I always thought that this one is a tricky one and may not be forced into a fieldtrip straightjacket.
Jörn M. Horschig - 2012-02-08 16:59:04 +0100
I try to be short and concise, but this is a lot of info: The old way of defining time-axis was by computing begin- and endsample: begsmp = round((begtime - min(begtime)) * data.fsample) + 1; endsmp = round((endtime - min(begtime)) * data.fsample) + 1; Without .fsample, this is not possible. Instead, we need to rely on the time-axis of the data, solely. Due to possible rounding error on the time-axis, I came up with this, for computing a common time-axis: tmptime = unique(round(10^9*cell2mat(data.time))); This rounds to up to 9 digits. The individual time-axis can then be recreated by doing similar: % concatenate all trials tmptrial = zeros(ntrial, nchan, length(tmptime)) + nan; for i=1:ntrial begsmp = find(round(10^9*data.time{i}(1)) == tmptime); endsmp = find(round(10^9*data.time{i}(end)) == tmptime); tmptrial(i,:,begsmp:endsmp) = data.trial{i}; end Good? All isequal matching suffer from numerical inaccuracies, so I think we need some kind of rounding as proposed here.
Jan-Mathijs Schoffelen - 2012-02-09 09:18:49 +0100
In the whole numerical accuracy discussion that pops up now and then, a function which is always mentioned in the end is nearest (which is a private function). Although the rounding may solve it for the time being, there is no guarantee that in the future someone will sample at the < nanosecond scale, and then using 10^e breaks down. Regarding nearest: it returns the index to the entry in an array, that is closest to the specified value. I think a principled solution would be to use nearest, to determine the offset, i.e. for i = 1:numtrl ix(i) = nearest(data.time{i}, 0); % number of samples pre-zero iy(i) = numel(data.time{i}) - ix(i); % number of samples post-zero (FIXME give or take 1 sample, not thought through yet) end % now we need to sort out what is to be done when all(ix==1) or all(iy==1), meaning that the 0 time point occurs outside all time axes % but if the above issue is ironed out, we can determine the required length of the time axis as: [mx,ix2] = max(ix); [my,iy2] = max(iy); nsmp = mx+my; %FIXME again give or take 1 sample, and tlck.time = nsmp.*(-mx:my)./(data.time{iy2}(my)-data.time{ix2}(mx)); Bottom line: could we use the nearest function to determine begsmp and endsmp?
Jörn M. Horschig - 2012-02-09 10:12:15 +0100
yep, I also thought about this - I'm gonna this and compare the two different approaches (and prefer the latter if both work out equally well)
Jörn M. Horschig - 2012-02-09 15:35:39 +0100
I am not sure how nearest is implemented, but I guess it substracts the scalar from the vector and takes the minimum, right? I wondered a bit that it works with inf, but since it does, that should do it: for i = 1:ntrial ix(i) = nearest(data.time{i}, -inf); % most negative sample iy(i) = nearest(data.time{i}, inf); % most positive end [mx,ix2] = min(ix); %most negative sample [my,iy2] = max(iy); %most positive sample nsmp = mx+my-1; % prob: tmptime = linspace(data.time{ix2}(mx), data.time{iy2}(my), nsmp); Then we don't have to think about how to account for zeros outside the interval. Using 'nearest' we can also solve finding the corresponding beg- and endsamples: % concatenate all trials tmptrial = zeros(ntrial, nchan, length(tmptime)) + nan; for i=1:ntrial begsmp(i) = nearest(tmptime, data.time{i}(1)); endsmp(i) = nearest(tmptime, data.time{i}(end)); tmptrial(i,:,begsmp(i):endsmp(i)) = data.trial{i}; end Additionally, I'm gonna make a testscript.
Jörn M. Horschig - 2012-02-10 09:05:44 +0100
190 $ svn ci utilities/ft_checkdata.m -m "bugfix-#1315- fsample is deprecated and not needed anymore when transforming from and to raw data" Sending utilities/ft_checkdata.m Transmitting file data . Committed revision 5265. 192 $ svn add test_bug1315.mat test_bug1315.m test_ft_checkdata.m A (bin) test_bug1315.mat A test_bug1315.m A test_ft_checkdata.m jorhor@mentat312:~/FieldTrip/trunk/test 193 $ svn ci -m "added testscript and -data for bug1315 and a more general testscript for ft_checkdata - over&out" Adding test/test_bug1315.m Adding (bin) test/test_bug1315.mat Adding test/test_ft_checkdata.m Transmitting file data ... Committed revision 5267.
Jörn M. Horschig - 2012-02-10 09:55:58 +0100
ah, damn it... the whole thing breaks with pretty strange time-axis like: data.time{1} = [-2, -1] data.time{2} = [3, 4] Unfortunately, I see no way to solve this without computing some kind of fsample...
Jan-Mathijs Schoffelen - 2012-02-10 09:59:49 +0100
I was afraid indeed that this would happen. Should we consider enforcing the time axes to have some minimal overlap?
Jörn M. Horschig - 2012-02-10 11:33:08 +0100
How to? And does it make sense for all analysis?
Jan-Mathijs Schoffelen - 2012-02-10 11:43:10 +0100
I guess this is not really a good option. Could we think of enhancing the functionality of nearest() so that it optionally outputs index values that are outside the scope of the input vector? (i.e. negative numbers, or numbers>numel(inputvector)?
Jörn M. Horschig - 2012-02-10 12:04:17 +0100
you mean some function that 'interpolates' up/down to 0 (or whatever scalar) and returning the theoretical distance in samples? I think this should be different from 'nearest', but could be done. In essence, it would also need to estimate sampling frequency somehow
Jörn M. Horschig - 2012-02-10 12:04:36 +0100
uh, extrapolate I mean
Jörn M. Horschig - 2012-02-10 13:59:02 +0100
okay, this is a bit hacky, but on the one hand the best I could come up with the last two hours and on the other hand I couldn't find a way to break it: begtime = cellfun(@min,data.time); endtime = cellfun(@max,data.time); for i = 1:ntrial mint = min(eps, begtime(i)); maxt = max(eps, endtime(i)); time = data.time{i}; % extrapolate so that we get near 0 time = interp1(time, time, mint:mean(diff(time)):maxt, 'linear', 'extrap'); ix(i) = sum(time<0); % number of samples pre-zero iy(i) = sum(time>=0); % number of samples post-zer end rest see above. Further, I extended the test-scrpt is that FieldTrip appropriate? (I'm comitting this anyway for now, since it is more stable than the current version) While working on this, I also found something other odd, so there might (again) be slight numerical issues: x = -1.5:0.22:3; y = linspace(min(x), max(x), numel(x)); sum(x==y) / numel(x) ans = 0.5714 Matlab rocks ;)
Jan-Mathijs Schoffelen - 2012-02-10 15:46:17 +0100
funny things indeed. I guess this is a good fix, albeit hacky ;-). Does the interpolation affect the outcome if 0 is contained within the time axis? (I guess not). Which test script are you referring to?
Jörn M. Horschig - 2012-02-10 16:15:39 +0100
I made test_ft_checkdata. Things that are tested: % make some raw data with unequal time-axis, including 0 implicitly data.time{1} = [-2 -1]; data.time{2} = [3 4]; tmp = ft_checkdata(data, 'datatype', 'timelock'); % make some raw data with unequal time-axis, excluding 0 data.time{1} = [-1.5 -1.28]; data.time{2} = [2.68 2.9]; tmp = ft_checkdata(data, 'datatype', 'timelock'); % make some raw data with unequal time-axis, including 0 data.time{1} = -.5:.25:1; data.time{2} = -.25:.25:.25; data.time{3} = .25:.25:1; data.time{4} = -.5:.25:-.25; tmp = ft_checkdata(data, 'datatype', 'timelock'); %% shift time axis to be strictly positive for i=1:numel(data.time) data.time{i} = data.time{i} + 0.6; end tmp = ft_checkdata(data, 'datatype', 'timelock'); %% shift time axis to be strictly negative for i=1:numel(data.time) data.time{i} = data.time{i} - .6 - 1.1; end tmp = ft_checkdata(data, 'datatype', 'timelock'); %% make time-axis incredibly small for i=1:numel(data.time) data.time{i} = (data.time{i}) ./ (10^-12); end tmp = ft_checkdata(data, 'datatype', 'timelock'); %% make time-axis awesomly huge for i=1:numel(data.time) data.time{i} = data.time{i} .* (10^12); end tmp = ft_checkdata(data, 'datatype', 'timelock'); I think this should cover all basic things that might go wrong
Jörn M. Horschig - 2012-02-10 16:16:51 +0100
oh, and there is test_bug1315 with Ana's data now
Jörn M. Horschig - 2012-02-13 14:08:18 +0100
no complaints so far, so I assume it's running smoothly
Jan-Mathijs Schoffelen - 2012-02-13 14:08:46 +0100
hurrah
Jörn M. Horschig - 2012-08-23 14:02:12 +0200
bug closing time (http://www.youtube.com/watch?v=xGytDsqkQY8)
Johanna - 2012-10-19 13:21:54 +0200
test_bug1315 crashes. error occurs in call to ft_checkdata.
Jörn M. Horschig - 2012-10-19 14:44:23 +0200
Afair, JM encountered this one in the bug bings. The reason is that megplanar adds channellabels not present in the data but in the grad. ft_senstype then determines the sensor type based on the relative number of the channel labels (>80%). If combineplanar is called with a subset <20%, then >80% of 'wrong' channel labels are added. So, the first question to ask is whether megplanar should add channel labels of not present channels? that's line 284 in ft_megplanar: interp = ft_apply_montage(data, planarmontage, 'keepunused', 'yes', 'feedback', cfg.feedback);
Eelke Spaak - 2012-11-28 14:24:31 +0100
see bug 1858 for the 'new' issue of incorrectly identifying planar gradient data