Back to the main page.
Bug 1169 - can low-level fileio know how to read only requested channels?
Status | CLOSED FIXED |
Reported | 2011-11-18 10:31:00 +0100 |
Modified | 2012-04-11 16:48:27 +0200 |
Product: | FieldTrip |
Component: | fileio |
Version: | unspecified |
Hardware: | PC |
Operating System: | Windows |
Importance: | P4 enhancement |
Assigned to: | Robert Oostenveld |
URL: | |
Tags: | |
Depends on: | |
Blocks: | |
See also: |
Johanna - 2011-11-18 10:31:28 +0100
In ft_read_data, some data types have 'chanindx' as input to the low-level function (e.g. read_biosig_data), where as others do not (e.g. read_brainvision_eeg). However, even within read_biosig_data, it seems all the data is read in first, then the subset of channels selected, thus it's really no different in either case. (i.e. In both cases, all channels are read in first, then sub-selected later). Sarang Dalal reported this, as he wants to read in only 1-2 channels of overnight sleep day (8 hours of recording) at once. Thus for memory reasons, it does not work the way the code exists, by first reading in all channels then immediately subselecting. Thus: can low-level fileio be modified to read only requested channels?
Robert Oostenveld - 2011-11-18 11:03:39 +0100
depending on the low-level format selecting channels is feasible or not. E.g. in a multiplexed file representation the overall reading speed does not benefit from reading one sample, doing an fseek, reading one sample, doing an fseek... Another reason is that some low-level readers don't support it and I did not bother implementing it in the low-level code. The point about the memory is well taken and has been previously identified. For some formats you should be able to recognize a code structure like for i=1:nblocks tmp = read_block_with_all_channels dat(:,sampleselection) = tmp(chanselection,:); end which reads short snippets with all channels and only accumulates the selected channels. This particular issue is general for multiple file formats, therefore preferably we would make a general solution (just like the general solution that is in place for those low-level readers returning the the data with a dimord that is not chan_time).
Johanna - 2011-11-18 11:11:10 +0100
I have requested Sarang to get a bugzilla account, and mentioned your suggestion to him. I also suggested to him something similar, to be called at the higher level, e.g. call ft_preprocesing for smaller time windows (e.g. 30 mins), then use ft_appenddata.
Boris Reuderink - 2011-11-18 15:17:14 +0100
Changed bug to UNCONFIRMED since it has not been assigned to someone yet. Then, due to my nasty new bug lifecycle, I'll change it to NEW.
Boris Reuderink - 2011-11-18 15:17:35 +0100
And NEW it is (confirmed by Robert).
Robert Oostenveld - 2011-12-08 17:10:09 +0100
I have merged the sections for the int16, int32 and float32 using something like this if strcmpi(hdr.DataFormat, 'binary') && strcmpi(hdr.DataOrientation, 'multiplexed') && any(strcmpi(hdr.BinaryFormat, {'int_16', 'int_32', 'ieee_float_32'})) switch lower(hdr.BinaryFormat) case 'int_16' sampletype = 'int16'; samplesize = 2; case 'int_32' sampletype = 'int32'; samplesize = 4; case 'ieee_float_32' sampletype = 'float32'; samplesize = 4; end % case ... ans used a variant of the code Sarang sent me to read only the selected channels. -------------------------------------------------------- manzana> svn commit Sending fileio/ft_read_data.m Sending fileio/private/read_brainvision_eeg.m Transmitting file data .. Committed revision 4958.
Robert Oostenveld - 2011-12-08 17:16:17 +0100
@Sarang: in the commit I just made I have not changed the sparse. It works like this for me (matlab2010b on OS X) and I have never heard of anyone else having problems with it. The reason for using "sparse" is that it makes the multiplication with the diagonal matrix much faster, because MATLAB knows that only each row needs to be multiplied by each diagonal element. The normal non-sparse multiplication would also involve many multiplication s with zero (which don't have an effect, but do take time). >> dat = randn(1000,100000); >> a = eye(1000); >> s = sparse(a); >> tic; a*dat; toc Elapsed time is 6.023062 seconds. >> tic; s*dat; toc Elapsed time is 1.354644 seconds. In either case the result of the multiplication is a non-sparse double matrix. Let me know if you have a problem with it.
Sarang Dalal - 2011-12-12 16:38:37 +0100
In fact, the problem with the "sparse" call occurs when exactly one channel is selected, e.g.: cfg.channel = {'Cz'}; This results in a sparse matrix being inserted into the trial{1} field, resulting in incompatibility with many common matlab functions... (I have tried on Matlab R2007b and R2011b on Mac.) When 2 or more channels are selected, the resulting data appears to be a regular "full" matrix.
Robert Oostenveld - 2011-12-12 18:14:41 +0100
(In reply to comment #7) good to know, should indeed be fixed. In case of nchan=1, having a sparse multiplication would also not have any computational advantage (and memory wise it is disadvantageous).
Robert Oostenveld - 2011-12-14 10:58:54 +0100
(In reply to comment #8) All occurrences should be fixed, not only the one for brainvision. Sparse multiplications for data calibration are done here manzana> grep -l sparse *.m private/*.m ft_read_data.m ft_read_sens.m private/ama2vol.m private/bti2grad.m private/ft_apply_montage.m private/ft_checkdata.m private/ft_datatype_spike.m private/ft_datatype_spikeraw.m private/ft_datatype_sts.m private/openbdf.m private/read_biosemi_bdf.m private/read_brainvision_eeg.m private/read_edf.m private/read_shm_data.m private/undobalancing.m private/yokogawa2grad.m private/yokogawa2grad_new.m
Robert Oostenveld - 2011-12-14 11:48:09 +0100
manzana> svn commit Sending fileio/ft_read_data.m Sending fileio/ft_read_sens.m Sending fileio/private/ama2vol.m Sending fileio/private/bti2grad.m Sending fileio/private/ft_apply_montage.m Sending fileio/private/read_biosemi_bdf.m Sending fileio/private/read_brainvision_eeg.m Sending fileio/private/read_edf.m Sending fileio/private/read_shm_data.m Sending fileio/private/undobalancing.m Sending fileio/private/yokogawa2grad.m Sending fileio/private/yokogawa2grad_new.m Sending forward/ft_prepare_vol_sens.m Transmitting file data ............. Committed revision 5035.