Back to the main page.
Bug 1038 - the reading functions can support tgz and zip datasets, unzip on the fly
Status | CLOSED FIXED |
Reported | 2011-10-12 16:46:00 +0200 |
Modified | 2019-08-10 12:29:19 +0200 |
Product: | FieldTrip |
Component: | fileio |
Version: | unspecified |
Hardware: | PC |
Operating System: | Mac OS |
Importance: | P3 enhancement |
Assigned to: | Eelke Spaak |
URL: | |
Tags: | |
Depends on: | |
Blocks: | |
See also: | http://bugzilla.fcdonders.nl/show_bug.cgi?id=1747 |
Robert Oostenveld - 2011-10-12 16:46:41 +0200
this requires unpacking in a temporary directory and a unique way of assigning a tmp-identifierTODO
Boris Reuderink - 2011-11-17 10:46:39 +0100
Changed the status of bugs without a specific owner to UNCONFIRMED. I'll try to replicate these bugs (potentially involving the submitter), and change confirmed bugs to NEW. Boris
Boris Reuderink - 2012-01-03 14:38:29 +0100
Confirmed (enhancement by Robert). Changed status to NEW.
Boris Reuderink - 2012-02-03 12:04:16 +0100
Maybe some ideas can be borrowed from Python's DataSource: http://docs.scipy.org/doc/numpy/reference/generated/numpy.DataSource.html It implements caching and opening from URLs on top of Robert's proposal.
Robert Oostenveld - 2012-02-03 15:33:02 +0100
Jan-Mathijs, Could this download-and-cache-on-the-fly be of relevance for the HCP? Do you know how the mgz format is dealt with? I suspect it to be compressed.
Jan-Mathijs Schoffelen - 2012-02-03 16:34:28 +0100
Yes, the mgz format is compressed. It is compressed/uncompressed on the fly, using unix('....gzip etc') for compression, and platform-dependent unix('...gunzip etc') or unix('...zcat ...') for decompression. Doesn't seem to work for windows
Eelke Spaak - 2012-02-29 14:47:01 +0100
Proposal for implementing: - Change ft_filetype to detect .zip, .tgz, .tar.gz, .gz file extensions, and return that as the filetype. - Change ft_read_header and ft_read_data to check whether filetype equals one of the compressed types, and if so, extract the file to a temporary directory, and recursively call ft_read_header/ft_read_data on the extracted file set. Am I missing something why this would not work? (Don't have any experience with these reading functions or data formats other than CTF.)
Eelke Spaak - 2012-02-29 15:07:52 +0100
JM thinks it's a good idea, working on it.
Robert Oostenveld - 2012-03-02 17:25:52 +0100
(In reply to comment #7) indeed good idea. I suggest the test scenario to include at least these cases: 1) ctf_ds, i.e. *.ds directory, where the directory would be zipped. 2) simple one-file EEG format, e.g. biosemi_bdf or ns_cnt, where the file is zipped 3) brainvision triplet of vhdr+vmrk+eeg files. These are normally not in a directory. case 1 also applies to neuralynx_sdma, also egi_mff would be a good test case case 2 applies to a lot of formats, and is simple. No additional testing needs to be done. But it raises the question: how do ft_filetype and ft_read_xxx interact with each other? Should ft_filetype already unzip? How to avoid multiple unzip actions? Should there be a "zipcache" function in fileio/private with a persistent list/struct-array with the original zipped filename and the alternative unzipped filename? case 3 also applies to a set of dicom images (ft_read_mri), or to an analyze hdr+img anatomical file. In case 3, should it support both a "flat" zipfile and one with a subdir in it?
Robert Oostenveld - 2012-09-26 22:34:45 +0200
(In reply to comment #3) I moved the URL idea to the separate bug 1747
Eelke Spaak - 2014-01-29 14:59:52 +0100
bash-4.1$ svn commit test/test_readcompresseddata.m fileio Sending fileio/ft_read_data.m Sending fileio/ft_read_header.m Sending fileio/ft_read_mri.m Adding fileio/private/inflate_file.m Adding test/test_readcompresseddata.m Transmitting file data ..... Committed revision 9149.