Back to the main page.
Bug 3066 - reimplement the dashboard for github
Status | CLOSED FIXED |
Reported | 2016-02-05 13:18:00 +0100 |
Modified | 2019-08-10 12:37:22 +0200 |
Product: | FieldTrip |
Component: | test |
Version: | unspecified |
Hardware: | PC |
Operating System: | Mac OS |
Importance: | P5 normal |
Assigned to: | Robert Oostenveld |
URL: | |
Tags: | |
Depends on: | 3049 |
Blocks: | |
See also: |
Robert Oostenveld - 2016-02-05 13:18:43 +0100
see http://www.fieldtriptoolbox.org/development/dashboard and of course the emails that are sent twice a week. The dashboard was designed for svn and relied on svn revisions. It needs to be revamped. Since not many people are looking along with it, I also don't think we should be running it as frequently as we did so far. I discussed with JM and have some ideas. The main idea is to have a simpler link between commits and columns in the test result display.
Robert Oostenveld - 2016-02-05 16:23:25 +0100
"NNO" wrote: Op het test-front: ik gebruik nu MOxUnit [4] in combinatie met MOcov [5] om niet alleen continuous integration testing te gebruiken met travis (of shippable.com), maar ook een test coverage report. Ik heb dit als erg handig ondervonden om te kunnen bepalen welke delen van de code niet voldoende worden getest. (Het tegenovergestelde is helaas niet het geval: ook al heeft een functie 100% code coverage, dat betekent nog niet dat de code goed wordt getest). Het is ook handig met PRs in CoSMoMVPA, omdat voor elke PR nu automatisch de test suite en coverage wordt bepaald en gerapporteerd op de PR pagina. Nu begrijp ik dat FieldTrip nog niet volledig op Octave draait, dus voor testen van FieldTrip lijkt Jenkins - zoals nu wordt gebruikt - ook de beste oplossing voor de toekomst. Echter wellicht dat MOcov handig kan zijn om automatisch coverage reporten te generen voor FieldTrip in HTML, XML of JSON met behulp van de Matlab profiler. Zie bijvoorbeeld de derde use case in MOcov's README bestand [6]. [4] https://github.com/MOxUnit/MOxUnit [5] https://github.com/MOcov/MOcov [6] https://github.com/MOxUnit/MOxUnit/blob/master/README.md
Robert Oostenveld - 2016-06-29 13:59:09 +0200
I have a plan for a new dashboard, which consists of bash tests scripts 1) to run a batch for all MATLAB versions (given FT dir) 2) to run a batch (given FT dir and MATLAB version) 3) to run a single test (given FT and MATLAB version) Furthermore, there will be a mongoDB server with node.js webinterface Step 3 would write a record into the DB. Beside an interface for ingestion of records, there will be a "dashboard" web interface to browse the results (given test function, matlab version and FT version). It will run on a separate server as http://dashboard.fieldtriptoolbox.org The actual tests are executed (as before) on the Donders compute cluster. To realize the new implementation, I have started by putting the historical and existing dashboard code in https://github.com/fieldtrip/dashboard
Robert Oostenveld - 2016-07-13 12:53:28 +0200
mac011> git commit -a [master ea3c2b9] ENH - renamed the function to run test scripts, made 1st implementation for function to get results. See http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3066 2 files changed, 69 insertions(+), 13 deletions(-) create mode 100644 utilities/ft_test_result.m rename utilities/{ft_test.m => ft_test_run.m} (89%)
Robert Oostenveld - 2016-10-20 11:05:58 +0200
note to self (from MEAB) webread sinds 2014b webwrite sinds 2015b these are used in ft_test_xx and ft_trackusage
Robert Oostenveld - 2017-01-17 13:02:53 +0100
I discussed this with Jan-Mathijs. The design of the user interface will be like this, with a single test function that does the insertion (into the DB) and retrieval (from the DB): ft_test run test_bug1234 ft_test run maxwalltime 300 ft_test
Robert Oostenveld - 2017-01-17 14:53:38 +0100
[bug3066-dashboard 128c693] ENH - implemented new user interface, all using single function call. Supported (but not 100% tested) are run, report, compare. See http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3066 6 files changed, 275 insertions(+), 443 deletions(-) rewrite utilities/ft_test.m (86%) delete mode 100644 utilities/ft_test_result.m create mode 100644 utilities/private/ft_test_compare.m create mode 100644 utilities/private/ft_test_report.m copy utilities/{ft_test.m => private/ft_test_run.m} (78%) create mode 100644 utilities/private/struct2table.m
nno - 2017-01-17 15:09:18 +0100
(In reply to Robert Oostenveld from comment #6) > I made a new branch for this: > > Switched to a new branch 'bug3066-dashboard' Is this branch publicly available? I looked at https://github.com/fieldtrip/fieldtrip but this shows only the master branch, not this new 'bug3066-dashboard' branch. Thanks.
Robert Oostenveld - 2017-01-17 23:05:25 +0100
(In reply to nno from comment #7) No, it was not available on github yet. I have just done another few commits, the last one being this mac011> git commit [bug3066-dashboard bb26763] ENH - improved the FT_TEST function, made it overall more consistent, tested many cases and improved documentation. See http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3066 6 files changed, 143 insertions(+), 36 deletions(-) create mode 100644 utilities/private/mergecellstruct.m I have just merged my own bug3066-dashboard branch into the master branch of fieldtrip/fieldtrip. See https://github.com/fieldtrip/fieldtrip/pull/301.
Robert Oostenveld - 2017-01-17 23:11:30 +0100
still to do: 1) I have to test it on various platforms (so far mainly on my laptop) and see how the raspberry pi mongoDB server keeps up if the DB gets larger. 2) The regular batching of all test scripts in "bulk" on our compute cluster needs to be reimplemented (built on top of "ft_test run"). 3) There is still documentation to be written, especially for the non-matlab section. This is all contained in https://github.com/fieldtrip/dashboard, which now needs to be cleaned up (i.e. old stuff needs to be deleted). --------- I would appreciate if you could already give it a try on your platforms. Simply do ft_test run xxx where xxx is your favorite test function. You can also do ft_test run to run all tests. It will sort them on execution time and start with the shortest. You can always abort by pressing ctrl-C. Oh, there are of course tests that don't run because you don't have the required input data. I have not yet explicitly dealt with them, so those will error. And please try out the report and compare modes.
Robert Oostenveld - 2017-01-19 15:57:57 +0100
first attempt by JM failed, due to the function "pad" missing. That was on 2014b. I wrote a drop-in-replacement function for it. But the next problem surfaces, which is that in 2014b and 2015a some new webwrite/weboptions was introduced, which the code now uses. We should make it compatible with matlab back to 2012a at least (5 years back) and preferably further back. We should also try making it compatible with octave. We should be able to test and compare over OSes, versions, etc. I am now handling the error more explicitly, but a solution is not yet implemented. I did already made a struct2json helper function. mac011> git commit -am "ENH - deal with older matlab versions for which uploading is not yet supported, see http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3066" [master 178dee7] ENH - deal with older matlab versions for which uploading is not yet supported, see http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3066 3 files changed, 79 insertions(+), 6 deletions(-) create mode 100644 utilities/private/struct2json.m
Robert Oostenveld - 2017-01-21 10:32:58 +0100
I made a compat/matlablt2014b directory with drop-in replacement functions for webread/webwrite (new) that internally call urlread (old). These should allow to keep the new webread calls in ft_test and not to worry about different MATLAB version in that code. ft_test report (and compare) now work on 2012b. I can also see a test committed by Nick. ft_test run does not work yet, writing to the web api still needs to be implemented.
Robert Oostenveld - 2017-01-21 16:48:21 +0100
the matlab function urlread in 2012b supports post, but in a very limited way. I could not get it to work with the dashboard server and reverted to a system call to curl on the command line. The disadvantage is that it won't work on old MATLABs on MS Windows, but it might work on Octave on linux. mac011> git commit -a [master 747747c] ENH - implemented webwrite using a system call to curl, ft_test run now works on 2012b (using curl), making run/report/compare all comaptible with older matlab versions on linux/osx. See http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3066 6 files changed, 66 insertions(+), 13 deletions(-) Another feature to implement is filtering on the dependency of external files (i.e. most of them are on our network storage system). Those that need files cannot run everywhere (or more accurate: can run but will fail, whereas many stand-alone scripts are actually interesting to run on a wide range of computers.
Robert Oostenveld - 2017-01-22 10:24:20 +0100
i forgot to add webwrite to the previous commits, it is present now. I also added an option to filter for test scripts that load data from files on our network storage. It works by searching for the string "dccnpath". The default is smart, i.e. if central storage is not available, it will exclude the test scripts that will try to read from it. There are quite some test scripts that do not use dccnpath correctly and only read from the linux hardcoded /home/common/matlab/fieldtrip/data (or the windows hardcoded) location. I will update those scripts so that they can also be detected as non-runnable on external computers.
Robert Oostenveld - 2017-01-22 11:31:34 +0100
mac011> git commit -a [master c65124e] use dccnpath helper fucntion for all test data that is read from the network-attached storage, see http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3066 144 files changed, 359 insertions(+), 408 deletions(-) create mode 100644 test/inspect_bug1230.m rename test/{test_bug645.m => inspect_bug645.m} (81%) delete mode 100644 test/test_bug1230.m all test files now use dccnpath, which means that "ft_test run" will detect test file reading correctly.
nno - 2017-01-23 11:58:10 +0100
Thanks Robert for your work on this, this is very helpful. I tried running this on Octave and came across a few minor errors in the code, which are fixed in this PR: https://github.com/fieldtrip/fieldtrip/pull/303. After these fixes, using Octave (4.0.3) with ft_test run upload no loadfile no assertclean no did run, although quite a few tests did not pass. I will try and see if it these tests can be run using travis and MOxUnit.
Robert Oostenveld - 2017-01-30 10:18:14 +0100
There are about 600 test scripts. My recent runs of the whole batch indicate that about 500 of them work, and that about 100 have issues (some minor, such as file relocation problems, some unclear). We could wait until those 100 are all fixed, but that is not likely to happen soon. Following discussion between JM and me, we agreed that: - the dashboard should among others serve to detect scripts that start(!) breaking - this requires that there are initially not too many scripts broken hence I will rename all currently broken test scripts from test_xxx to failed_xxx. Those are not to be ignored, but actually having then with that name makes it easier to work on them (when time allows). Once the underlying cause has been fixed, they should be renamed back to test_xxx of course. This also allows testing that scripts that work on linux (our reference platform) also work on osx (my development platform) and on windows. Idem for testing scripts over different matlab versions. I am presently running the whole batch. I will use the outcome of that batch to rename the scripts.
nno - 2017-01-30 13:18:53 +0100
I have sent a Pull Request adding support using MOxUnit and continuous integration testing: https://github.com/fieldtrip/fieldtrip/pull/310 Example output: - https://travis-ci.org/nno/fieldtrip/builds/196540885 (scroll to the bottom to see a summary of the test results) - https://app.shippable.com/runs/588f1a6a6076a90f00241863/1/tests (click on 'tests' on the left hand side to see results from non-passing tests) Using LIMITS="'maxwalltime',600,'maxmem','1gb'", currently the statistics are: 42 passing 29 failing 1 error 522 skipped. Most tests are skipped due to memory limitations or the absence of 'dccnpath'. This PR could be useful as an additional method to track test results. I would be curious to hear comments about this approach.
Robert Oostenveld - 2017-01-31 08:51:25 +0100
(In reply to Robert Oostenveld from comment #16) I have renamed the failing test scripts. I deleted one, which was present twice (test_ft_plot_mesh, also in plotting/test). In the list I renamed, there was one that did not report to the dashboard database correctly (test_old_ft_write_volume) due to messing up the path. mac011> git push upstream master Counting objects: 3, done. Delta compression using up to 4 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 987 bytes | 0 bytes/s, done. Total 3 (delta 2), reused 0 (delta 0) remote: Resolving deltas: 100% (2/2), completed with 2 local objects. To github.com:fieldtrip/fieldtrip.git 1527905..7c2e2b0 master -> master I will start the whole batch once more, to confirm that we have a clean baseline.
nno - 2017-01-31 10:31:33 +0100
(In reply to Robert Oostenveld from comment #18) > I have renamed the failing test scripts Function names were not changed. This is addressed in this PR: https://github.com/fieldtrip/fieldtrip/pull/312
Robert Oostenveld - 2018-05-31 10:51:26 +0200
I have updated the documentation on http://www.fieldtriptoolbox.org/development/dashboard Combined with the documentation one https://github.com/fieldtrip/dashboard, I don't think there is anything else that needs to be done any more.