Re: [gmx-users] gromp error for gmxtest-4.0.4 and 3.3.3 on new gromacs-4.0.5 install

Justin A. Lemkul Tue, 01 Dec 2009 15:45:47 -0800


Mark Abraham wrote:

mrshi...@gmail.com wrote:
What other input might you need for a test set? As a minor developerand a stickler for accuracy, I would be very much interested in thesorts of inputs your looking for, and have some ideas as well.


I'd be willing to help, as well.

There's a number of issues listed in the posts in the URLs below, chiefamong them the absence of a bug-free reference version of GROMACS. Theother issues mostly arise because if there's no documentation of what isbeing tested *for*, it's hard to do maintenance on the test.

Can we guarantee any version will ever be bug-free? Or should there just be atest set for every version such that the tests succeed given the inherentlimitations or potential bugs in the software? This could get a bit laborious,re-creating reference data for every version, but might be the most thorough wayto proceed.

There's currently no tests designed for GROMACS in parallel. It's farfrom clear that there's a suitable reference GROMACS version anyway.

Would it even be possible to design a meaningful parallel set, given theinherent potential for deviations due to, i.e. dynamic load balancing? Evenmdrun -reprod doesn't completely guarantee reproducibility, does it? Might someof the more advanced features also depend on the FFT implementation, as well?

One clear need is a mechanism to permit features to be tested incombination in an automated manner. The set of "complex" tests thatalready exist are a good start, but they're far from complete. It shouldbe possible to ask a script to test thermostats in (X,Y) with barostatsin (W,Z), using -sum/-nosum with the constraint of -npme 0. (This is notat all silly - I spent several weeks this year proving that I'd found aGROMACS bug. It transpired that the problem was with the V-rescalethermostat under -nosum, and I only noticed that because I was using-rerun!) To avoid combinatorial explosion of the reference data, thatdata would have to be generated at the same time as the test data. Thusthe user would need to have installed some known good GROMACS version,and done some "bootstrap" correctness tests of that against suppliedreference data, before moving on to more complex cases withuser-generated reference data. This requires that the script "know" howto test each feature, so that it can correctly construct reference andtest runs. The above example is easy - the script knows that to test athermostat or barostat, the reference and test .mdp files need to have acertain form, and testing a command line flag is easier still. Thescript would also need to know how to reject tests of mutually-exclusivefeatures.

That does sound relatively simple to do, but would probably also require a bitof re-organization in the test set. For example, instead of the four or sodirectories now, we'd probably have to expand to substantially more depending onthe features being tested (which also helps in determining what the tests do).README files are also a must in each directory, similar to the AMBER test set.

In principle, each new feature implemented should be regarded asincomplete until there's a test that functions correctly. This meansthat the author of the feature needs to designate a GROMACS version thatis a suitable reference case (e.g. you can't test V-rescale against a3.x reference version because it wasn't implemented back then!) Thatbecomes rapidly untenable for a user of the test suite, since they wouldhave to have access to multiple different versions - there'd have to bea web server for providing reference data. There's further complicationsif testing feature A (whose reference version is 4.0.2) in combinationwith feature B (whose reference version is 4.0.4). Clearly you'd have touse at least 4.0.4 to generate a reference case for A & B together, andthen have to test that A alone in 4.0.4 is correct with respect to 4.0.2.
I don't know how to bring order to this chaos! I do know that the lackof a solution will continue to cost everyone time and money doing brokensimulations and chasing bugs.

Would it be useful to start a wiki page on the topic, perhaps somewhere withinthe development section, sort of like what was once done for features to beimplemented in the main software? That way, there's a central site for listingideas, comments, and progress.


-Justin

--
========================================

Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
MILES-IGERT Trainee
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!

Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] gromp error for gmxtest-4.0.4 and 3.3.3 on new gromacs-4.0.5 install

Reply via email to