Hi all, after the brainstorming on GTK+ unit tests I think it is time to start getting some conlusions. Generally, it seems that people is ok with Tim's proposal although there are some points which have led to debate.
In this mail I first tried to summarize Tim's proposal for reference, then I tried to point out the current state of the opened debates around it, and finally, I added some other additions/suggestions/doubts/issues dropped by people along the brainstorming. Regarding the opened debates, there are some important issues that still need to reach a decision, like deciding whether to go with a testing framework like Check or without it, so any opinions on these topics will be very appreciated. Note: please, keep in mind that I tried to summarize lots of info, so, although I did my best to gather all opinions, it is possible that I forgot something or made some mistakes. If you think I forgot something or made a mistake, excuse me, and feel free to add/fix whatever you think should be added/fixed. ----------------><-------------------- ORIGINAL PROPOSAL ----------------><-------------------- 1.- About tests output: -------------------------------------- 1.1.- "All tests passed" vs "at least on test failed" instead a list of passed and failed tests. 1.2.- Use a progress indicator for slow tests. 1.3.- Homogeneous/consistent test output. For performance measurements, provide canonical and machine parsable output. 2.- About tests implementation: -------------------------------------- 2.1.- Provide make targets to split up tests (check, slowcheck, perf). 2.2.- Fork tests that test abort()-like bahavior. 2.3.- Fork time bound tests, aborting and failing them once a timeout has passed. 2.4.- Pass main() arguments to tests. 2.5.- Tests that produce a "CRITICAL **:" or "WARNING **:" message should fail. 2.6.- Use G_LOG_DOMAIN properly. 3.- About the testing framework -------------------------------------- 3.1.- Do not add a new dependency in GTK+, instead of using an existing testing framework like Check, develop a common set of reduced features that would be needed for unit testing. These features would be: 1) An initialization function that takes care of things like calling gtk_init(), preparsing arguments, setting CRITICALS/WARNINGS as fatal, etc. 2) Register all widget types provided by Gtk+. 3) Fork off a test and assert it fails in the expected place. 4) A fork-off and timeout helper function. 5) Helper macros to indicate test start/progress/assertions/end. 6) Output formatting function. 4.- Things that would be worth testing: ----------------------------------------- 4.1.- For a specific widget type, test input/output conditions of all API functions (only for valid use cases) for both Gtk and Gdk. 4.2.- Try setting & getting all widget properties on all widgets over the full value ranges (sparsely covered by means of random numbers for instance). 4.3.- Try setting & getting all container child properties. 4.4.- Check layout algorithms by layouting a child widget and checking the coordinates it's layed out at. 4.5.- Create all widgets with mnemonic constructors and check that their activation works. 4.6.- Generically query all key bindings of stock Gtk+ widgets, and activate them, checking that no warnings/criticals are generated. 4.7.- Create a test rcfile covering all rcfile mechanisms that's parsed and who's values are asserted in the resulting GtkStyles. 4.8.- For all widget types, create and destroy them in a loop to: a) measure basic object setup performance b) catch obvious leaks (these would be slowcheck/perf tests) ----------------><-------------------- OPENED DEBATES ----------------><-------------------- * Regarding 1.1.: -------------------------------------- Some people think that a list with all passed/failed tests should, at least, be also provided. Some comments that support this idea have been: - People would like to know the overall state (percentage of pass/fail). - People would like to have a list of tests that fail, so that they can fix the issues. - It allows a group of people to work on fixing different issues in parallel. - Some people prefer a more verbose output. - It could save valuable developer time in some situations. Some comments against it have been: - Having/letting more than one test fail and to continue work in an unrelated area rapidly leads to confusion about which tests are supposed to work and which aren't, especially in multi-contributor setups. - Figuring whether the right test passed, suddenly requires scanning of the test logs and remembering the last count of tests that may validly fail. - Defeats the purpose using a single quick make check run to be confident that one's changes didn't introduce breakage. (Tim Janik) - You usually have to fix the first issue before being able to move on. * Regarding 2.2.: ------------------------------------ What about testing segmentation faults? Seg. faults are not predictable, so, in case we want to be able to get a complete list of passed/failed tests we need to fork every single test. The downsize, of course, is execution time. * Regarding 2.5.: ------------------------------------ Some people think that there are situations where you would like to not make those tests fail. Some comments that support this idea have been: - In the docs you describe API usage. Thus a test could check if in the debug build a g_return_if_fail() is used to implement the contract (warn & fail if NULL is passed). - It is worth knowing if a function handles safely the case when it is passed invalid arguments (like a NULL pointer) or if it produces a segmentation fault in that case. - Sometimes it is useful to check that a critical message was indeed shown, and then move on. - Preemptively deciding it's always impossible to test resilience of certain known warnings is a misstep. Some comments against it have been: - In GLib context, once a program triggers any of g_assert*(), g_error(), g_warning() or g_critical(), the program/library is in an undefined state. - That can be implemented anyway by installing g_log handlers, reconfiguring the fatality of certain log levels and by employing fork-ed test mode. But these kind of tests will be rare though, and also need to be carefully crafted. - Functions are simply just defined within the range specified by the implementor/designer. - Occasional return_if_fail statements in glib/gtk code base are a pure convenience tool to catch programming mistakes easier. They can be removed in production environments. IMHO, I think there are two issues in this debate: - On one hand, people want to assert some critical/warnings. That could be done, as Tim suggested, by installing g_log handlers, reconfiguring the fatality of certain log levels and by employing fork-ed test mode. However, seems it is not clear when this kind of tests should be done. - On the other hand, although g_return_if_fail statements have been added only to help developers, some people think they are/should be part of the API contract, and thus, worth testing. * Regarding 3.1.: ------------------------------------ Some people think that using a testing framework like Check would be better. Some comments that support this idea have been: - It does not make sense to write an own test suite framework. It is reinventing the wheel. - It wouldn't be a dependency to build GTK+, it would only be a dependency to run the tests. - It is not too much to ask for developers to install check. - Most distros have it. - Check is widely used and having a standard tool for testing, instead of doing something ad-hoc, has its advantages. - You will need to maintain that ad-hoc framework. If new features are needed in the future you will need to add them yourself. Some comments against it have been: - Test frameworks like Check would only help us out with 3, 4 and to some extend 5 (see above the list of features that Tim suggested the unit test framework should provide). This does not warrant a new package dependency, especially since 5 might be highly customized and 3 or 4 could be useful to provide generally in GLib. ----------------><-------------------- OTHER ISSUES ----------------><-------------------- * How many test programs? -------------------------- Ideally, one test program per component makes developer's life easier, cause it would allow developers to run only the tests for the components they are interested in/working on. I think this is a very important issue if we finally make the test suite to fail when we detect the first failed test. On the other hand, the more test programs we have, the more it will take to build cause libtool needs to relink all the test programs again when there is a change in the lib, which can become a really endless process if we have too many test programs. I guess we need to reach some kind of agreement to organize groups of tests so we do not generate too few nor too many test programs. * Code coverage ------------------------------------- Some people suggested to include code coverage statistics (gcov, lcov,etc.). Nobody seemed to be against this. * Adding testing-only code to the lib ------------------------------------- Adding conditionalized testing only code to the lib could be useful to get effective tests under certain situations. However, for a project of the size and build time of Gtk+, with a quite large legacy code base, it can be a too high price. Quite related to the above issue is the idea of adding the tests to the files with the code being tested, as Nautilis does. The problem would be the file growth and the fact that GTK+ already has some quite big files. Also, some people added that the additional cruft the tests would add to the files is rather distracting, so they prefer them to be in separate files. * Misc proposals/concerns -------------------------------------- This is a misc set of minor recomendations/doubts or suggestions for future work beyond the scope of unit tests. - Use AT-SPI (functional tests and accessibility support). - Clearly document the purpose of each test. - Develop use-case tests. - Is signal emission part of the API contract? should it be tested? Iago. _______________________________________________ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list