On 2015-12-01, Kornel Benko wrote: > Am 30. November 2015 um 12:56:22, schrieb Guenter Milde <mi...@users.sf.net> >> On 2015-11-28, Kornel Benko wrote:
Thanks for explaining the filter chain. The table in Development.lyx gives a quite good overview. I still believe there is room for improvement in both, the actual test setup and naming as well as the documentation. (BTW: I reordered the headings in Development.lyx and pushed to git.) First, a general question, When tagging tests that fail: Do we want tags for a) the reason of failure (TexBug, LyxBug, incompatibility, nonstandard, ...) b) our treatment of the test case (ignored, inverted, suspended, ...), or c) both? Currently, * the filter files have names from b), * labels are a mixture of test-class (export, load, lyx2lyx), b), and c): #> ctest --print-label Test project /usr/local/src/lyxbuild All Labels: chemgreek cmplyx erratic examples export layout load lyx2lyx manuals mathmacros module nonstandard reverted roundtrip suspended templates unreliable For clarity, we could give them some structure, e.g. cmplyx examples export export:reverted export:reverted:chemgreek # actually, I wouldn't make "chemgreek" a label export:suspended export:unreliable:erratic export:unreliable:nonstandard layout load lyx2lyx manuals mathmacros module roundtrip templates Selection by label uses regular expressions, so instead of ctest -L export -LE reverted one could use ctest -L "export$" to test only for new failures. ........................................................................ When creating the tests, I propose the following logic: 1. filter: sort out test cases to ignore ignoredTests -> Export combinations matching here are withdrawn 2. categorization: label the remaining tests a) *all* export tests get the label "export" to make them easy to distinguish from lyx2lyx, roundtrip, mathmacros, ... b) Set specific labels: test names are matched against regexps in all "*Tests" files. Matches regexp in Label Sublabels ================= ========== ======================================== InvertedTests¹ inverted² wontfix, minor, TeXBug, assertError, ... UnreliableTests unreliable nonstandard, ... SuspendedTests¹ suspended A test matching a regexp in UnreliableTests and InvertedTests, say, would get the labels "unreliable" and "inverted". Combining this with the proposal above, there would not be independent sublabels but one label, e.g. "export:inverted:TeXBug" or "export:unreliable:nonstandard". ¹ see also the discussion below. ² tests with label "inverted" will also get the test property "inverted", of course. ..................................................................... Choice of names: suspicious ========== >> I am not happy with the naming "suspicious" (we know for sure there is a >> problem and the test fail). > > > I don't like "known problems". We know, there is a problem, > > > but don't know exactly which (and if it is the only one) However, we not only suspect a problem: We only insert a regexp here after establishing there is a problem we cannot currently solve. We could also say "known to fail" or "acknowledged problem" or just "problematic". inverted/reverted ================= > For me 'reverted' clearly signals that the testresult is inverted. Actually, reverted has a related but somewhat different meaning, e.g. inversion of argument $\ne$ reversion of argument: to revert = turn back, undo, ... (revert a commit), to invert = turn round, turn upside down, negate, ... Hence, using "revert" for inverted tests is confusing (at least for me). Is there a reason the label cannot be called "inverted"? ..................................................................... >> I am not happy with the naming "suspended" (what is its use case?)¹ > This one are tests which are failing, but we cannot do anything on them. > See .*pdf4_texF We actually solved a lot of "INV.*(dvi3|pdf4|pdf5)_texF" and there is more that could be done (however it may be not worth the effort). But what is this label usefull for? The effect of this label is that if a "suspended" test works again, there is no feedback when running inverted export tests. Anything else? I don't see the advantage. For problems that are unlikely to be fixed soon, I propose one of inverted:minor # problems that may be eventually solved inverted:wontfix # problems that # - minor and are hard to solve (not worth the effort), # - can't be solved due to systematic limitations, or # - are bugs in "historic" packages no one works on. However, not with "catchall" regular expressions but on a casewise basis, and with explaining comment. This could be implemented via sublabels. tests with low signal/noise ratio ================================ The export with Unicode-aware TeX engine but TeX-fonts (dvi3|pdf4|pdf5)_texF is rather an example for a class of "unreliable" tests -- however, the unreliability is not "fail/pass depending on the site or the phase of the moon". The inverted test cases fail for every developer running the tests, but > Many other tests with the same signature are not failing. Rather, export success depends on the document content: Adding or removing one character (which depends on some autoloaded package), can make a compilable document uncompilable and vice versa. Also, the probability of hidden problems (erroneous output despite non-failing export) is high. This means that "(dvi3|pdf4|pdf5)_texF"-tests have a low signal/noise ratio. The statement The answer is that if a non-default route is broken it is often because a bug was introduced in LyX and not because a document-specific change was made that is not supported by the route. In other words, there is a high signal/noise ratio in the export tests for some non-default formats. in Development.lyx does not hold for them. The questions are Do we want a label for all tests with low signal/noise ratio? How to name this label? How should we handle tests carrying this label? Thanks, Günter