Am Donnerstag, 26. November 2015 um 22:18:18, schrieb Guenter Milde <mi...@users.sf.net> > On 2015-11-26, Kornel Benko wrote: > > Am 26. November 2015 um 11:23:46, schrieb Guenter Milde <mi...@users.sf.net> > > >> The following proposal for an export test case categorisation tries to > >> avoid the controversial terms "inverted/reverted", "suspended", and > >> "ignored". > > >> Instead, the basic distinction is between "good" tests and "known > >> problems". > > >> While the concept of "known problems" matches roughly to "inverted", there > >> are some differences: > > >> * tests with "known problems" usually fail, but may also pass. > > >> * a line > > >> KNOWN_PROBLEM.<subtag>.export/... > > >> is easier to understand than > > >> INVERTED-SEE-README.export/... > > > Hm, yes. But the first entries are not in any subcategory (no subtag). > > Even without subtag, > > KNOWN_PROBLEM.export/... > > >> is easier to understand than > > >> INVERTED-SEE-README.export/... > > ;-) > > ... > > >> * There is no need for a top-level category "unreliable". > > > I added it to please you ... :( > > Ach so. I thought it was to allow an easier description:
No, I really took your proposal from the mail mentioning unreliable and added the description later. > \begin_layout Description > -nonstandard In primary sense such test means "requires non-standard resources > - (LaTeX packages and document classes, fonts, ... > - that are not a requirement for running this test suite". > +nonstandard Requires non-standard resources (LaTeX packages and document > + classes, fonts, ...) that are not a requirement for running this test suite. > \end_layout > > \begin_deeper > \begin_layout Standard > -In a wider sense, it is currently used also for "not to be expected to > succeed > - on every site that runs this test suite". > - This wider definition includes tests that have "arbitrary" result depending > - on local configuration, OS, TeX distribution, package versions, or the > - phase of the moon. > -\end_layout > - > -\begin_layout Standard > These tests are labelled as > \family typewriter > > > ... > > Unreliable test cases are test cases with a known problem. The correct, > full hierarchy would be > > * known problems > ... > - unreliable > · nonstandard > · erratic > > If we do not want 3 levels with subsubcategories, we can just remove the > level "unreliable" add its subcategories below "known_problems": > > * known problems > ... > - nonstandard > - erratic I don't like known problems. We know, there is a problem, but don't know exactly which (and if it is the only one) > or use "unreliable" on the same level as "known problems": > > * known problems > ... > * unreliable > - nonstandard > - erratic > > whatever suits you more. > I am open to implement something that fits with ctest. Some remarks to its functionality below. > >> Export Test Categorisation > >> -------------------------- > > >> To get a feel for the severity of a known problem, it makes sense to > >> sort known problems in sub-categories, e.g. > > > >> * TODO # problems we want to solve but currently cannot. > > >> * minor # problems that may be eventually solved > > >> * wontfix # LyX problems with cases so special we decided to > >> # leave them, or LaTeX problems that > >> # - can't be solved due to systematic limitations, or > >> # - are bugs in "historic" packages no one works on. > > >> * wrong output # the output is corrupt, LyX should rise an error > >> # but export returns success. > > >> * LaTeX bug # problems due to LaTeX packages or other "external" > >> # reasons (someone else's problems). > >> # that may be eventually solved > >> # (In this case, the case goes to "unreliable" until > >> # everyone has the version with the fix.) > > >> * nonstandard # requires packages or other resources that are not on > >> CTAN > >> # (some developers may have them installed) > > >> * erratic # depending on local configuration, OS, TeX distribution, > >> # package versions, or the phase of the moon. > > > > Feels good, but who shall categorize? > > This will be a collaborative work. Normally, this would be done when > addressing a new "known problem". > > But first we need to agree on and set up the framework. > > Proposal > ======== > > * Rename "inverted" to "known_problem" and the file problematic? > autotests/revertedTests to autotests/problematicTests. > For me 'reverted' clearly signals that the testresult is inverted. > - in test mode (looking for regressions), the result of these test > cases is irrelevant Yes > - in maintenance mode, the label should be removed from test cases > that pass. > > This means that `ctest -L export` should not run tests with > "known_problems". Why not? It is already possible to run only non-inverted tests. E.g. 'ctest -L export -LE reverted' Make an alias. > Running `ctest` (without -L) should rather list the failing tests with > "known problems" than the passing ones. This is less confusing. This is not possible. ctest does not care for labels if called without '-L' > Motivation: In "test mode", > · a test that fails to fail is no problem (we search regressions), > · a test that fails for a "known reason" is recognised as such by its > label and can be ignored by the user or a post-processing script. > > (BTW: in the list of failing tests recently sent by Scott, there were a > number of "INVERTED_SEE-README" tests. Does this mean these tests failed > or does it mean these tests failed to fail?) They failed to fail. But they fail here. Therefore I moved them to nonstandard (now unreliable). > * Handle "unreliable" test cases similar to "known problems": create test > instances with a telling label (unless they are wontfix): > > - in test mode (looking for regressions), the result of these test > cases is irrelevant > > - in maintenance mode, the label should be removed from test cases > that pass everywhere and every time. > > However: > The label "unreliable" is an indicator that the test is not "good" if > it passes at one site or only one time. It passes because of 'better' configuration. Scott and I have similar platforms, so this is the only difference. > To remove this label, you need confirmation from other developers > that the problem is really solved. Sure. > * Rename autotests/ignoredTests to autotests/wontfixTests and move > "wontfix" problems there. We do not agree. > * Rename autotests/suspendedTests to autotests/fragileTests. Suspended is not fragile. It always fail, and we cannot do anything yet. Maybe later, with new luatex or xetex version. > Use this label for all fragile tests, not only inverted ones. No, test which always pass are not fragile. > > The problem is the huge number of tests which do not fail. They are not > > categorized ATM. > > The idea here is a file autotests/fragileTests with "wide" regular > expressions, e.g. > > .*pdf4SystemF > .*Math.* > > This should apply to all tests, not only the ones with "known problems". No, same reason as above. > Then, when there is a "regression" in one of the fragile tests, it will > be shown with the "moderating" label "fragile" telling that the reason is > more likely a surfacing problem with the document or export format than a > new one. > Such test would go to unreliableTests (or whatever the name will be) > >> If we want to make sure that no "good fail" is transformed to a > >> "wrong output" we would need a category "assert fail" and report > >> export without failure: > > >> * assert fail # we know the export does not work for a permanent reason > >> # and want to test whether LyX correctly fails > >> # (e.g. pdflatex with a package requiring LuaTeX) > > > > That is for later, used in autotests/export. > > All other lyx-files (but attic) are distributed. Normally we expect > > them in good shape. > > Yes. > Now to ctest: Possible selection of test through the testname # ctest -R <some regex to select tests> -E <some regex to exclude from test> In this appearance the given labels are irrelevant. You can give more than one '-R' parameter, but unfortunately they are connected with 'or'. Second selection thorough labels: # ctest -L <some regex to select labels> -LE <some regex to exclude labels> You can also mix, so for instance I use # ctest -L export -E "xhtml|lyx16" (notice '-E' instead of '-LE') to check only tex exports. To check only tex regressions, you may want to use # ctest -L export -E "xhtml|lyx16" -LE reverted > Günter Kornel
signature.asc
Description: This is a digitally signed message part.