categorizing export tests (was: export test failures 2)

Guenter Milde Sat, 21 Nov 2015 15:39:54 -0800

On 2015-11-20, Kornel Benko wrote:
> Am 20. November 2015 um 11:20:54, schrieb Guenter Milde <mi...@users.sf.net>



>> My suggestion would be to have the following categories and sub-categories:

>> * export              # we expect the export to succeed
>               this is also a candidate for regressions

>> * reverted            # we expect the export to fail (currently) for a known
>>                       # reason
>               which will not be changed in near future?

Yes. This is for minor issues or problems that cannot be solved the next
day or not by the developer introducing this regression (by a patch
fixing more important stuff so that reverting is not better).
Normally, there should be a bug report for the issue.

>>     - correct fail    # we know the export fails for a good reason
>>                            # and want to test whether LyX correctly fails
>>                            # (e.g. pdflatex with non-TeX fonts or 
>> polyglossia)
>               makes sense to me

>>     - fragile         # exports that may easily fail because of "critical"
>>                       # combination (e.g. XeTeX + TeX-fonts)

>               I plead for suspended, because in the future XeTeX may
>               become smarter

The reason is normally not XeTeX, but the many packages that assume a
positive test for XeTeX means Unicode fonts! This will only gradually
improve (if at all).
                
Also, many current failures of XeTeX-texF export are due to the need of
using ASCII encoding (but must test export with ASCII also independent
from XeTeX to be able to differentiate).

The idea here is to mark test failures that are due to our "abuse" of the
LyX documentation and templates for road-testing outside their intended
use-cases when fixing them is not worth the effort. (Remember, there is more
important work to do than makeing sure an obscure document can be sent
on a never intended and export route that we know is dangerous, almost never
used and not offering any advantages.


>> * ignored         # we don't care for the result and hence dont run
>>                   # these test cases
>               Yes

>>     - wrong output    # the output is wrong although export returns success.
>>                            # not LyX's fault, but e.g. incompatible packages.
>               OK

>>     - nofix           # "historic" packages with bugs that prevent working
>>                            # with some export routes.
>               OK

>>     - nonstandard     # requires packages or similar that are not on CTAN

>               Do not ignore them. They _are_ compilable at the end.


>>     - suspended       # - problems that we currently cannot solve but want 
>> to.
>               Yes, but not ignore.
                OK move this to "reverted".

       -  upstream   # SEP (someone elses problem)
                     # export failure not due to LyX.
                     # May work, depending on TeXLive version.
                     # (regressions or fixes in upstream packages)


> To make it clear: Everything ignored cannot be tested. 

Then, there is a fundamental flaw in the test machinery:

We need a category and rule-set for tests where:

* we don't care for the result because it does not tell us anything about
  the "healthiness" of LyX and hence dont run them normally, but

* we may want to run them on special request (because we know the phase of
  the moon or have installed a special package or want to check the status
  of upstream packages or fixed a nofix bug).


> If we want to see, if anything
> changed (like XeTeX), we should be able to retest.
>       like 'ctest -L suspended'

We should be able to retest like 'ctest -L ignored'

If this is currently not possible, so please change the test machinery to
enable this without for these tests to be "inverted":

* They should not appearing in the "inverted tests" list
  spoiling our statistics

* There should be no need to change the "invertedtests" file
  depending on the phase of the moon or after installing a nonstandard
  package.

>> The sub-categories are just for sorting the tests.
>> Behaviour would be the same for main categories:

>> * export:   return False if export fails
>> * reverted: return False if export succeeds
>> * ignored:  do not run
               unless specially requested

Günter

categorizing export tests (was: export test failures 2)

Reply via email to