Guenter Milde wrote:

> On 2016-01-21, Scott Kostyshak wrote:
>> On Wed, Jan 20, 2016 at 10:18:14PM +0100, Georg Baum wrote:
>>> Scott Kostyshak wrote:
>>> > On Mon, Jan 18, 2016 at 03:59:16PM +0000, Guenter Milde wrote:
> 
> 
> ...
> 
>>> >> Alternatively, we could have an "untested" branch for commits that
>>> >> should go in after a test run...
>>> > 
>>> > I would like this. But I'm not sure all LyX developers feel
>>> > comfortable with git branches or with learning about git branches.
> 
>>> IMHO developer objections to git branches are the smallest part of the
>>> problem. Similar proposals have been made in the past, and nice
>>> workflows have been proposed which make use of branches, but in the end
>>> a human or a script needs to decide which parts of the untested branch
>>> get merged into master, and this is the point where it fails. We neither
>>> have the resources for setting up an automatism (would be a lot of
>>> work), nor for manual merging.
> 
> The idea would be that instead of posting/mailing the patches, I would
> commit them to the "untested" branch for some kind soul to run the tests.
> If things are OK, I can merge to master and commit.
> Would this make sense?

Definitely, if the needed coordination is still done as before (e.g. nobody 
expects automatic merging of that branch, and if some changes are added to 
that branch someone is asked to do tests).


>>> >> Could you life with the the current description of "expectations of
>>> >> LyX developers" (sec 3.3.1.1 in Develomplent.lyx):
>>> > 
>>> > I can live with the below for most cases. For trivial changes, no need
>>> > to run the tests. For non-trivial but small or medium changes, then
>>> > the below is fine. However, for significant changes where one thinks
>>> > there is a good chance that many (non-suspicious) tests will be
>>> > broken, then I think the committer should be required to either run
>>> > the tests or to post and wait until someone else runs the tests. I
>>> > only expect this to happen for a few commits every release cycle. For
>>> > example, changing from babel to polyglossia. Looking backward, it was
>>> > expected that many tests could break. Another example is the reporting
>>> > missing characters commit. I don't think there are many other commits
>>> > that fall in this category of "I think there is a large probability
>>> > that many tests will fail".
> 
> The problem with this kind of patches is, that the hundreds of failures
> were actually all "false positives" - problems with the tests or
> follow-up LyX-bugs, not problems with the patch.

The problem is you can't tell that without running the tests.

> I agree that there
> should be coordination with the test suite maintainers but no requirement
> to ensure a failure-free test run after patches where there is a
> consensus that they do "the right thing". The developer fixing important
> shortcomings usually does not have the time and expertise to solve these
> test failures anyway.

If it turns out that the test failures fall into this category, then the 
developer needs to ask for help. Even if his fix is 100% perfect, if it 
causes many tests to fail, then the most effective approach is to first take 
some other measures that would reduce the number of failing tests after 
applying the patch, and only commit when the number is low or even zero. We 
are talking about very few commits per year that fall into this category, so 
if it means that some fix needs to wait two weeks until somebody else fixed 
some other parts, then this is not a problem IMHO.

>>> This is quite complicated, but Ok for a start. I'd still like to see a
>>> standard set of export tests that we can make highly recommend for any
>>> change which is suspected to change something in LaTeX export.
> 
>> Yes this seems like a good idea.
> 
> Actually, this very much depends on what changes were done and what needs
> testing.
> 
> * There are "obscure" formats that are almost never really used (Lua/XeTeX
>   with 8-bit fonts, say).
>   
> * There are tests that replicate most parts: .lyx->.tex->.dvi for
>   "dvi", "ps", "pdf", "pdf3" (usually it should suffice to test "dvi").
> 
> * Some changes only regard 8-bit TeX, others only Xe/LuaTeX.
>   
> * It may suffice to test a simple document (splash.lyx or Intro.lyx) with
>   every language, or
>   
> * if may be OK to test with just English,
> 
> * ...

Seems I need to explain what I expect from such a set of standard tests. If 
you look at the export bugs we had in the past, it was quite rare they 
showed only up in only one of the export tests. They usually triggered a lot 
of the tests, and bug fixes did usually make lots of failures vanish. 
Therefore, you can simply do some statistics on that and come up with a very 
small set of tests which do still cover 90% of the coverage of the complete 
set of several thousand test cases. This is nothing invented by me, I read a 
paper about that some time go, can look it up if needed.

In our case we do not have enough statistical data yet to collect a good 
small test set (because lots of buigs were fixed before we had tests), but 
we can as well collect a good test set manually, based on our understanding 
of the LaTeX export.

If we have such a set of standard tests, then we can recommend it to be run 
before commit in many cases (because it would be fast), and it would find 
many of the problems which have only been found by the full tests in the 
past. The idea is that the developer who fixes something would not need to 
think about what tests to run.

I see the procedure you describe more on top of that for ironing out the 
last few percent of bugs.


Georg

Reply via email to