On Tue, 1 May 2007 15:08:56 +0200
Piotr JaroszyƄski <[EMAIL PROTECTED]> wrote:

> Hello,
> 
> There was some discussion about forcing/not forcing tests in EAPI-1,
> but there was clearly no compromise. Imho, tests are very important
> and thus I want to discuss them a little more, but in more sensible
> fashion.
> 
> Firstly each test can be(not all categories are mutually exclusive):
> - not existant
> - non-functional
> - not runnable from ebuild
> - useful but unreasonable resource-wise
> - useful and reasonable resource-wise
> - necessary
> - known to partially fail but with a way of skipping failing tests
> - known to partially fail but with no easy way of skipping failing
> tests Is that list comprehensive?

I'd approach it a bit different: Before creating fixed classification
groups I'd first identify the attributes of tests that should be used
for those classifications.
a) cost (in terms of runtime, resource usage, additional deps)
b) effectiveness (does a failing/working test mean the package is
broken/working?)
c) importance (is there a realistic chance for the test to be useful?)
d) correctness (does the test match the implementation? overlaps a bit
with effectiveness)
e) others?

Each of these needs to be considered if we want to find a good
compromise of which tests to run and which not. A test with high cost
can still be worth running if effectiveness, correctness and importance
are also high, on the other hand a test with little effectiveness,
correctness and/or importance probably isn't worth running even with
zero cost.
Now the tricky question is how to actually measure those attributes.

> Secondly we must answer the question how precisely we want to
> distinguish them, so users/dev can choose which categories of tests
> they want to run. What comes to mind is:
> - run all tests
> - run only necessary tests
> - run only reasonable tests
> - don't run tests at all
> Again, is that list comprehensive?

Problem is that terms like "reasonable" or "necessary" are quite
subjective (regarding both humans and machines), and in this special
context even "all" could be interpreted in different ways (btw, could
someone give some real examples for packages with "necessary" tests?).

So I think a more fine grained classification is needed that can be
adopted for specific use cases (e.g. the mips+embedded profiles might
want different defaults than the amd64+desktop profiles).

Marius

-- 
Public Key at http://www.genone.de/info/gpg-key.pub

In the beginning, there was nothing. And God said, 'Let there be
Light.' And there was still nothing, but you could see a bit better.

Attachment: signature.asc
Description: PGP signature

Reply via email to