On Tue, 1 May 2007 15:08:56 +0200 Piotr JaroszyĆski <[EMAIL PROTECTED]> wrote:
> Hello, > > There was some discussion about forcing/not forcing tests in EAPI-1, > but there was clearly no compromise. Imho, tests are very important > and thus I want to discuss them a little more, but in more sensible > fashion. > > Firstly each test can be(not all categories are mutually exclusive): > - not existant > - non-functional > - not runnable from ebuild > - useful but unreasonable resource-wise > - useful and reasonable resource-wise > - necessary > - known to partially fail but with a way of skipping failing tests > - known to partially fail but with no easy way of skipping failing > tests Is that list comprehensive? I'd approach it a bit different: Before creating fixed classification groups I'd first identify the attributes of tests that should be used for those classifications. a) cost (in terms of runtime, resource usage, additional deps) b) effectiveness (does a failing/working test mean the package is broken/working?) c) importance (is there a realistic chance for the test to be useful?) d) correctness (does the test match the implementation? overlaps a bit with effectiveness) e) others? Each of these needs to be considered if we want to find a good compromise of which tests to run and which not. A test with high cost can still be worth running if effectiveness, correctness and importance are also high, on the other hand a test with little effectiveness, correctness and/or importance probably isn't worth running even with zero cost. Now the tricky question is how to actually measure those attributes. > Secondly we must answer the question how precisely we want to > distinguish them, so users/dev can choose which categories of tests > they want to run. What comes to mind is: > - run all tests > - run only necessary tests > - run only reasonable tests > - don't run tests at all > Again, is that list comprehensive? Problem is that terms like "reasonable" or "necessary" are quite subjective (regarding both humans and machines), and in this special context even "all" could be interpreted in different ways (btw, could someone give some real examples for packages with "necessary" tests?). So I think a more fine grained classification is needed that can be adopted for specific use cases (e.g. the mips+embedded profiles might want different defaults than the amd64+desktop profiles). Marius -- Public Key at http://www.genone.de/info/gpg-key.pub In the beginning, there was nothing. And God said, 'Let there be Light.' And there was still nothing, but you could see a bit better.
signature.asc
Description: PGP signature