subject:"Re\: Random sampling in tests"

Re: Random sampling in tests

2018-10-09 Thread Steve Loughran

Randomized testing can, in theory, help you explore a far larger area of the environment of an app than you could explicitly explore, such as "does everything work in the turkish locale where "I".toLower()!="i", etc. Good: faster tests, especially on an essentially-non-finite set of options bad

Re: Random sampling in tests

2018-10-08 Thread Dongjoon Hyun

Sean's approach looks much better to me ( https://github.com/apache/spark/pull/22672) It achieves both contradictory goals simultaneously; keeping all test coverages and reducing the time from 2:31 to 0:24. Since we can remove test coverages anytime, can we proceed with Sean's non-intrusive appro

Re: Random sampling in tests

2018-10-08 Thread Xiao Li

Yes. Testing all the timezones is not needed. Xiao On Mon, Oct 8, 2018 at 8:36 AM Maxim Gekk wrote: > Hi All, > > I believe we should also take into account what we test, for example, I > don't think it makes sense to check all timezones for JSON/CSV > functions/datasources because those timezo

Re: Random sampling in tests

2018-10-08 Thread Maxim Gekk

Hi All, I believe we should also take into account what we test, for example, I don't think it makes sense to check all timezones for JSON/CSV functions/datasources because those timezones are just passed to external libraries. So, the same code is involved into testing of each out of 650 timezone

Re: Random sampling in tests

2018-10-08 Thread Sean Owen

If the problem is simply reducing the wall-clock time of tests, then even before we get to this question, I'm advocating: 1) try simple parallelization of tests within the suite. In this instance there's no reason not to test these in parallel and get a 8x or 16x speedup from cores. This assumes,

Re: Random sampling in tests

2018-10-08 Thread Marco Gaido

Yes, I see. It makes sense. Thanks. Il giorno lun 8 ott 2018 alle ore 16:35 Reynold Xin ha scritto: > Marco - the issue is to reproduce. It is much more annoying for somebody > else who might not have touched this test case to be able to reproduce the > error, just given a timezone. It is much e

Re: Random sampling in tests

2018-10-08 Thread Reynold Xin

Marco - the issue is to reproduce. It is much more annoying for somebody else who might not have touched this test case to be able to reproduce the error, just given a timezone. It is much easier to just follow some documentation saying "please run TEST_SEED=5 build/sbt ~ ". On Mon, Oct 8, 20

Re: Random sampling in tests

2018-10-08 Thread Marco Gaido

Hi all, thanks for bringing up the topic Sean. I agree too with Reynold's idea, but in the specific case, if there is an error the timezone is part of the error message. So we know exactly which timezone caused the failure. Hence I thought that logging the seed is not necessary, as we can directly

Re: Random sampling in tests

2018-10-08 Thread Xiao Li

For this specific case, I do not think we should test all the timezone. If this is fast, I am fine to leave it unchanged. However, this is very slow. Thus, I even prefer to reducing the tested timezone to a smaller number or just hardcoding some specific time zones. In general, I like Reynold’s id

Re: Random sampling in tests

2018-10-08 Thread Reynold Xin

I'm personally not a big fan of doing it that way in the PR. It is perfectly fine to employ randomized tests, and in this case it might even be fine to just pick couple different timezones like the way it happened in the PR, but we should: 1. Document in the code comment why we did it that way. 2

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

Re: Random sampling in tests

10 matches

Site Navigation

Mail list logo

Footer information