Hello everyone, Here are the highlights of this week: 1. I am keeping the script customizable, to make simple/complex runs if necessary. I am also documenting the options. 2. I have added the logic to consider test file changes into the mix. Finding first-failures has improved significantly, and this is useful. 3. I am working now on considering correlation between test failures. Adding this should be quite quick, and I should be testing it by the end of the week. 4. I feel quite okay with the results so far. I have been testing with the first 5000 test runs (2000 for training, 3000 for simulation). Once I finish implementing the factor of correlation, I will run tests on a more ample spectrum (~20,000 or ~50,000 test runs). If the results look good, we might be able to review, make any changes/suggestions, and if everyone is okay with it, discuss details about the implementation while I finish the last touches for the simulation script. How does that seem?
Regards, and best for everyone. Pablo On Tue, Jun 3, 2014 at 7:42 AM, Sergei Golubchik <s...@mariadb.org> wrote: > Hi, Pablo! > > To add to my last reply: > > On Jun 03, Sergei Golubchik wrote: > > Well, it's your project, you can keep any measure you want. > > But please mark clearly (in comments or whatever) what factors affect > > results and what don't. > > > > It would be very useful to be able to see the simplest possible model > > that still delivers reasonably good results. Even if we'll decide to use > > something more complicated at the end. > ... > > Same as above, basically. I'd prefer to use not the model that simply > > "looks realistic", but the one that makes best predictions. > > > > You can use whatever criteria you prefer, but if taking into account > > changed tests will not improve the results, I'd like it to be to clearly > > documented or visible in the code. > > Alternatively, you can deliver (when the this GSoC project ends) two > versions of the script - one with anything you want in it, and the > second one - as simple as possible. > > For example, the only really important metric is the "recall as a > function of total testing` time". We want to reach as high recall as > possible in the shortest possible testing time, right? But according to > this criteria one needs to take into account individual test execution > times (it's better to run 5 fast tests than 1 slow test) and individual > builder speed factors (better to run 10 tests on a fast builder than 5 > tests on a slow builder). And in my tests it turned out that these > complications don't improve results much. So, while they make perfect > sense and make the model more realistic, the simple model can perfectly > survive without them and use "recall vs. number of tests" metric. > > Regards, > Sergei > >
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp