Hello Elena, Can you give a raw estimation of a ratio of failures missed due to being > low in the priority queue vs those that were not in the queue at all? >
I sent this information in a previous email, here: https://lists.launchpad.net/maria-developers/msg07482.html Also, once again, I would like you to start using an incoming test list as > an initial point of your test set generation. It must be done sooner or > later, I already explained earlier why; and while it's not difficult to > implement even after the end of your project, it might affect the result > considerably, so we need to know if it makes it better or worse, and adjust > the algorithm accordingly. > > You are right. I understand that this information is not fully available for all the test_runs, so can you upload the information going back as much as possible? I can parse these files and adjust the program to work with this. I will get on to work with this, I think this should significantly improve results. I think, it might even push my current strategy from promising results into attractive ones. > There are several options which change the way the tests are executed; > e.g. tests can be run in a "normal" mode, or in PS protocol mode, or with > valgrind, or with embedded server. And it might well be that some tests > always fail e.g. with valgrind, but almost never fail otherwise. > Information about these options is partially available in test_run.info, > but it would require some parsing. It would be perfect if you could analyze > the existing data to understand whether using it can affect your results > before spending time on actual code changes. > I will keep this in consideration, but for now I will focus on these two main things: - Improving precision of selecting code changes to estimate correlation with test failures - Adding the use of an incoming test list > When we are trying to watch all code changes and find correlation with > test failures, if it's done well, it should actually provide immediate > gain; however, it's very difficult to do it right, there is way too much > noise in the statistical data to get a reliable picture. So, while it will > be nice if you get it work (since you already started doing it), don't take > it as a defeat if you eventually find out that it doesn't work very well. > Well, actually, this is the only big difference between the original strategy using just a weighted average of failures; and the new strategy, which performs *significantly better* in longer testing settings. It has been working for a few weeks, and is up on github. Either way, as I said before, I will, from today, focus on improving precision of selecting code changes to estimate correlation with test failures. Regards Pablo
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp