Certainly if your tests have found a problem, open a JIRA and/or pull request with the fix and relevant tests.
More tests generally can't hurt, though I guess we should maybe have a look at them first. If they're a lot of boilerplate and covering basic functions already covered by other tests, they're not as useful, but tests covering new cases should probably be added. On Wed, Aug 22, 2018 at 6:14 AM Steffen Herbold < herb...@cs.uni-goettingen.de> wrote: > Dear developers, > > I am writing you because I applied an approach for the automated testing > of classification algorithms to Spark MLlib and would like to forward > the results to you. > > The approach is a combination of smoke testing and metamorphic testing. > The smoke tests try to find problems by executing the training and > prediction functions of classifiers with different data. These smoke > tests should ensure the basic functioning of classifiers. I defined 20 > different data sets, some very simple (uniform features in [0,1]), some > with extreme distributions, e.g., data close to machine precision. The > metamorphic tests determine if classification results change as expected > if the training data is modified, e.g., by reordering features, flipping > class labels, or reordering instances. > > I generated 70 different JUnit tests for six different Spark ML > classifiers. In summary, I found the following potential problems: > - One error due to a value being out of bounds for the Logistic > regression classifier if data approaches MAXDOUBLE. Which bound is > affected is not explained. > - The classification of NaïveBayes and the LinearSVC sometimes changed > if one is added to each feature value. > - The classification of LogisticRegression, DecisionTree, and > RandomForest were not inverted when all binary class labels are flipped. > - The classification of LogisticRegression, DecisionTree, GBT, and > RandomForest sometimes changed when the features are reordered. > - The classification of LogisticRegression, RandomForest, and LinearSVC > sometimes changed when the instances are reordered. > > You can find details of our results online [1]. The provided resources > include the current draft of the paper that describes the tests as well > as detailed results in detail. Moreover, we provide an executable test > suite with all tests we executed, as well as the export of our test > results as XML file that contains all details of the test execution, > including stack traces in case of exceptions. The preprint and online > materials also contain the results for two other machine learning > libraries, i.e., Weka and scikit-learn. Additionally, you can find the > atoml tool used to generate the tests on GitHub [2]. > > I hope that these tests may help with the future development of Spark > MLlib. You could help me a lot by answering the following questions: > - Do you consider the tests helpful? > - Do you consider any source code or documentation changes due to our > findings? > - Would you be interested in a pull request or any other type of > integration of (a subset of) the tests into your project? > - Would you be interested in more such tests, e.g., for the > consideration of hyper parameters, other algorithm types like > clustering, or more complex algorithm specific metamorphic tests? > > I am looking forward to your feedback. > > Best regards, > Steffen Herbold > > [1] http://user.informatik.uni-goettingen.de/~sherbold/atoml-results/ > [2] https://github.com/sherbold/atoml > > -- > Dr. Steffen Herbold > Institute of Computer Science > University of Goettingen > Goldschmidtstraße 7 > 37077 Göttingen, Germany > mailto. herb...@cs.uni-goettingen.de > tel. +49 551 39-172037 <+49%20551%2039172037> > > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >