Note: by the time I got to the end of writing this email I have come to strongly suspect that the duplication relates to how the filtering in the table works... but how one easily gets a non-duplicative set when investigating failures is still a question I don't have an answer for....
Seeing TimeRoutedAliasUpdateProcessorTest on the 7.6 bad apple list, having recently been looking at that test, and waiting on a long build for other work, I went to http://fucit.org/solr-jenkins-reports/failure-report.html to gather recent failures, and when I started looking I began to suspect there were duplicates... So I downloaded/extracted everything that comes up when I filter on TimeR and fiddled about a bit with grepping/etc and got this result that I admit I don't understand.... Filtered results on fucit.org looked like: NS2-MacBook-Pro:TRA gus$ column -ts, failure-rates-data.csv "Suite?" "Class" "Method" "Rate" "Runs" "Fails" "false" "org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest" "test" "4.76190476190476" "42" "2" "false" "org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest" "testSliceRouting" "2.27272727272727" "308" "7" "true" "org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest" "" "1.3986013986014" "286" "4" "false" "org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest" "testPreemptiveCreation" "1.2987012987013" "308" "4" I clicked on each line and opened a tab fore each line in the modal dialog and then from each tab downloaded jenkins.log.txt.gz into a folder corresponding to the day on the file timestamp gus$ find . -name *.gz -print0 | xargs -0 gunzip NS2-MacBook-Pro:testfails gus$ find . -name *.txt ./TRA/2018-12-01/jenkins.log.txt ./TRA/2018-12-01/jenkins.log (2).txt ./TRA/2018-12-01/jenkins.log (3).txt ./TRA/2018-12-01/jenkins.log (1).txt ./TRA/2018-11-30/jenkins.log (4).txt ./TRA/2018-11-30/jenkins.log.txt ./TRA/2018-11-30/jenkins.log (2).txt ./TRA/2018-11-30/jenkins.log (3).txt ./TRA/2018-11-30/jenkins.log (1).txt ./TRA/2018-12-02/jenkins.log.txt ./TRA/2018-12-02/jenkins.log (2).txt ./TRA/2018-12-02/jenkins.log (3).txt ./TRA/2018-12-02/jenkins.log (1).txt ./TRA/2018-12-03/jenkins.log.txt gus$ grep -r 'reprod' * | grep TimeRouted | perl -pe 's/(^[^[]*).*reproduce with:(.*Dtests\.seed=(\w+)\s.*)/\3 \1 \2/' | sort 2E743D2D45BF625E TRA/2018-12-02/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=test -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log (2).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=test -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log (2).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log (2).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=test -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 2E743D2D45BF625E TRA/2018-12-02/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 653A3F94747D4B6C TRA/2018-12-02/jenkins.log (3).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=653A3F94747D4B6C -Dtests.slow=true -Dtests.locale=is -Dtests.timezone=Pacific/Kiritimati -Dtests.asserts=true -Dtests.file.encoding=UTF-8 85F52ED219B35581 TRA/2018-11-30/jenkins.log (2).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testPreemptiveCreation -Dtests.seed=85F52ED219B35581 -Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 85F52ED219B35581 TRA/2018-11-30/jenkins.log (2).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=85F52ED219B35581 -Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 85F52ED219B35581 TRA/2018-11-30/jenkins.log (4).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testPreemptiveCreation -Dtests.seed=85F52ED219B35581 -Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 85F52ED219B35581 TRA/2018-11-30/jenkins.log (4).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=85F52ED219B35581 -Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 87AA84094394A25D TRA/2018-11-30/jenkins.log (3).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testPreemptiveCreation -Dtests.seed=87AA84094394A25D -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=fr-FR -Dtests.timezone=Pacific/Efate -Dtests.asserts=true -Dtests.file.encoding=US-ASCII 8DF7794AB2D01C00 TRA/2018-12-01/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=8DF7794AB2D01C00 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=en-IO -Dtests.timezone=America/Belize -Dtests.asserts=true -Dtests.file.encoding=UTF-8 953C910946955E70 TRA/2018-12-01/jenkins.log (2).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=953C910946955E70 -Dtests.slow=true -Dtests.locale=uk -Dtests.timezone=Indian/Christmas -Dtests.asserts=true -Dtests.file.encoding=UTF-8 96F44DBF886ECD38 TRA/2018-12-01/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=96F44DBF886ECD38 -Dtests.slow=true -Dtests.locale=de-LU -Dtests.timezone=Etc/GMT+2 -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1 B72E2FF953D47986 TRA/2018-12-03/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=test -Dtests.seed=B72E2FF953D47986 -Dtests.multiplier=2 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=sl-SI -Dtests.timezone=Asia/Famagusta -Dtests.asserts=true -Dtests.file.encoding=UTF-8 B9DA44E3F56E5A8C TRA/2018-12-01/jenkins.log (3).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testSliceRouting -Dtests.seed=B9DA44E3F56E5A8C -Dtests.slow=true -Dtests.locale=hr -Dtests.timezone=Mexico/BajaSur -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1 C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt: ant test -Dtestcase=TimeRoutedAliasUpdateProcessorTest -Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true -Dtests.file.encoding=Cp1252 What I've done is sort by seed, and found a LOT of duplication, even across files and some apparent running of specific test methods. I'd like to understand what's happening with the build servers here... why am we seeing so many duplicates? I would guess that this really boils down to 1 fail per seed value seen? I'm trying to figure out how many and which of these I need to consider, and I'm interested in the frequency of different failure scenarios which is hard to gauge if there's duplication. Other interesting stuff... gus$ grep -r 'reprod' * | grep Time | perl -pe 's/(^[^[]*).*reproduce with:(.*Dtests\.seed=(\w+)\s.*)/\1/' | sort | uniq -c 4 TRA/2018-11-30/jenkins.log (1).txt: 2 TRA/2018-11-30/jenkins.log (2).txt: 1 TRA/2018-11-30/jenkins.log (3).txt: 2 TRA/2018-11-30/jenkins.log (4).txt: 4 TRA/2018-11-30/jenkins.log.txt: 1 TRA/2018-12-01/jenkins.log (1).txt: 1 TRA/2018-12-01/jenkins.log (2).txt: 1 TRA/2018-12-01/jenkins.log (3).txt: 1 TRA/2018-12-01/jenkins.log.txt: 3 TRA/2018-12-02/jenkins.log (1).txt: 3 TRA/2018-12-02/jenkins.log (2).txt: 1 TRA/2018-12-02/jenkins.log (3).txt: 3 TRA/2018-12-02/jenkins.log.txt: 1 TRA/2018-12-03/jenkins.log.txt: gus$ grep -r 'reprod' * | grep Time | perl -pe 's/.*reproduce with:(.*Dtests\.seed=(\w+)\s.*)/\2/' | sort | uniq -c 9 2E743D2D45BF625E 1 653A3F94747D4B6C 4 85F52ED219B35581 1 87AA84094394A25D 1 8DF7794AB2D01C00 1 953C910946955E70 1 96F44DBF886ECD38 1 B72E2FF953D47986 1 B9DA44E3F56E5A8C 8 C3EC920833C32D9F gus$ grep -r 'reprod' * | grep Time | perl -pe 's/.*reproduce with:(.*Dtests\.seed=(\w+)\s.*)/\2/' | wc -l 28 So 17 failures listed in fucit.org lead me to find 14 files containing 10 distinct seeds seeds and 28 lines that contain "reproduce with:" Not quite sure how to interpret that. Even if I say each seed is a unique fail, I have no idea how many total builds that relates to... -Gus On Wed, Dec 5, 2018 at 12:02 PM Nicholas Knize <[email protected]> wrote: > Hi All, > > https://issues.apache.org/jira/browse/SOLR-13039 contains a patch that > sets a list of common failing Tests to BadApple. As mentioned above I cross > referenced our CI builds to make sure there aren't any new test failures > that we haven't seen before. Let me know if any of these come as a > surprise. I'll plan to commit this change to the 7.6 branch only to > continue along in the release process. Once 7.6 is released I can revert > the change to continue CI testing on the bug fix branch. > > Thanks for the patience on this. >
