Note: by the time I got to the end of writing this email I have come to
strongly suspect that the duplication relates to how the filtering in the
table works... but how one easily gets a non-duplicative set when
investigating failures is still a question I don't have an answer for....

Seeing TimeRoutedAliasUpdateProcessorTest on the 7.6 bad apple list, having
recently been looking at that test, and waiting on a long build for other
work, I went to http://fucit.org/solr-jenkins-reports/failure-report.html
to gather recent failures, and when I started looking I began to suspect
there were duplicates... So I downloaded/extracted everything that comes up
when I filter on TimeR and fiddled about a bit with grepping/etc and got
this result that I admit I don't understand....

Filtered results on fucit.org looked like:

NS2-MacBook-Pro:TRA gus$ column -ts, failure-rates-data.csv

"Suite?"  "Class"
      "Method"                  "Rate"              "Runs"  "Fails"

"false"
"org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest"
"test"                    "4.76190476190476"  "42"    "2"

"false"
"org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest"
"testSliceRouting"        "2.27272727272727"  "308"   "7"

"true"
"org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest"  ""
                      "1.3986013986014"   "286"   "4"

"false"
"org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest"
"testPreemptiveCreation"  "1.2987012987013"   "308"   "4"

I clicked on each line and opened a tab fore each line in the modal dialog
and then from each tab downloaded jenkins.log.txt.gz into a folder
corresponding to the day on the file timestamp

gus$ find . -name *.gz -print0 | xargs -0 gunzip

NS2-MacBook-Pro:testfails gus$ find . -name *.txt

./TRA/2018-12-01/jenkins.log.txt

./TRA/2018-12-01/jenkins.log (2).txt

./TRA/2018-12-01/jenkins.log (3).txt

./TRA/2018-12-01/jenkins.log (1).txt

./TRA/2018-11-30/jenkins.log (4).txt

./TRA/2018-11-30/jenkins.log.txt

./TRA/2018-11-30/jenkins.log (2).txt

./TRA/2018-11-30/jenkins.log (3).txt

./TRA/2018-11-30/jenkins.log (1).txt

./TRA/2018-12-02/jenkins.log.txt

./TRA/2018-12-02/jenkins.log (2).txt

./TRA/2018-12-02/jenkins.log (3).txt

./TRA/2018-12-02/jenkins.log (1).txt

./TRA/2018-12-03/jenkins.log.txt


gus$ grep -r 'reprod' * | grep TimeRouted | perl -pe 's/(^[^[]*).*reproduce
with:(.*Dtests\.seed=(\w+)\s.*)/\3 \1 \2/' | sort

2E743D2D45BF625E TRA/2018-12-02/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=test -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE
-Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=2E743D2D45BF625E
-Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=guz-KE
-Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true
-Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log (2).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=test -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE
-Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log (2).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=2E743D2D45BF625E
-Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=guz-KE
-Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log (2).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true
-Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=test -Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=guz-KE
-Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=2E743D2D45BF625E
-Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=guz-KE
-Dtests.timezone=Etc/GMT-5 -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII

2E743D2D45BF625E TRA/2018-12-02/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.seed=2E743D2D45BF625E -Dtests.multiplier=3 -Dtests.slow=true
-Dtests.badapples=true -Dtests.locale=guz-KE -Dtests.timezone=Etc/GMT-5
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII

653A3F94747D4B6C TRA/2018-12-02/jenkins.log (3).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=653A3F94747D4B6C
-Dtests.slow=true -Dtests.locale=is -Dtests.timezone=Pacific/Kiritimati
-Dtests.asserts=true -Dtests.file.encoding=UTF-8

85F52ED219B35581 TRA/2018-11-30/jenkins.log (2).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testPreemptiveCreation -Dtests.seed=85F52ED219B35581
-Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII

85F52ED219B35581 TRA/2018-11-30/jenkins.log (2).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=85F52ED219B35581
-Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII

85F52ED219B35581 TRA/2018-11-30/jenkins.log (4).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testPreemptiveCreation -Dtests.seed=85F52ED219B35581
-Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII

85F52ED219B35581 TRA/2018-11-30/jenkins.log (4).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=85F52ED219B35581
-Dtests.slow=true -Dtests.locale=lv-LV -Dtests.timezone=America/Resolute
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII

87AA84094394A25D TRA/2018-11-30/jenkins.log (3).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testPreemptiveCreation -Dtests.seed=87AA84094394A25D
-Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=fr-FR
-Dtests.timezone=Pacific/Efate -Dtests.asserts=true
-Dtests.file.encoding=US-ASCII

8DF7794AB2D01C00 TRA/2018-12-01/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=8DF7794AB2D01C00
-Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=en-IO
-Dtests.timezone=America/Belize -Dtests.asserts=true
-Dtests.file.encoding=UTF-8

953C910946955E70 TRA/2018-12-01/jenkins.log (2).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=953C910946955E70
-Dtests.slow=true -Dtests.locale=uk -Dtests.timezone=Indian/Christmas
-Dtests.asserts=true -Dtests.file.encoding=UTF-8

96F44DBF886ECD38 TRA/2018-12-01/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=96F44DBF886ECD38
-Dtests.slow=true -Dtests.locale=de-LU -Dtests.timezone=Etc/GMT+2
-Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1

B72E2FF953D47986 TRA/2018-12-03/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=test -Dtests.seed=B72E2FF953D47986 -Dtests.multiplier=2
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=sl-SI
-Dtests.timezone=Asia/Famagusta -Dtests.asserts=true
-Dtests.file.encoding=UTF-8

B9DA44E3F56E5A8C TRA/2018-12-01/jenkins.log (3).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testSliceRouting -Dtests.seed=B9DA44E3F56E5A8C
-Dtests.slow=true -Dtests.locale=hr -Dtests.timezone=Mexico/BajaSur
-Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1

C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=be-BY
-Dtests.timezone=Asia/Dubai -Dtests.asserts=true
-Dtests.file.encoding=Cp1252

C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F
-Dtests.slow=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai
-Dtests.asserts=true -Dtests.file.encoding=Cp1252

C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.badapples=true
-Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true
-Dtests.file.encoding=Cp1252

C3EC920833C32D9F TRA/2018-11-30/jenkins.log (1).txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.locale=be-BY
-Dtests.timezone=Asia/Dubai -Dtests.asserts=true
-Dtests.file.encoding=Cp1252

C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=be-BY
-Dtests.timezone=Asia/Dubai -Dtests.asserts=true
-Dtests.file.encoding=Cp1252

C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.method=testPreemptiveCreation -Dtests.seed=C3EC920833C32D9F
-Dtests.slow=true -Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai
-Dtests.asserts=true -Dtests.file.encoding=Cp1252

C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.badapples=true
-Dtests.locale=be-BY -Dtests.timezone=Asia/Dubai -Dtests.asserts=true
-Dtests.file.encoding=Cp1252

C3EC920833C32D9F TRA/2018-11-30/jenkins.log.txt:     ant test
-Dtestcase=TimeRoutedAliasUpdateProcessorTest
-Dtests.seed=C3EC920833C32D9F -Dtests.slow=true -Dtests.locale=be-BY
-Dtests.timezone=Asia/Dubai -Dtests.asserts=true
-Dtests.file.encoding=Cp1252

What I've done is sort by seed, and found a LOT of duplication, even across
files and some apparent running of specific test methods. I'd like to
understand what's happening with the build servers here... why am we seeing
so many duplicates? I would guess that this really boils down to 1 fail per
seed value seen? I'm trying to figure out how many and which of these I
need to consider, and I'm interested in the frequency of different failure
scenarios which is hard to gauge if there's duplication.

Other interesting stuff...
gus$ grep -r 'reprod' * | grep Time | perl -pe 's/(^[^[]*).*reproduce
with:(.*Dtests\.seed=(\w+)\s.*)/\1/' | sort | uniq -c

   4 TRA/2018-11-30/jenkins.log (1).txt:

   2 TRA/2018-11-30/jenkins.log (2).txt:

   1 TRA/2018-11-30/jenkins.log (3).txt:

   2 TRA/2018-11-30/jenkins.log (4).txt:

   4 TRA/2018-11-30/jenkins.log.txt:

   1 TRA/2018-12-01/jenkins.log (1).txt:

   1 TRA/2018-12-01/jenkins.log (2).txt:

   1 TRA/2018-12-01/jenkins.log (3).txt:

   1 TRA/2018-12-01/jenkins.log.txt:

   3 TRA/2018-12-02/jenkins.log (1).txt:

   3 TRA/2018-12-02/jenkins.log (2).txt:

   1 TRA/2018-12-02/jenkins.log (3).txt:

   3 TRA/2018-12-02/jenkins.log.txt:

   1 TRA/2018-12-03/jenkins.log.txt:

gus$ grep -r 'reprod' * | grep Time | perl -pe 's/.*reproduce
with:(.*Dtests\.seed=(\w+)\s.*)/\2/' | sort | uniq -c

   9 2E743D2D45BF625E

   1 653A3F94747D4B6C

   4 85F52ED219B35581

   1 87AA84094394A25D

   1 8DF7794AB2D01C00

   1 953C910946955E70

   1 96F44DBF886ECD38

   1 B72E2FF953D47986

   1 B9DA44E3F56E5A8C

   8 C3EC920833C32D9F

 gus$ grep -r 'reprod' * | grep Time | perl -pe 's/.*reproduce
with:(.*Dtests\.seed=(\w+)\s.*)/\2/' | wc -l

      28

So 17 failures listed in fucit.org lead me to find 14 files containing 10
distinct seeds seeds and 28 lines that contain "reproduce with:"

Not quite sure how to interpret that. Even if I say each seed is a unique
fail, I have no idea how many total builds that relates to...

-Gus

On Wed, Dec 5, 2018 at 12:02 PM Nicholas Knize <[email protected]> wrote:

> Hi All,
>
> https://issues.apache.org/jira/browse/SOLR-13039 contains a patch that
> sets a list of common failing Tests to BadApple. As mentioned above I cross
> referenced our CI builds to make sure there aren't any new test failures
> that we haven't seen before. Let me know if any of these come as a
> surprise. I'll plan to commit this change to the 7.6 branch only to
> continue along in the release process. Once 7.6 is released I can revert
> the change to continue CI testing on the bug fix branch.
>
> Thanks for the patience on this.
>

Reply via email to