Hey Angelo. Here's my suspicion - only had a brief time to poke at this this morning but it might help you bottom out on this. In short: I think the logic in the test is flawed, and that the Assert.assertEquals check should instead do a >= check against total mutations read from midpoint. The test iterates through all files and plays mutations >= the midpoint in terms of file offset, but what we often see testing the CommitLog is that other things (system tables, etc) are being written to CommitLog's so we can't deterministically say "check that X mutations are present, no more, no less" because some runs will end up with extra things written in the CL, some not, and it's just a timing concern.
Each of your cases, you see that you have >= expected actual readCount from midpoint (50) mutations in your testHandler.seenMutationCount(), and I suspect that's because mutations from disparate tables and keyspaces are all interleaved in the CL segment so you can get more than you expected. Hope that helps and isn't woefully incorrect. :) On Thu, Dec 19, 2019 at 7:01 AM Angelo Polo <language.de...@gmail.com> wrote: > Hi C* Devs, > > I've got two questions relating to the CommitLogReader. > > 1. > I maintain the Cassandra port for FreeBSD. For C* 3.11.5 (OpenJDK > 1.8.0_222, FreeBSD 12.1-RELEASE-p1 GENERIC amd64), there are currently test > failures in org.apache.cassandra.db.commitlog.CommitLogReaderTest with zero > mutations read in each case. Only testReadCount passes and it looks like > that's because the number of generated samples in the test is smaller than > the others. Increase the sample size to around 300 and it will fail too; > similarly, the others pass with a smaller sample size. > > The tests are failing because CommitLogDescriptor#readHeader returns null, > a result of the checksum check failing: the checksum read from the file > (CommitLogDescriptor:175) is zero. Similarly version, id, and > parameterLength are all zero and from what I can tell the commit log in > these instances is just all zeros. > > Any idea why a "large" number of mutations could cause a problem with the > commit log? Has this been seen and fixed on any of the officially supported > platforms (Linux, Darwin)? > > 2. > Because of #1 above, I started playing around with CommitLogReaderTest on > Linux (OpenJDK 1.8.0_232, Ubuntu 18.04). Checked out the 3.11.5 release > commit and changed just samples and readCount in > CommitLogReaderTest#testReadFromMidpoint as follows: > > public void testReadFromMidpoint() throws Throwable > { > int samples = 100; // formerly 1000 > int readCount = 60; // formerly 500 > > Shouldn't this fail uniformly every time with "Expected 60 seen mutations, > got: 50"? It produces somewhat random results: > > [junit-timeout] Testcase: > > testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest): > FAILED > [junit-timeout] Expected 60 seen mutations, got: 55 expected:<60> but > was:<55> > [junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen > mutations, got: 55 expected:<60> but was:<55> > [junit-timeout] at > > org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111) > > [junit-timeout] Testcase: > > testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest): > FAILED > [junit-timeout] Expected 60 seen mutations, got: 57 expected:<60> but > was:<57> > [junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen > mutations, got: 57 expected:<60> but was:<57> > [junit-timeout] at > > org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111) > > Including, occasionally, the anticipated answer :) > [junit-timeout] Testcase: > > testReadFromMidpoint(org.apache.cassandra.db.commitlog.CommitLogReaderTest): > FAILED > [junit-timeout] Expected 60 seen mutations, got: 50 expected:<60> but > was:<50> > [junit-timeout] junit.framework.AssertionFailedError: Expected 60 seen > mutations, got: 50 expected:<60> but was:<50> > [junit-timeout] at > > org.apache.cassandra.db.commitlog.CommitLogReaderTest.testReadFromMidpoint(CommitLogReaderTest.java:111) > > Shall I open a bug in JIRA for this or have I misunderstood how > testReadFromMidpoint works? > > Thanks, > Angelo Polo >