[ https://issues.apache.org/jira/browse/CASSANDRA-12580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452820#comment-15452820 ]
Paulo Motta commented on CASSANDRA-12580: ----------------------------------------- Attaching patch to fix the calculation formula to: {{int depth = numPartitions > 0 ? (int) Math.min(Math.ceil(Math.log(numPartitions) / Math.log(2)), 20) : 0;}} Besides fixing from {{ln}} to {{lg}}, this also changes the rounding formula from {{floor}} to {{ceil}}, so we overestimate the depth rather than underestimate. I added a new test on {{ValidationTest}} that runs a validation compaction with N=128 and N=1500 keys and expect the merkle tree depth to be {{ceil(lg(N))}}. I also modified the other tests on this class to use a {{ListenableFuture}} ({{CompletableFuture}} on 3.0+) instead of {{SimpleCondition}}, since the JUnit assertions are not enforced in other threads. Patch and tests available below: ||2.1||2.2||3.0||trunk|| |[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-CASSANDRA-12580]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]| > Fix merkle tree size calculation > -------------------------------- > > Key: CASSANDRA-12580 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12580 > Project: Cassandra > Issue Type: Bug > Reporter: Paulo Motta > Assignee: Paulo Motta > > On CASSANDRA-5263 it was introduced dynamic merkle tree sizing based on > estimated number of partitions as {{estimatedDepth = lg(numPartitions)}}, but > on > [CompactionManager.doValidationCompaction|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1052] > this is being calculated as: > {{int depth = numPartitions > 0 ? (int) > Math.min(Math.floor(Math.log(numPartitions)), 20) : 0;}} > This is actually calculating {{ln(numPartitions)}} (base-e) instead of > {{lg(numPartitions)}} (base-2), which causes merkle trees to lose resolution, > what may result in overstreaming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)