This is a great addition to our Cassandra validation framework/tools. I can see many teams in the community get benefited from tooling like this.
I like the idea of the generic repo (repos/asf/cassandra-contrib.git or *whatever the name is*) for tools like this, for the following 2 main reasons. 1. Easily accessible/ reachable/ searchable 2. Welcomes community in Cassandra ecosystem to contribute more easily Thanks, Vinay Chella On Wed, Aug 21, 2019 at 11:39 PM Marcus Eriksson <marc...@apache.org> wrote: > Hi, we are about to open source our tooling for comparing two cassandra > clusters and want to get some feedback where to push it. I think the > options are: (name bike-shedding welcome) > > 1. create repos/asf/cassandra-diff.git > 2. create a generic repos/asf/cassandra-contrib.git where we can add more > contributed tools in the future > > Temporary location: https://github.com/krummas/cassandra-diff > > Cassandra-diff is a spark job that compares the data in two clusters - it > pages through all partitions and reads all rows for those partitions in > both clusters to make sure they are identical. Based on the configuration > variable “reverse_read_probability” the rows are either read forward or in > reverse order. > > Our main use case for cassandra-diff has been to set up two identical > clusters, transfer a snapshot from the cluster we want to test to these > clusters and upgrade one side. When that is done we run this tool to make > sure that 2.1 and 3.0 gives the same results. A few examples of the bugs we > have found using this tool: > > * CASSANDRA-14823: Legacy sstables with range tombstones spanning multiple > index blocks create invalid bound sequences on 3.0+ > * CASSANDRA-14803: Rows that cross index block boundaries can cause > incomplete reverse reads in some cases > * CASSANDRA-15178: Skipping illegal legacy cells can break reverse > iteration of indexed partitions > > /Marcus > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >