A different repo will be better 

> On Aug 22, 2019, at 6:16 AM, Per Otterström <per.otterst...@ericsson.com> 
> wrote:
> 
> Very powerful tool indeed, thanks for sharing!
> 
> I believe it is best to keep tools like this in different repos since 
> different tools will probably have different life cycles and tool chains. 
> Yes, that could be handled in a single repo, but with different repos we'd 
> get natural boundaries.
> 
> -----Original Message-----
> From: Sumanth Pasupuleti <spasupul...@netflix.com.INVALID> 
> Sent: den 22 augusti 2019 14:40
> To: dev@cassandra.apache.org
> Subject: Re: Contributing cassandra-diff
> 
> No hard preference on the repo, but just excited about this tool! Looking 
> forward to employing this for upgrade testing (very timely :))
> 
>> On Thu, Aug 22, 2019 at 3:38 AM Sam Tunnicliffe <s...@beobal.com> wrote:
>> 
>> My own weak preference would be for a dedicated repo in the first 
>> instance. If/when additional tools are contributed we should look at 
>> co-locating common stuff, but rushing toward a monorepo would be a 
>> mistake IMO.
>> 
>>>> On 22 Aug 2019, at 11:10, Jeff Jirsa <jji...@gmail.com> wrote:
>>> 
>>> I weakly prefer contrib.
>>> 
>>> 
>>> On Thu, Aug 22, 2019 at 12:09 PM Marcus Eriksson 
>>> <marc...@apache.org>
>> wrote:
>>> 
>>>> Hi, we are about to open source our tooling for comparing two 
>>>> cassandra clusters and want to get some feedback where to push it. 
>>>> I think the options are: (name bike-shedding welcome)
>>>> 
>>>> 1. create repos/asf/cassandra-diff.git 2. create a generic 
>>>> repos/asf/cassandra-contrib.git where we can add
>> more
>>>> contributed tools in the future
>>>> 
>>>> Temporary location: 
>>>> https://protect2.fireeye.com/url?k=e8982d07-b412e678-e8986d9c-86717
>>>> 581b0b5-292bc820a13b7138&q=1&u=https%3A%2F%2Fgithub.com%2Fkrummas%2
>>>> Fcassandra-diff
>>>> 
>>>> Cassandra-diff is a spark job that compares the data in two 
>>>> clusters -
>> it
>>>> pages through all partitions and reads all rows for those 
>>>> partitions in both clusters to make sure they are identical. Based 
>>>> on the
>> configuration
>>>> variable “reverse_read_probability” the rows are either read 
>>>> forward or
>> in
>>>> reverse order.
>>>> 
>>>> Our main use case for cassandra-diff has been to set up two 
>>>> identical clusters, transfer a snapshot from the cluster we want to 
>>>> test to these clusters and upgrade one side. When that is done we 
>>>> run this tool to
>> make
>>>> sure that 2.1 and 3.0 gives the same results. A few examples of the
>> bugs we
>>>> have found using this tool:
>>>> 
>>>> * CASSANDRA-14823: Legacy sstables with range tombstones spanning
>> multiple
>>>> index blocks create invalid bound sequences on 3.0+
>>>> * CASSANDRA-14803: Rows that cross index block boundaries can cause 
>>>> incomplete reverse reads in some cases
>>>> * CASSANDRA-15178: Skipping illegal legacy cells can break reverse 
>>>> iteration of indexed partitions
>>>> 
>>>> /Marcus
>>>> 
>>>> -------------------------------------------------------------------
>>>> -- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>> 
>>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 
> B‹KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB•È[œÝXœØÜšX™KK[XZ[ˆ]‹][œÝXœØÜšX™PØ\ÜØ[™˜K˜\XÚK›Ü™ÃB‘›ÜˆY][Û˜[ÛÛ[X[™ËK[XZ[ˆ]‹Z[Ø\ÜØ[™˜K˜\XÚK›Ü™ÃBƒB

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to