Right, that's sort of a half-repair: it will repair differences in replies it got, but it won't doublecheck md5s on the rest in the background. So if you're doing CL.ONE reads this is a no-op.
On Sat, May 7, 2011 at 4:25 PM, aaron morton <aa...@thelastpickle.com> wrote: > I remembered something like that so had a look at > RangeSliceResponseResolver.resolve() in 0.6.12 and it looks like it > schedules the repairs... > > protected Row getReduced() > { > ColumnFamily resolved = > ReadResponseResolver.resolveSuperset(versions); > ReadResponseResolver.maybeScheduleRepairs(resolved, table, > key, versions, versionSources); > versions.clear(); > versionSources.clear(); > return new Row(key, resolved); > } > > > Is that right? > > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 8 May 2011, at 00:48, Jonathan Ellis wrote: > >> range_slices respects consistencylevel, but only single-row reads and >> multiget do the *repair* part of RR. >> >> On Sat, May 7, 2011 at 1:44 AM, aaron morton <aa...@thelastpickle.com> wrote: >>> get_range_slices() does read repair if enabled (checked >>> DoConsistencyChecksBoolean in the config, it's on by default) so you should >>> be getting good reads. If you want belt-and-braces run nodetool repair >>> first. >>> >>> Hope that helps. >>> >>> >>> On 7 May 2011, at 11:46, Jeremy Hanna wrote: >>> >>>> Great! I just wanted to make sure you were getting the information you >>>> needed. >>>> >>>> On May 6, 2011, at 6:42 PM, Henrik Schröder wrote: >>>> >>>>> Well, I already completed the migration program. Using get_range_slices I >>>>> could migrate a few thousand rows per second, which means that migrating >>>>> all of our data would take a few minutes, and we'll end up with pristine >>>>> datafiles for the new cluster. Problem solved! >>>>> >>>>> I'll see if I can create datafiles in 0.6 that are uncleanable in 0.7 so >>>>> that you all can repeat this and hopefully fix it. >>>>> >>>>> >>>>> /Henrik Schröder >>>>> >>>>> On Sat, May 7, 2011 at 00:35, Jeremy Hanna <jeremy.hanna1...@gmail.com> >>>>> wrote: >>>>> If you're able, go into the #cassandra channel on freenode (IRC) and talk >>>>> to driftx or jbellis or aaron_morton about your problem. It could be >>>>> that you don't have to do all of this based on a conversation there. >>>>> >>>>> On May 6, 2011, at 5:04 AM, Henrik Schröder wrote: >>>>> >>>>>> I'll see if I can make some example broken files this weekend. >>>>>> >>>>>> >>>>>> /Henrik Schröder >>>>>> >>>>>> On Fri, May 6, 2011 at 02:10, aaron morton <aa...@thelastpickle.com> >>>>>> wrote: >>>>>> The difficulty is the different thrift clients between 0.6 and 0.7. >>>>>> >>>>>> If you want to roll your own solution I would consider: >>>>>> - write an app to talk to 0.6 and pull out the data using keys from the >>>>>> other system (so you know can check referential integrity while you are >>>>>> at it). Dump the data to flat file. >>>>>> - write an app to talk to 0.7 to load the data back in. >>>>>> >>>>>> I've not given up digging on your migration problem, having to manually >>>>>> dump and reload if you've done nothing wrong is not the best solution. >>>>>> I'll try to find some time this weekend to test with: >>>>>> >>>>>> - 0.6 server, random paritioner, standard CF's, byte column >>>>>> - load with python or the cli on osx or ubuntu (dont have a window >>>>>> machine any more) >>>>>> - migrate and see whats going on. >>>>>> >>>>>> If you can spare some sample data to load please send it over in the >>>>>> user group or my email address. >>>>>> >>>>>> Cheers >>>>>> >>>>>> ----------------- >>>>>> Aaron Morton >>>>>> Freelance Cassandra Developer >>>>>> @aaronmorton >>>>>> http://www.thelastpickle.com >>>>>> >>>>>> On 6 May 2011, at 05:52, Henrik Schröder wrote: >>>>>> >>>>>>> We can't do a straight upgrade from 0.6.13 to 0.7.5 because we have >>>>>>> rows stored that have unicode keys, and Cassandra 0.7.5 thinks those >>>>>>> rows in the sstables are corrupt, and it seems impossible to clean it >>>>>>> up without losing data. >>>>>>> >>>>>>> However, we can still read all rows perfectly via thrift so we are now >>>>>>> looking at building a simple tool that will copy all rows from our >>>>>>> 0.6.3 cluster to a parallell 0.7.5 cluster. Our question is now how to >>>>>>> do that and ensure that we actually get all rows migrated? It's a >>>>>>> pretty small cluster, 3 machines, a single keyspace, a singke >>>>>>> columnfamily, ~2 million rows, a few GB of data, and a replication >>>>>>> factor of 3. >>>>>>> >>>>>>> So what's the best way? Call get_range_slices and move through the >>>>>>> entire token space? We also have all row keys in a secondary system, >>>>>>> would it be better to use that and make calls to get_multi or >>>>>>> get_multi_slices instead? Are we correct in assuming that if we use the >>>>>>> consistencylevel ALL we'll get all rows? >>>>>>> >>>>>>> >>>>>>> /Henrik Schröder >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com