Can you create a ticket? On Fri, Apr 30, 2010 at 4:55 PM, Joost Ouwerkerk <jo...@openplaces.org> wrote: > There's a bug in ColumnFamilyRecordReader that appears when processing > a single split. When the start and end tokens of the split are equal, > duplicate rows can be returned. > > Example with 5 rows: > token (start and end) = 53193025635115934196771903670925341736 > > Tokens returned by first get_range_slices iteration: > 16955237001963240173058271559858726497 > 40670782773005619916245995581909898190 > 99079589977253916124855502156832923443 > 144992942750327304334463589818972416113 > 166860289390734216023086131251507064403 > > Tokens returned by next iteration (first token is last token from > previous, end token is unchanged) > 16955237001963240173058271559858726497 > 40670782773005619916245995581909898190 > > Tokens returned by final iteration (first token is last token from > previous, end token is unchanged) > [] (empty) > > In this example, the mapper has processed 7 rows in total, 2 of which > were duplicates. > > Joost. >
-- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com