Re: inconsistent hadoop/cassandra results

aaron morton Thu, 10 Jan 2013 19:46:19 -0800

> But this is the first time I've tried to use the
> wide-row support, which makes me a little suspicious. The wide-row support is 
> not
> very well documented, so maybe I'm doing something wrong there in ignorance.
This was the area I was thinking about.


Can you drill in and see a pattern. 
Are the differences in rows that would be paged by wide rows ?
Could it be an off by one error in the wide row paging ? 

It all sounds strange. So I would make sure what your job is outputing matches 
what it is reading from C*. Maybe add some logging in there. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 10/01/2013, at 1:24 AM, Brian Jeltema <brian.jelt...@digitalenvoy.net> wrote:

> Sorry if this is a duplicate - I was having mailer problems last night:
> 
>> Assuming their were no further writes, running repair or using CL all should 
>> have fixed it. 
>> 
>> Can you describe the inconsistency between runs? 
> 
> Sure. The job output is generated by a single reducer and consists of a list 
> of
> key/value pairs where the key is the row key of the original table, and the 
> value is
> the total count of all columns in the row. Each run produces a file with a 
> different
> size, and running a diff against various output file pairs displays rows that 
> only
> appear in one file, or rows with the same key but different counts. 
> 
> What seems particularly hard to explain is the behavior after setting CL to 
> ALL,
> where the results eventually become reproducible (making it hard to place the
> blame on my trivial mapper/reducer implementations) but only after about half 
> a 
> dozen runs. And once reaching this state, setting CL to QUORUM results in 
> additional inconsistent results.
> 
> I can say with certainty that there were no other writes. I'm the sole 
> developer working
> with the CF in question. I haven't seen behavior like this before, though I 
> don't have
> a tremendous amount of experience. But this is the first time I've tried to 
> use the
> wide-row support, which makes me a little suspicious. The wide-row support is 
> not
> very well documented, so maybe I'm doing something wrong there in ignorance.
> 
> Brian
> 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 8/01/2013, at 2:16 AM, Brian Jeltema <brian.jelt...@digitalenvoy.net> 
>> wrote:
>> 
>>> I need some help understanding unexpected behavior I saw in some recent 
>>> experiments with Cassandra 1.1.5 and Hadoop 1.0.3:
>>> 
>>> I've written a small map/reduce job that simply counts the number of 
>>> columns in each row of a static CF (call it Foo) 
>>> and generates a list of every row and column count. A relatively small 
>>> fraction of the rows have a large number
>>> of columns; worst case is approximately 36 million. So when I set up the 
>>> job, I used wide-row support:
>>> 
>>>     ConfigHelper.setInputColumnFamily(job.getConfiguration(), "fooKS", 
>>> "Foo", WIDE_ROWS); // where WIDE_ROWS == true
>>> 
>>> When I ran this job using the default CL (1) I noticed that the results 
>>> varied from run to run, which I attributed to inconsistent
>>> replicas, since Foo was generated with CL == 1 and the RF == 3. 
>>> 
>>> So I ran repair for that CF on every node. The cassandra log on every node 
>>> contains lines similar to:
>>> 
>>>   INFO [AntiEntropyStage:1] 2013-01-05 20:38:48,605 AntiEntropyService.java 
>>> (line 778) [repair #e4a1d7f0-579d-11e2-0000-d64e0a75e6df] Foo is fully 
>>> synced
>>> 
>>> However, repeated runs were still inconsistent. Then I set CL to ALL, which 
>>> I presumed would always result in identical
>>> output, but repeated runs initially continued to be inconsistent. However, 
>>> I noticed that the results seemed to
>>> be converging, and after several runs (somewhere between 4 and 6) I finally 
>>> was producing identical results on every run.
>>> Then I set CL to QUORUM, and again generated inconsistent results.
>>> 
>>> Does this behavior make sense?
>>> 
>>> Brian
>> 
>

Re: inconsistent hadoop/cassandra results

Reply via email to