does anyone have a feel for how performant m/r operations are when
backed by cassandra as opposed to hdfs in terms of network utilization
and volume of data being pushed around?

jesse

--
jesse mcconnell
jesse.mcconn...@gmail.com



On Fri, May 7, 2010 at 08:54, Ian Kallen <spidaman.l...@gmail.com> wrote:
> On 5/6/10 3:26 PM, Stu Hood wrote:
>>
>> Ian: I think that as get_range_slice gets faster, the approach that Mark
>> was heading toward may be considerably more efficient than reading the old
>> value in the OutputFormat.
>>
>
> Interesting, I'm trying to understand the performance impact of the
> different ways to do this. Under Mark's approach, the prior values are
> pulled out of Cassandra in the mapper in bulk, then merged and written back
> to Cassandra in the reducer; the get_range_slice is faster than the
> individual row fetches that my approach does in the reducer. Is that what
> you mean or are you referring to something else?
> thanks!
> -Ian
>
> --
> Ian Kallen
> blog: http://www.arachna.com/roller/spidaman
> tweetz: http://twitter.com/spidaman
> vox: 925.385.8426
>
>
>

Reply via email to