Re: Anyone using hadoop/MapReduce integration currently?

朱蓝天 Wed, 26 May 2010 21:05:14 -0700

2010/5/26 Utku Can Topçu <u...@topcu.gen.tr>

> Hi Jeremy,
>
>
> > Why are you using Cassandra versus using data stored in HDFS or HBase?
> - I'm thinking of using it for realtime streaming of user data. While
> streaming the requests, I'm also using Lucandra for indexing the data in
> realtime. It's a better option when you compare it with HBase or the native
> HDFS flat files, because of low latency in writes.



     i'm  interested in realtime index with lucandra. but how to intersect
posting list from multiple terms with cansandra. if through the network, i
think it is very
inefficient

>
>
> > Is there anything holding you back from using it (if you would like to
> use it but currently cannot)?
>
> My answer to this would be:
> - The current integration only supports the whole range of the CF to be
> input for the map phase, it would be way much better if the InputFormat had
> means of support for a KeyRange.
>
> Best Regards,
> Utku
>
>
> On Tue, May 25, 2010 at 6:35 PM, Jeremy Hanna 
> <jeremy.hanna1...@gmail.com>wrote:
>
>> I'll be doing a presentation on Cassandra's (0.6+) hadoop integration next
>> week. Is anyone currently using MapReduce or the initial Pig integration?
>>
>> (If you're unaware of such integration, see
>> http://wiki.apache.org/cassandra/HadoopSupport)
>>
>> If so, could you post to this thread on how you're using it or planning on
>> using it (if not covered by the shroud of secrecy)?
>>
>> e.g.
>> What is the use case?
>>
>> Why are you using Cassandra versus using data stored in HDFS or HBase?
>>
>> Are you using a separate Hadoop cluster to run the MR jobs on, or perhaps
>> are you running the Job Tracker and Task Trackers on Cassandra nodes?
>>
>> Is there anything holding you back from using it (if you would like to use
>> it but currently cannot)?
>>
>> Thanks!
>
>
>

Re: Anyone using hadoop/MapReduce integration currently?

Reply via email to