Hadoop over Cassandra
Thanks.
But 1) overcomes with C* API for commitlog and memtables or with mixed
access (direct IO + traditional connectors or pure CQL if data model
allows, we experimented with it).
2) is more complex for universal solution. In our case C* uses without
replication (RF=1
over Cassandra
Thanks.
But 1) overcomes with C* API for commitlog and memtables or with mixed access
(direct IO + traditional connectors or pure CQL if data model allows, we
experimented with it).
2) is more complex for universal solution. In our case C* uses without
replication (RF=1
Thanks.
But 1) overcomes with C* API for commitlog and memtables or with mixed
access (direct IO + traditional connectors or pure CQL if data model
allows, we experimented with it).
2) is more complex for universal solution. In our case C* uses without
replication (RF=1) because of huge data
If you access directly the C* sstables from those frameworks, you will:
1) miss live data which are in memory and not dumped yet to disk
2) skip the Dynamo layer of C* responsible for data consistency
Le 16 sept. 2014 10:58, "platon.tema" a écrit :
> Hi.
>
> As I see massive data processing too
Hi.
As I see massive data processing tools (map\reduce) with C* data include
connectors
- Calliope http://tuplejump.github.io/calliope/
- Datastax spark cassandra connector
https://github.com/datastax/spark-cassandra-connector
- Startio Deep https://github.com/Stratio/stratio-deep
- other free
I was able to workaround this problem by modifying the
ColumnFamilyRecordReader class from the org.apache.cassandra.hadoop package.
Since the errors where TimeoutException, I added sleep and retry logic
around
rows = client.get_range_slices(keyspace,
new ColumnParent(cfName),
predicate,
The cassandra logs strangely show no errors at the time of failure.
Changing the RPCTimeoutInMillis seemed to help. Though it slowed down the
job considerably, it seems to be finishing by changing the timeout value
to 1 min. Unfortunately, I cannot be sure if it will continue to work if
the data in
On Jan 12, 2011, at 12:40 PM, Jairam Chandar wrote:
> Hi folks,
>
> We have a Cassandra 0.6.6 cluster running in production. We want to run
> Hadoop (version 0.20.2) jobs over this cluster in order to generate reports.
> I modified the word_count example in the contrib folder of the cassandra
On Wed, 2011-01-12 at 23:04 +0100, mck wrote:
> > Caused by: TimedOutException()
>
> What is the exception in the cassandra logs?
Or tried increasing rpc_timeout_in_ms?
~mck
--
"When there is no enemy within, the enemies outside can't hurt you."
African proverb | www.semb.wever.org | www.sesa
On Wed, 2011-01-12 at 18:40 +, Jairam Chandar wrote:
> Caused by: TimedOutException()
What is the exception in the cassandra logs?
~mck
--
"Don't use Outlook. Outlook is really just a security hole with a small
e-mail client attached to it." Brian Trosko | www.semb.wever.org |
www.sesat.no
Whats happening in the cassandra server logs when you get these errors? Reading through the hadoop 0.6.6 code it looks like it creates a thrift client with an infinite timeout. So it may be an internode timeout, which is set in storage-conf.xml.AaronOn 13 Jan, 2011,at 07:40 AM, Jairam Chandar wrot
Hi folks,
We have a Cassandra 0.6.6 cluster running in production. We want to run
Hadoop (version 0.20.2) jobs over this cluster in order to generate
reports.
I modified the word_count example in the contrib folder of the cassandra
distribution. While the program is running fine for small datasets
On Tue, May 18, 2010 at 9:40 PM, Mark Schnitzius
wrote:
>> If anyone has "war stories" on the topic of Cassandra & Hadoop (or
>> even just Hadoop in general) let me know.
>
> Don't know if it counts as a war story, but I was successful recently in
> implementing something I got advice on in an ear
>
> If anyone has "war stories" on the topic of Cassandra & Hadoop (or
> even just Hadoop in general) let me know.
Don't know if it counts as a war story, but I was successful recently in
implementing something I got advice on in an earlier thread, namely feeding
both a Cassandra table and a Had
xim Grinev"
> Sent: Tuesday, May 18, 2010 2:42am
> To: user@cassandra.apache.org
> Subject: Re: Hadoop over Cassandra
>
> On Tue, May 18, 2010 at 2:23 AM, Jonathan Ellis wrote:
>
>> On Mon, May 17, 2010 at 4:12 PM, Vick Khera wrote:
>> > On Mon,
: "Maxim Grinev"
Sent: Tuesday, May 18, 2010 2:42am
To: user@cassandra.apache.org
Subject: Re: Hadoop over Cassandra
On Tue, May 18, 2010 at 2:23 AM, Jonathan Ellis wrote:
> On Mon, May 17, 2010 at 4:12 PM, Vick Khera wrote:
> > On Mon, May 17, 2010 at 3:46 PM, Jonathan Ellis
>
Maxim,
Check out the getLocation() method from this file:
http://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java
Basically, it loops over the list of nodes containing this split of
data and if any of them are the local node, it returns
On Tue, May 18, 2010 at 2:23 AM, Jonathan Ellis wrote:
> On Mon, May 17, 2010 at 4:12 PM, Vick Khera wrote:
> > On Mon, May 17, 2010 at 3:46 PM, Jonathan Ellis
> wrote:
> >> Moving to the user@ list.
> >>
> >> http://wiki.apache.org/cassandra/HadoopSupport should be useful.
> >
> > That documen
On Mon, May 17, 2010 at 4:12 PM, Vick Khera wrote:
> On Mon, May 17, 2010 at 3:46 PM, Jonathan Ellis wrote:
>> Moving to the user@ list.
>>
>> http://wiki.apache.org/cassandra/HadoopSupport should be useful.
>
> That document doesn't really answer the "is data locality preserved"
> when running t
On Mon, May 17, 2010 at 3:46 PM, Jonathan Ellis wrote:
> Moving to the user@ list.
>
> http://wiki.apache.org/cassandra/HadoopSupport should be useful.
That document doesn't really answer the "is data locality preserved"
when running the map phase, but my hunch is "no".
>
> On Mon, May 17, 2010
Moving to the user@ list.
http://wiki.apache.org/cassandra/HadoopSupport should be useful.
On Mon, May 17, 2010 at 2:41 PM, Yan Virin wrote:
> Hi,
> Can someone explain how this works? As long as I know, there is no execution
> engine in Cassandra alone, so I assume that Hadoop gives the MapRedu
21 matches
Mail list logo