HBase has the same problem. Your choices are basically (a) figure out a way to not do all writes sequentially or (b) figure out a way to model w/o OPP.
Most Cassandra users go with option (b). On Thu, May 20, 2010 at 8:21 AM, Sonny Heer <sonnyh...@gmail.com> wrote: > Yes, I'm using OOP, because of the way we modeled our data. Does > Cassandra not handle OOP intensive write operations? Is HBase a > better approach if one must use OOP? > > > On Thu, May 20, 2010 at 7:41 AM, Jonathan Ellis <jbel...@gmail.com> wrote: >> Are you using OOP? That will tend to create hot spots like this, >> which is why most people deploy on RP. >> >> If you are using RP you may simply need to add C* capacity, or take >> TimeoutException as a signal to throttle your activity. >> >> On Tue, May 18, 2010 at 4:37 PM, Sonny Heer <sonnyh...@gmail.com> wrote: >>> Yeah there are many writes happening at the same time to any given cass >>> node. >>> >>> e.g. assume 10 machines, all running hadoop and cassandra. The hadoop >>> nodes are randomly picking a cassandra node and writing directly using >>> the batch mutate. >>> >>> After increasing the timeout even more, i don't get that exception >>> anymore. But now getting UnavailableException. >>> >>> The wiki states this happens when all the replicas required could be >>> created and/or read. How do we resolve this problem? the write >>> consistency is one. >>> >>> thanks >>> >>> >>> On Sat, May 15, 2010 at 8:02 AM, Jonathan Ellis <jbel...@gmail.com> wrote: >>>> rpctimeout should be sufficient >>>> >>>> you can turn on debug logging to see how long it's actually taking the >>>> destination node to do the write (or look at cfstats, if no other >>>> writes are going on) >>>> >>>> On Fri, May 14, 2010 at 11:55 AM, Sonny Heer <sonnyh...@gmail.com> wrote: >>>>> Hey, >>>>> >>>>> I'm running a map/reduce job, reading from HDFS directory, and >>>>> reducing to Cassandra using the batch_mutate method. >>>>> >>>>> The reducer builds the list of rowmutations for a single row, and >>>>> calls batch_mutate at the end. As I move to a larger dataset, i'm >>>>> seeing the following exception: >>>>> >>>>> Caused by: TimedOutException() >>>>> at >>>>> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:15361) >>>>> at >>>>> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:796) >>>>> at >>>>> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:772) >>>>> >>>>> I changed the RpcTimeoutInMillis to 60 seconds with no changes. What >>>>> configuration changes should i make when doing intensive write >>>>> operations using batch mutate? >>>>> >>>> >>>> >>>> >>>> -- >>>> Jonathan Ellis >>>> Project Chair, Apache Cassandra >>>> co-founder of Riptano, the source for professional Cassandra support >>>> http://riptano.com >>>> >>> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com >> > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com