Re: Regarding Cassandra Scalability

2010-04-19 Thread dir dir
Hi Paul, I do not have any pressure to build software using Cassandra right now. I am studying and exploring Cassandra now. Hence I have a big curiosity about Cassandra. Ok I will continue my study and wait better documentation. Dir. On Mon, Apr 19, 2010 at 1:44 PM, Paul Prescod wrote: > On S

cassandra monitoring

2010-04-19 Thread Simeonov, Daniel
Hi, What is the preferred way of monitoring Cassandra clusters? Is Cassandra integrated with Ganglia? Thank you very much! Best regards, Daniel.

0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Masood Mortazavi
I wonder if anyone can use: * Add logging of GC activity (CASSANDRA-813) to confirm this: http://www.slideshare.net/schubertzhang/cassandra-060-insert-throughput - m. On Sun, Apr 18, 2010 at 6:58 PM, Eric Evans wrote: > > Hot on the trails of 0.6.0 comes our latest, 0.6.1. This stable point

RE: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Mark Jones
I'm seeing some issues like this as well, in fact, I think seeing your graphs has helped me understand the dynamics of my cluster better. Using some ballpark figures for inserting single column objects of ~500 bytes onto individual nodes(not when combined as a cluster): Node1: Inserts 12000/s N

Re: Regarding Cassandra Scalability

2010-04-19 Thread Gary Dusbabek
On Sun, Apr 18, 2010 at 11:14, dir dir wrote: > Hi Gary, > >>The main reason is that the compaction operation (removing deleted >>values) currently requires that an entire row be read into memory. > > Thank you for your explanation. But I still do not understand what do you > mean. > When you del

RE: Cassandra Java Client

2010-04-19 Thread Dop Sun
May I take this chance to share this link here: http://code.google.com/p/jassandra/ It currently based with Cassandra 0.6 Thrift APIs. The class ThriftCriteria and ThriftColumnFamily has direct use of Thrift API. Also, the site itself has test code, which is actually works on Jassandra abs

Re: Cassandra Java Client

2010-04-19 Thread Jonathan Ellis
How is Jassandra different from http://github.com/rantav/hector ? On Mon, Apr 19, 2010 at 9:21 AM, Dop Sun wrote: > May I take this chance to share this link here: > > http://code.google.com/p/jassandra/ > > > > It currently based with Cassandra 0.6 Thrift APIs. > > > > The class ThriftCriteria a

RE: Cassandra Java Client

2010-04-19 Thread Dop Sun
Well, there are couple of points while Jassandra is created: 1. First of all, I want to create something like that is because I come from JDBC background, and familiar with Hibernate API. The ICriteria (which is created for querying) is inspired by the Criteria API from hibernate. Actually, maybe

tcp CLOSE_WAIT bug

2010-04-19 Thread Ingram Chen
Hi all, We have observed several connections between nodes in CLOSE_WAIT after several hours of operation: At node 87: netstat -tn | grep 7000 tcp0 0 :::192.168.2.87:7000:::192.168.2.88:57625 CLOSE_WAIT tcp0 0 :::192.168.2.87:7000:::192.168.2

Re: [RELEASE] 0.6.1

2010-04-19 Thread Eric Evans
On Sun, 2010-04-18 at 19:04 -0700, Jeff Hodges wrote: > It does, however, include a change the networking layout[1]. It's not > a simple rolling deploy. You will have to do a full cluster restart to > upgrade. You're right; I wasn't aware of this one. We probably should have considered pushing thi

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Brandon Williams
On Mon, Apr 19, 2010 at 10:27 AM, Ingram Chen wrote: > Hi all, > > We have observed several connections between nodes in CLOSE_WAIT after > several hours of operation: > This is symptomatic of not pooling your client connections correctly. Be sure you're using one connection per thread, not

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Ingram Chen
Thank your information. We do use connection pools with thrift client and ThriftAdress is on port 9160. Those problematic connections we found are all in port 7000, which is internal communications port between nodes. I guess this related to StreamingService. On Mon, Apr 19, 2010 at 23:46, Brand

RE: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Daniel Kluesing
We see this behavior as well with 0.6, heap usage graphs look almost identical. The GC is a noticeable bottleneck, we've tried jdku19 and jrockit vm's. It basically kills any kind of soft real time behavior. From: Masood Mortazavi [mailto:masoodmortaz...@gmail.com] Sent: Monday, April 19, 2010 4

Map/Reduce Cassandra Output

2010-04-19 Thread Sonny Heer
Different from the wordcount my input source is a directory, and I have the a split class and record reader defined. Different from wordcount during reduce I need to insert into Cassandra. I notice for the wordcount input it retrieves a handle on a cassandra client like this: TSocket soc

Re: [RELEASE] 0.6.0

2010-04-19 Thread Ted Zlatanov
On Wed, 14 Apr 2010 13:09:13 -0500 Ted Zlatanov wrote: TZ> On Wed, 14 Apr 2010 12:23:19 -0500 Eric Evans wrote: EE> On Wed, 2010-04-14 at 10:16 -0500, Ted Zlatanov wrote: >>> Can it support a non-root user through /etc/default/cassandra? I've >>> been patching the init script myself but was h

Modelling assets and user permissions

2010-04-19 Thread tsuraan
Suppose I have a CF that holds some sort of assets that some users of my program have access to, and that some do not. In SQL-ish terms it would look something like this: TABLE Assets ( asset_id serial primary key, ... ); TABLE Users ( user_id serial primary key, user_name text ); TABLE

Re: Cassandra Java Client

2010-04-19 Thread Ran Tavory
Hi Dop, you may want to look at hector as a low level cassandra client on which you build jassandra, adding hibernate style magic etc like other ppl have done with ORM layers on top of it. Hector's main features include extensive jmx counters, failover and connection pooling. It's available for all

Re: [RELEASE] 0.6.0

2010-04-19 Thread Eric Evans
On Mon, 2010-04-19 at 12:02 -0500, Ted Zlatanov wrote: > > EE> It's the first item on debian/TODO, but, you know, patches welcome > and > EE> all that. > > TZ> The appended patch has been sufficient for me. > > Eric, do you need me to open a ticket for this, too, or is what I > posted sufficient

PropertyFileEndPointSnitch

2010-04-19 Thread Erik Holstad
When building the PropertyFileEndPointSnitch into the jar cassandra-propsnitch.jar the files in the jar end up on src/java/org/apache/cassandra/locator/PropertyFileEndPointSnitch.class instead of org/apache/cassandra/locator/PropertyFileEndPointSnitch.class. Am I doing something wrong , is this int

Re: PropertyFileEndPointSnitch

2010-04-19 Thread Jonathan Ellis
this is a bug in 0.6. 984.txt attached to https://issues.apache.org/jira/browse/CASSANDRA-984 should fix it. On Mon, Apr 19, 2010 at 12:56 PM, Erik Holstad wrote: > When building the PropertyFileEndPointSnitch into the jar > cassandra-propsnitch.jar > the files in the jar end up on > src/java/or

RE: Map/Reduce Cassandra Output

2010-04-19 Thread Stu Hood
If you used that snippet of code, all connections would go through the same seed: the input code does additional work to determine which nodes are holding particular key ranges, and then connects directly. For outputting from Hadoop to Cassandra, you may want to consider using a Java clie

Re: PropertyFileEndPointSnitch

2010-04-19 Thread Erik Holstad
Thanks Jonathan!

restore with snapshot

2010-04-19 Thread Lee Parker
I am working on finalizing our backup and restore procedures for a cassandra cluster running on EC2. I understand based on the wiki that in order to replace a single node, I don't actually need to put data on that node. I just need to bootstrap the new node into the cluster and it will get data fr

Re: [RELEASE] 0.6.0

2010-04-19 Thread Ted Zlatanov
On Mon, 19 Apr 2010 12:47:52 -0500 Eric Evans wrote: EE> On Mon, 2010-04-19 at 12:02 -0500, Ted Zlatanov wrote: >> EE> It's the first item on debian/TODO, but, you know, patches welcome >> and EE> all that. >> TZ> The appended patch has been sufficient for me. >> >> Eric, do you need me to op

Re: Data model question - column names sort

2010-04-19 Thread Jonathan Ellis
On Thu, Apr 15, 2010 at 6:01 PM, Sonny Heer wrote: > Need a way to have two different types of indexes. > > Key: aTextKey > ColumnName: aTextColumnName:55 > Value: "" > > Key: aTextKey > ColumnName: 55:aTextColumnName > Value: "" > > All the valuable information is stored in the column name itself

RE: Cassandra Java Client

2010-04-19 Thread Dop Sun
Hi Ran: Yep, looks like there is possibility that I can add dependencies to hector, and enhance the functionality to Jassandra. I would take this chance to extend the discussion about “xxx Client for Cassandra” a little bit: In short, Cassandra may need a kind of sub-project to define

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-19 Thread Jonathan Ellis
On Thu, Apr 15, 2010 at 6:10 PM, Anthony Molinaro wrote: > 1) shutdown cassandra on instance I want to replace > 2) create a new instance, start cassandra with AutoBootstrap = true > 3) run nodeprobe removetoken against the token of the instance I am >   replacing > > Then according to the 'Handli

Re: effective modeling for fixed limit columns

2010-04-19 Thread Jonathan Ellis
Limiting by number of columns in a row will perform very poorly. Limiting by the time a column has existed can perform quite well, and was added by Sylvain for 0.7 in https://issues.apache.org/jira/browse/CASSANDRA-699 On Fri, Apr 16, 2010 at 1:50 PM, Chris Shorrock wrote: > I'm attempting to co

Re: why read operation use so much of memory?

2010-04-19 Thread Jonathan Ellis
(Moving to users@ list.) Like any Java server, Cassandra will use as much memory in its heap as you allow it to. You can request a GC from jconsole to see what its approximate "real" working set it. http://wiki.apache.org/cassandra/SSTableMemtable explains why reads are slower than writes. You

Re: cassandra monitoring

2010-04-19 Thread Jonathan Ellis
Anything that can consume JMX. On Mon, Apr 19, 2010 at 5:34 AM, Simeonov, Daniel wrote: > Hi, >    What is the preferred way of monitoring Cassandra clusters? Is Cassandra > integrated with Ganglia? Thank you very much! > Best regards, Daniel. >

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Jonathan Ellis
Is this after doing a bootstrap or other streaming operation? Or did a node go down? The internal sockets are supposed to remain open, otherwise. On Mon, Apr 19, 2010 at 10:56 AM, Ingram Chen wrote: > Thank your information. > > We do use connection pools with thrift client and ThriftAdress is

Re: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Jonathan Ellis
It's hard to tell from those slides, but it looks like the slowdown doesn't hit until after several GCs. Perhaps this is compaction kicking in, not GCs? Definitely the extra I/O + CPU load from compaction will cause a drop in throughput. On Mon, Apr 19, 2010 at 6:14 AM, Masood Mortazavi wrote:

Re: Map/Reduce Cassandra Output

2010-04-19 Thread Sonny Heer
Thanks Stu. I will take a look at Hector. Do you know where the input code does the additional work? On Mon, Apr 19, 2010 at 11:20 AM, Stu Hood wrote: > If you used that snippet of code, all connections would go through the same > seed: the input code does additional work to determine which

Re: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Masood Mortazavi
Minimizing GC pauses or minimizing time slots allocated to GC pauses -- either through configuration or re-implementations of garbage collection "bottlenecks" (i.e. object-generation "bottlenecks") -- seem to be the immediate approach. (Other approaches appear to be more intrusive.) At code level,

Re: busy thread on IncomingStreamReader ?

2010-04-19 Thread Rob Coli
On 4/17/10 6:47 PM, Ingram Chen wrote: after upgrading jdk from 1.6.0_16 to 1.6.0_20, the problem solved. FYI, this sounds like it might be : https://issues.apache.org/jira/browse/CASSANDRA-896 http://bugs.sun.com/view_bug.do;jsessionid=60c39aa55d3666c0c84dd70eb826?bug_id=6805775 Where garb

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-19 Thread Anthony Molinaro
On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote: > > Can I then 'nodeprobe move ', and > > achieve the same as step 2 above? > > You can't have two nodes with the same token in the ring at once. So, > you can removetoken the old node first, then bootstrap the new one > (just speci

get_range_slices in hector

2010-04-19 Thread Chris Dean
Is there a version of hector that has an interface to get_range_slices ? or should I provide a patch? Cheers, Chris Dean

Re: Help with MapReduce

2010-04-19 Thread Joost Ouwerkerk
I'm slowly getting somewhere with Cassandra... I have successfully imported 1.5 million rows using MapReduce. This took about 8 minutes on an 8-node cluster, which is comparable to the time it takes with HBase. Now I'm having trouble scanning this data. I've created a simple MapReduce job that c

Re: restore with snapshot

2010-04-19 Thread Edward M. Goldberg
This is a very important thread for me also. I have assumed up to this point in time that the Nodes IP is not part of the equation at all. I have assumed that a new node with the exact same data store files an configuration but with a new IP value can replace that same node. I will test this out

Re: get_range_slices in hector

2010-04-19 Thread Nathan McCall
Not yet. If you wanted to provide a patch that would be much appreciated. A fork and pull request would be best logistically, but whatever works. -Nate On Mon, Apr 19, 2010 at 5:10 PM, Chris Dean wrote: > Is there a version of hector that has an interface to get_range_slices ? > or should I prov

Re: get_range_slices in hector

2010-04-19 Thread Chris Dean
Ok, thanks. Cheers, Chris Dean Nathan McCall writes: > Not yet. If you wanted to provide a patch that would be much > appreciated. A fork and pull request would be best logistically, but > whatever works. > > -Nate > > On Mon, Apr 19, 2010 at 5:10 PM, Chris Dean wrote: >> Is there a version of

Re: Help with MapReduce

2010-04-19 Thread Jesse McConnell
most likely means that the count() operation is taking too long for the configured RPCTimeout counts get unreliable after a certain number of columns under a key in my experience jesse -- jesse mcconnell jesse.mcconn...@gmail.com On Mon, Apr 19, 2010 at 19:12, Joost Ouwerkerk wrote: > I'm sl

Re: Help with MapReduce

2010-04-19 Thread Jesse McConnell
err not count in your case, but same symptom, cassandra can't return the answer to your query in the configured rpctimeout time cheers, jesse -- jesse mcconnell jesse.mcconn...@gmail.com On Mon, Apr 19, 2010 at 19:40, Jesse McConnell wrote: > most likely means that the count() operation is ta

Re: Help with MapReduce

2010-04-19 Thread Jonathan Ellis
Possibly you are asking it to retrieve too many columns per row. Possibly there is something else causing poor performance, like swapping. On Mon, Apr 19, 2010 at 7:12 PM, Joost Ouwerkerk wrote: > I'm slowly getting somewhere with Cassandra... I have successfully imported > 1.5 million rows usin

Re: Help with MapReduce

2010-04-19 Thread Joost Ouwerkerk
hmm, might be too much data. In the case of a supercolumn, how do I specify which sub-columns to retrieve? Or can I only retrieve entire supercolumns? On Mon, Apr 19, 2010 at 8:47 PM, Jonathan Ellis wrote: > Possibly you are asking it to retrieve too many columns per row. > > Possibly there is

0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Ken Sandney
Hi I am doing a insert test with 9 nodes, the command: > stress.py -n 10 -t 1000 -c 10 -o insert -i 5 -d > 10.0.0.1,10.0.0.2. and 5 of the 9 nodes were cashed, only about 6'500'000 rows were inserted I checked out the system.log and seems the reason are 'out of memory'. I don't if th

Re: Help with MapReduce

2010-04-19 Thread Jonathan Ellis
the latter, if you are retrieving multiple supercolumns. On Mon, Apr 19, 2010 at 8:10 PM, Joost Ouwerkerk wrote: > hmm, might be too much data.  In the case of a supercolumn, how do I specify > which sub-columns to retrieve?  Or can I only retrieve entire supercolumns? > On Mon, Apr 19, 2010 at 8

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts On Mon, Apr 19, 2010 at 8:22 PM, Ken Sandney wrote: > Hi > I am doing a insert test with 9 nodes, the command: >> >> stress.py -n 10 -t 1000 -c 10 -o insert -i 5 -d >> 10.0.0.1,10.0.0.2. > > and  5 of the 9 nodes were

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Schubert Zhang
Please also post your jvm-heap and GC options, i.e. the seting in cassandra.in.sh And what about you node hardware? On Tue, Apr 20, 2010 at 9:22 AM, Ken Sandney wrote: > Hi > I am doing a insert test with 9 nodes, the command: > >> stress.py -n 10 -t 1000 -c 10 -o insert -i 5 -d >> 10.0.

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Schubert Zhang
Seems you should configure larger jvm-heap. On Tue, Apr 20, 2010 at 9:32 AM, Schubert Zhang wrote: > Please also post your jvm-heap and GC options, i.e. the seting in > cassandra.in.sh > And what about you node hardware? > > On Tue, Apr 20, 2010 at 9:22 AM, Ken Sandney wrote: > >> Hi >> I am do

Re: Help with MapReduce

2010-04-19 Thread Joost Ouwerkerk
And when retrieving only one supercolumn? Can I further specify which subcolumns to retrieve? On Mon, Apr 19, 2010 at 9:29 PM, Jonathan Ellis wrote: > the latter, if you are retrieving multiple supercolumns. > > On Mon, Apr 19, 2010 at 8:10 PM, Joost Ouwerkerk > wrote: > > hmm, might be too mu

Re: Help with MapReduce

2010-04-19 Thread Jonathan Ellis
yes On 4/19/10, Joost Ouwerkerk wrote: > And when retrieving only one supercolumn? Can I further specify which > subcolumns to retrieve? > > On Mon, Apr 19, 2010 at 9:29 PM, Jonathan Ellis wrote: > >> the latter, if you are retrieving multiple supercolumns. >> >> On Mon, Apr 19, 2010 at 8:10 PM

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Ken Sandney
here is my JVM options, by default, I didn't modify them, from cassandra.in.sh # Arguments to pass to the JVM JVM_OPTS=" \ -ea \ -Xms128M \ -Xmx1G \ -XX:TargetSurvivorRatio=90 \ -XX:+AggressiveOpts \ -XX:+UseParNewGC \ -XX:+UseConcMar

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Schubert Zhang
-Xmx1G is too small. In my cluster, 8GB ram on each node, and I grant 6GB to cassandra. Please see my test @ http://www.slideshare.net/schubertzhang/presentations 幻灯片 5 –Memory, GC..., always to be the bottleneck and big issue of java-based infrastructure software! References: –http://wiki.apach

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Brandon Williams
On Mon, Apr 19, 2010 at 9:06 PM, Schubert Zhang wrote: > > 2. Reject the request when be short of resource, instead of throws OOME and > exit (crash). > Right, that is the crux of the problem It will be addressed here: https://issues.apache.org/jira/browse/CASSANDRA-685 -Brandon

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Jonathan Ellis
Schubert, I don't know if you saw this in the other thread referencing your slides: It looks like the slowdown doesn't hit until after several GCs, although it's hard to tell since the scale is different on the GC graph and the insert throughput ones. Perhaps this is compaction kicking in, not GC

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Ken Sandney
I am just running Cassandra on normal boxes, and grants 1GB of total 2GB to Cassandra is reasonable I think. Can this problem be resolved by tuning the thresholds described on this page , or just be waiting for the 0.7 release as Brandon mentione

Re: busy thread on IncomingStreamReader ?

2010-04-19 Thread Ingram Chen
Ouch ! I talk too early ! We still suffer same problems after upgrade to 1.6.0_20. In JMX StreamingService, I see several wired incoming/outgoing transfer: In Host A, 192.168.2.87 StreamingService Status: Done with transfer to /192.168.2.88 StreamingService StreamSources: [/192.168.2.88] Stre

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Jonathan Ellis
Ken, I linked you to the FAQ answering your problem in the first reply you got. Please don't hijack my replies to other people; that's rude. On Mon, Apr 19, 2010 at 9:32 PM, Ken Sandney wrote: > I am just running Cassandra on normal boxes, and grants 1GB of total 2GB to > Cassandra is reasonable

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-19 Thread Schubert Zhang
You can have a look at org.apache.cassandra.service.StorageService public void initServer() throws IOException 1. If AutoBootstrap=false, it means the the node is bootstaped (not a new node) Usually, the first new node is set false. (1) check the system table to find the saved token, if found

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Ken Sandney
Sorry I just don't know how to resolve this :) On Tue, Apr 20, 2010 at 10:37 AM, Jonathan Ellis wrote: > Ken, I linked you to the FAQ answering your problem in the first reply > you got. Please don't hijack my replies to other people; that's rude. > > On Mon, Apr 19, 2010 at 9:32 PM, Ken Sandne

Re: busy thread on IncomingStreamReader ?

2010-04-19 Thread Jonathan Ellis
I don't see csArena-tmp-6-Index.db in the incoming files list. If it's not there, that means that it did break out of that while loop. Did you check both logs for exceptions? On Mon, Apr 19, 2010 at 9:36 PM, Ingram Chen wrote: > Ouch ! I talk too early ! > > We still suffer same problems after

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-19 Thread Schubert Zhang
Jonathan, Thanks. Yes, the scale of GC grath is different from the throughput one. I will do more check and tuning in our next test immediately. On Tue, Apr 20, 2010 at 10:39 AM, Ken Sandney wrote: > Sorry I just don't know how to resolve this :) > > > On Tue, Apr 20, 2010 at 10:37 AM, Jonathan

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Ingram Chen
this happened after several hours of operations and both nodes are started at the same time (clean start without any data). so it might not relate to Bootstrap. In system.log I do not see any logs like "xxx node dead" or exceptions. and both nodes in test are alive. they serve read/write well, too

Re: busy thread on IncomingStreamReader ?

2010-04-19 Thread Ingram Chen
I check system.log both, but there is no exception logged. On Tue, Apr 20, 2010 at 10:40, Jonathan Ellis wrote: > I don't see csArena-tmp-6-Index.db in the incoming files list. If > it's not there, that means that it did break out of that while loop. > > Did you check both logs for exceptions?

Re: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Schubert Zhang
Since the scale of GC graph in the slides is different from the throughput ones. I will do another test for this issue. Thanks for your advices, Masood and Jonathan. --- Here, i just post my cossandra.in.sh. JVM_OPTS=" \ -ea \ -Xms128M \ -Xmx6G \ -XX:Tar

Re: why read operation use so much of memory?

2010-04-19 Thread dir dir
Hi Jonathan, I see this page (http://wiki.apache.org/cassandra/SSTableMemtable) does not exist yet. thanks. Dir. On Tue, Apr 20, 2010 at 3:41 AM, Jonathan Ellis wrote: > (Moving to users@ list.) > > Like any Java server, Cassandra will use as much memory in its heap as > you allow it to. You

Re: why read operation use so much of memory?

2010-04-19 Thread Brandon Williams
On Mon, Apr 19, 2010 at 10:28 PM, dir dir wrote: > Hi Jonathan, > > I see this page (http://wiki.apache.org/cassandra/SSTableMemtable) does > not exist yet. > > I think he meant: http://wiki.apache.org/cassandra/MemtableSSTable -Brandon

Re: Help with MapReduce

2010-04-19 Thread Joost Ouwerkerk
Ok. This should be ok for now, although not optimal for some jobs. Next issue is node stability during the insert job. The stacktrace below occured on several nodes while inserting 10 million rows. We're running on 4G machines, 1G of which is allocated to cassandra. What's the best config to p

Re: Help with MapReduce

2010-04-19 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts On Tue, Apr 20, 2010 at 12:48 AM, Joost Ouwerkerk wrote: > Ok.  This should be ok for now, although not optimal for some jobs. > > Next issue is node stability during the insert job.  The stacktrace below > occured on several nod