So at the moment, I'm not running my loader, and I'm looking at one node which is slow to respond to nodetool requests. At this point, it has a pile of hinted-handoffs pending which don't seem to be draining out. The system.log shows that it's GCing pretty much constantly. Ian
$ /usr/local/src/cassandra/bin/nodetool --host node7 tpstats Pool Name Active Pending Completed FILEUTILS-DELETE-POOL 0 0 178 STREAM-STAGE 0 0 0 RESPONSE-STAGE 0 0 21852 ROW-READ-STAGE 0 0 0 LB-OPERATIONS 0 0 0 MESSAGE-DESERIALIZER-POOL 0 0 1648536 GMFD 0 0 125430 LB-TARGET 0 0 0 CONSISTENCY-MANAGER 0 0 0 ROW-MUTATION-STAGE 2 2 1886537 MESSAGE-STREAMING-POOL 0 0 0 LOAD-BALANCER-STAGE 0 0 0 FLUSH-SORTER-POOL 0 0 0 MEMTABLE-POST-FLUSHER 0 0 206 FLUSH-WRITER-POOL 0 0 206 AE-SERVICE-STAGE 0 0 0 HINTED-HANDOFF-POOL 1 158 23 On Fri, May 21, 2010 at 10:37 AM, Ian Soboroff <isobor...@gmail.com> wrote: > On the to-do list for today. Is there a tool to aggregate all the JMX > stats from all nodes? I mean, something a little more complete than nagios. > Ian > > > On Fri, May 21, 2010 at 10:23 AM, Jonathan Ellis <jbel...@gmail.com>wrote: > >> you should check the jmx stages I posted about >> >> On Fri, May 21, 2010 at 7:05 AM, Ian Soboroff <isobor...@gmail.com> >> wrote: >> > Just an update. I rolled the memtable size back to 128MB. I am still >> > seeing that the daemon runs for a while with reasonable heap usage, but >> then >> > the heap climbs up to the max (6GB in this case, should be plenty) and >> it >> > starts GCing, without much getting cleared. The client catches lots of >> > exceptions, where I wait 30 seconds and try again, with a new client if >> > necessary, but it doesn't clear up. >> > >> > Could this be related to memory leak problems I've skimmed past on the >> list >> > here? >> > >> > It can't be that I'm creating rows a bit at a time... once I stick a web >> > page into two CFs, it's over and done with for this application. I'm >> just >> > trying to get stuff loaded. >> > >> > Is there a limit to how much on-disk data a Cassandra daemon can >> manage? Is >> > there runtime overhead associated with stuff on disk? >> > >> > Ian >> > >> > On Thu, May 20, 2010 at 9:31 PM, Ian Soboroff <isobor...@gmail.com> >> wrote: >> >> >> >> Excellent leads, thanks. cassandra.in.sh has a heap of 6GB, but I >> didn't >> >> realize that I was trying to float so many memtables. I'll poke >> tomorrow >> >> and report if it gets fixed. >> >> Ian >> >> >> >> On Thu, May 20, 2010 at 10:40 AM, Jonathan Ellis <jbel...@gmail.com> >> >> wrote: >> >>> >> >>> Some possibilities: >> >>> >> >>> You didn't adjust Cassandra heap size in cassandra.in.sh (1GB is too >> >>> small) >> >>> You're inserting at CL.ZERO (ROW-MUTATION-STAGE in tpstats will show >> >>> large pending ops -- large = 100s) >> >>> You're creating large rows a bit at a time and Cassandra OOMs when it >> >>> tries to compact (the oom should usually be in the compaction thread) >> >>> You have your 5 disks each with a separate data directory, which will >> >>> allow up to 12 total memtables in-flight internally, and 12*256 is too >> >>> much for the heap size you have (FLUSH-WRITER-STAGE in tpstats will >> >>> show large pending ops -- large = more than 2 or 3) >> >>> >> >>> On Tue, May 18, 2010 at 6:24 AM, Ian Soboroff <isobor...@gmail.com> >> >>> wrote: >> >>> > I hope this isn't too much of a newbie question. I am using >> Cassandra >> >>> > 0.6.1 >> >>> > on a small cluster of Linux boxes - 14 nodes, each with 8GB RAM and >> 5 >> >>> > data >> >>> > drives. The nodes are running HDFS to serve files within the >> cluster, >> >>> > but >> >>> > at the moment the rest of Hadoop is shut down. I'm trying to load a >> >>> > large >> >>> > set of web pages (the ClueWeb collection, but more is coming) and my >> >>> > Cassandra daemons keep dying. >> >>> > >> >>> > I'm loading the pages into a simple column family that lets me fetch >> >>> > out >> >>> > pages by an internal ID or by URL. The biggest thing in the row is >> the >> >>> > page >> >>> > content, maybe 15-20k per page of raw HTML. There aren't a lot of >> >>> > columns. >> >>> > I tried Thrift, Hector, and the BMT interface, and at the moment I'm >> >>> > doing >> >>> > batch mutations over Thrift, about 2500 pages per batch, because >> that >> >>> > was >> >>> > fastest for me in testing. >> >>> > >> >>> > At this point, each Cassandra node has between 500GB and 1.5TB >> >>> > according to >> >>> > nodetool ring. Let's say I start the daemons up, and they all go >> live >> >>> > after >> >>> > a couple minutes of scanning the tables. I then start my importer, >> >>> > which is >> >>> > a single Java process reading Clueweb bundles over HDFS, cutting >> them >> >>> > up, >> >>> > and sending the mutations to Cassandra. I only talk to one node at >> a >> >>> > time, >> >>> > switching to a new node when I get an exception. As the job runs >> over >> >>> > a few >> >>> > hours, the Cassandra daemons eventually fall over, either with no >> error >> >>> > in >> >>> > the log or reporting that they are out of heap. >> >>> > >> >>> > Each daemon is getting 6GB of RAM and has scads of disk space to >> play >> >>> > with. >> >>> > I've set the storage-conf.xml to take 256MB in a memtable before >> >>> > flushing >> >>> > (like the BMT case), and to do batch commit log flushes, and to not >> >>> > have any >> >>> > caching in the CFs. I'm sure I must be tuning something wrong. I >> >>> > would >> >>> > eventually like this Cassandra setup to serve a light request load >> but >> >>> > over >> >>> > say 50-100 TB of data. I'd appreciate any help or advice you can >> >>> > offer. >> >>> > >> >>> > Thanks, >> >>> > Ian >> >>> > >> >>> >> >>> >> >>> >> >>> -- >> >>> Jonathan Ellis >> >>> Project Chair, Apache Cassandra >> >>> co-founder of Riptano, the source for professional Cassandra support >> >>> http://riptano.com >> >> >> > >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com >> > >