Some possibilities:

You didn't adjust Cassandra heap size in cassandra.in.sh (1GB is too small)
You're inserting at CL.ZERO (ROW-MUTATION-STAGE in tpstats will show
large pending ops -- large = 100s)
You're creating large rows a bit at a time and Cassandra OOMs when it
tries to compact (the oom should usually be in the compaction thread)
You have your 5 disks each with a separate data directory, which will
allow up to 12 total memtables in-flight internally, and 12*256 is too
much for the heap size you have (FLUSH-WRITER-STAGE in tpstats will
show large pending ops -- large = more than 2 or 3)

On Tue, May 18, 2010 at 6:24 AM, Ian Soboroff <isobor...@gmail.com> wrote:
> I hope this isn't too much of a newbie question.  I am using Cassandra 0.6.1
> on a small cluster of Linux boxes - 14 nodes, each with 8GB RAM and 5 data
> drives.  The nodes are running HDFS to serve files within the cluster, but
> at the moment the rest of Hadoop is shut down.  I'm trying to load a large
> set of web pages (the ClueWeb collection, but more is coming) and my
> Cassandra daemons keep dying.
>
> I'm loading the pages into a simple column family that lets me fetch out
> pages by an internal ID or by URL.  The biggest thing in the row is the page
> content, maybe 15-20k per page of raw HTML.  There aren't a lot of columns.
> I tried Thrift, Hector, and the BMT interface, and at the moment I'm doing
> batch mutations over Thrift, about 2500 pages per batch, because that was
> fastest for me in testing.
>
> At this point, each Cassandra node has between 500GB and 1.5TB according to
> nodetool ring.  Let's say I start the daemons up, and they all go live after
> a couple minutes of scanning the tables.  I then start my importer, which is
> a single Java process reading Clueweb bundles over HDFS, cutting them up,
> and sending the mutations to Cassandra.  I only talk to one node at a time,
> switching to a new node when I get an exception.  As the job runs over a few
> hours, the Cassandra daemons eventually fall over, either with no error in
> the log or reporting that they are out of heap.
>
> Each daemon is getting 6GB of RAM and has scads of disk space to play with.
> I've set the storage-conf.xml to take 256MB in a memtable before flushing
> (like the BMT case), and to do batch commit log flushes, and to not have any
> caching in the CFs.  I'm sure I must be tuning something wrong.  I would
> eventually like this Cassandra setup to serve a light request load but over
> say 50-100 TB of data.  I'd appreciate any help or advice you can offer.
>
> Thanks,
> Ian
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to