Re: Scaling problems

Ian Soboroff Fri, 21 May 2010 09:10:08 -0700

So at the moment, I'm not running my loader, and I'm looking at one node
which is slow to respond to nodetool requests.  At this point, it has a pile
of hinted-handoffs pending which don't seem to be draining out.  The
system.log shows that it's GCing pretty much constantly.
Ian



$ /usr/local/src/cassandra/bin/nodetool --host node7 tpstats
Pool Name                    Active   Pending      Completed
FILEUTILS-DELETE-POOL             0         0            178
STREAM-STAGE                      0         0              0
RESPONSE-STAGE                    0         0          21852
ROW-READ-STAGE                    0         0              0
LB-OPERATIONS                     0         0              0
MESSAGE-DESERIALIZER-POOL         0         0        1648536
GMFD                              0         0         125430
LB-TARGET                         0         0              0
CONSISTENCY-MANAGER               0         0              0
ROW-MUTATION-STAGE                2         2        1886537
MESSAGE-STREAMING-POOL            0         0              0
LOAD-BALANCER-STAGE               0         0              0
FLUSH-SORTER-POOL                 0         0              0
MEMTABLE-POST-FLUSHER             0         0            206
FLUSH-WRITER-POOL                 0         0            206
AE-SERVICE-STAGE                  0         0              0
HINTED-HANDOFF-POOL               1       158             23


On Fri, May 21, 2010 at 10:37 AM, Ian Soboroff <isobor...@gmail.com> wrote:

> On the to-do list for today.  Is there a tool to aggregate all  the JMX
> stats from all nodes?  I mean, something a little more complete than nagios.
> Ian
>
>
> On Fri, May 21, 2010 at 10:23 AM, Jonathan Ellis <jbel...@gmail.com>wrote:
>
>> you should check the jmx stages I posted about
>>
>> On Fri, May 21, 2010 at 7:05 AM, Ian Soboroff <isobor...@gmail.com>
>> wrote:
>> > Just an update.  I rolled the memtable size back to 128MB.  I am still
>> > seeing that the daemon runs for a while with reasonable heap usage, but
>> then
>> > the heap climbs up to the max (6GB in this case, should be plenty) and
>> it
>> > starts GCing, without much getting cleared.  The client catches lots of
>> > exceptions, where I wait 30 seconds and try again, with a new client if
>> > necessary, but it doesn't clear up.
>> >
>> > Could this be related to memory leak problems I've skimmed past on the
>> list
>> > here?
>> >
>> > It can't be that I'm creating rows a bit at a time... once I stick a web
>> > page into two CFs, it's over and done with for this application.  I'm
>> just
>> > trying to get stuff loaded.
>> >
>> > Is there a limit to how much on-disk data a Cassandra daemon can
>> manage?  Is
>> > there runtime overhead associated with stuff on disk?
>> >
>> > Ian
>> >
>> > On Thu, May 20, 2010 at 9:31 PM, Ian Soboroff <isobor...@gmail.com>
>> wrote:
>> >>
>> >> Excellent leads, thanks.  cassandra.in.sh has a heap of 6GB, but I
>> didn't
>> >> realize that I was trying to float so many memtables.  I'll poke
>> tomorrow
>> >> and report if it gets fixed.
>> >> Ian
>> >>
>> >> On Thu, May 20, 2010 at 10:40 AM, Jonathan Ellis <jbel...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Some possibilities:
>> >>>
>> >>> You didn't adjust Cassandra heap size in cassandra.in.sh (1GB is too
>> >>> small)
>> >>> You're inserting at CL.ZERO (ROW-MUTATION-STAGE in tpstats will show
>> >>> large pending ops -- large = 100s)
>> >>> You're creating large rows a bit at a time and Cassandra OOMs when it
>> >>> tries to compact (the oom should usually be in the compaction thread)
>> >>> You have your 5 disks each with a separate data directory, which will
>> >>> allow up to 12 total memtables in-flight internally, and 12*256 is too
>> >>> much for the heap size you have (FLUSH-WRITER-STAGE in tpstats will
>> >>> show large pending ops -- large = more than 2 or 3)
>> >>>
>> >>> On Tue, May 18, 2010 at 6:24 AM, Ian Soboroff <isobor...@gmail.com>
>> >>> wrote:
>> >>> > I hope this isn't too much of a newbie question.  I am using
>> Cassandra
>> >>> > 0.6.1
>> >>> > on a small cluster of Linux boxes - 14 nodes, each with 8GB RAM and
>> 5
>> >>> > data
>> >>> > drives.  The nodes are running HDFS to serve files within the
>> cluster,
>> >>> > but
>> >>> > at the moment the rest of Hadoop is shut down.  I'm trying to load a
>> >>> > large
>> >>> > set of web pages (the ClueWeb collection, but more is coming) and my
>> >>> > Cassandra daemons keep dying.
>> >>> >
>> >>> > I'm loading the pages into a simple column family that lets me fetch
>> >>> > out
>> >>> > pages by an internal ID or by URL.  The biggest thing in the row is
>> the
>> >>> > page
>> >>> > content, maybe 15-20k per page of raw HTML.  There aren't a lot of
>> >>> > columns.
>> >>> > I tried Thrift, Hector, and the BMT interface, and at the moment I'm
>> >>> > doing
>> >>> > batch mutations over Thrift, about 2500 pages per batch, because
>> that
>> >>> > was
>> >>> > fastest for me in testing.
>> >>> >
>> >>> > At this point, each Cassandra node has between 500GB and 1.5TB
>> >>> > according to
>> >>> > nodetool ring.  Let's say I start the daemons up, and they all go
>> live
>> >>> > after
>> >>> > a couple minutes of scanning the tables.  I then start my importer,
>> >>> > which is
>> >>> > a single Java process reading Clueweb bundles over HDFS, cutting
>> them
>> >>> > up,
>> >>> > and sending the mutations to Cassandra.  I only talk to one node at
>> a
>> >>> > time,
>> >>> > switching to a new node when I get an exception.  As the job runs
>> over
>> >>> > a few
>> >>> > hours, the Cassandra daemons eventually fall over, either with no
>> error
>> >>> > in
>> >>> > the log or reporting that they are out of heap.
>> >>> >
>> >>> > Each daemon is getting 6GB of RAM and has scads of disk space to
>> play
>> >>> > with.
>> >>> > I've set the storage-conf.xml to take 256MB in a memtable before
>> >>> > flushing
>> >>> > (like the BMT case), and to do batch commit log flushes, and to not
>> >>> > have any
>> >>> > caching in the CFs.  I'm sure I must be tuning something wrong.  I
>> >>> > would
>> >>> > eventually like this Cassandra setup to serve a light request load
>> but
>> >>> > over
>> >>> > say 50-100 TB of data.  I'd appreciate any help or advice you can
>> >>> > offer.
>> >>> >
>> >>> > Thanks,
>> >>> > Ian
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Jonathan Ellis
>> >>> Project Chair, Apache Cassandra
>> >>> co-founder of Riptano, the source for professional Cassandra support
>> >>> http://riptano.com
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>

Re: Scaling problems

Reply via email to