Why use such a large batch size?

-ryan

On Thu, Mar 10, 2011 at 6:31 AM, Desimpel, Ignace
<ignace.desim...@nuance.com> wrote:
>
>
> Hello,
>
> I had a demo application with embedded cassandra version 0.6.x, inserting
> about 120 K  row mutations in one call.
>
> In version 0.6.x that usually took about 5 seconds, and I could repeat this
> step adding each time the same amount of data.
>
> Running on a single CPU computer, single hard disk, XP 32 bit OS, 1G memory
>
> I tested this again on CentOS 64 bit OS, 6G memory, different settings of
> memtable_throughput_in_mb and memtable_operations_in_millions.
>
> Also tried version 0.7.3. Also the same behavior.
>
>
>
> Now with version 0.7.2 the call returns with a timeout exception even using
> a timeout of 120000 (2 minutes). I see the CPU time going to 100%, a lot of
> disk writing ( giga bytes), a lot of log messages  about compacting,
> flushing, commitlog, …
>
>
>
> Below you can find some information using the nodetool at start of the batch
> mutation and also after 14 minutes. The MutationStage is clearly showing how
> slow the system handles the row mutations.
>
>
>
> Attached : Cassandra.yaml with at end the description of my database
> structure using yaml
>
> Attached : log file with cassandra output.
>
>
>
> Any idea what I could be doing wrong?
>
>
>
> Regards,
>
>
>
> Ignace Desimpel
>
>
>
> ignace.desim...@nuance.com
>
>
>
> At start of the insert (after inserting 124360 row mutations) I get the
> following info from the nodetool :
>
>
>
> C:\apache-cassandra-07.2\bin>nodetool --host ads.nuance.com info
>
> Starting NodeTool
>
> 34035877798200531112672274220979640561
>
> Gossip active    : true
>
> Load             : 5.49 MB
>
> Generation No    : 1299502115
>
> Uptime (seconds) : 1152
>
> Heap Memory (MB) : 179,84 / 1196,81
>
>
>
> C:\apache-cassandra-07.2\bin>nodetool --host ads.nuance.com tpstats
>
> Starting NodeTool
>
> Pool Name                    Active   Pending      Completed
>
> ReadStage                         0         0          40637
>
> RequestResponseStage              0         0             30
>
> MutationStage                    32    121679          72149
>
> GossipStage                       0         0              0
>
> AntiEntropyStage                  0         0              0
>
> MigrationStage                    0         0              1
>
> MemtablePostFlusher               0         0              6
>
> StreamStage                       0         0              0
>
> FlushWriter                       0         0              5
>
> MiscStage                         0         0              0
>
> FlushSorter                       0         0              0
>
> InternalResponseStage             0         0              0
>
> HintedHandoff                     0         0              0
>
>
>
> After 14 minutes (timeout exception after 2 minutes : see log file) I get :
>
>
>
> C:\apache-cassandra-07.2\bin>nodetool --host ads.nuance.com info
>
> Starting NodeTool
>
> 34035877798200531112672274220979640561
>
> Gossip active    : true
>
> Load             : 10.31 MB
>
> Generation No    : 1299502115
>
> Uptime (seconds) : 2172
>
> Heap Memory (MB) : 733,82 / 1196,81
>
>
>
> C:\apache-cassandra-07.2\bin>nodetool --host ads.nuance.com tpstats
>
> Starting NodeTool
>
> Pool Name                    Active   Pending      Completed
>
> ReadStage                         0         0          40646
>
> RequestResponseStage              0         0             30
>
> MutationStage                    32    103310          90526
>
> GossipStage                       0         0              0
>
> AntiEntropyStage                  0         0              0
>
> MigrationStage                    0         0              1
>
> MemtablePostFlusher               0         0             69
>
> StreamStage                       0         0              0
>
> FlushWriter                       0         0             68
>
> FILEUTILS-DELETE-POOL             0         0             42
>
> MiscStage                         0         0              0
>
> FlushSorter                       0         0              0
>
> InternalResponseStage             0         0              0
>
> HintedHandoff                     0         0              0
>
>
>
>

Reply via email to