Re: Troubleshooting IO performance ?

aaron morton Mon, 13 Jun 2011 05:43:42 -0700

To reduce the number of SSTables increase the memtable_threshold for the CF.


The IO numbers may be because of compaction kicking in. The CompactionManager 
provides information via JMX on it's progress, or you can check the logs. You 
could increase the min_compaction_threshold for the CF or disable compaction if 
you want to during the bulk load. In your case it's probably a bad idea as in 
your case every write requires a read and compaction will help the read 
performance. 

These numbers show that about 37GB of data was written to disk, but compaction 
has shrunk this down to about 7GB. If you are not doing deletes that you are 
doing a log of overwrites

>                 Space used (live): 8283980539
>                 Space used (total): 39972827945


I'm getting a bit confused about what the problem is. Is it increasing latency 
for read requests or the unexplained IO ? 
 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12 Jun 2011, at 08:55, Philippe wrote:

> More info below
>  
> I just loaded 4.8GB of similar data in another keyspace and ran the same 
> process as in my previous tests but on that data.
> I started with three threads hitting cassandra. No I/O, hardly any CPU (15% 
> on a 4 core server)
> After an hour or so, I raised it to 6 threads in parallel. Then to 9 threads 
> in parallel.
> 
> I never got any IO, in fact iostat showed me there wasn't any disk reads. I 
> hardly saw the CPU elevate except at the end.
> 
> The only difference between the two datasets is that the size of the other 
> one is 8.4Gb. So the second one doesn't fit completely in memory.So my woes 
> are related to how well cassandra is fetching the data in the SSTAbles right ?
> 
> 
> So what are my options ? My rows are very small at the moment (like well < 4 
> kBytes). Should I reduce the read buffer ? Should I reduce the number of SST 
> tables ?
> I'm reloading the data from scratch using the incremental update I will be 
> using in production. I'm hitting the cluster with 6 threads. Yes I should 
> have decreased it but I was too optimistic and I dont want to stop it now. 
> The data I'm loading is used for computing running averages (sum & total) and 
> so every update requires a read. 
> 
> As soon as the data no longer fits in memory, I'm seeing huge amounts of io 
> (almost 380MB/s reading) that I'd like to understand.
> My hypothesis is that because I am updating the same key so many times 
> (dozens if not hundreds of times in some cases), the row is split across the 
> SSTables and every read needs to go through all the SST tables.
> Unfortunately, at some point, cassandra compacted the keys from 5 tables to 3 
> and the throughput did not increase after that so I'm not even sure this 
> makes sense.
> 1) Is there another explanation ? Can I do something about this ?
> 2) why is the readlatency displayed twice in "cfstats" and why does it differ 
> ?
> 
> Thanks
> 
> vmstat
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
>  0  7  47724  94604  22556 7373252    0    0 391646  2152 10498 10297  6  6 
> 26 62
>  0  6  47724  93736  22556 7372400    0    0 396774     0 10881 11177  5  6 
> 29 60
>  2  5  47724  92496  22556 7374824    0    0 372406    15 11212 11149  8  7 
> 25 59
>  0  5  47724  89520  22568 7378484    0    0 399730   526 10964 11975  6  7 
> 24 63
>  0  7  47724  87908  22568 7379444    0    0 396216     0 10405 10880  5  7 
> 22 66
> 
> iostat -dmx 2
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await r_await w_await  svctm  %util
> sdb             168.50     0.00 3152.00    0.50   185.49     0.25   120.66    
> 54.86   17.39   17.39    4.00   0.31  96.80
> sda             178.50     0.50 3090.50    0.50   184.47     0.19   122.35    
> 61.16   19.71   19.71    4.00   0.31  97.20
> md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00    
>  0.00    0.00    0.00    0.00   0.00   0.00
> md5               0.00     0.00 6612.50    1.50   372.82     0.44   115.58    
>  0.00    0.00    0.00    0.00   0.00   0.00
> dm-0              0.00     0.00 6612.50    0.00   372.82     0.00   115.47   
> 123.15   18.58   18.58    0.00   0.15  97.20
> 
> cfstats
>         Read Count: 88215069
>         Read Latency: 1.821254258759351 ms.
>         Write Count: 88215059
>         Write Latency: 0.013311765885686253 ms.
>         Pending Tasks: 0
>                 Column Family: PUBLIC_MONTHLY
>                 SSTable count: 3
>                 Space used (live): 8283980539
>                 Space used (total): 39972827945
>                 Memtable Columns Count: 449201
>                 Memtable Data Size: 21788245
>                 Memtable Switch Count: 72
>                 Read Count: 88215069
>                 Read Latency: 7.433 ms.
>                 Write Count: 88215069
>                 Write Latency: 0.016 ms.
>                 Pending Tasks: 0
>                 Key cache capacity: 2000000
>                 Key cache size: 2000000
>                 Key cache hit rate: 2.93152540950838E-4
>                 Row cache capacity: 50000
>                 Row cache size: 50000
>                 Row cache hit rate: 0.20131872518304333
>                 Compacted row minimum size: 125
>                 Compacted row maximum size: 17084
>                 Compacted row mean size: 176
> 
> tpstats
> Pool Name                    Active   Pending      Completed
> ReadStage                         6         7       88200770
> RequestResponseStage              0         0              0
> MutationStage                     0         0       88200760
> ReadRepairStage                   0         0              0
> GossipStage                       0         0              0
> AntiEntropyStage                  0         0              0
> MigrationStage                    0         0              0
> MemtablePostFlusher               0         0             74
> StreamStage                       0         0              0
> FlushWriter                       0         0             74
> FILEUTILS-DELETE-POOL             0         0            358
> MiscStage                         0         0              0
> FlushSorter                       0         0              0
> InternalResponseStage             0         0              0
> HintedHandoff                     0         0              0
> 
> 
> cfhistograms
> Offset      SSTables     Write Latency      Read Latency          Row Size    
>   Column Count
> 1              27628                 0                 0                 0    
>       29101531
> 2              12405                 3                 9                 0    
>        6677734
> 3                  0                30               130                 0    
>        2477249
> 4                  0               191               451                 0    
>         391129
> 5                  0               414               990                 0    
>          51983
> 6                  0               644              1587                 0    
>           4041
> 7                  0              1131              1957                 0    
>            710
> 8                  0              1328              2291                 0    
>            220
> 10                 0              1663              5515                 0    
>            131
> 12                 0              1349              6133                 0    
>             24
> 14                 0              1557              6461                 0    
>              5
> 17                 0               880              8911                 0    
>              0
> 20                 0               594              6685                 0    
>              0
> 24                 0              1099              5316                 0    
>              0
> 29                 0              2044              2283                 0    
>              0
> 35                 0              2901               474                 0    
>              0
> 42                 0              2737                66                 0    
>              0
> 50                 0               876                10                 0    
>              0
> 60                 0               192                 4                 0    
>              0
> 72                 0               361                 0                 0    
>              0
> 86                 0               771                 0                 0    
>              0
> 103                0              1486                 0                 0    
>              0
> 124                0              2285                 0                 0    
>             11
> 149                0              2618                 0          24141265    
>              0
> 179                0              2438                 0           1791177    
>              0
> 215                0              1909                 0           5326712    
>              0
> 258                0              1008                 0           1841927    
>              0
> 310                0               420                 0           2298870    
>              0
> 372                0               105                 0           1234895    
>              0
> 446                0                 3                 0           1036341    
>              0
> 535                0                 3                 0            553275    
>              0
> 642                0                 2                 0            273339    
>              0
> 770                0                 1                 0            144826    
>              0
> 924                0                 1                 0             48619    
>              0
> 1109               0                 1                 0             10275    
>              0
> 1331               0                 1                 0              2134    
>              0
> 1597                                 5                 0               637    
>              0
> 1916                                 5                 0               291    
>              0
> 2299                                10                 0               115    
>              0
> 2759                                15                 0                42    
>              0
> 3311                                51                 0                16    
>              0
> 3973                                86                 0                 1    
>              0
> 4768                               100                 0                 0    
>              0
> 5722                               165                 0                 0    
>              0
> 6866                               255                 0                 0    
>              0
> 8239                               412                 0                 0    
>              0
> 9887                               556                 0                 0    
>              0
> 11864                              945                 0                 0    
>              0
> 14237                             1358                 0                 6    
>              0
>

Re: Troubleshooting IO performance ?

Reply via email to