Sorry for the delayed reply, but thanks very much - this pointed me at the exact problem. I found that the queue size here was equal to the number of configured DataFileDirectories, so a good test was to lie to Cassandra and claim that there were more DataFileDirectories than I needed. Interestingly, it still only ever wrote to the first configured DataFileDirectory, but it certainly eliminated the problem, which I think means that for my use case at least, it will be good enough to patch Cassandra to introduce more control of the queue size.

On 08/01/11 18:20, Peter Schuller wrote:
[multiple active cf;s, often triggering flush at the same time]

Can anyone confirm whether or not this behaviour is expected, and
suggest anything that I could do about it? This is on 0.6.6, by the way.
Patched with time-to-live code, if that makes a difference.
I looked at the code (trunk though, not 0.6.6) and was a bit
surprised. There seems to be a single shared (static) executor for the
sorting and writing stages of memtable flushing (so far so good). But
what I didn't expect was that they seem to have a work queue of a size
equal to the concurrency.

In the case of the writer, the concurrency is the
memtable_flush_writers option (not available in 0.6.6). For the
sorter, it is the number of CPU cores on the system. This makes sense
for the concurrency aspect.

If my understanding is correct and I am not missing something else,
this means that for multiple column families you do indeed need to
expect to have this problem. The more column families the greater the
probability.

What I expected to find was to see that each cf would be guaranteed to
have at least one memtable in queue before writes would block for that
cf.

Assuming the same holds true in your case on 0.6.6 (it looks to be so
on the 0.6 branch by quick examination), I would have to assume that
either one of the following is true:

(1) You have more cf:s actively written to than the number of CPU
cores on your machine so that you're waiting on flushSorter.
   or
(2) Your write speed is overall higher than what can be sustained by
an sstable writer.

If you are willing to patch Cassandra and do the appropriate testing,
and are find with the implications on heap size, you should be able to
work around this by adjusting the size of the work queues for the
flushSorter and flushWriter in ColumnFamilyStory.java.

Note that I did not test this, so proceed with caution if you do.

It will definitely mean that you will eat more heap space if you
submit writes to the cluster faster than they are processed. So in
particular if you're relying on backpressure mechanisms to avoid
causing problems when you do non-rate-limited writes to the cluster,
results are probably negative.

I'll file a bug about this to (1) elicit feedback if I'm wrong, and
(2) to fix it.


--
Andy Burgess
Principal Development Engineer
Application Delivery
WorldPay Ltd.
270-289 Science Park, Milton Road
Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 940
andy.burg...@worldpay.com

WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell 
Street, London E1 8AN

Authorised and regulated by the Financial Services Authority.

‘WorldPay Group’ means WorldPay (UK) Limited and its affiliates from time to 
time.  A reference to an “affiliate” means any Subsidiary Undertaking, any 
Parent Undertaking and any Subsidiary Undertaking of any such Parent 
Undertaking and reference to a “Parent Undertaking” or a “Subsidiary 
Undertaking” is to be construed in accordance with section 1162 of the 
Companies Act 2006, as amended.

DISCLAIMER: This email and any files transmitted with it, including replies and 
forwarded copies (which may contain alterations) subsequently transmitted from 
the WorldPay Group, are confidential and solely for the use of the intended 
recipient. If you are not the intended recipient (or authorised to receive for 
the intended recipient), you have received this email in error and any review, 
use, distribution or disclosure of its content is strictly prohibited. If you 
have received this email in error please notify the sender immediately by 
replying to this message. Please then delete this email and destroy any copies 
of it.

Messages sent to and from the WorldPay Group may be monitored to ensure 
compliance with internal policies and to protect our business.  Emails are not 
necessarily secure.  The WorldPay Group does not accept responsibility for 
changes made to this message after it was sent. Please note that neither the 
WorldPay Group nor the sender accepts any responsibility for viruses and it is 
the responsibility of the recipient to ensure that the onward transmission, 
opening or use of this message and any attachments will not adversely affect 
its systems or data. Anyone who communicates with us by email is taken to 
accept these risks. Opinions, conclusions and other information contained in 
this message that do not relate to the official business of the WorldPay Group 
shall not be understood as endorsed or given by it.

Reply via email to