Benjamin Black <b <at> b3k.us> writes: > > I am only saying something obvious: if you don't have sufficient > resources to handle the demand, you should reduce demand, increase > resources, or expect errors. Doing lots of writes without much heap > space is such a situation (whether or not it is happening in this > instance), but there are many others. This constraint it not specific > to Cassandra. Hence, there is no free lunch. > > b
I guess my point is that I have rarely run across database servers that die from either too many client connections, or too rapid client requests. They generally stop accepting incoming connections when there are too many connection requests, and further they do not queue and acknowledge an unbounded number of client requests on any given connection. In the example at hand, Julie has 8 clients, each of which is in a loop that writes 100 rows at a time (via batch_mutate), waits for successful completion, then writes another bunch of 100, until it completes all of the rows it is supposed to write (typically 100,000). So at any one time, each client should have about 10 MB of request (100 rows x 100 KB/row), times 8 clients, for a max pending request of no more than 80 MB. Further each request is running with a CL=ALL, so in theory, the request should not complete until each row has been handed off to the ultimate destination node, and perhaps written to the commit log (that part is not clear to me). It sounds like something else must be gobbling up either an unbounded amount of heap, or alternatively, a bounded, but large amount of heap. In the former case it is unclear how to make the application robust. In the later, it would be helpful to understand what the heap ussage upper bound is, and what parameters might have a significant effect on that value. To clarify the history here -- initially we were writing with CL=0 and had great performance but ended up killing the server. It was pointed out that we were really asking the server to accept and acknowledge an unbounded number of requests without waiting for any final disposition of the rows. So we had a "doh!" moment. That is why we went to the other extreme of CL=ALL, to let the server fully dispose of each request before acknowledging it and getting the next. TIA -- Charlie