The multi get batches range from 100 to 200. The tests I'm running need to do get_slices and the multigets on those results. I can't turn either of them off.
I was only setting 16 threads for reading, but I'll boost it up to 32 and see what happens. On May 9, 2012, at 11:03 AM, aaron morton wrote: > How big are the multi get batches ? > > How do the wide row get_slice calls behave when the multi gets are not > running ? > > Cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 9/05/2012, at 1:47 AM, Luís Ferreira wrote: > >> Maybe one of the problems is that I am reading the columns in a row and the >> rows themselves in batches, using the count attribute in the SliceRange and >> by changing the start column or the corresponding for rows with the >> KeyRange. According to your blog post, using start key to read for millions >> of rows/columns has a lot of latency, but how else can I read an entire row >> that does not fit into memory? >> >> I'll have to run some tests again and check the tpstats. Still, do you think >> that adding more machines to the cluster will help a lot? I say this, >> because I started with a 3 node cluster and have scaled to a 5 node cluster >> with little improvement... >> >> Thanks anyway. >> >> On May 8, 2012, at 9:54 AM, aaron morton wrote: >> >>> If I was rebuilding my power after spending the first thousand years of the >>> Third Age as a shapeless evil I would cast my Eye of Fire in the direction >>> of the filthy little multi_gets. >>> >>> A node can fail to respond to a query with rpc_timeout for two reasons: >>> either the command did not run or the command started but did not complete. >>> The former is much more likely. If it is happening you will see large >>> pending counts and dropped messages in nodetool tpstats, you will also see >>> log entries about dropped messages. >>> >>> When you send a multi_get each row you request becomes a message in the >>> read thread pool. If you request 100 rows you will put 100 messages in the >>> pool, which by default has 32 threads. If some clients are sending large >>> multi get (or batch mutations) you can overload nodes and starve other >>> clients. >>> >>> for background, some metrics here for selecting from 10million columns >>> http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ >>> >>> Hope that helps. >>> >>> >>> ----------------- >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 6/05/2012, at 7:14 AM, Luís Ferreira wrote: >>> >>>> Hi, >>>> >>>> I'm doing get_slice on huge rows (3 million columns) and even though I am >>>> doing it iteratively I keep getting TimeoutExceptions. I've tried to >>>> change the number of columns fetched but it did not work. >>>> >>>> I have a 5 machine cluster, each with 4GB of which 3 are dedicated to >>>> cassandra's heap, but still the all consume all of the memory and get huge >>>> IO wait due to the amout of reads. >>>> >>>> I am running tests with 100 clients all performing multiple operations >>>> mostly get_slice, get and multi_get, but the timeouts only occur in the >>>> get_slice. >>>> >>>> Does this have anything to do with cassandra's ability (or lack thereof) >>>> to keep the rows in memory? Or am I doing anything wrong? Any tips? >>>> >>>> Cumpliments, >>>> Luís Ferreira >>>> >>>> >>>> >>>> >>> >> >> Cumprimentos, >> Luís Ferreira >> >> >> > Cumprimentos, Luís Ferreira