Sorry, I didn't answer your question in my response, I have at this point:
Key(ID) When/Where SuperColumn Tag: and Columns {Data: One Value (not yet written, tags, flags)} Under some keys (very small #) there will be 2 values like: Key(ID) When/Where SuperColumn Tag: and Columns {Data: One Value (not yet written, tags, flags)} When/Where SuperColumn Tag: and Columns {Data: One Value (not yet written, tags, flags)} Long term this list will be in the 1000's possibly millions -----Original Message----- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Tuesday, April 20, 2010 10:47 AM To: user@cassandra.apache.org Subject: Re: How to increase cassandra's performance in read? How many columns are in the supercolumn total? "in super columnfamilies there is a third level of subcolumns; these are not indexed, and any request for a subcolumn deserializes _all_ the subcolumns in that supercolumn" http://wiki.apache.org/cassandra/CassandraLimitations On Tue, Apr 20, 2010 at 9:50 AM, Mark Jones <mjo...@imagehawk.com> wrote: > I too am seeing very slow performance while testing worst case scenarios of > 1 key leading to 1 supercolumn and 1 column beyond that. > > > > Key -> SuperColumn -> 1 Column (of ~ 500 bytes) > > > > Drive utilization is 80-90% and I'm only dealing with 50-70 million rows. > (With NO swapping) So far, I've found nothing that helps, including > increasing the keycache FROM 200k-500k keys, I'm guessing the hashing > prevents better cache performance. > > > > Read performance is definitely not 3 IOs based on the utilization factors on > my drives. I'm not sure the issue was ever settled in the previous e-mails > as to how to calculate how many IOs were being done for each read. I've > been testing with clusters of 1,2,3 or 4 machines and so far all I'm seeing > with multiple machines, is lower performance in a cluster than alone. I > keep assuming that at some number of nodes, the performance will begin to > pick up. Three of my nodes are running with 8GB (6GB Java Heap), and one > has 4GB (3GB Java Heap). The machine with the smallest memory footprint is > the fastest performer on inserts, but definitely not the fastest on reads. > > > > I'm suspecting the read path is relying heavily on the fact that you want to > get many columns that are closely related, because lookup by key appears to > be incredibly slow. > > > > From: yangfeng [mailto:yea...@gmail.com] > Sent: Tuesday, April 20, 2010 7:59 AM > To: user@cassandra.apache.org; d...@cassandra.apache.org > Subject: How to increase cassandra's performance in read? > > > > I get 10 columns Family by keys and one columns Family has 30 columns. > > I use multigetSlice once to get 10 column Family.but the performance is so > poor. > > anyone has other thought to increase the performance. > >