#1 There are 2 levels of JMX metrics: the one for each table and the one related to the StorageProxy. Depending on each "readcount" you're looking at, the meaning can be different, watch this video for more explanation: https://www.youtube.com/watch?v=w6aD4vAY_a8&index=14&list=PLqcm6qE9lgKJkxYZUOIykswDndrOItnn2
#2 Same answer as above, it depends on each level you're looking at #3 It means that on a given level, each partition fit entirely in 1 SSTable. Because of that, you cannot have a partition whose head is in 1 SSTable and a tail in another SSTable. Consequently where you're fetching a range of "CQL rows" from a partition, Cassandra only needs to touch 1 physical SSTable on disk and read sequentially from the beginning of the partition. With STCS the partition may span on several SSTables so it will cost time to touch many files. #4 It depends really on your data model. How did you perform you "changes", do you just issue "UPDATE" statements frequently or do you use timeseries pattern ? The first scenario may cause performance issue because before compaction arrives, you will force C* to load a bunch of "old" values in memory and filter all of them to return the last one. Have a look at those slides: http://www.slideshare.net/doanduyhai/cassandra-best-practices-and-worst-anti-patterns-meetup-in-germany (slides 23-28) #5 Again, no universal answer, all depend on your insertion rate/update frequency/data access pattern. Beware of the fact that leveled compaction eats a lot of your disk I/O. You'd better have enough I/O bandwidth otherwise the result could be worse than SizeTiered. In theory, if your I/O can keep up with Leveled compaction, the read performance will be steady Hope that it helps On Tue, Oct 21, 2014 at 5:24 AM, Jimmy Lin <y2klyf+w...@gmail.com> wrote: > Hi, > I have a column family/ table that has frequent update on one of the > column, and one column that has infrequent update. Rest of the columns > never changed. Our application also read frequently on this table. > > We have seen some read latency issue on this table and plan to switch to > use level compaction on this table. Few questions: > > #1 > In Cassandra Server JMX, there is "readcount" attribute, what is consider > a read? accessing a row will consider a read count? > > #2 > For JMX "read latency", does it include consistency level (fetching data > from other nodes) or coordinator related work load? > > #3 > From the doc, level compaction stated it will guarantee that all sstables > in same level are 'non-overlapping", what does it really mean? (trying to > visualize how this can reduce read latency) > > #4 > If I change my select CQL query not to include the frequently changed > column, will that improve read latency? > > #5 > How significant of the improvement of change compaction from sized to > level? is it day and night difference? and it is true that while sized > compaction latency will get worse, level compaction can give very > consistent read latency for long long time? > > > Thanks > >