Re: frequently update/read table and level compaction

DuyHai Doan Thu, 23 Oct 2014 07:55:06 -0700

#1
 There are 2 levels of JMX metrics: the one for each table and the one
related to the StorageProxy. Depending on each "readcount" you're looking
at, the meaning can be different, watch this video for more explanation:
https://www.youtube.com/watch?v=w6aD4vAY_a8&index=14&list=PLqcm6qE9lgKJkxYZUOIykswDndrOItnn2

#2 Same answer as above, it depends on each level you're looking at

#3
It means that on a given level, each partition fit entirely in 1 SSTable.
Because of that, you cannot have a partition whose head is in 1 SSTable and
a tail in another SSTable. Consequently where you're fetching a range of
"CQL rows" from a partition, Cassandra only needs to touch 1 physical
SSTable on disk and read sequentially from the beginning of the partition.
With STCS the partition may span on several SSTables so it will cost time
to touch many files.

#4
It depends really on your data model. How did you perform you "changes", do
you just issue "UPDATE" statements frequently or do you use timeseries
pattern ? The first scenario may cause performance issue because before
compaction arrives, you will force C* to load a bunch of "old" values in
memory and filter all of them to return the last one. Have a look at those
slides:
http://www.slideshare.net/doanduyhai/cassandra-best-practices-and-worst-anti-patterns-meetup-in-germany
(slides 23-28)

#5
Again, no universal answer, all depend on your insertion rate/update
frequency/data access pattern. Beware of the fact that leveled compaction
eats a lot of your disk I/O. You'd better have enough I/O bandwidth
otherwise the result could be worse than SizeTiered.

 In theory, if your I/O can keep up with Leveled compaction, the read
performance will be steady

 Hope that it helps

On Tue, Oct 21, 2014 at 5:24 AM, Jimmy Lin <y2klyf+w...@gmail.com> wrote:

> Hi,
> I have a column family/ table that has frequent update on one of the
> column, and one column that has infrequent update. Rest of the columns
> never changed. Our application also read frequently on this table.
>
> We have seen some read latency issue on this table and plan to switch to
> use level compaction on this table. Few questions:
>
> #1
> In Cassandra Server JMX, there is "readcount" attribute, what is consider
> a read? accessing a row will consider a read count?
>
> #2
> For JMX "read latency", does it include consistency level (fetching data
> from other nodes) or coordinator related work load?
>
> #3
> From the doc, level compaction stated it will guarantee that all sstables
> in same level are 'non-overlapping", what does it really mean? (trying to
> visualize how this can reduce read latency)
>
> #4
> If I change my select CQL query not to include the frequently changed
> column, will that improve read latency?
>
> #5
> How significant of the improvement of change compaction from sized to
> level? is  it day and night difference? and it is true that while sized
> compaction latency will get worse, level compaction can give very
> consistent read latency for long long time?
>
>
> Thanks
>
>

Re: frequently update/read table and level compaction

Reply via email to