In addition to what Jeff mentioned, there was an optimization in 3.4 that can significantly reduce the number of sstables accessed when a LIMIT clause was used. This can be a pretty big win with TWCS.
http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa <jji...@gmail.com> wrote: > In my original TWCS talk a few years back, I suggested that people make > the partitions match the time window to avoid exactly what you’re > describing. I added that to the talk because my first team that used TWCS > (the team for which I built TWCS) had a data model not unlike yours, and > the read-every-sstable thing turns out not to work that well if you have > lots of windows (or very large partitions). If you do this, you can fan out > a bunch of async reads for the first few days and ask for more as you need > to fill the page - this means the reads are more distributed, too, which is > an extra bonus when you have noisy partitions. > > In 3.0 and newer (I think, don’t quote me in the specific version), the > sstable metadata has the min and max clustering which helps exclude > sstables from the read path quite well if everything in the table is using > timestamp clustering columns. I know there was some issue with this and RTs > recently, so I’m not sure if it’s current state, but worth considering that > this may be much better on 3.0+ > > > > -- > Jeff Jirsa > > > > On Jan 31, 2019, at 1:56 PM, Carl Mueller > > <carl.muel...@smartthings.com.invalid> > wrote: > > > > Situation: > > > > We use TWCS for a task history table (partition is user, column key is > > timeuuid of task, TWCS is used due to tombstone TTLs that rotate out the > > tasks every say month. ) > > > > However, if we want to get a "slice" of tasks (say, tasks in the last two > > days and we are using TWCS sstable blocks of 12 hours). > > > > The problem is, this is a frequent user and they have tasks in ALL the > > sstables that are organized by the TWCS into time-bucketed sstables. > > > > So Cassandra has to first read in, say 80 sstables to reconstruct the > row, > > THEN it can exclude/slice on the column key. > > > > Question: > > > > Or am I wrong that the read path needs to grab all relevant sstables > before > > applying column key slicing and this is possible? Admittedly we are in > 2.1 > > for this table (we in the process of upgrading now that we have an > > automated upgrading program that seems to work pretty well) > > > > If my assumption is correct, then the compaction strategy knows as it > > writes the sstables what it is bucketing them as (and could encode in > > sstable metadata?). If my assumption about slicing is that the whole row > > needs reconstruction, if we had a perfect infinite monkey coding team > that > > could generate whatever we wanted within some feasibility, could we > provide > > special hooks to do sstable exclusion based on metadata if we know that > > that the metadata will indicate exclusion/inclusion of columns based on > > metadata? > > > > Goal: > > > > The overall goal would be to support exclusion of sstables from a read > > path, in case we had compaction strategies hand-tailored for other > queries. > > Essentially we would be doing a first-pass bucketsort exclusion with the > > sstable metadata marking the buckets. This might aid support of superwide > > rows and paging through column keys if we allowed the table creator to > > specify bucketing as flushing occurs. In general it appears query > > performance quickly degrades based on # sstables required for a lookup. > > > > I still don't know the code nearly well enough to do patches, it would > seem > > based on my looking at custom compaction strategies and the basic read > path > > that this would be a useful extension for advanced users. > > > > The fallback would be a set of tables to serve as buckets and we span the > > buckets with queries when one bucket runs out. The tables rotate. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade