In addition to what Jeff mentioned, there was an optimization in 3.4 that
can significantly reduce the number of sstables accessed when a LIMIT
clause was used.  This can be a pretty big win with TWCS.

http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html

On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa <jji...@gmail.com> wrote:

> In my original TWCS talk a few years back, I suggested that people make
> the partitions match the time window to avoid exactly what you’re
> describing. I added that to the talk because my first team that used TWCS
> (the team for which I built TWCS) had a data model not unlike yours, and
> the read-every-sstable thing turns out not to work that well if you have
> lots of windows (or very large partitions). If you do this, you can fan out
> a bunch of async reads for the first few days and ask for more as you need
> to fill the page - this means the reads are more distributed, too, which is
> an extra bonus when you have noisy partitions.
>
> In 3.0 and newer (I think, don’t quote me in the specific version), the
> sstable metadata has the min and max clustering which helps exclude
> sstables from the read path quite well if everything in the table is using
> timestamp clustering columns. I know there was some issue with this and RTs
> recently, so I’m not sure if it’s current state, but worth considering that
> this may be much better on 3.0+
>
>
>
> --
> Jeff Jirsa
>
>
> > On Jan 31, 2019, at 1:56 PM, Carl Mueller 
> > <carl.muel...@smartthings.com.invalid>
> wrote:
> >
> > Situation:
> >
> > We use TWCS for a task history table (partition is user, column key is
> > timeuuid of task, TWCS is used due to tombstone TTLs that rotate out the
> > tasks every say month. )
> >
> > However, if we want to get a "slice" of tasks (say, tasks in the last two
> > days and we are using TWCS sstable blocks of 12 hours).
> >
> > The problem is, this is a frequent user and they have tasks in ALL the
> > sstables that are organized by the TWCS into time-bucketed sstables.
> >
> > So Cassandra has to first read in, say 80 sstables to reconstruct the
> row,
> > THEN it can exclude/slice on the column key.
> >
> > Question:
> >
> > Or am I wrong that the read path needs to grab all relevant sstables
> before
> > applying column key slicing and this is possible? Admittedly we are in
> 2.1
> > for this table (we in the process of upgrading now that we have an
> > automated upgrading program that seems to work pretty well)
> >
> > If my assumption is correct, then the compaction strategy knows as it
> > writes the sstables what it is bucketing them as (and could encode in
> > sstable metadata?). If my assumption about slicing is that the whole row
> > needs reconstruction, if we had a perfect infinite monkey coding team
> that
> > could generate whatever we wanted within some feasibility, could we
> provide
> > special hooks to do sstable exclusion based on metadata if we know that
> > that the metadata will indicate exclusion/inclusion of columns based on
> > metadata?
> >
> > Goal:
> >
> > The overall goal would be to support exclusion of sstables from a read
> > path, in case we had compaction strategies hand-tailored for other
> queries.
> > Essentially we would be doing a first-pass bucketsort exclusion with the
> > sstable metadata marking the buckets. This might aid support of superwide
> > rows and paging through column keys if we allowed the table creator to
> > specify bucketing as flushing occurs. In general it appears query
> > performance quickly degrades based on # sstables required for a lookup.
> >
> > I still don't know the code nearly well enough to do patches, it would
> seem
> > based on my looking at custom compaction strategies and the basic read
> path
> > that this would be a useful extension for advanced users.
> >
> > The fallback would be a set of tables to serve as buckets and we span the
> > buckets with queries when one bucket runs out. The tables rotate.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade

Reply via email to