Iterate over all of the possible time buckets.
On Fri, Feb 1, 2019 at 1:36 PM Carl Mueller <carl.muel...@smartthings.com.invalid> wrote: > I'd still need a "all events for app_id" query. We have seconds-level > events :-( > > > On Fri, Feb 1, 2019 at 3:02 PM Jeff Jirsa <jji...@gmail.com> wrote: > > > On Fri, Feb 1, 2019 at 12:58 PM Carl Mueller > > <carl.muel...@smartthings.com.invalid> wrote: > > > > > Jeff: so the partition key with timestamp would then need a separate > > index > > > table to track the appid->partition keys. Which isn't horrible, but > also > > > tracks into another desire of mine: some way to make the replica > mapping > > > match locally between the index table and the data table: > > > > > > So in the composite partition key for the TWCS table, you'd have > app_id + > > > timestamp, BUT ONLY THE app_id GENERATES the hash/key. > > > > > > > > Huh? No, you'd have a composite partition key of app_id + timestamp > > ROUNDED/CEIL/FLOOR to some time window, and both would be used for > > hash/key. > > > > And you dont need any extra table, because app_id is known and the > > timestamp can be calculated (e.g., 4 digits of year + 3 digits for day of > > year makes today 2019032 ) > > > > > > > > > Thus it would match with the index table that is just partition key > > app_id, > > > column key timestamp. > > > > > > And then theoretically a node-local "join" could be done without an > > > additional query hop, and batched updates would be more easily atomic > to > > a > > > single node. > > > > > > Now how we would communicate all that in CQL/etc: who knows. Hm. Maybe > > > materialized views cover this, but I haven't tracked that since we > don't > > > have versions that support them and they got "deprecated". > > > > > > > > > On Fri, Feb 1, 2019 at 2:53 PM Carl Mueller < > > carl.muel...@smartthings.com> > > > wrote: > > > > > > > Interesting. Now that we have semiautomated upgrades, we are going to > > > > hopefully get everything to 3.11X once we get the intermediate hop to > > > 2.2. > > > > > > > > I'm thinking we could also use sstable metadata markings + custom > > > > compactors for things like multiple customers on the same table. So > you > > > > could sequester the data for a customer in their own sstables and > then > > > > queries could effectively be subdivided against only the sstables > that > > > had > > > > that customer. Maybe the min and max would cover that, I'd have to > look > > > at > > > > the details. > > > > > > > > On Thu, Jan 31, 2019 at 8:11 PM Jonathan Haddad <j...@jonhaddad.com> > > > wrote: > > > > > > > >> In addition to what Jeff mentioned, there was an optimization in 3.4 > > > that > > > >> can significantly reduce the number of sstables accessed when a > LIMIT > > > >> clause was used. This can be a pretty big win with TWCS. > > > >> > > > >> > > > >> > > > > > > http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html > > > >> > > > >> On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa <jji...@gmail.com> > wrote: > > > >> > > > >> > In my original TWCS talk a few years back, I suggested that people > > > make > > > >> > the partitions match the time window to avoid exactly what you’re > > > >> > describing. I added that to the talk because my first team that > used > > > >> TWCS > > > >> > (the team for which I built TWCS) had a data model not unlike > yours, > > > and > > > >> > the read-every-sstable thing turns out not to work that well if > you > > > have > > > >> > lots of windows (or very large partitions). If you do this, you > can > > > fan > > > >> out > > > >> > a bunch of async reads for the first few days and ask for more as > > you > > > >> need > > > >> > to fill the page - this means the reads are more distributed, too, > > > >> which is > > > >> > an extra bonus when you have noisy partitions. > > > >> > > > > >> > In 3.0 and newer (I think, don’t quote me in the specific > version), > > > the > > > >> > sstable metadata has the min and max clustering which helps > exclude > > > >> > sstables from the read path quite well if everything in the table > is > > > >> using > > > >> > timestamp clustering columns. I know there was some issue with > this > > > and > > > >> RTs > > > >> > recently, so I’m not sure if it’s current state, but worth > > considering > > > >> that > > > >> > this may be much better on 3.0+ > > > >> > > > > >> > > > > >> > > > > >> > -- > > > >> > Jeff Jirsa > > > >> > > > > >> > > > > >> > > On Jan 31, 2019, at 1:56 PM, Carl Mueller < > > > >> carl.muel...@smartthings.com.invalid> > > > >> > wrote: > > > >> > > > > > >> > > Situation: > > > >> > > > > > >> > > We use TWCS for a task history table (partition is user, column > > key > > > is > > > >> > > timeuuid of task, TWCS is used due to tombstone TTLs that rotate > > out > > > >> the > > > >> > > tasks every say month. ) > > > >> > > > > > >> > > However, if we want to get a "slice" of tasks (say, tasks in the > > > last > > > >> two > > > >> > > days and we are using TWCS sstable blocks of 12 hours). > > > >> > > > > > >> > > The problem is, this is a frequent user and they have tasks in > ALL > > > the > > > >> > > sstables that are organized by the TWCS into time-bucketed > > sstables. > > > >> > > > > > >> > > So Cassandra has to first read in, say 80 sstables to > reconstruct > > > the > > > >> > row, > > > >> > > THEN it can exclude/slice on the column key. > > > >> > > > > > >> > > Question: > > > >> > > > > > >> > > Or am I wrong that the read path needs to grab all relevant > > sstables > > > >> > before > > > >> > > applying column key slicing and this is possible? Admittedly we > > are > > > in > > > >> > 2.1 > > > >> > > for this table (we in the process of upgrading now that we have > an > > > >> > > automated upgrading program that seems to work pretty well) > > > >> > > > > > >> > > If my assumption is correct, then the compaction strategy knows > as > > > it > > > >> > > writes the sstables what it is bucketing them as (and could > encode > > > in > > > >> > > sstable metadata?). If my assumption about slicing is that the > > whole > > > >> row > > > >> > > needs reconstruction, if we had a perfect infinite monkey coding > > > team > > > >> > that > > > >> > > could generate whatever we wanted within some feasibility, could > > we > > > >> > provide > > > >> > > special hooks to do sstable exclusion based on metadata if we > know > > > >> that > > > >> > > that the metadata will indicate exclusion/inclusion of columns > > based > > > >> on > > > >> > > metadata? > > > >> > > > > > >> > > Goal: > > > >> > > > > > >> > > The overall goal would be to support exclusion of sstables from > a > > > read > > > >> > > path, in case we had compaction strategies hand-tailored for > other > > > >> > queries. > > > >> > > Essentially we would be doing a first-pass bucketsort exclusion > > with > > > >> the > > > >> > > sstable metadata marking the buckets. This might aid support of > > > >> superwide > > > >> > > rows and paging through column keys if we allowed the table > > creator > > > to > > > >> > > specify bucketing as flushing occurs. In general it appears > query > > > >> > > performance quickly degrades based on # sstables required for a > > > >> lookup. > > > >> > > > > > >> > > I still don't know the code nearly well enough to do patches, it > > > would > > > >> > seem > > > >> > > based on my looking at custom compaction strategies and the > basic > > > read > > > >> > path > > > >> > > that this would be a useful extension for advanced users. > > > >> > > > > > >> > > The fallback would be a set of tables to serve as buckets and we > > > span > > > >> the > > > >> > > buckets with queries when one bucket runs out. The tables > rotate. > > > >> > > > > >> > > > --------------------------------------------------------------------- > > > >> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > >> > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > >> > > > > >> > > > > >> > > > >> -- > > > >> Jon Haddad > > > >> http://www.rustyrazorblade.com > > > >> twitter: rustyrazorblade > > > >> > > > > > > > > > >