Jeff: so the partition key with timestamp would then need a separate index
table to track the appid->partition keys. Which isn't horrible, but also
tracks into another desire of mine: some way to make the replica mapping
match locally between the index table and the data table:

So in the composite partition key for the TWCS table, you'd have app_id +
timestamp, BUT ONLY THE app_id GENERATES the hash/key.

Thus it would match with the index table that is just partition key app_id,
column key timestamp.

And then theoretically a node-local "join" could be done without an
additional query hop, and batched updates would be more easily atomic to a
single node.

Now how we would communicate all that in CQL/etc: who knows. Hm. Maybe
materialized views cover this, but I haven't tracked that since we don't
have versions that support them and they got "deprecated".


On Fri, Feb 1, 2019 at 2:53 PM Carl Mueller <carl.muel...@smartthings.com>
wrote:

> Interesting. Now that we have semiautomated upgrades, we are going to
> hopefully get everything to 3.11X once we get the intermediate hop to 2.2.
>
> I'm thinking we could also use sstable metadata markings + custom
> compactors for things like multiple customers on the same table. So you
> could sequester the data for a customer in their own sstables and then
> queries could effectively be subdivided against only the sstables that had
> that customer. Maybe the min and max would cover that, I'd have to look at
> the details.
>
> On Thu, Jan 31, 2019 at 8:11 PM Jonathan Haddad <j...@jonhaddad.com> wrote:
>
>> In addition to what Jeff mentioned, there was an optimization in 3.4 that
>> can significantly reduce the number of sstables accessed when a LIMIT
>> clause was used.  This can be a pretty big win with TWCS.
>>
>>
>> http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html
>>
>> On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>
>> > In my original TWCS talk a few years back, I suggested that people make
>> > the partitions match the time window to avoid exactly what you’re
>> > describing. I added that to the talk because my first team that used
>> TWCS
>> > (the team for which I built TWCS) had a data model not unlike yours, and
>> > the read-every-sstable thing turns out not to work that well if you have
>> > lots of windows (or very large partitions). If you do this, you can fan
>> out
>> > a bunch of async reads for the first few days and ask for more as you
>> need
>> > to fill the page - this means the reads are more distributed, too,
>> which is
>> > an extra bonus when you have noisy partitions.
>> >
>> > In 3.0 and newer (I think, don’t quote me in the specific version), the
>> > sstable metadata has the min and max clustering which helps exclude
>> > sstables from the read path quite well if everything in the table is
>> using
>> > timestamp clustering columns. I know there was some issue with this and
>> RTs
>> > recently, so I’m not sure if it’s current state, but worth considering
>> that
>> > this may be much better on 3.0+
>> >
>> >
>> >
>> > --
>> > Jeff Jirsa
>> >
>> >
>> > > On Jan 31, 2019, at 1:56 PM, Carl Mueller <
>> carl.muel...@smartthings.com.invalid>
>> > wrote:
>> > >
>> > > Situation:
>> > >
>> > > We use TWCS for a task history table (partition is user, column key is
>> > > timeuuid of task, TWCS is used due to tombstone TTLs that rotate out
>> the
>> > > tasks every say month. )
>> > >
>> > > However, if we want to get a "slice" of tasks (say, tasks in the last
>> two
>> > > days and we are using TWCS sstable blocks of 12 hours).
>> > >
>> > > The problem is, this is a frequent user and they have tasks in ALL the
>> > > sstables that are organized by the TWCS into time-bucketed sstables.
>> > >
>> > > So Cassandra has to first read in, say 80 sstables to reconstruct the
>> > row,
>> > > THEN it can exclude/slice on the column key.
>> > >
>> > > Question:
>> > >
>> > > Or am I wrong that the read path needs to grab all relevant sstables
>> > before
>> > > applying column key slicing and this is possible? Admittedly we are in
>> > 2.1
>> > > for this table (we in the process of upgrading now that we have an
>> > > automated upgrading program that seems to work pretty well)
>> > >
>> > > If my assumption is correct, then the compaction strategy knows as it
>> > > writes the sstables what it is bucketing them as (and could encode in
>> > > sstable metadata?). If my assumption about slicing is that the whole
>> row
>> > > needs reconstruction, if we had a perfect infinite monkey coding team
>> > that
>> > > could generate whatever we wanted within some feasibility, could we
>> > provide
>> > > special hooks to do sstable exclusion based on metadata if we know
>> that
>> > > that the metadata will indicate exclusion/inclusion of columns based
>> on
>> > > metadata?
>> > >
>> > > Goal:
>> > >
>> > > The overall goal would be to support exclusion of sstables from a read
>> > > path, in case we had compaction strategies hand-tailored for other
>> > queries.
>> > > Essentially we would be doing a first-pass bucketsort exclusion with
>> the
>> > > sstable metadata marking the buckets. This might aid support of
>> superwide
>> > > rows and paging through column keys if we allowed the table creator to
>> > > specify bucketing as flushing occurs. In general it appears query
>> > > performance quickly degrades based on # sstables required for a
>> lookup.
>> > >
>> > > I still don't know the code nearly well enough to do patches, it would
>> > seem
>> > > based on my looking at custom compaction strategies and the basic read
>> > path
>> > > that this would be a useful extension for advanced users.
>> > >
>> > > The fallback would be a set of tables to serve as buckets and we span
>> the
>> > > buckets with queries when one bucket runs out. The tables rotate.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: dev-h...@cassandra.apache.org
>> >
>> >
>>
>> --
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
>>
>

Reply via email to