Re: Capabilities

Jordan West Fri, 20 Dec 2024 09:08:02 -0800

Benedict, I agree with you TCM might be overkill for capabilities. It’s
truly something that’s fine to be eventually consistent. Riaks
implementation used a local ETS table (ETS is built into Erlang -
equivalent for us would a local only system table) and an efficient and
reliable gossip protocol. The data was a simple CRDT basically (a
map<string, list<string>> basically of support features in preference order
with the only operations being additions and reads).


So i agree with you that we could be using TCM as a hammer for every nail
here. But im also hestitant to introduce something new. Distributed tables,
or a virtual table with some way to aggregate accross the cluster, would
also work. In either case we would need a local cache (like Denylist).

>From a requirements perspective reads need to be local (because they may be
done in a hot path) but writes can be slow (typically only change on start
up or during operator intervention).

Jordan



On Fri, Dec 20, 2024 at 01:53 Benedict <bened...@apache.org> wrote:

> If you perform a read from a distributed table on startup you will find
> the latest information. What catchup are you thinking of? I don’t think any
> of the features we talked about need a log, only the latest information.
>
> We can (and should) probably introduce event listeners for distributed
> tables, as this is also a really great feature, but I don’t think this
> should be necessary here.
>
> Regarding disagreements: if you use LWTs then there are no consistency
> issues to worry about.
>
> Again, I’m not opposed to using TCM, although I am a little worried TCM is
> becoming our new hammer with everything a nail. It would be better IMO to
> keep TCM scoped to essential functionality as it’s critical to correctness.
> Perhaps we could extend its APIs to less critical services without
> intertwining them with membership, schema and epoch handling.
>
> On 20 Dec 2024, at 09:43, Štefan Miklošovič <smikloso...@apache.org>
> wrote:
>
> 
>
> I find TCM way more comfortable to work with. The capability of log being
> replayed on restart and catching up with everything else automatically is
> god-sent. If we had that on "good old distributed tables", then is it not
> true that we would need to take extra care of that, e.g. we would need to
> repair it etc ... It might be the source of the discrepancies /
> disagreements etc. TCM is just "maintenance-free" and _just works_.
>
> I think I was also investigating distributed tables but was just pulled
> towards TCM naturally because of its goodies.
>
> On Fri, Dec 20, 2024 at 10:08 AM Benedict <bened...@apache.org> wrote:
>
>> TCM is a perfectly valid basis for this, but TCM is only really
>> *necessary* to solve meta config problems where we can’t rely on the rest
>> of the database working. Particularly placement issues, which is why schema
>> and membership need to live there.
>>
>> It should be possible to use distributed system tables just fine for
>> capabilities, config and guardrails.
>>
>> That said, it’s possible config might be better represented as part of
>> the schema (and we already store some relevant config there) in which case
>> it would live in TCM automatically. Migrating existing configs to a
>> distributed setup will be fun however we do it though.
>>
>> Capabilities also feel naturally related to other membership information,
>> so TCM might be the most suitable place, particularly for handling
>> downgrades after capabilities have been enabled (if we ever expect to
>> support turning off capabilities and then downgrading - which today we
>> mostly don’t).
>>
>> On 20 Dec 2024, at 08:42, Štefan Miklošovič <smikloso...@apache.org>
>> wrote:
>>
>> 
>> Jordan,
>>
>> I also think that having it on TCM would be ideal and we should explore
>> this path first before doing anything custom.
>>
>> Regarding my idea about the guardrails in TCM, when I prototyped that and
>> wanted to make it happen, there was a little bit of a pushback (1) (even
>> though super reasonable one) that TCM is just too young at the moment and
>> it would be desirable to go through some stabilisation period.
>>
>> Another idea was that we should not make just guardrails happen but the
>> whole config should be in TCM. From what I put together, Sam / Alex does
>> not seem to be opposed to this idea, rather the opposite, but having CEP
>> about that is way more involved than having just guardrails there. I
>> consider guardrails to be kind of special and I do not think that having
>> all configurations in TCM (which guardrails are part of) is the absolute
>> must in order to deliver that. I may start with guardrails CEP and you may
>> explore Capabilities CEP on TCM too, if that makes sense?
>>
>> I just wanted to raise the point about the time this would be delivered.
>> If Capabilities are built on TCM and I wanted to do Guardrails on TCM too
>> but was explained it is probably too soon, I guess you would experience
>> something similar.
>>
>> Sam's comment is from May and maybe a lot has changed since in then and
>> his comment is not applicable anymore. It would be great to know if we
>> could build on top of the current trunk already or we will wait until
>> 5.1/6.0 is delivered.
>>
>> (1)
>> https://issues.apache.org/jira/browse/CASSANDRA-19593?focusedCommentId=17844326&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17844326
>>
>> On Fri, Dec 20, 2024 at 2:17 AM Jordan West <jorda...@gmail.com> wrote:
>>
>>> Firstly, glad to see the support and enthusiasm here and in the recent
>>> Slack discussion. I think there is enough for me to start drafting a CEP.
>>>
>>> Stefan, global configuration and capabilities do have some overlap but
>>> not full overlap. For example, you may want to set globally that a cluster
>>> enables feature X or control the threshold for a guardrail but you still
>>> need to know if all nodes support feature X or have that guardrail, the
>>> latter is what capabilities targets. I do think capabilities are a step
>>> towards supporting global configuration and the work you described is
>>> another step (that we could do after capabilities or in parallel with them
>>> in mind). I am also supportive of exploring global configuration for the
>>> reasons you mentioned.
>>>
>>> In terms of how capabilities get propagated across the cluster, I hadn't
>>> put much thought into it yet past likely TCM since this will be a new
>>> feature that lands after TCM. In Riak, we had gossip (but more mature than
>>> C*s -- this was an area I contributed to a lot so very familiar) to
>>> disseminate less critical information such as capabilities and a separate
>>> layer that did TCM. Since we don't have this in C* I don't think we would
>>> want to build a separate distribution channel for capabilities metadata
>>> when we already have TCM in place. But I plan to explore this more as I
>>> draft the CEP.
>>>
>>> Jordan
>>>
>>> On Thu, Dec 19, 2024 at 1:48 PM Štefan Miklošovič <
>>> smikloso...@apache.org> wrote:
>>>
>>>> Hi Jordan,
>>>>
>>>> what would this look like from the implementation perspective? I was
>>>> experimenting with transactional guardrails where an operator would control
>>>> the content of a virtual table which would be backed by TCM so whatever
>>>> guardrail we would change, this would be automatically and transparently
>>>> propagated to every node in a cluster. The POC worked quite nicely. TCM is
>>>> just a vehicle to commit a change which would spread around and all these
>>>> settings would survive restarts. We would have the same configuration
>>>> everywhere which is not currently the case because guardrails are
>>>> configured per node and if not persisted to yaml, on restart their values
>>>> would be forgotten.
>>>>
>>>> Guardrails are just an example, what is quite obvious is to expand this
>>>> idea to the whole configuration in yaml. Of course, not all properties in
>>>> yaml make sense to be the same cluster-wise (ip addresses etc ...), but the
>>>> ones which do would be again set everywhere the same way.
>>>>
>>>> The approach I described above is that we make sure that the
>>>> configuration is same everywhere, hence there can be no misunderstanding
>>>> what features this or that node has, if we say that all nodes have to have
>>>> a particular feature because we said so in TCM log so on restart / replay,
>>>> a node with "catch up" with whatever features it is asked to turn on.
>>>>
>>>> Your approach seems to be that we distribute what all capabilities /
>>>> features a cluster supports and that each individual node configures itself
>>>> in some way or not to comply?
>>>>
>>>> Is there any intersection in these approaches? At first sight it seems
>>>> somehow related. How is one different from another from your point of view?
>>>>
>>>> Regards
>>>>
>>>> (1) https://issues.apache.org/jira/browse/CASSANDRA-19593
>>>>
>>>> On Thu, Dec 19, 2024 at 12:00 AM Jordan West <jw...@apache.org> wrote:
>>>>
>>>>> In a recent discussion on the pains of upgrading one topic that came
>>>>> up is a feature that Riak had called Capabilities [1]. A major pain with
>>>>> upgrades is that each node independently decides when to start using new 
>>>>> or
>>>>> modified functionality. Even when we put this behind a config (like 
>>>>> storage
>>>>> compatibility mode) each node immediately enables the feature when the
>>>>> config is changed and the node is restarted. This causes various types of
>>>>> upgrade pain such as failed streams and schema disagreement. A
>>>>> recent example of this is CASSANRA-20118 [2]. In some cases operators can
>>>>> prevent this from happening through careful coordination (e.g. ensuring
>>>>> upgrade sstables only runs after the whole cluster is upgraded) but
>>>>> typically requires custom code in whatever control plane the operator is
>>>>> using. A capabilities framework would distribute the state of what 
>>>>> features
>>>>> each node has (and their status e.g. enabled or not) so that the cluster
>>>>> can choose to opt in to new features once the whole cluster has them
>>>>> available. From experience, having this in Riak made upgrades a
>>>>> significantly less risky process and also paved a path towards repeatable
>>>>> downgrades. I think Cassandra would benefit from it as well.
>>>>>
>>>>> Further, other tools like analytics could benefit from having this
>>>>> information since currently it's up to the operator to manually determine
>>>>> the state of the cluster in some cases.
>>>>>
>>>>> I am considering drafting a CEP proposal for this feature but wanted
>>>>> to take the general temperature of the community and get some early
>>>>> thoughts while working on the draft.
>>>>>
>>>>> Looking forward to hearing y'alls thoughts,
>>>>> Jordan
>>>>>
>>>>> [1]
>>>>> https://github.com/basho/riak_core/blob/25d9a6fa917eb8a2e95795d64eb88d7ad384ed88/src/riak_core_capability.erl#L23-L72
>>>>>
>>>>> [2] https://issues.apache.org/jira/browse/CASSANDRA-20118
>>>>>
>>>>

Re: Capabilities

Reply via email to