Re: Trait propagation in heterogeneous plans

Julian Hyde Wed, 05 May 2021 20:00:52 -0700

Vladimir,

You are arguing for pragmatism over idealism. I get that.


The problem with your argument is that you go on to say

> If in the future we invest in the
> proper integration 

That’s a big “If”. Who is the “we” who is going to do this work? Now you are 
the one being unrealistic.

Calcite is a sophisticated framework that has many high-level abstractions to 
support scenarios that are not tested in the core code base. We built those 
abstractions by being idealistic. We couldn’t possibly test them because we 
didn’t have the use case to exercise them.

How do these abstractions get fully baked into production quality? When the 
downstream projects that need them refine the features, and contribute fixes 
back.

It’s not in Calcite’s interests to make it easy for downstream projects to fork 
the code when they need to do the complex stuff. We need to use our 
abstractions (in this case, the idea that traits are pluggable) and if those 
abstractions are wrong or limiting, those downstream projects will come and fix 
them.

Julian



> On May 5, 2021, at 12:32 PM, Vladimir Ozerov <[email protected]> wrote:
> 
> Hi Vladimir, Julian,
> 
> I want to distinguish between two cases.
> 
> Some projects may decide to use Calcite's distribution trait. To my
> knowledge, this is not a common pattern because it is not really integrated
> into Calcite. It is not destroyed/adjusted in rules and operators as
> needed, not integrated into EnumerableConvention.enforce, etc.
> 
> Other projects may decide to use a custom distribution trait. Examples are
> Apache Flink, Hazelcast, and some other private projects we work on. There
> are many reasons to do this. A couple of examples:
> 1. Calcite's distribution produces logical exchange, while production
> grade-optimizers are typically multi-phase and want the distribution
> convention to produce physical exchanges in a dedicated physical phase(s).
> 2. Some systems may have custom requirements for distribution, such as
> propagating the number of shards, supporting multiple equivalent keys, etc.
> 
> But in both cases, the bottom line is that the Enumerable currently cannot
> work with both built-in and custom distributions because the associated
> code is not implemented in Calcite's core. And even if we add the
> fully-fledged support of the built-in distribution to Enumerable, many
> projects will continue using custom distribution traits because the
> exchange is a physical operation with lots of backend-dependent specific
> quirks, and any attempt to model it abstractly in Calcite's core is
> unlikely to cover some edge cases.
> 
> The same applies to any other custom trait that depends on columns -
> Enumerable will not be able to process it correctly.
> 
> Therefore, instead of having a definitively broken code, it might be better
> to apply the defensive approach when the whole Enumerable backend provides
> a clear and consistent contract: we support collation and reset everything
> else. IMO it is better because it matches the current behavior and would
> never cause strange bugs in a user code. If in the future we invest in the
> proper integration of the built-in distribution or figure out how to
> "externalize" the trait propagation for Enumerable operators, we may relax
> this statement.
> 
> Please let me know if it makes any sense.
> 
> Regards,
> Vladimir.
> 
> вт, 4 мая 2021 г. в 21:02, Julian Hyde <[email protected]>:
> 
>>> I would say known in-core vs unknown trait is a reasonable approach to
>>> distingush traits.
>> 
>> Easy, but not reasonable. It will make it very difficult to reuse
>> existing rels and rules (e.g. Enumerable) in a downstream project that
>> has defined its own traits.
>> 
>> On Tue, May 4, 2021 at 10:44 AM Vladimir Sitnikov
>> <[email protected]> wrote:
>>> 
>>>> It seems arbitrary to include Collation but exclude other traits.
>>> 
>>> I would say known in-core vs unknown trait is a reasonable approach to
>>> distingush traits.
>>> 
>>> Vladimir
>>

Re: Trait propagation in heterogeneous plans

Reply via email to