Thanks Francesco,

The two arguments you posted, further strengthen the need to make it
DISABLED by default.

On Tue, Feb 22, 2022 at 12:10 PM Francesco Guardiani <
france...@ververica.com> wrote:

> Hi all,
> I'm +1 with what everything you said Marios.
>
> I'm gonna add another argument on top of that: the "legacy-cast-behavior"
> has also a broken type inference, leading to incorrect results or further
> errors down in the pipeline[1]. For example, take this:
>
> SELECT COALESCE(CAST('a' AS INT), 0) ...
>
> With the legacy cast behavior ENABLED, this is going to lead to the wrong
> result, as 'a' is inferred as VARCHAR NOT NULL, the CAST return value is
> inferred as INT NOT NULL, so the planner will drop COALESCE, and will never
> return 0. Essentially, CAST will infer the wrong nullability leading to
> assigning its result to a NOT NULL type, when its value can effectively be
> NULL.
>
> > You introduce a deprecated flag to help users
> using old versions of the software to smoothly transition to the new
> version, while the new users experience the new features/behavior,
> without the need to set a flag.
>
> This is IMO the major point on why we should disable the legacy cast
> behavior by default. This is even more relevant with 1.15 and the upgrade
> story, as per the problem described above, because users will now end up
> generating plans with wrong type inference, which will be hard to migrate
> in the next flink versions.
>
> FG
>
> [1] In case you're wondering why it wasn't fixed, the reason is that fixing
> it means creating a breaking change, for details
> https://github.com/apache/flink/pull/18611#issuecomment-1028174877
>
>
> On Tue, Feb 22, 2022 at 10:07 AM Marios Trivyzas <mat...@gmail.com> wrote:
>
> > Hello all!
> >
> > I would like to bring back the discussion regarding the
> > *table.exec.legacy-cast-behaviour*
> > configuration option which we are introducing with Flink *1.15*. This
> > option provides the users
> > with the flexibility to continue using the old (incorrect, according to
> SQL
> > standards) behaviour
> > of *CAST.*
> >
> > With Flink *1.15* we have introduced a bunch of fixes, improvements and
> new
> > casting functionality
> > between types, see https://issues.apache.org/jira/browse/FLINK-24403,
> and
> > some of them are
> > guarded behind the legacy behaviour option:
> >
> >    - Trimming and padding when casting to CHAR/VARCHAR types to respect
> the
> >    specified length
> >    - Changes for casting various types to CHAR/VARCHAR/STRING
> >    - Runtime errors for CAST no longer emit *null *as result but
> exceptions
> >    are thrown with a meaningful message for the cause, that fail the
> > pipeline. *TRY_CAST
> >    *is introduced instead, which emits *null* results instead of throwing
> >    exceptions.
> >
> > Those changes become active if users set the
> > *table.exec.legacy-cast-behaviour
> > *option to *DISABLED*, otherwise
> > they will continue to experience the old, *erroneous*, behaviour of
> *CAST*.
> >
> > Currently, we have set the *table.exec.legacy-cast-behaviour *option
> > to be *ENABLED
> > *by default, so if users want
> > to get the new correct behaviour, they are required to set explicitly the
> > option to *DISABLED*. Moreover, the option
> > itself is marked as deprecated, since the plan is to be removed in the
> > future, so that the old, erroneous behaviour
> > won't be an option, and the *TRY_CAST* would be the way to go if users
> > don't want to have errors and failed pipelines,
> > but have *null*s emitted in case of runtime errors when casting.
> >
> > I would like to start a discussion and maybe ask for voting, so that we
> set
> > the *table.exec.legacy-cast-behaviour* option
> > to *DISABLED *by default, so that users that want to keep their old
> > pipelines working the same way, without changing their
> > SQL/TableAPI code, would need to explicitly set it to *ENABLED.*
> >
> > My main argument for changing the default value for the option, is that
> the
> > *DISABLED* value is the one that enables
> > the *correct* behaviour for CAST which should be the default for all new
> > users. This way, new FLINK users, or users which
> > build new pipelines, from now on would get the correct behaviour by
> default
> > without the need of changing some flag in their
> > configuration. It feels weird to me, especially for people very familiar
> > with standard SQL, to be obliged to set some config
> > flag, to be able to get the correct behaviour for CAST. On top, users
> that
> > won't read about this option in our docs, will,
> > "blindly", experience the old incorrect behaviour for their new
> pipelines,
> > and issues that could cause the CAST to fail
> > will remain hidden from them, since *nulls* would be emitted. IMHO, this
> > last part is also very important during the development
> > stages of an application/pipeline. Normally, developers would want to see
> > all possible errors/scenarios during development
> > stages , in order to build a robust production system. If errors, are
> > hidden then, they can easily end up with those errors
> > in the production system which would be even harder to discover and
> debug,
> > since no exception will ever be thrown.
> > Imagine that there is a CAST which generates nulls because of runtime
> > errors, and its result is used in an aggregate function:
> >
> > SELECT AVG(CAST(col1 AS FLOAT)), ... FROM t GROUP BY ....
> >
> > If nulls are emitted by the CAST (let's say because some records have a
> > quoted string value for col1, then simply the AVG result
> > would be wrong and in would be extremely difficult to realise and fix the
> > issue by simply wrapping col1 first with, for example,
> > a REGEXP_REPLACE, to get rid of the quotes.
> >
> > My second argument is that, since we have marked
> > *table.exec.legacy-cast-behaviour* as deprecated, and we want to
> completely
> > remove it in the future, if the default value is *DISABLED*, when it's
> > removed we also make a breaking change, by changing the
> > default behaviour for all users, which is against the common software
> > practices. You introduce a deprecated flag to help users
> > using old versions of the software to smoothly transition to the new
> > version, while the new users experience the new features/behaviour,
> > without the need to set a flag. So, when in the future this flag is
> > completely removed, for those "new" users it would be completely
> > transparent. If we continue with having the default value *ENABLED*,
> those
> > new user would experience an "unnatural" breaking change
> > when this option is completely removed.
> >
> > Best,
> > Marios
> >
>


-- 
Marios

Reply via email to