[ https://issues.apache.org/jira/browse/FLINK-23385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382009#comment-17382009 ]
Wenlong Lyu commented on FLINK-23385: ------------------------------------- hi, [~twalthr] I think the root cause of the issue is the return type of REGEXP_EXTRACT should be force nullable instead of depending on input type, introduce by FLINK-13783 > Fix nullability of COALESCE > --------------------------- > > Key: FLINK-23385 > URL: https://issues.apache.org/jira/browse/FLINK-23385 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner > Affects Versions: 1.13.1 > Reporter: Maciej Bryński > Priority: Major > > EDIT: Simpler case: > {code:java} > SELECT COALESCE(REGEXP_EXTRACT('22','[A-Z]+'),'-'); > [ERROR] Could not execute SQL statement. Reason: > org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, > however, a null value is being written into it. You can set job configuration > 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and > drop such records silently. > {code} > When using REGEXP_EXTRACT on NOT NULL column I'm getting following exception > {code:java} > select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test limit 10 > [ERROR] Could not execute SQL statement. Reason: > org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, > however, a null value is being written into it. You can set job configuration > 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and > drop such records silently. > {code} > I think the reason is that nullability of result is wrongly calculated. > Example: > {code:java} > create table test ( > test STRING NOT NULL > ) WITH ( > 'connector' = 'datagen' > ); > explain select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test > == Abstract Syntax Tree == > LogicalProject(EXPR$0=[REGEXP_EXTRACT($0, _UTF-16LE'[A-Z]+')]) > +- LogicalTableScan(table=[[default_catalog, default_database, test]])== > Optimized Physical Plan == > Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0]) > +- TableSourceScan(table=[[default_catalog, default_database, test]], > fields=[test])== Optimized Execution Plan == > Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0]) > +- TableSourceScan(table=[[default_catalog, default_database, test]], > fields=[test]){code} > As you can see Flink is removing COALESCE from query which is wrong. > > Same for view (null = false): > {code:java} > create view v as select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from > test > describe v; > +--------+--------+-------+-----+--------+-----------+ > | name | type | null | key | extras | watermark | > +--------+--------+-------+-----+--------+-----------+ > | EXPR$0 | STRING | false | | | | > +--------+--------+-------+-----+--------+-----------+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)