[ 
https://issues.apache.org/jira/browse/FLINK-23385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382009#comment-17382009
 ] 

Wenlong Lyu commented on FLINK-23385:
-------------------------------------

hi, [~twalthr] I think the root cause of the issue is the return type of 
REGEXP_EXTRACT should be force nullable instead of  depending on input type, 
introduce by FLINK-13783

> Fix nullability of COALESCE
> ---------------------------
>
>                 Key: FLINK-23385
>                 URL: https://issues.apache.org/jira/browse/FLINK-23385
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>    Affects Versions: 1.13.1
>            Reporter: Maciej Bryński
>            Priority: Major
>
> EDIT: Simpler case:
> {code:java}
> SELECT COALESCE(REGEXP_EXTRACT('22','[A-Z]+'),'-');
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, 
> however, a null value is being written into it. You can set job configuration 
> 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and 
> drop such records silently.
> {code}
> When using REGEXP_EXTRACT on NOT NULL column I'm getting following exception
> {code:java}
> select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test limit 10
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.TableException: Column 'EXPR$0' is NOT NULL, 
> however, a null value is being written into it. You can set job configuration 
> 'table.exec.sink.not-null-enforcer'='drop' to suppress this exception and 
> drop such records silently.
> {code}
> I think the reason is that nullability of result is wrongly calculated.
>  Example:
> {code:java}
> create table test (
>      test STRING NOT NULL
> ) WITH (
>     'connector' = 'datagen'
> );
> explain select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from test
> == Abstract Syntax Tree ==
> LogicalProject(EXPR$0=[REGEXP_EXTRACT($0, _UTF-16LE'[A-Z]+')])
> +- LogicalTableScan(table=[[default_catalog, default_database, test]])== 
> Optimized Physical Plan ==
> Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0])
> +- TableSourceScan(table=[[default_catalog, default_database, test]], 
> fields=[test])== Optimized Execution Plan ==
> Calc(select=[REGEXP_EXTRACT(test, _UTF-16LE'[A-Z]+') AS EXPR$0])
> +- TableSourceScan(table=[[default_catalog, default_database, test]], 
> fields=[test]){code}
> As you can see Flink is removing COALESCE from query which is wrong.
>  
> Same for view (null = false):
> {code:java}
> create view v as select COALESCE(REGEXP_EXTRACT(test, '[A-Z]+'), '-') from 
> test
> describe v;
> +--------+--------+-------+-----+--------+-----------+
> |   name |   type |  null | key | extras | watermark |
> +--------+--------+-------+-----+--------+-----------+
> | EXPR$0 | STRING | false |     |        |           |
> +--------+--------+-------+-----+--------+-----------+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to