Talat Uyarer created BEAM-13577:
-----------------------------------

             Summary: Beam Select's uniquifyNames function loses nullability of 
Complex types while inferring schema 
                 Key: BEAM-13577
                 URL: https://issues.apache.org/jira/browse/BEAM-13577
             Project: Beam
          Issue Type: Bug
          Components: dsl-sql, sdk-java-core
    Affects Versions: 2.34.0, 2.33.0, 2.32.0
            Reporter: Talat Uyarer


We use BeamSQL in our project. When we use any JOIN. SQL generates 
BeamCoGBKJoinRel plan which uses Select from core sdk. While Select infer 
output schema it loses nullability of complex types such as Array, Map. You can 
see an example error. 

 

 
{code:java}
INFO: SQL:
SELECT `o1`.`order_id`, `o1`.`site_id`, `o1`.`price`, `o1`.`f_stringArr`, 
`o2`.`order_id` AS `order_id0`, `o2`.`site_id` AS `site_id0`, `o2`.`price` AS 
`price0`
FROM `beam`.`ORDER_DETAILS1_WITH_ARRAY` AS `o1`
INNER JOIN `beam`.`ORDER_DETAILS2` AS `o2` ON `o1`.`order_id` = `o2`.`site_id` 
AND `o2`.`price` = `o1`.`site_id`
Dec 28, 2021 1:20:14 PM 
org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(order_id=[$0], site_id=[$1], price=[$2], f_stringArr=[$3], 
order_id0=[$4], site_id0=[$5], price0=[$6])
  LogicalJoin(condition=[AND(=($0, $5), =($6, $1))], joinType=[inner])
    BeamIOSourceRel(table=[[beam, ORDER_DETAILS1_WITH_ARRAY]])
    BeamIOSourceRel(table=[[beam, ORDER_DETAILS2]])Dec 28, 2021 1:20:14 PM 
org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCoGBKJoinRel(condition=[AND(=($0, $5), =($6, $1))], joinType=[inner])
  BeamIOSourceRel(table=[[beam, ORDER_DETAILS1_WITH_ARRAY]])
  BeamIOSourceRel(table=[[beam, ORDER_DETAILS2]])
Types not equal. provided output schema: Fields:
Field{name=order_id, description=, type=INT32 NOT NULL, options={{}}}
Field{name=site_id, description=, type=INT32 NOT NULL, options={{}}}
Field{name=price, description=, type=INT32 NOT NULL, options={{}}}
Field{name=f_stringArr, description=, type=ARRAY<STRING NOT NULL>, options={{}}}
Field{name=order_id0, description=, type=INT32 NOT NULL, options={{}}}
Field{name=site_id0, description=, type=INT32 NOT NULL, options={{}}}
Field{name=price0, description=, type=INT32 NOT NULL, options={{}}}
Encoding positions:
{f_stringArr=3, price0=6, price=2, site_id=1, order_id0=4, order_id=0, 
site_id0=5}
Options:{{}}UUID: null Schema inferred from select: Fields:
Field{name=317499b1-9c8a-4bb9-8897-4ecda110d02a, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=46249f84-b89e-439d-b799-a039b427a60a, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=565d397c-d36c-4387-b2e4-5d6402c839bd, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=bbe10404-3c44-41a4-b942-16f484211dab, description=, 
type=ARRAY<STRING NOT NULL> NOT NULL, options={{}}}
Field{name=bd3e3adf-ae12-4155-9770-b0123c8bb18c, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=60a346c5-fa18-40e1-819f-c06af92ff033, description=, type=INT32 NOT 
NULL, options={{}}}
Encoding positions:
{2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6=5, 
60a346c5-fa18-40e1-819f-c06af92ff033=6, 565d397c-d36c-4387-b2e4-5d6402c839bd=2, 
bbe10404-3c44-41a4-b942-16f484211dab=3, bd3e3adf-ae12-4155-9770-b0123c8bb18c=4, 
46249f84-b89e-439d-b799-a039b427a60a=1, 317499b1-9c8a-4bb9-8897-4ecda110d02a=0}
Options:{{}}UUID: null from input type: Fields:
Field{name=lhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
NOT NULL, price INT32 NOT NULL, f_stringArr ARRAY<STRING NOT NULL>> NOT NULL, 
options={{}}}
Field{name=rhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
NOT NULL, price INT32 NOT NULL> NOT NULL, options={{}}}
Encoding positions:
{lhs=0, rhs=1}
Options:{{}}UUID: a35cf07b-2bc1-48b8-b229-3c2368993738
java.lang.IllegalArgumentException: Types not equal. provided output schema: 
Fields:
Field{name=order_id, description=, type=INT32 NOT NULL, options={{}}}
Field{name=site_id, description=, type=INT32 NOT NULL, options={{}}}
Field{name=price, description=, type=INT32 NOT NULL, options={{}}}
Field{name=f_stringArr, description=, type=ARRAY<STRING NOT NULL>, options={{}}}
Field{name=order_id0, description=, type=INT32 NOT NULL, options={{}}}
Field{name=site_id0, description=, type=INT32 NOT NULL, options={{}}}
Field{name=price0, description=, type=INT32 NOT NULL, options={{}}}
Encoding positions:
{f_stringArr=3, price0=6, price=2, site_id=1, order_id0=4, order_id=0, 
site_id0=5}
Options:{{}}UUID: null Schema inferred from select: Fields:
Field{name=317499b1-9c8a-4bb9-8897-4ecda110d02a, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=46249f84-b89e-439d-b799-a039b427a60a, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=565d397c-d36c-4387-b2e4-5d6402c839bd, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=bbe10404-3c44-41a4-b942-16f484211dab, description=, 
type=ARRAY<STRING NOT NULL> NOT NULL, options={{}}}
Field{name=bd3e3adf-ae12-4155-9770-b0123c8bb18c, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6, description=, type=INT32 NOT 
NULL, options={{}}}
Field{name=60a346c5-fa18-40e1-819f-c06af92ff033, description=, type=INT32 NOT 
NULL, options={{}}}
Encoding positions:
{2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6=5, 
60a346c5-fa18-40e1-819f-c06af92ff033=6, 565d397c-d36c-4387-b2e4-5d6402c839bd=2, 
bbe10404-3c44-41a4-b942-16f484211dab=3, bd3e3adf-ae12-4155-9770-b0123c8bb18c=4, 
46249f84-b89e-439d-b799-a039b427a60a=1, 317499b1-9c8a-4bb9-8897-4ecda110d02a=0}
Options:{{}}UUID: null from input type: Fields:
Field{name=lhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
NOT NULL, price INT32 NOT NULL, f_stringArr ARRAY<STRING NOT NULL>> NOT NULL, 
options={{}}}
Field{name=rhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
NOT NULL, price INT32 NOT NULL> NOT NULL, options={{}}}
Encoding positions:
{lhs=0, rhs=1}
Options:{{}}UUID: a35cf07b-2bc1-48b8-b229-3c2368993738
    at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
    at 
org.apache.beam.sdk.schemas.transforms.Select$Fields.expand(Select.java:205)
    at 
org.apache.beam.sdk.schemas.transforms.Select$Fields.expand(Select.java:157)
    at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
    at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:482)
    at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:363)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel.standardJoin(BeamCoGBKJoinRel.java:196)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel.access$400(BeamCoGBKJoinRel.java:75)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel$StandardJoin.expand(BeamCoGBKJoinRel.java:135)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel$StandardJoin.expand(BeamCoGBKJoinRel.java:93)
    at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
    at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:499)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils.toPCollection(BeamSqlRelUtils.java:72)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils.toPCollection(BeamSqlRelUtils.java:42)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BaseRelTest.compilePipeline(BaseRelTest.java:34)
    at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRelBoundedVsBoundedTest.testInnerJoin(BeamCoGBKJoinRelBoundedVsBoundedTest.java:83)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
    at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
    at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:322)
    at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
    at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
    at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
    at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
    at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
    at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
    at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
    at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
    at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
    at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
    at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
    at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
    at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
    at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
    at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
    at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
    at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
    at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
    at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
    at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
    at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
    at 
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
    at 
org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at 
org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
    at java.base/java.lang.Thread.run(Thread.java:829) {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to