[ 
https://issues.apache.org/jira/browse/BEAM-13577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475982#comment-17475982
 ] 

Talat Uyarer commented on BEAM-13577:
-------------------------------------

[~kenn] I am waiting [~ibzib] 's review. After it I hope we will merge to 
master. 

> Beam Select's uniquifyNames function loses nullability of Complex types while 
> inferring schema 
> -----------------------------------------------------------------------------------------------
>
>                 Key: BEAM-13577
>                 URL: https://issues.apache.org/jira/browse/BEAM-13577
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql, sdk-java-core
>    Affects Versions: 2.32.0, 2.33.0, 2.34.0
>            Reporter: Talat Uyarer
>            Assignee: Talat Uyarer
>            Priority: P2
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> We use BeamSQL in our project. When we use any JOIN. SQL generates 
> BeamCoGBKJoinRel plan which uses Select from core sdk. While Select infer 
> output schema it loses nullability of complex types such as Array, Map. You 
> can see an example error. 
> {code:java}
> INFO: SQL:
> SELECT `o1`.`order_id`, `o1`.`site_id`, `o1`.`price`, `o1`.`f_stringArr`, 
> `o2`.`order_id` AS `order_id0`, `o2`.`site_id` AS `site_id0`, `o2`.`price` AS 
> `price0`
> FROM `beam`.`ORDER_DETAILS1_WITH_ARRAY` AS `o1`
> INNER JOIN `beam`.`ORDER_DETAILS2` AS `o2` ON `o1`.`order_id` = 
> `o2`.`site_id` AND `o2`.`price` = `o1`.`site_id`
> Dec 28, 2021 1:20:14 PM 
> org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
> INFO: SQLPlan>
> LogicalProject(order_id=[$0], site_id=[$1], price=[$2], f_stringArr=[$3], 
> order_id0=[$4], site_id0=[$5], price0=[$6])
>   LogicalJoin(condition=[AND(=($0, $5), =($6, $1))], joinType=[inner])
>     BeamIOSourceRel(table=[[beam, ORDER_DETAILS1_WITH_ARRAY]])
>     BeamIOSourceRel(table=[[beam, ORDER_DETAILS2]])Dec 28, 2021 1:20:14 PM 
> org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
> INFO: BEAMPlan>
> BeamCoGBKJoinRel(condition=[AND(=($0, $5), =($6, $1))], joinType=[inner])
>   BeamIOSourceRel(table=[[beam, ORDER_DETAILS1_WITH_ARRAY]])
>   BeamIOSourceRel(table=[[beam, ORDER_DETAILS2]])
> Types not equal. provided output schema: Fields:
> Field{name=order_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=f_stringArr, description=, type=ARRAY<STRING NOT NULL>, 
> options={{}}}
> Field{name=order_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price0, description=, type=INT32 NOT NULL, options={{}}}
> Encoding positions:
> {f_stringArr=3, price0=6, price=2, site_id=1, order_id0=4, order_id=0, 
> site_id0=5}
> Options:{{}}UUID: null Schema inferred from select: Fields:
> Field{name=317499b1-9c8a-4bb9-8897-4ecda110d02a, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=46249f84-b89e-439d-b799-a039b427a60a, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=565d397c-d36c-4387-b2e4-5d6402c839bd, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=bbe10404-3c44-41a4-b942-16f484211dab, description=, 
> type=ARRAY<STRING NOT NULL> NOT NULL, options={{}}}
> Field{name=bd3e3adf-ae12-4155-9770-b0123c8bb18c, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=60a346c5-fa18-40e1-819f-c06af92ff033, description=, type=INT32 NOT 
> NULL, options={{}}}
> Encoding positions:
> {2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6=5, 
> 60a346c5-fa18-40e1-819f-c06af92ff033=6, 
> 565d397c-d36c-4387-b2e4-5d6402c839bd=2, 
> bbe10404-3c44-41a4-b942-16f484211dab=3, 
> bd3e3adf-ae12-4155-9770-b0123c8bb18c=4, 
> 46249f84-b89e-439d-b799-a039b427a60a=1, 
> 317499b1-9c8a-4bb9-8897-4ecda110d02a=0}
> Options:{{}}UUID: null from input type: Fields:
> Field{name=lhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
> NOT NULL, price INT32 NOT NULL, f_stringArr ARRAY<STRING NOT NULL>> NOT NULL, 
> options={{}}}
> Field{name=rhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
> NOT NULL, price INT32 NOT NULL> NOT NULL, options={{}}}
> Encoding positions:
> {lhs=0, rhs=1}
> Options:{{}}UUID: a35cf07b-2bc1-48b8-b229-3c2368993738
> java.lang.IllegalArgumentException: Types not equal. provided output schema: 
> Fields:
> Field{name=order_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=f_stringArr, description=, type=ARRAY<STRING NOT NULL>, 
> options={{}}}
> Field{name=order_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price0, description=, type=INT32 NOT NULL, options={{}}}
> Encoding positions:
> {f_stringArr=3, price0=6, price=2, site_id=1, order_id0=4, order_id=0, 
> site_id0=5}
> Options:{{}}UUID: null Schema inferred from select: Fields:
> Field{name=317499b1-9c8a-4bb9-8897-4ecda110d02a, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=46249f84-b89e-439d-b799-a039b427a60a, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=565d397c-d36c-4387-b2e4-5d6402c839bd, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=bbe10404-3c44-41a4-b942-16f484211dab, description=, 
> type=ARRAY<STRING NOT NULL> NOT NULL, options={{}}}
> Field{name=bd3e3adf-ae12-4155-9770-b0123c8bb18c, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6, description=, type=INT32 NOT 
> NULL, options={{}}}
> Field{name=60a346c5-fa18-40e1-819f-c06af92ff033, description=, type=INT32 NOT 
> NULL, options={{}}}
> Encoding positions:
> {2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6=5, 
> 60a346c5-fa18-40e1-819f-c06af92ff033=6, 
> 565d397c-d36c-4387-b2e4-5d6402c839bd=2, 
> bbe10404-3c44-41a4-b942-16f484211dab=3, 
> bd3e3adf-ae12-4155-9770-b0123c8bb18c=4, 
> 46249f84-b89e-439d-b799-a039b427a60a=1, 
> 317499b1-9c8a-4bb9-8897-4ecda110d02a=0}
> Options:{{}}UUID: null from input type: Fields:
> Field{name=lhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
> NOT NULL, price INT32 NOT NULL, f_stringArr ARRAY<STRING NOT NULL>> NOT NULL, 
> options={{}}}
> Field{name=rhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 
> NOT NULL, price INT32 NOT NULL> NOT NULL, options={{}}}
> Encoding positions:
> {lhs=0, rhs=1}
> Options:{{}}UUID: a35cf07b-2bc1-48b8-b229-3c2368993738
>     at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
>     at 
> org.apache.beam.sdk.schemas.transforms.Select$Fields.expand(Select.java:205)
>     at 
> org.apache.beam.sdk.schemas.transforms.Select$Fields.expand(Select.java:157)
>     at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
>     at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:482)
>     at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:363)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel.standardJoin(BeamCoGBKJoinRel.java:196)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel.access$400(BeamCoGBKJoinRel.java:75)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel$StandardJoin.expand(BeamCoGBKJoinRel.java:135)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel$StandardJoin.expand(BeamCoGBKJoinRel.java:93)
>     at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
>     at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:499)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils.toPCollection(BeamSqlRelUtils.java:72)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils.toPCollection(BeamSqlRelUtils.java:42)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BaseRelTest.compilePipeline(BaseRelTest.java:34)
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRelBoundedVsBoundedTest.testInnerJoin(BeamCoGBKJoinRelBoundedVsBoundedTest.java:83)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:322)
>     at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>     at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>     at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>     at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>     at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
>     at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
>     at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>     at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>     at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>     at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
>     at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
>     at 
> org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
>     at 
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
>     at 
> org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at 
> org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
>     at java.base/java.lang.Thread.run(Thread.java:829) {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to