[ 
https://issues.apache.org/jira/browse/FLINK-29046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582966#comment-17582966
 ] 

Yunhong Zheng commented on FLINK-29046:
---------------------------------------

Hi, [~hxbks2ks] and [~chesnay] , I  tried to reproduce the error of these two 
failed tests. I found that the problem was not produced by 
HiveTableSourceStatisticsReport itself. It is caused by 
_orc.apache.orc.impl.writer.StructTreeWriter_ in {_}hive-exec-3.1.1{_}. In this 
class,  method _writeFileStatistics_ will create a wrong column stats min value 
when encounter decimal type data like:

!image-2022-08-22-21-06-29-980.png|width=395,height=273!

Also, I found that there is no test cases to cover decimal type data in hive 
3.1.1 or upper version. So, I think the best solution now is to add tests to 
cover the decimal type stats report of hive 3.x for orc format, and create 
issue in hive to fix this error.

> HiveTableSourceStatisticsReportTest fails with Hadoop 3
> -------------------------------------------------------
>
>                 Key: FLINK-29046
>                 URL: https://issues.apache.org/jira/browse/FLINK-29046
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Hive, Tests
>    Affects Versions: 1.16.0
>            Reporter: Chesnay Schepler
>            Priority: Critical
>             Fix For: 1.16.0
>
>         Attachments: image-2022-08-22-21-06-29-980.png
>
>
> {code:java}
> 2022-08-19T13:35:56.1882498Z Aug 19 13:35:56 [ERROR] 
> org.apache.flink.connectors.hive.HiveTableSourceStatisticsReportTest.testFlinkOrcFormatHiveTableSourceStatisticsReport
>   Time elapsed: 9.442 s  <<< FAILURE!
> 2022-08-19T13:35:56.1883817Z Aug 19 13:35:56 
> org.opentest4j.AssertionFailedError: 
> 2022-08-19T13:35:56.1884543Z Aug 19 13:35:56 
> 2022-08-19T13:35:56.1890435Z Aug 19 13:35:56 expected: TableStats{rowCount=3, 
> colStats={f_boolean=ColumnStats(nullCount=1), 
> f_smallint=ColumnStats(nullCount=0, max=128, min=100), 
> f_decimal5=ColumnStats(nullCount=0, max=223.45, min=123.45), f_array=null, 
> f_binary=null, f_decimal38=ColumnStats(nullCount=1, 
> max=123433343334333433343334333433343334.34, 
> min=123433343334333433343334333433343334.33), f_map=null, 
> f_float=ColumnStats(nullCount=1, max=33.33300018310547, 
> min=33.31100082397461), f_row=null, f_tinyint=ColumnStats(nullCount=0, max=3, 
> min=1), f_decimal14=ColumnStats(nullCount=0, max=123333333355.33, 
> min=123333333333.33), f_date=ColumnStats(nullCount=0, max=1990-10-16, 
> min=1990-10-14), f_bigint=ColumnStats(nullCount=0, max=1238123899121, 
> min=1238123899000), f_timestamp3=ColumnStats(nullCount=0, max=1990-10-16 
> 12:12:43.123, min=1990-10-14 12:12:43.123), f_double=ColumnStats(nullCount=0, 
> max=10.1, min=1.1), f_string=ColumnStats(nullCount=0, max=def, min=abcd), 
> f_int=ColumnStats(nullCount=1, max=45536, min=31000)}}
> 2022-08-19T13:35:56.1902811Z Aug 19 13:35:56  but was: TableStats{rowCount=3, 
> colStats={f_boolean=ColumnStats(nullCount=1), 
> f_smallint=ColumnStats(nullCount=0, max=128, min=100), 
> f_decimal5=ColumnStats(nullCount=0, max=223.45, min=0), f_array=null, 
> f_binary=null, f_decimal38=ColumnStats(nullCount=1, 
> max=123433343334333433343334333433343334.34, 
> min=123433343334333433343334333433343334.33), f_map=null, 
> f_float=ColumnStats(nullCount=1, max=33.33300018310547, 
> min=33.31100082397461), f_row=null, f_tinyint=ColumnStats(nullCount=0, max=3, 
> min=1), f_decimal14=ColumnStats(nullCount=0, max=123333333355.33, min=0), 
> f_date=ColumnStats(nullCount=0, max=1990-10-16, min=1990-10-14), 
> f_bigint=ColumnStats(nullCount=0, max=1238123899121, min=1238123899000), 
> f_timestamp3=ColumnStats(nullCount=0, max=1990-10-16 12:12:43.123, 
> min=1990-10-14 12:12:43.123), f_double=ColumnStats(nullCount=0, max=10.1, 
> min=1.1), f_string=ColumnStats(nullCount=0, max=def, min=abcd), 
> f_int=ColumnStats(nullCount=1, max=45536, min=31000)}}
> 2022-08-19T13:35:56.1908634Z Aug 19 13:35:56  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 2022-08-19T13:35:56.1910402Z Aug 19 13:35:56  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 2022-08-19T13:35:56.1912266Z Aug 19 13:35:56  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 2022-08-19T13:35:56.1913257Z Aug 19 13:35:56  at 
> org.apache.flink.connectors.hive.HiveTableSourceStatisticsReportTest.assertHiveTableOrcFormatTableStatsEquals(HiveTableSourceStatisticsReportTest.java:339)
> 2022-08-19T13:35:56.1914512Z Aug 19 13:35:56  at 
> org.apache.flink.connectors.hive.HiveTableSourceStatisticsReportTest.testFlinkOrcFormatHiveTableSourceStatisticsReport(HiveTableSourceStatisticsReportTest.java:118)
> 2022-08-19T13:35:56.1915444Z Aug 19 13:35:56  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2022-08-19T13:35:56.1916130Z Aug 19 13:35:56  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2022-08-19T13:35:56.1916856Z Aug 19 13:35:56  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-08-19T13:35:56.1917571Z Aug 19 13:35:56  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-08-19T13:35:56.1918278Z Aug 19 13:35:56  at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
> 2022-08-19T13:35:56.1919020Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> 2022-08-19T13:35:56.1919923Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> 2022-08-19T13:35:56.1920841Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
> 2022-08-19T13:35:56.1921877Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
> 2022-08-19T13:35:56.1922778Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
> 2022-08-19T13:35:56.1923726Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
> 2022-08-19T13:35:56.1924761Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
> 2022-08-19T13:35:56.1925690Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> 2022-08-19T13:35:56.1926590Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> 2022-08-19T13:35:56.1927507Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> 2022-08-19T13:35:56.1928422Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> 2022-08-19T13:35:56.1929216Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
> 2022-08-19T13:35:56.1930018Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
> 2022-08-19T13:35:56.1930866Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
> 2022-08-19T13:35:56.1931868Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2022-08-19T13:35:56.1932794Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
> 2022-08-19T13:35:56.1933757Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135)
> 2022-08-19T13:35:56.1934645Z Aug 19 13:35:56  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66)
> 2022-08-19T13:35:56.1935581Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
> 2022-08-19T13:35:56.1936483Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2022-08-19T13:35:56.1937381Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> 2022-08-19T13:35:56.1938153Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
> 2022-08-19T13:35:56.1938980Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
> 2022-08-19T13:35:56.1939899Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2022-08-19T13:35:56.1940713Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
> 2022-08-19T13:35:56.1941642Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
> 2022-08-19T13:35:56.1942726Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:185)
> 2022-08-19T13:35:56.1943944Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService.executeNonConcurrentTasks(ForkJoinPoolHierarchicalTestExecutorService.java:155)
> 2022-08-19T13:35:56.1945074Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService.invokeAll(ForkJoinPoolHierarchicalTestExecutorService.java:135)
> 2022-08-19T13:35:56.1946207Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
> 2022-08-19T13:35:56.1947104Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2022-08-19T13:35:56.1947941Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> 2022-08-19T13:35:56.1948776Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
> 2022-08-19T13:35:56.1949613Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
> 2022-08-19T13:35:56.1950509Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2022-08-19T13:35:56.1951326Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
> 2022-08-19T13:35:56.1952371Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
> 2022-08-19T13:35:56.1953430Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:185)
> 2022-08-19T13:35:56.1954545Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService.invokeAll(ForkJoinPoolHierarchicalTestExecutorService.java:129)
> 2022-08-19T13:35:56.1955575Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
> 2022-08-19T13:35:56.1956466Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2022-08-19T13:35:56.1957359Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> 2022-08-19T13:35:56.1958137Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
> 2022-08-19T13:35:56.1959059Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
> 2022-08-19T13:35:56.1959962Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2022-08-19T13:35:56.1960783Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
> 2022-08-19T13:35:56.1961688Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
> 2022-08-19T13:35:56.1962766Z Aug 19 13:35:56  at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:185)
> 2022-08-19T13:35:56.1963674Z Aug 19 13:35:56  at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> 2022-08-19T13:35:56.1964372Z Aug 19 13:35:56  at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> 2022-08-19T13:35:56.1965093Z Aug 19 13:35:56  at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> 2022-08-19T13:35:56.1965758Z Aug 19 13:35:56  at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> 2022-08-19T13:35:56.1966500Z Aug 19 13:35:56  at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=40205&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to