[ 
https://issues.apache.org/jira/browse/IMPALA-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892426#comment-17892426
 ] 

Daniel Becker commented on IMPALA-13471:
----------------------------------------

If NDV stats were generally unavailable in Ozone, 
{{custom_cluster.test_iceberg_with_puffin.TestIcebergTableWithPuffinStats.test_puffin_stats}}
 would also have failed.
In the logs we can see:

{code:java}
W1022 12:51:59.890749 31907 PuffinStatsLoader.java:116] Could not load Iceberg 
Puffin column statistics for table 
'functional_parquet.iceberg_with_puffin_stats' from Puffin file 
'/test-warehouse/iceberg_test/iceberg_with_puffin_stats/metadata/20240906_085606_00006_wsfgs-4d9242d5-bd79-4069-be8b-2cfced8e0647.stats'.
 Exception: org.apache.iceberg.exceptions.RuntimeIOException: Failed to open 
input stream for file: 
/test-warehouse/iceberg_test/iceberg_with_puffin_stats/metadata/20240906_085606_00006_wsfgs-4d9242d5-bd79-4069-be8b-2cfced8e0647.stats
{code}

I think the file path should start with {{ofs:// }} on Ozone, it does in the 
passing test  
{{custom_cluster.test_iceberg_with_puffin.TestIcebergTableWithPuffinStats.test_puffin_stats}}.

The failing test uses a table loaded at dataload, the passing Puffin test 
creates it on the fly. I'll check how we handle the paths.


> test_enable_reading_puffin() seems to fail in the Ozone build
> -------------------------------------------------------------
>
>                 Key: IMPALA-13471
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13471
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Fang-Yu Rao
>            Assignee: Daniel Becker
>            Priority: Major
>              Labels: broken-build
>
> We found that the test 
> [test_enable_reading_puffin()|https://github.com/apache/impala/blame/master/tests/custom_cluster/test_iceberg_with_puffin.py#L59]
>  added in IMPALA-13247 seems to fail in the Ozone build.
> +*Error Message*+
> {code}
> assert [-1, -1] == [2, 2]   At index 0 diff: -1 != 2   Full diff:   - [-1, 
> -1]   + [2, 2]
> {code}
> +*Stacktrace*+
> {code}
> custom_cluster/test_iceberg_with_puffin.py:50: in test_enable_reading_puffin
>     self._read_ndv_stats_expect_result([2, 2])
> custom_cluster/test_iceberg_with_puffin.py:59: in 
> _read_ndv_stats_expect_result
>     assert ndvs == expected_ndv_stats
> E   assert [-1, -1] == [2, 2]
> E     At index 0 diff: -1 != 2
> E     Full diff:
> E     - [-1, -1]
> E     + [2, 2]
> {code}
> According to the above, in the Ozone build, the result of "show column stats" 
> was [-1, -1]. It looks like the NDV statistics is not available in the Ozone 
> build.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to