[
https://issues.apache.org/jira/browse/IMPALA-13535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904246#comment-17904246
]
ASF subversion and git services commented on IMPALA-13535:
----------------------------------------------------------
Commit 8e71f5ec8609cc046cf35eb044d91bf34ae9f9c7 in impala's branch
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=8e71f5ec8 ]
IMPALA-13535: Add script to restore stats on PlannerTest
Impala has several PlannerTest that validate over EXTENDED profile and
validate cardinality. In EXTENDED level, profile display stored table
stats from HMS like 'numRows' and 'totalSize', which can vary between
data loads. They are not validated by PlannerTest. But frequent change
of these lines can disturb code review process because they are mostly
noise.
This patch provides a python script restore-stats-on-planner-tests.py to
fix the table stats information in selected .test files. The test files
to check and fixed table stats is declared inside the script. It is
currently focus on tests under
functional-planner/queries/PlannerTest/tpcds/ and some that test against
tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is
updated to run with python3, trigger restore-stats-on-planner-tests.py,
and warn if there is any unnecessary table stats change detected.
This patch also fixed table size for tests under
functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests
there runs with synthetic stats declared in stats-3TB.json. Before the
patch, the table stats printed in plan is the real stats from HMS. After
this patch, the table stats displayed is calculated from the
stats-3TB.json. See IMPALA-12726 for more detail on large scale planner
test simulation.
Testing:
- Manually run the script and confirm that stats line are replaced
correctly.
- Run affected PlannerTest and all passed.
Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682
Reviewed-on: http://gerrit.cloudera.org:8080/22045
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Add script to restore stats on PlannerTest files
> ------------------------------------------------
>
> Key: IMPALA-13535
> URL: https://issues.apache.org/jira/browse/IMPALA-13535
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend, Test
> Reporter: Riza Suminto
> Assignee: Riza Suminto
> Priority: Major
>
> We have several PlannerTest that validate over EXTENDED profile and validate
> cardinality. In EXTENDED level, profile display stored table stats from HMS
> like 'numRows' and 'totalSize', which can vary between data loads. They are
> not validated by PlannerTest and will not fail the test. But frequent change
> of these lines can disturb code review process because they are mostly noise.
> We need to have some script to help ease restoring the stored table stats
> information in those .test files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]