[
https://issues.apache.org/jira/browse/IMPALA-14491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18029828#comment-18029828
]
Joe McDonnell commented on IMPALA-14491:
----------------------------------------
Beeswax uses the ImpalaBeeswaxClient from tests/beeswax/impala_beeswax.py as
the client. For the exec summary, it fetches the exec summary then does extra
processing to construct a summary table.
[https://github.com/apache/impala/blob/master/tests/beeswax/impala_beeswax.py#L241-L269]
When we switched to HS2, it is now using Impyla directly from
tests/performance/query_exec_functions.py. The code fetches the exec summary
but it isn't constructing the summary table in the same way. It simply turns
the Thrift into a string:
[https://github.com/apache/impala/blob/master/tests/performance/query_exec_functions.py#L137]
{noformat}
exec_result.exec_summary = str(cursor.get_summary()){noformat}
I think one way to fix this is to change that to construct the summary table as
expected.
> benchmark/report_benchmark_results.py fails to process exec summary
> -------------------------------------------------------------------
>
> Key: IMPALA-14491
> URL: https://issues.apache.org/jira/browse/IMPALA-14491
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Priority: Critical
>
> After running the queries, the perf-AB-test job is failing when generating
> the performance difference report:
> {noformat}
> 20:58:02 Traceback (most recent call last):
> 20:58:02 File
> "/home/ubuntu/Impala/tests/benchmark/report_benchmark_results.py", line 1157,
> in <module>
> 20:58:02 report = Report(grouped, ref_grouped)
> 20:58:02 File
> "/home/ubuntu/Impala/tests/benchmark/report_benchmark_results.py", line 509,
> in __init__
> 20:58:02 self.__analyze()
> 20:58:02 File
> "/home/ubuntu/Impala/tests/benchmark/report_benchmark_results.py", line 532,
> in __analyze
> 20:58:02 query_variability_row = Report.QueryVariabilityRow(results,
> ref_results)
> 20:58:02 File
> "/home/ubuntu/Impala/tests/benchmark/report_benchmark_results.py", line 494,
> in __init__
> 20:58:02 self.exec_summary_str = build_exec_summary_str(
> 20:58:02 File
> "/home/ubuntu/Impala/tests/benchmark/report_benchmark_results.py", line 1104,
> in build_exec_summary_str
> 20:58:02 combined_summary = CombinedExecSummaries(exec_summaries)
> 20:58:02 File
> "/home/ubuntu/Impala/tests/benchmark/report_benchmark_results.py", line 649,
> in __init__
> 20:58:02 ok, err_str = self.__check_exec_summary_schema(exec_summaries)
> 20:58:02 File
> "/home/ubuntu/Impala/tests/benchmark/report_benchmark_results.py", line 773,
> in __check_exec_summary_schema
> 20:58:02 if row[OPERATOR] != comp_row[OPERATOR]:
> 20:58:02 TypeError: string indices must be integers{noformat}
> https://jenkins.impala.io/job/perf-AB-test-ub2004/325/
> It is trying to process the exec summary, but it doesn't have the same
> structure as it expects. This could be related to the switch from beeswax to
> HS2.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]