Surya Hebbar has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/23154 )
Change subject: IMPALA-9846: Enable Aggregated Runtime Profile by Default ...................................................................... IMPALA-9846: Enable Aggregated Runtime Profile by Default The traditional query profile generated bigger forests of runtime profiles and child counters, from fragments and operators upto instance levels. This structure of the runtime profile can potentially stress the memory allocator and use up a lot more memory and cache than is really necessary. To mitigate these issues, the aggregated profiles were introduced, which are substantially denser and faster to process for higher 'mt_dop' values. In the aggregated profile, the depth of the forests have been reduced by transforming instance-level counters into operator level arrays or maps. The aggregation is also done in a single step, merging the aggregated thrift profiles from the executor directly into the final aggregated profile, without converting it to an unaggregated profile first. This representation also helps with producing nice, high-level, readable text profile by default with the option to produce more detailed profiles and alternate views of the profile when required. With this change, aggregated runtime profile is enabled by default with the 'aggregated_profile' flag set to 'true'. This serves as replacement for the previous 'gen_experimental_profile' flag. For enabling the aggregated profile, more than 2700+ tests, both backend and end-to-end tests have been implemented/modified/corrected to accommodate the usage of both the aggregated runtime profile and the traditional runtime profile. More than 2500+ of these tests have been optimized to perform 10x-50x times faster by utilizing a single advanced regex search on the entire runtime profile at once, instead of comparing the runtime profile line by line repeatedly per statement. These optimized tests can largely be categorized into either counter value aggregation or row regex searches. Although, there are several other tests that have been optimized seperately(i.e.assert metrics), these do not fall into either categories. For the users and tests to differentiate between profile types, an info string 'Profile Type' has been added to the execution profile. The supported values of the info string are represented by the following enum values. RuntimeProfileBase::Type: 1. AGGREGATED - Aggregated profile 2. UNAGGREGATED - Traditional profile For nearly all of these tests, the major requirement was the sum of counter values across fragment instances. This was not present in the text representation of the aggregated profile. In order to present this information to the tests, the 'total' statistic has been added to the text representation of averaged counters along with the existing min/max/avg. - BytesRead: total=243.99 KB (246834) mean=81.31 KB (85258) min=... The test outputs for impala-profile-tool have been updated. With these changes, all the existing, updated and newly added tests successfully pass for both the aggregated and the traditional profile. Note: Instance-level time series counters are currently not present in the aggregated profile, as they are considered too large for longer queries (i.e. MemoryUsage, ThreadUsage, etc). To include these counters further sampling or aggregation is required(See IMPALA-14256). Change-Id: If41d6322361fba82c946efd614cc7d28cb1c36e8 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/service/impala-server.cc M be/src/service/query-state-record.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/protobuf/control_service.proto M testdata/impala-profiles/README M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats.expected.json M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats.expected.pretty.json M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats.expected.txt M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats_default.expected.txt M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats_extended.expected.pretty.json M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats_extended.expected.txt M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats_v2.expected.json M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats_v2_default.expected.txt M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats_v2_extended.expected.pretty.json M testdata/impala-profiles/impala_profile_log_tpcds_compute_stats_v2_extended.expected.txt M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-json-scan-node-errors.test M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test M testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test M testdata/workloads/functional-query/queries/QueryTest/acid-profile.test M testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test M testdata/workloads/functional-query/queries/QueryTest/admission-max-min-mem-limits.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-mem-estimate.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test M testdata/workloads/functional-query/queries/QueryTest/ai_generate_text_exprs.test M testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test M testdata/workloads/functional-query/queries/QueryTest/alter-table.test M testdata/workloads/functional-query/queries/QueryTest/analytic-fns-tpcds-partitioned-topn.test M testdata/workloads/functional-query/queries/QueryTest/basic-spilling.test M testdata/workloads/functional-query/queries/QueryTest/binary-type.test M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test M testdata/workloads/functional-query/queries/QueryTest/calcite.test M testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/compute-stats-incremental.test M testdata/workloads/functional-query/queries/QueryTest/compute-stats.test M testdata/workloads/functional-query/queries/QueryTest/data-cache.test M testdata/workloads/functional-query/queries/QueryTest/datasketches-cpc.test M testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test M testdata/workloads/functional-query/queries/QueryTest/datastream-sender-codegen.test M testdata/workloads/functional-query/queries/QueryTest/dedicated-coord-mem-estimates.test M testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test M testdata/workloads/functional-query/queries/QueryTest/full-acid-original-file.test M testdata/workloads/functional-query/queries/QueryTest/hbase-hms-column-order.test M testdata/workloads/functional-query/queries/QueryTest/hdfs_parquet_scan_node_profile.test M testdata/workloads/functional-query/queries/QueryTest/hdfs_scanner_profile.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-compute-stats.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-load.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-mixed-format-position-deletes.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-scan-metrics-basic.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-scan-metrics-with-deletes.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-directed-mode.test M testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test M testdata/workloads/functional-query/queries/QueryTest/in_list_filters.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/insert_null.test M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test M testdata/workloads/functional-query/queries/QueryTest/joins_mt_dop.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test M testdata/workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test M testdata/workloads/functional-query/queries/QueryTest/max-mt-dop.test M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-array-materialization.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test M testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test M testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters_on_sorted_columns.test M testdata/workloads/functional-query/queries/QueryTest/parquet-corrupt-footer-len-decr.test M testdata/workloads/functional-query/queries/QueryTest/parquet-corrupt-footer-len-incr.test M testdata/workloads/functional-query/queries/QueryTest/parquet-corrupt-rle-counts.test M testdata/workloads/functional-query/queries/QueryTest/parquet-late-materialization.test M testdata/workloads/functional-query/queries/QueryTest/processing-cost-admission-slots.test M testdata/workloads/functional-query/queries/QueryTest/query-impala-13138.test M testdata/workloads/functional-query/queries/QueryTest/query-resource-limits-hbase.test M testdata/workloads/functional-query/queries/QueryTest/query-resource-limits-kudu.test M testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test M testdata/workloads/functional-query/queries/QueryTest/runtime_filters_mt_dop.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filter_reservations.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test M testdata/workloads/functional-query/queries/QueryTest/scanner-reservation.test M testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test M testdata/workloads/functional-query/queries/QueryTest/sfs.test M testdata/workloads/functional-query/queries/QueryTest/single-node-joins-with-limits-exhaustive.test M testdata/workloads/functional-query/queries/QueryTest/single-node-large-sorts.test M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test M testdata/workloads/functional-query/queries/QueryTest/spilling-broadcast-joins.test M testdata/workloads/functional-query/queries/QueryTest/spilling-no-debug-action.test M testdata/workloads/functional-query/queries/QueryTest/spilling-query-options.test M testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive-no-default-buffer-size.test M testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test M testdata/workloads/functional-query/queries/QueryTest/spilling.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test M testdata/workloads/functional-query/queries/QueryTest/strict-mode.test M testdata/workloads/functional-query/queries/QueryTest/thread-limits.test M testdata/workloads/functional-query/queries/QueryTest/union-const-scalar-expr-codegen.test M testdata/workloads/targeted-perf/queries/aggregation.test M testdata/workloads/tpcds-insert/queries/partitioned-insert.test M testdata/workloads/tpch/queries/datastream-sender.test M testdata/workloads/tpch/queries/runtime-profile-aggregated.test M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test M testdata/workloads/tpch/queries/tpch-q1.test M testdata/workloads/tpch/queries/tpch-q10.test M testdata/workloads/tpch/queries/tpch-q11.test M testdata/workloads/tpch/queries/tpch-q12.test M testdata/workloads/tpch/queries/tpch-q13.test M testdata/workloads/tpch/queries/tpch-q14.test M testdata/workloads/tpch/queries/tpch-q15.test M testdata/workloads/tpch/queries/tpch-q16.test M testdata/workloads/tpch/queries/tpch-q17.test M testdata/workloads/tpch/queries/tpch-q18.test M testdata/workloads/tpch/queries/tpch-q19.test M testdata/workloads/tpch/queries/tpch-q2.test M testdata/workloads/tpch/queries/tpch-q20.test M testdata/workloads/tpch/queries/tpch-q21.test M testdata/workloads/tpch/queries/tpch-q22.test M testdata/workloads/tpch/queries/tpch-q3.test M testdata/workloads/tpch/queries/tpch-q4.test M testdata/workloads/tpch/queries/tpch-q5.test M testdata/workloads/tpch/queries/tpch-q6.test M testdata/workloads/tpch/queries/tpch-q7.test M testdata/workloads/tpch/queries/tpch-q8.test M testdata/workloads/tpch/queries/tpch-q9.test M testdata/workloads/tpch_nested/queries/QueryTest/nested-types-subplan-single-node.test M tests/common/impala_test_suite.py M tests/common/test_result_verifier.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_executor_groups.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py M tests/custom_cluster/test_query_retries.py M tests/custom_cluster/test_runtime_profile.py M tests/custom_cluster/test_tuple_cache.py M tests/query_test/test_aggregation.py M tests/query_test/test_fetch.py M tests/query_test/test_hash_join_timer.py M tests/query_test/test_iceberg.py M tests/query_test/test_observability.py M tests/query_test/test_parquet_bloom_filter.py M tests/query_test/test_queries.py M tests/query_test/test_result_spooling.py M tests/query_test/test_runtime_filters.py M tests/query_test/test_scanners.py M tests/query_test/test_sort.py M tests/unittests/test_result_verifier.py M tests/util/workload_management.py 161 files changed, 4,587 insertions(+), 4,978 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/23154/21 -- To view, visit http://gerrit.cloudera.org:8080/23154 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If41d6322361fba82c946efd614cc7d28cb1c36e8 Gerrit-Change-Number: 23154 Gerrit-PatchSet: 21 Gerrit-Owner: Surya Hebbar <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Surya Hebbar <[email protected]>
