Hello Joe McDonnell, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/22061

to look at the new patch set (#2).

Change subject: IMPALA-13543: single_node_perf_run.py must accept 
tpcds_partitioned
......................................................................

IMPALA-13543: single_node_perf_run.py must accept tpcds_partitioned

tpcds_partitioned dataset is a fully-partitioned version of tpcds
dataset (the latter only partition store_sales table). It does not have
the default text format database like tpcds dataset. Instead, it relies
on pre-existence of text format tpcds database, which then INSERT
OVERWRITE INTO tpcds_partitioned database equivalent. It does not have
its own queries set, but instead symlinked to share
testdata/workloads/tpcds/queries. It also have slightly different schema
from tpcds dataset, namely column "c_last_review_date" in tpcds dataset
is "c_last_review_date_sk" in tpcds_partitioned (TPC-DS v2.11.0, section
2.4.7). These reasons make tpcds_partitioned ineligible for
perf-AB-test (single_node_perf_run.py).

This patch update single_node_perf_run.py and related scripts to make
tpcds_partitioned eligible for benchmark dataset. It adds an initial
steps to load the text database from tpcds dataset with selected scale
before running the load script for tpcds_partitioned dataset. Compute
stats step also limited to run one at a time to not overadmit the
cluster with concurrent compute stats queries.

Testing
- Run perf-AB-test-ub2004 with this commit included and confirm
  benchmark works with tpcds_partitioned dataset.
- Run normal data loading. Pass FE tests, and
  query_test/test_tpcds_queries.py.

Change-Id: I4b6f435705dcf873696ffd151052ebeab35d9898
---
M bin/single_node_perf_run.py
M testdata/bin/generate-schema-statements.py
M testdata/datasets/tpcds_partitioned/tpcds_partitioned_schema_template.sql
M tests/util/test_file_parser.py
4 files changed, 118 insertions(+), 77 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/22061/2
--
To view, visit http://gerrit.cloudera.org:8080/22061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4b6f435705dcf873696ffd151052ebeab35d9898
Gerrit-Change-Number: 22061
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>

Reply via email to