Hello Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/23628
to look at the new patch set (#8).
Change subject: Generate parallel data load with batch files
......................................................................
Generate parallel data load with batch files
Creates num_processes files for each phase of schema SQL dataload to
execute in parallel.
Analyzes SQL statements to create a dependency graph using networkx, and
batches statements by independent subgraphs so dependent statements are
always executed sequentially, and independent statements may be executed
concurrently.
Updates load-data.py documented order to reflect actual order.
Speeds up devdata functional-query load by ~30s, but now bound by TPC-DS
so no significant change overall:
Loading TPC-H data OK (Took: 0 min 13 sec)
Loading functional-query data OK (Took: 0 min 54 sec)
Loading TPC-DS data OK (Took: 1 min 35 sec)
Change-Id: I9586504f6cb91f873f7ed978fda3df32e759ba90
---
M bin/load-data.py
M infra/python/deps/py3-requirements.txt
M testdata/bin/generate-schema-statements.py
3 files changed, 118 insertions(+), 49 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/23628/8
--
To view, visit http://gerrit.cloudera.org:8080/23628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9586504f6cb91f873f7ed978fda3df32e759ba90
Gerrit-Change-Number: 23628
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>