Hey guys,

I put up a small project on GitHub [1] with Hive metastore dumps from
tpcds10tb/tpcds30tb (+partitioning) and some scripts to quickly spin up a
dockerized Postgres with those loaded.

Personally, I find it useful to check the plans of TPC-DS queries using the
usual qtest mechanism (without external tools and tapping into a real
cluster) having at hand beefy stats + partitioning info. The driver and
other changes needed to run these tests are located in [2].

I am sharing it here in case it might be of use to somebody else.

The two main commands that you will need if you wanna try this out:
docker build --tag postgres-tpcds-metastore:1.0 .
mvn test -Dtest=TestTezPerfDBCliDriver -Dtest.output.overwrite=true
-Dtest.metastore.db=postgres.tpcds

Small caveat: Currently in [2] the dockerized postgres is restarted for
every query which makes things slow. This will be fixed later on.

Best,
Stamatis

[1] https://github.com/zabetak/hive-postgres-metastore
[2] https://github.com/zabetak/hive/tree/qtest_postgres_driver

Reply via email to