Awesome. Thank you (all)! We've had so many conversations about it and it is great to have it running continuously.
Kenn On Fri, Sep 16, 2022 at 9:29 AM Sachin Agarwal via dev <[email protected]> wrote: > This is wonderful - thank you so much to you and the whole Talend team to > make Beam better! > > On Fri, Sep 16, 2022 at 9:11 AM Alexey Romanenko <[email protected]> > wrote: > >> Hi everybody, >> >> As some of you may know, at Talend, we’ve been working for a while to add >> TPC-DS benchmark suite into Beam. We believe that having TPC-DS as a part >> of Beam testing workflow and release routine will help a community to >> detect quickly the performance regressions or improvements, identify >> missing or incorrect Beam SQL features and execute Beam SQL on different >> runtime environments with different runners. >> >> What is TPC-DS? From TPC-DS specification document [1]: >> >> *“TPC-DS is a decision support benchmark that models several generally >> applicable aspects of a decision support system, including queries and data >> maintenance. The benchmark provides a representative evaluation of >> performance as a general purpose decision support system.” * >> >> TPC-DS benchmark suite for Beam is implemented as a separate testing tool >> for Java SDK (like well known Nexmark benchmark suite) [2]. It supports a >> limited number of TPC-DS SQL queries for now (mostly because of limited SQL >> syntax support in Beam), CSV and Parquet as input data format, and it runs >> on Jenkins with three most popular Beam runners (Spark [3], Flink [4], >> Dataflow [5]). The job metrics are stored in InfluxDB and can be accessed >> though Grafana dashboards [6][7][8]. >> >> More details can be found in Beam documentation [9]. >> >> For sure, there are still plenty things to do, like adding new runners, >> support of other SDKs, data formats, etc - so, your contributions are very >> welcomed in any form. Though, at least for now, we already have a first >> working and automated version that can be used by community. >> >> Also, I’d like to thank everybody who worked on this improvement! >> >> — >> Alexey >> >> >> [1] >> https://www.tpc.org/tpc_documents_current_versions/current_specifications5.asp >> [2] https://github.com/apache/beam/tree/master/sdks/java/testing/tpcds >> [3] https://ci-beam.apache.org/job/beam_PostCommit_Java_Tpcds_Spark/ >> [4] https://ci-beam.apache.org/job/beam_PostCommit_Java_Tpcds_Flink/ >> [5] https://ci-beam.apache.org/job/beam_PostCommit_Java_Tpcds_Dataflow/ >> [6] >> http://metrics.beam.apache.org/d/tkqc0AdGk2/tpc-ds-spark-classic-new-sql?orgId=1 >> [7] http://metrics.beam.apache.org/d/8INnSY9Mv/tpc-ds-flink-sql?orgId=1 >> [8] >> http://metrics.beam.apache.org/d/tkqc0AdGk2/tpc-ds-spark-classic-new-sql?orgId=1 >> [9] https://beam.apache.org/documentation/sdks/java/testing/tpcds/ >> >> >> >> >> >>
