Hi Pierre Thanks !
I have a general comment: do we want the benchmark tool as part of Polaris "core" repo or on polaris-tools ? As we can consider this as a benchmark "tool", maybe it makes sense to host it in https://github.com/apache/polaris-tools. Thoughts ? Regards JB On Wed, Mar 19, 2025 at 4:06 PM Pierre Laporte <pie...@pingtimeout.fr> wrote: > > Hi > > I have been working on a set of benchmarks for Polaris [1] and would like > to contribute them to the project. I have opened a PR with the code, in > case anybody is interested. > > The benchmarks are written using Gatling. The core design decision > consists in building a procedural dataset, loading it to Polaris, and then > reusing it for all subsequent benchmarks. The procedural aspect makes it > possible to deterministically regenerate the same dataset at runtime over > and over, without having to store the actual data. > > With this, it is trivial to generate large number of Polaris entities. > Typically, I used this to benchmark the NoSQL persistence implementation > with 65k namespaces, 65k tables and 65k views. Increasing that to millions > would only require a one parameter change. Additionally, the dataset > currently includes property updates for namespaces, tables and views, which > can quickly create hundreds of manifests. This can be useful for table > maintenance testing. > > Three benchmarks have been created so far: > > - A benchmark that populates an empty Polaris server with a dataset that > have predefined attributes > - A benchmark that issues only read queries over that dataset > - A benchmark that issues read and write queries (entity updates) over > that dataset, with a configurable read/write ratio > > The benchmarks/README.md contains instructions to build and run the > benchmarks, as well as to describe the kind of dataset that should be > generated. > > As with every Gatling benchmark, an HTML page is generated with interactive > charts showing query performance over time, response time percentiles, > etc... > > I would love to head your feedback on it. > > Pierre > > [1] https://github.com/apache/polaris/pull/1208 > -- > > Pierre Laporte > @pingtimeout <https://twitter.com/pingtimeout> > pie...@pingtimeout.fr > http://www.pingtimeout.fr/