I don't mind contributing the benchmarks to `polaris-tools`. It seems that the consensus is clearly in that direction.
I want to address some comments that were made in the PR but that are not really related to code review per se. > You can write gatling benchmarks in a language other than Scala. > > There are also frameworks other than gatling. To me, the big question is : Assuming the code goes to `polaris-tools`, _will this contribution be rejected if it uses Scala?_ I understand that this is a controversial topic, and how that the expected maintenance cost is a key factor here. I made sure that the code is documented and that a comprehensive readme file describes how datasets work. That way, nobody needs to be a Scala developer to leverage or understand the tool. Those benchmarks have already been used to detect, reproduce and fix multiple issues in the codebase. Issues that had not been caught before [1] [2] [3]. This shows that the benchmarks already bring value to the community in their current state. Now, I want to avoid any misunderstanding. My current focus is on evolving the benchmarks and covering new cases. Not on completely rewriting the code in Java/another framework. Essentially: focus on the area that brings the most value to Polaris users. Hence my asking on dev@. If anything, there will be more Scala code pushed to the benchmarks branch in the upcoming weeks. Not less. I would completely understand if the Gatling/Scala design choice is a reason for rejection. The discussion simply needs to happen. [1] https://github.com/apache/polaris/issues/1044 [2] https://github.com/apache/polaris/issues/1076 [3] https://github.com/apache/polaris/issues/1123 -- Pierre On Sat, Mar 22, 2025 at 3:47 PM Russell Spitzer <russell.spit...@gmail.com> wrote: > I think it makes sense for us to also build some capabilities into the > tools repo to build Polaris at a specific commit for testing purposes. If > the Spark Catalog and Benchmarking code goes there they could both share > this code for testing, ditto for the migration code. > > On Fri, Mar 21, 2025 at 4:59 PM Yufei Gu <flyrain...@gmail.com> wrote: > > > I’m leaning toward placing it in a separate repository rather than in > > https://github.com/apache/polaris. The benchmark tool is largely > > self-contained and doesn’t have a strong dependency on the main codebase. > > > > IIUC, the only requirement is a running Polaris instance, which the tool > > can connect to using the following configuration: > > export CLIENT_ID=your_client_id > > export CLIENT_SECRET=your_client_secret > > export BASE_URL=http://your-polaris-instance:8181 > > > > Yufei > > > > > > On Thu, Mar 20, 2025 at 6:05 AM Jean-Baptiste Onofré <j...@nanthrax.net> > > wrote: > > > > > Hi Ajantha, > > > > > > That's a good request. > > > > > > Imho, right now, before distributing any artifact (either on nightly > > > build space https://nightlies.apache.org/), I prefer to have it "good > > > enough" from a "legal" standpoint (e.g. LICENSE/NOTICE). > > > > > > I'm almost done about that for all artifacts (jar and distributions). > > > I will open a PR soon. > > > Once this PR is done, I will submit a way to provide nightly builds. > > > > > > Regards > > > JB > > > > > > On Thu, Mar 20, 2025 at 10:27 AM Ajantha Bhat <ajanthab...@gmail.com> > > > wrote: > > > > > > > > > I cannot think of any issue with storing that code in the > > polaris-tools > > > > repository. > > > > > > > > While contributing the `catalog migrator tool` to `polaris-tools`, I > > > > encountered a challenge because this external repository needs to > > depend > > > on > > > > Apache Polaris jars, which haven't been published yet by Apache > > Polaris. > > > If > > > > we keep the tool in polaris-tools, we may need to wait for the > nightly > > > > build or official jar publication. > > > > > > > > - Ajantha > > > > > > > > On Thu, Mar 20, 2025 at 2:46 PM Pierre Laporte < > pie...@pingtimeout.fr> > > > > wrote: > > > > > > > > > On Wed, Mar 19, 2025 at 4:53 PM Jean-Baptiste Onofré < > > j...@nanthrax.net> > > > > > wrote: > > > > > > > > > > > Hi Pierre > > > > > > > > > > > > Thanks ! > > > > > > > > > > > > I have a general comment: do we want the benchmark tool as part > of > > > > > > Polaris "core" repo or on polaris-tools ? > > > > > > As we can consider this as a benchmark "tool", maybe it makes > sense > > > to > > > > > > host it in https://github.com/apache/polaris-tools. > > > > > > > > > > > > > > > > > At this point, apart from the Gradle build files, the benchmark > code > > is > > > > > completely contained under the benchmarks/ directory. And given it > > > relies > > > > > on the REST API, there is no real dependency to any specific > Polaris > > > > > version. > > > > > > > > > > I cannot think of any issue with storing that code in the > > polaris-tools > > > > > repository. > > > > > > > > > > -- > > > > > > > > > > Pierre > > > > > > > > > > >