Personally, I don’t mind if have to maintain a bit of Scala code - I like
Scala, though every time the question of using comes up, I see the same
concerns that Russell brought up.

I will say that if the alternative is to introduce JMeter into the repo,
I’m a hard -1. I’ll write Scala all day long to avoid that.

Mike

On Sat, Mar 22, 2025 at 1:13 PM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> I think we should start a new thread just to gauge consensus on whether
> Scala will be allowed in the tools repository or not. To go through my
> quick thoughts here.
>
> I like Scala but I have to be realistic in saying that it is a rather
> esoteric language choice and limits the number of community members that
> can contribute. So it would be a hard -1 for it being included in the main
> repository.
>
> Now for the tools repository I would also be a -1 for brand new proposals
> without code. Scala raises the bar for contributing so it still wouldn't be
> a great thing to add when other language bindings exist that are much more
> popular (even if we didn't chose Java)
>
> The current situation is a little different as we already have code written
> and I am usually focused on immediate practical benefits over hypothetical
> problems. So in the current situation I'm more of a -.1.  The reason I am
> still negative is that inclusion of the benchmarks into the project isn't
> just about utility to the project, but about whether the community should
> take up responsibility for maintaining the code. What is important here is
> not whether the code can be used by the project and contributors but about
> whether we have enough contributors who are familiar with Scala that the
> benchmarks can be maintained. We don't want to be in a situation where you
> win the lottery and we are left high and dry :)
>
> The value of the code is clearly high, but whether or not it is reasonable
> for the community to take on responsibility for Scala code (and build)
> needs to be polled. As long as a significant fraction of contributors don't
> have a problem working on Scala code I'm a +1.
>
> If this contribution was in Java or Python I would be +1 without
> reservation.
>
>
> On Sat, Mar 22, 2025 at 12:06 PM Pierre Laporte <pie...@pingtimeout.fr>
> wrote:
>
> > I don't mind contributing the benchmarks to `polaris-tools`.  It seems
> that
> > the consensus is clearly in that direction.
> >
> > I want to address some comments that were made in the PR but that are not
> > really related to code review per se.
> >
> > > You can write gatling benchmarks in a language other than Scala.
> > >
> > > There are also frameworks other than gatling.
> >
> > To me, the big question is : Assuming the code goes to `polaris-tools`,
> > _will this contribution be rejected if it uses Scala?_
> >
> > I understand that this is a controversial topic, and how that the
> expected
> > maintenance cost is a key factor here.  I made sure that the code is
> > documented and that a comprehensive readme file describes how datasets
> > work.  That way, nobody needs to be a Scala developer to leverage or
> > understand the tool.
> >
> > Those benchmarks have already been used to detect, reproduce and fix
> > multiple issues in the codebase.  Issues that had not been caught before
> > [1] [2] [3].  This shows that the benchmarks already bring value to the
> > community in their current state.
> >
> > Now, I want to avoid any misunderstanding.  My current focus is on
> evolving
> > the benchmarks and covering new cases.  Not on completely rewriting the
> > code in Java/another framework.  Essentially: focus on the area that
> brings
> > the most value to Polaris users.
> >
> > Hence my asking on dev@.  If anything, there will be more Scala code
> > pushed
> > to the benchmarks branch in the upcoming weeks.  Not less.  I would
> > completely understand if the Gatling/Scala design choice is a reason for
> > rejection.  The discussion simply needs to happen.
> >
> > [1] https://github.com/apache/polaris/issues/1044
> > [2] https://github.com/apache/polaris/issues/1076
> > [3] https://github.com/apache/polaris/issues/1123
> >
> >
> > --
> >
> > Pierre
> >
> >
> > On Sat, Mar 22, 2025 at 3:47 PM Russell Spitzer <
> russell.spit...@gmail.com
> > >
> > wrote:
> >
> > > I think it makes sense for us to also build some capabilities into the
> > > tools repo to build Polaris at a specific commit for testing purposes.
> If
> > > the Spark Catalog and Benchmarking code goes there they could both
> share
> > > this code for testing, ditto for the migration code.
> > >
> > > On Fri, Mar 21, 2025 at 4:59 PM Yufei Gu <flyrain...@gmail.com> wrote:
> > >
> > > > I’m leaning toward placing it in a separate repository rather than in
> > > > https://github.com/apache/polaris. The benchmark tool is largely
> > > > self-contained and doesn’t have a strong dependency on the main
> > codebase.
> > > >
> > > > IIUC, the only requirement is a running Polaris instance, which the
> > tool
> > > > can connect to using the following configuration:
> > > > export CLIENT_ID=your_client_id
> > > > export CLIENT_SECRET=your_client_secret
> > > > export BASE_URL=http://your-polaris-instance:8181
> > > >
> > > > Yufei
> > > >
> > > >
> > > > On Thu, Mar 20, 2025 at 6:05 AM Jean-Baptiste Onofré <
> j...@nanthrax.net>
> > > > wrote:
> > > >
> > > > > Hi Ajantha,
> > > > >
> > > > > That's a good request.
> > > > >
> > > > > Imho, right now, before distributing any artifact (either on
> nightly
> > > > > build space https://nightlies.apache.org/), I prefer to have it
> > "good
> > > > > enough" from a "legal" standpoint (e.g. LICENSE/NOTICE).
> > > > >
> > > > > I'm almost done about that for all artifacts (jar and
> distributions).
> > > > > I will open a PR soon.
> > > > > Once this PR is done, I will submit a way to provide nightly
> builds.
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On Thu, Mar 20, 2025 at 10:27 AM Ajantha Bhat <
> ajanthab...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I cannot think of any issue with storing that code in the
> > > > polaris-tools
> > > > > > repository.
> > > > > >
> > > > > > While contributing the `catalog migrator tool` to
> `polaris-tools`,
> > I
> > > > > > encountered a challenge because this external repository needs to
> > > > depend
> > > > > on
> > > > > > Apache Polaris jars, which haven't been published yet by Apache
> > > > Polaris.
> > > > > If
> > > > > > we keep the tool in polaris-tools, we may need to wait for the
> > > nightly
> > > > > > build or official jar publication.
> > > > > >
> > > > > > - Ajantha
> > > > > >
> > > > > > On Thu, Mar 20, 2025 at 2:46 PM Pierre Laporte <
> > > pie...@pingtimeout.fr>
> > > > > > wrote:
> > > > > >
> > > > > > > On Wed, Mar 19, 2025 at 4:53 PM Jean-Baptiste Onofré <
> > > > j...@nanthrax.net>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Pierre
> > > > > > > >
> > > > > > > > Thanks !
> > > > > > > >
> > > > > > > > I have a general comment: do we want the benchmark tool as
> part
> > > of
> > > > > > > > Polaris "core" repo or on polaris-tools ?
> > > > > > > > As we can consider this as a benchmark "tool", maybe it makes
> > > sense
> > > > > to
> > > > > > > > host it in https://github.com/apache/polaris-tools.
> > > > > > > >
> > > > > > > >
> > > > > > > At this point, apart from the Gradle build files, the benchmark
> > > code
> > > > is
> > > > > > > completely contained under the benchmarks/ directory.  And
> given
> > it
> > > > > relies
> > > > > > > on the REST API, there is no real dependency to any specific
> > > Polaris
> > > > > > > version.
> > > > > > >
> > > > > > > I cannot think of any issue with storing that code in the
> > > > polaris-tools
> > > > > > > repository.
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Pierre
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to