+1 to what JB said.

My concern with Scala has mostly been that it can alienate new contributors
and add ambiguity about when we should use Scala vs. Java. If we’re putting
this in polaris-tools for now and the philosophy for polaris-tools is to
more or less use whatever language you prefer, there should be no issues.

It does make me think that we should more or less isolate each other “tool”
though. What if contributor A wants a different version of a language or
dependence compared to contributor B? But that’s something we can figure
out as we go.

On Mon, Mar 24, 2025 at 1:46 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi,
>
> Personally, I'm more in favor of hosting the benchmark tool in
> polaris-tools (it looks logical :)).
>
> Now, about Scala, and generally speaking about "maintenance
> questions", I think we should not consider what we (individuals) can
> or want to maintain, but more, what the community (including all
> contributors) can/would like to maintain.
> If we take an analogy with Apache Iceberg, Apache Arrow or Apache
> Beam, we can see python, rust, go, maintained by the community,
> whereas it was not probably not the main "skill" from the first
> committers.
>
> So, I don't consider Scala as a question. I also am more in favor of
> moving forward, adding scala support on polaris-tools repo. In the
> lifetime of a project, things can change and refactoring happens, so
> we will always be able to replace Scala or find alternative (to the
> benchmark tool) if there's an ask from the community.
>
> My $0.10 :)
>
> Regards
> JB
>
> On Sun, Mar 23, 2025 at 4:42 PM Michael Collado <collado.m...@gmail.com>
> wrote:
> >
> > Personally, I don’t mind if have to maintain a bit of Scala code - I like
> > Scala, though every time the question of using comes up, I see the same
> > concerns that Russell brought up.
> >
> > I will say that if the alternative is to introduce JMeter into the repo,
> > I’m a hard -1. I’ll write Scala all day long to avoid that.
> >
> > Mike
> >
> > On Sat, Mar 22, 2025 at 1:13 PM Russell Spitzer <
> russell.spit...@gmail.com>
> > wrote:
> >
> > > I think we should start a new thread just to gauge consensus on whether
> > > Scala will be allowed in the tools repository or not. To go through my
> > > quick thoughts here.
> > >
> > > I like Scala but I have to be realistic in saying that it is a rather
> > > esoteric language choice and limits the number of community members
> that
> > > can contribute. So it would be a hard -1 for it being included in the
> main
> > > repository.
> > >
> > > Now for the tools repository I would also be a -1 for brand new
> proposals
> > > without code. Scala raises the bar for contributing so it still
> wouldn't be
> > > a great thing to add when other language bindings exist that are much
> more
> > > popular (even if we didn't chose Java)
> > >
> > > The current situation is a little different as we already have code
> written
> > > and I am usually focused on immediate practical benefits over
> hypothetical
> > > problems. So in the current situation I'm more of a -.1.  The reason I
> am
> > > still negative is that inclusion of the benchmarks into the project
> isn't
> > > just about utility to the project, but about whether the community
> should
> > > take up responsibility for maintaining the code. What is important
> here is
> > > not whether the code can be used by the project and contributors but
> about
> > > whether we have enough contributors who are familiar with Scala that
> the
> > > benchmarks can be maintained. We don't want to be in a situation where
> you
> > > win the lottery and we are left high and dry :)
> > >
> > > The value of the code is clearly high, but whether or not it is
> reasonable
> > > for the community to take on responsibility for Scala code (and build)
> > > needs to be polled. As long as a significant fraction of contributors
> don't
> > > have a problem working on Scala code I'm a +1.
> > >
> > > If this contribution was in Java or Python I would be +1 without
> > > reservation.
> > >
> > >
> > > On Sat, Mar 22, 2025 at 12:06 PM Pierre Laporte <pie...@pingtimeout.fr
> >
> > > wrote:
> > >
> > > > I don't mind contributing the benchmarks to `polaris-tools`.  It
> seems
> > > that
> > > > the consensus is clearly in that direction.
> > > >
> > > > I want to address some comments that were made in the PR but that
> are not
> > > > really related to code review per se.
> > > >
> > > > > You can write gatling benchmarks in a language other than Scala.
> > > > >
> > > > > There are also frameworks other than gatling.
> > > >
> > > > To me, the big question is : Assuming the code goes to
> `polaris-tools`,
> > > > _will this contribution be rejected if it uses Scala?_
> > > >
> > > > I understand that this is a controversial topic, and how that the
> > > expected
> > > > maintenance cost is a key factor here.  I made sure that the code is
> > > > documented and that a comprehensive readme file describes how
> datasets
> > > > work.  That way, nobody needs to be a Scala developer to leverage or
> > > > understand the tool.
> > > >
> > > > Those benchmarks have already been used to detect, reproduce and fix
> > > > multiple issues in the codebase.  Issues that had not been caught
> before
> > > > [1] [2] [3].  This shows that the benchmarks already bring value to
> the
> > > > community in their current state.
> > > >
> > > > Now, I want to avoid any misunderstanding.  My current focus is on
> > > evolving
> > > > the benchmarks and covering new cases.  Not on completely rewriting
> the
> > > > code in Java/another framework.  Essentially: focus on the area that
> > > brings
> > > > the most value to Polaris users.
> > > >
> > > > Hence my asking on dev@.  If anything, there will be more Scala code
> > > > pushed
> > > > to the benchmarks branch in the upcoming weeks.  Not less.  I would
> > > > completely understand if the Gatling/Scala design choice is a reason
> for
> > > > rejection.  The discussion simply needs to happen.
> > > >
> > > > [1] https://github.com/apache/polaris/issues/1044
> > > > [2] https://github.com/apache/polaris/issues/1076
> > > > [3] https://github.com/apache/polaris/issues/1123
> > > >
> > > >
> > > > --
> > > >
> > > > Pierre
> > > >
> > > >
> > > > On Sat, Mar 22, 2025 at 3:47 PM Russell Spitzer <
> > > russell.spit...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > I think it makes sense for us to also build some capabilities into
> the
> > > > > tools repo to build Polaris at a specific commit for testing
> purposes.
> > > If
> > > > > the Spark Catalog and Benchmarking code goes there they could both
> > > share
> > > > > this code for testing, ditto for the migration code.
> > > > >
> > > > > On Fri, Mar 21, 2025 at 4:59 PM Yufei Gu <flyrain...@gmail.com>
> wrote:
> > > > >
> > > > > > I’m leaning toward placing it in a separate repository rather
> than in
> > > > > > https://github.com/apache/polaris. The benchmark tool is largely
> > > > > > self-contained and doesn’t have a strong dependency on the main
> > > > codebase.
> > > > > >
> > > > > > IIUC, the only requirement is a running Polaris instance, which
> the
> > > > tool
> > > > > > can connect to using the following configuration:
> > > > > > export CLIENT_ID=your_client_id
> > > > > > export CLIENT_SECRET=your_client_secret
> > > > > > export BASE_URL=http://your-polaris-instance:8181
> > > > > >
> > > > > > Yufei
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 20, 2025 at 6:05 AM Jean-Baptiste Onofré <
> > > j...@nanthrax.net>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Ajantha,
> > > > > > >
> > > > > > > That's a good request.
> > > > > > >
> > > > > > > Imho, right now, before distributing any artifact (either on
> > > nightly
> > > > > > > build space https://nightlies.apache.org/), I prefer to have
> it
> > > > "good
> > > > > > > enough" from a "legal" standpoint (e.g. LICENSE/NOTICE).
> > > > > > >
> > > > > > > I'm almost done about that for all artifacts (jar and
> > > distributions).
> > > > > > > I will open a PR soon.
> > > > > > > Once this PR is done, I will submit a way to provide nightly
> > > builds.
> > > > > > >
> > > > > > > Regards
> > > > > > > JB
> > > > > > >
> > > > > > > On Thu, Mar 20, 2025 at 10:27 AM Ajantha Bhat <
> > > ajanthab...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I cannot think of any issue with storing that code in the
> > > > > > polaris-tools
> > > > > > > > repository.
> > > > > > > >
> > > > > > > > While contributing the `catalog migrator tool` to
> > > `polaris-tools`,
> > > > I
> > > > > > > > encountered a challenge because this external repository
> needs to
> > > > > > depend
> > > > > > > on
> > > > > > > > Apache Polaris jars, which haven't been published yet by
> Apache
> > > > > > Polaris.
> > > > > > > If
> > > > > > > > we keep the tool in polaris-tools, we may need to wait for
> the
> > > > > nightly
> > > > > > > > build or official jar publication.
> > > > > > > >
> > > > > > > > - Ajantha
> > > > > > > >
> > > > > > > > On Thu, Mar 20, 2025 at 2:46 PM Pierre Laporte <
> > > > > pie...@pingtimeout.fr>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > On Wed, Mar 19, 2025 at 4:53 PM Jean-Baptiste Onofré <
> > > > > > j...@nanthrax.net>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Pierre
> > > > > > > > > >
> > > > > > > > > > Thanks !
> > > > > > > > > >
> > > > > > > > > > I have a general comment: do we want the benchmark tool
> as
> > > part
> > > > > of
> > > > > > > > > > Polaris "core" repo or on polaris-tools ?
> > > > > > > > > > As we can consider this as a benchmark "tool", maybe it
> makes
> > > > > sense
> > > > > > > to
> > > > > > > > > > host it in https://github.com/apache/polaris-tools.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > At this point, apart from the Gradle build files, the
> benchmark
> > > > > code
> > > > > > is
> > > > > > > > > completely contained under the benchmarks/ directory.  And
> > > given
> > > > it
> > > > > > > relies
> > > > > > > > > on the REST API, there is no real dependency to any
> specific
> > > > > Polaris
> > > > > > > > > version.
> > > > > > > > >
> > > > > > > > > I cannot think of any issue with storing that code in the
> > > > > > polaris-tools
> > > > > > > > > repository.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Pierre
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>

Reply via email to