Sounds good!

On Tue, Apr 1, 2025 at 10:38 AM Pierre Laporte <pie...@pingtimeout.fr>
wrote:

> Ok so it seems there is a consensus.  The benchmarks can be written in
> Scala as long as they are contributed to the tools repository.  I just
> closed the initial PR that was against the `apache/polaris` repository and
> opened a new one against the `apache/polaris-tools` repository (
> https://github.com/apache/polaris-tools/pull/2).
>
> Thanks for your feedback
>
> --
>
> Pierre
>
>
> On Mon, Mar 24, 2025 at 7:05 PM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > Hi Eric
> >
> > That's a good point. I think that it's something we can manage with
> > each tool in a separate folder/module. And, I'm sure we will find a
> > solution if/when the problem will occur :)
> >
> > Regards
> > JB
> >
> > On Mon, Mar 24, 2025 at 5:51 PM Eric Maynard <eric.w.mayn...@gmail.com>
> > wrote:
> > >
> > > +1 to what JB said.
> > >
> > > My concern with Scala has mostly been that it can alienate new
> > contributors
> > > and add ambiguity about when we should use Scala vs. Java. If we’re
> > putting
> > > this in polaris-tools for now and the philosophy for polaris-tools is
> to
> > > more or less use whatever language you prefer, there should be no
> issues.
> > >
> > > It does make me think that we should more or less isolate each other
> > “tool”
> > > though. What if contributor A wants a different version of a language
> or
> > > dependence compared to contributor B? But that’s something we can
> figure
> > > out as we go.
> > >
> > > On Mon, Mar 24, 2025 at 1:46 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Personally, I'm more in favor of hosting the benchmark tool in
> > > > polaris-tools (it looks logical :)).
> > > >
> > > > Now, about Scala, and generally speaking about "maintenance
> > > > questions", I think we should not consider what we (individuals) can
> > > > or want to maintain, but more, what the community (including all
> > > > contributors) can/would like to maintain.
> > > > If we take an analogy with Apache Iceberg, Apache Arrow or Apache
> > > > Beam, we can see python, rust, go, maintained by the community,
> > > > whereas it was not probably not the main "skill" from the first
> > > > committers.
> > > >
> > > > So, I don't consider Scala as a question. I also am more in favor of
> > > > moving forward, adding scala support on polaris-tools repo. In the
> > > > lifetime of a project, things can change and refactoring happens, so
> > > > we will always be able to replace Scala or find alternative (to the
> > > > benchmark tool) if there's an ask from the community.
> > > >
> > > > My $0.10 :)
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On Sun, Mar 23, 2025 at 4:42 PM Michael Collado <
> > collado.m...@gmail.com>
> > > > wrote:
> > > > >
> > > > > Personally, I don’t mind if have to maintain a bit of Scala code -
> I
> > like
> > > > > Scala, though every time the question of using comes up, I see the
> > same
> > > > > concerns that Russell brought up.
> > > > >
> > > > > I will say that if the alternative is to introduce JMeter into the
> > repo,
> > > > > I’m a hard -1. I’ll write Scala all day long to avoid that.
> > > > >
> > > > > Mike
> > > > >
> > > > > On Sat, Mar 22, 2025 at 1:13 PM Russell Spitzer <
> > > > russell.spit...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > I think we should start a new thread just to gauge consensus on
> > whether
> > > > > > Scala will be allowed in the tools repository or not. To go
> > through my
> > > > > > quick thoughts here.
> > > > > >
> > > > > > I like Scala but I have to be realistic in saying that it is a
> > rather
> > > > > > esoteric language choice and limits the number of community
> members
> > > > that
> > > > > > can contribute. So it would be a hard -1 for it being included in
> > the
> > > > main
> > > > > > repository.
> > > > > >
> > > > > > Now for the tools repository I would also be a -1 for brand new
> > > > proposals
> > > > > > without code. Scala raises the bar for contributing so it still
> > > > wouldn't be
> > > > > > a great thing to add when other language bindings exist that are
> > much
> > > > more
> > > > > > popular (even if we didn't chose Java)
> > > > > >
> > > > > > The current situation is a little different as we already have
> code
> > > > written
> > > > > > and I am usually focused on immediate practical benefits over
> > > > hypothetical
> > > > > > problems. So in the current situation I'm more of a -.1.  The
> > reason I
> > > > am
> > > > > > still negative is that inclusion of the benchmarks into the
> project
> > > > isn't
> > > > > > just about utility to the project, but about whether the
> community
> > > > should
> > > > > > take up responsibility for maintaining the code. What is
> important
> > > > here is
> > > > > > not whether the code can be used by the project and contributors
> > but
> > > > about
> > > > > > whether we have enough contributors who are familiar with Scala
> > that
> > > > the
> > > > > > benchmarks can be maintained. We don't want to be in a situation
> > where
> > > > you
> > > > > > win the lottery and we are left high and dry :)
> > > > > >
> > > > > > The value of the code is clearly high, but whether or not it is
> > > > reasonable
> > > > > > for the community to take on responsibility for Scala code (and
> > build)
> > > > > > needs to be polled. As long as a significant fraction of
> > contributors
> > > > don't
> > > > > > have a problem working on Scala code I'm a +1.
> > > > > >
> > > > > > If this contribution was in Java or Python I would be +1 without
> > > > > > reservation.
> > > > > >
> > > > > >
> > > > > > On Sat, Mar 22, 2025 at 12:06 PM Pierre Laporte <
> > pie...@pingtimeout.fr
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I don't mind contributing the benchmarks to `polaris-tools`.
> It
> > > > seems
> > > > > > that
> > > > > > > the consensus is clearly in that direction.
> > > > > > >
> > > > > > > I want to address some comments that were made in the PR but
> that
> > > > are not
> > > > > > > really related to code review per se.
> > > > > > >
> > > > > > > > You can write gatling benchmarks in a language other than
> > Scala.
> > > > > > > >
> > > > > > > > There are also frameworks other than gatling.
> > > > > > >
> > > > > > > To me, the big question is : Assuming the code goes to
> > > > `polaris-tools`,
> > > > > > > _will this contribution be rejected if it uses Scala?_
> > > > > > >
> > > > > > > I understand that this is a controversial topic, and how that
> the
> > > > > > expected
> > > > > > > maintenance cost is a key factor here.  I made sure that the
> > code is
> > > > > > > documented and that a comprehensive readme file describes how
> > > > datasets
> > > > > > > work.  That way, nobody needs to be a Scala developer to
> > leverage or
> > > > > > > understand the tool.
> > > > > > >
> > > > > > > Those benchmarks have already been used to detect, reproduce
> and
> > fix
> > > > > > > multiple issues in the codebase.  Issues that had not been
> caught
> > > > before
> > > > > > > [1] [2] [3].  This shows that the benchmarks already bring
> value
> > to
> > > > the
> > > > > > > community in their current state.
> > > > > > >
> > > > > > > Now, I want to avoid any misunderstanding.  My current focus is
> > on
> > > > > > evolving
> > > > > > > the benchmarks and covering new cases.  Not on completely
> > rewriting
> > > > the
> > > > > > > code in Java/another framework.  Essentially: focus on the area
> > that
> > > > > > brings
> > > > > > > the most value to Polaris users.
> > > > > > >
> > > > > > > Hence my asking on dev@.  If anything, there will be more
> Scala
> > code
> > > > > > > pushed
> > > > > > > to the benchmarks branch in the upcoming weeks.  Not less.  I
> > would
> > > > > > > completely understand if the Gatling/Scala design choice is a
> > reason
> > > > for
> > > > > > > rejection.  The discussion simply needs to happen.
> > > > > > >
> > > > > > > [1] https://github.com/apache/polaris/issues/1044
> > > > > > > [2] https://github.com/apache/polaris/issues/1076
> > > > > > > [3] https://github.com/apache/polaris/issues/1123
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Pierre
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Mar 22, 2025 at 3:47 PM Russell Spitzer <
> > > > > > russell.spit...@gmail.com
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I think it makes sense for us to also build some capabilities
> > into
> > > > the
> > > > > > > > tools repo to build Polaris at a specific commit for testing
> > > > purposes.
> > > > > > If
> > > > > > > > the Spark Catalog and Benchmarking code goes there they could
> > both
> > > > > > share
> > > > > > > > this code for testing, ditto for the migration code.
> > > > > > > >
> > > > > > > > On Fri, Mar 21, 2025 at 4:59 PM Yufei Gu <
> flyrain...@gmail.com
> > >
> > > > wrote:
> > > > > > > >
> > > > > > > > > I’m leaning toward placing it in a separate repository
> rather
> > > > than in
> > > > > > > > > https://github.com/apache/polaris. The benchmark tool is
> > largely
> > > > > > > > > self-contained and doesn’t have a strong dependency on the
> > main
> > > > > > > codebase.
> > > > > > > > >
> > > > > > > > > IIUC, the only requirement is a running Polaris instance,
> > which
> > > > the
> > > > > > > tool
> > > > > > > > > can connect to using the following configuration:
> > > > > > > > > export CLIENT_ID=your_client_id
> > > > > > > > > export CLIENT_SECRET=your_client_secret
> > > > > > > > > export BASE_URL=http://your-polaris-instance:8181
> > > > > > > > >
> > > > > > > > > Yufei
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Mar 20, 2025 at 6:05 AM Jean-Baptiste Onofré <
> > > > > > j...@nanthrax.net>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Ajantha,
> > > > > > > > > >
> > > > > > > > > > That's a good request.
> > > > > > > > > >
> > > > > > > > > > Imho, right now, before distributing any artifact (either
> > on
> > > > > > nightly
> > > > > > > > > > build space https://nightlies.apache.org/), I prefer to
> > have
> > > > it
> > > > > > > "good
> > > > > > > > > > enough" from a "legal" standpoint (e.g. LICENSE/NOTICE).
> > > > > > > > > >
> > > > > > > > > > I'm almost done about that for all artifacts (jar and
> > > > > > distributions).
> > > > > > > > > > I will open a PR soon.
> > > > > > > > > > Once this PR is done, I will submit a way to provide
> > nightly
> > > > > > builds.
> > > > > > > > > >
> > > > > > > > > > Regards
> > > > > > > > > > JB
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 20, 2025 at 10:27 AM Ajantha Bhat <
> > > > > > ajanthab...@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I cannot think of any issue with storing that code in
> > the
> > > > > > > > > polaris-tools
> > > > > > > > > > > repository.
> > > > > > > > > > >
> > > > > > > > > > > While contributing the `catalog migrator tool` to
> > > > > > `polaris-tools`,
> > > > > > > I
> > > > > > > > > > > encountered a challenge because this external
> repository
> > > > needs to
> > > > > > > > > depend
> > > > > > > > > > on
> > > > > > > > > > > Apache Polaris jars, which haven't been published yet
> by
> > > > Apache
> > > > > > > > > Polaris.
> > > > > > > > > > If
> > > > > > > > > > > we keep the tool in polaris-tools, we may need to wait
> > for
> > > > the
> > > > > > > > nightly
> > > > > > > > > > > build or official jar publication.
> > > > > > > > > > >
> > > > > > > > > > > - Ajantha
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 20, 2025 at 2:46 PM Pierre Laporte <
> > > > > > > > pie...@pingtimeout.fr>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Wed, Mar 19, 2025 at 4:53 PM Jean-Baptiste Onofré
> <
> > > > > > > > > j...@nanthrax.net>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Pierre
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks !
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have a general comment: do we want the benchmark
> > tool
> > > > as
> > > > > > part
> > > > > > > > of
> > > > > > > > > > > > > Polaris "core" repo or on polaris-tools ?
> > > > > > > > > > > > > As we can consider this as a benchmark "tool",
> maybe
> > it
> > > > makes
> > > > > > > > sense
> > > > > > > > > > to
> > > > > > > > > > > > > host it in https://github.com/apache/polaris-tools
> .
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > At this point, apart from the Gradle build files, the
> > > > benchmark
> > > > > > > > code
> > > > > > > > > is
> > > > > > > > > > > > completely contained under the benchmarks/ directory.
> > And
> > > > > > given
> > > > > > > it
> > > > > > > > > > relies
> > > > > > > > > > > > on the REST API, there is no real dependency to any
> > > > specific
> > > > > > > > Polaris
> > > > > > > > > > > > version.
> > > > > > > > > > > >
> > > > > > > > > > > > I cannot think of any issue with storing that code in
> > the
> > > > > > > > > polaris-tools
> > > > > > > > > > > > repository.
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Pierre
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
>

Reply via email to