Having benchmark results against individual commits is a great thing to
have.
The small GH hosted runners however are not suitable for
deterministic/comparable results.
It would be possible though, if the hardware (or bare-metal compute
instances in the cloud) is available to the project. I
Hi Pierre,
Thanks !
I will take a look at the new PR :)
Regards
JB
On Tue, Apr 1, 2025 at 5:38 PM Pierre Laporte wrote:
>
> Ok so it seems there is a consensus. The benchmarks can be written in
> Scala as long as they are contributed to the tools repository. I just
> closed the initial PR th
Ok so it seems there is a consensus. The benchmarks can be written in
Scala as long as they are contributed to the tools repository. I just
closed the initial PR that was against the `apache/polaris` repository and
opened a new one against the `apache/polaris-tools` repository (
https://github.co
Sounds good!
On Tue, Apr 1, 2025 at 10:38 AM Pierre Laporte
wrote:
> Ok so it seems there is a consensus. The benchmarks can be written in
> Scala as long as they are contributed to the tools repository. I just
> closed the initial PR that was against the `apache/polaris` repository and
> open
I think having a tool like this is a great idea. Would we be able to host
the results over time as well? Like an official build run that triggers on
a daily basis?
On Wed, Mar 19, 2025 at 10:07 AM Pierre Laporte
wrote:
> Hi
>
> I have been working on a set of benchmarks for Polaris [1] and would
Hi Eric
That's a good point. I think that it's something we can manage with
each tool in a separate folder/module. And, I'm sure we will find a
solution if/when the problem will occur :)
Regards
JB
On Mon, Mar 24, 2025 at 5:51 PM Eric Maynard wrote:
>
> +1 to what JB said.
>
> My concern with S
+1 to what JB said.
My concern with Scala has mostly been that it can alienate new contributors
and add ambiguity about when we should use Scala vs. Java. If we’re putting
this in polaris-tools for now and the philosophy for polaris-tools is to
more or less use whatever language you prefer, there
Hi,
Personally, I'm more in favor of hosting the benchmark tool in
polaris-tools (it looks logical :)).
Now, about Scala, and generally speaking about "maintenance
questions", I think we should not consider what we (individuals) can
or want to maintain, but more, what the community (including all
Personally, I don’t mind if have to maintain a bit of Scala code - I like
Scala, though every time the question of using comes up, I see the same
concerns that Russell brought up.
I will say that if the alternative is to introduce JMeter into the repo,
I’m a hard -1. I’ll write Scala all day long
I’m leaning toward placing it in a separate repository rather than in
https://github.com/apache/polaris. The benchmark tool is largely
self-contained and doesn’t have a strong dependency on the main codebase.
IIUC, the only requirement is a running Polaris instance, which the tool
can connect to u
I think we should start a new thread just to gauge consensus on whether
Scala will be allowed in the tools repository or not. To go through my
quick thoughts here.
I like Scala but I have to be realistic in saying that it is a rather
esoteric language choice and limits the number of community memb
I don't mind contributing the benchmarks to `polaris-tools`. It seems that
the consensus is clearly in that direction.
I want to address some comments that were made in the PR but that are not
really related to code review per se.
> You can write gatling benchmarks in a language other than Scala
I think it makes sense for us to also build some capabilities into the
tools repo to build Polaris at a specific commit for testing purposes. If
the Spark Catalog and Benchmarking code goes there they could both share
this code for testing, ditto for the migration code.
On Fri, Mar 21, 2025 at 4:5
Hi Ajantha,
That's a good request.
Imho, right now, before distributing any artifact (either on nightly
build space https://nightlies.apache.org/), I prefer to have it "good
enough" from a "legal" standpoint (e.g. LICENSE/NOTICE).
I'm almost done about that for all artifacts (jar and distributio
> I cannot think of any issue with storing that code in the polaris-tools
repository.
While contributing the `catalog migrator tool` to `polaris-tools`, I
encountered a challenge because this external repository needs to depend on
Apache Polaris jars, which haven't been published yet by Apache Pol
On Wed, Mar 19, 2025 at 4:53 PM Jean-Baptiste Onofré
wrote:
> Hi Pierre
>
> Thanks !
>
> I have a general comment: do we want the benchmark tool as part of
> Polaris "core" repo or on polaris-tools ?
> As we can consider this as a benchmark "tool", maybe it makes sense to
> host it in https://git
Hey,
Yes, we have precedent about sponsored "machines/executors".
For instance, at Apache Beam, we had (and still have) sponsored
Jenkins executors (there are some requirements from the ASF Infra, but
possible).
Regards
JB
On Wed, Mar 19, 2025 at 5:23 PM Robert Stupp wrote:
>
> Having benchmark
Thanks Pieree!
It's great to have a benchmark tool to measure performance. It'd be awesome
to make decisions based on numbers instead of theories.
Yufei
On Wed, Mar 19, 2025 at 8:53 AM Jean-Baptiste Onofré
wrote:
> Hi Pierre
>
> Thanks !
>
> I have a general comment: do we want the benchmark
Thank you so much for the benchmarks !
+1, having benchmark results committed, it will help catch any degradation
/ correctness issue that can creep in !
equivalent to golden files of tpc-ds / tpc-h in spark repo.
Best,
Prashant Sungh
On Wed, Mar 19, 2025 at 8:53 AM Russell Spitzer
wrote:
> I t
Hi Pierre
Thanks !
I have a general comment: do we want the benchmark tool as part of
Polaris "core" repo or on polaris-tools ?
As we can consider this as a benchmark "tool", maybe it makes sense to
host it in https://github.com/apache/polaris-tools.
Thoughts ?
Regards
JB
On Wed, Mar 19, 2025
20 matches
Mail list logo