On Wed, Apr 10, 2024 at 9:54 PM Binwei Yang wrote:
>
> Gluten currently already support Velox backend and Clickhouse backend.
> data fusion support is also proposed but no one worked on it.
>
> Gluten isn't a POC. It's under actively developing but some companies
> already used it.
>
>
> On 2024/
Gluten currently already support Velox backend and Clickhouse backend. data
fusion support is also proposed but no one worked on it.
Gluten isn't a POC. It's under actively developing but some companies already
used it.
On 2024/04/11 03:32:01 Dongjoon Hyun wrote:
> I'm interested in your cla
Gluten java part is pretty stable now. The development is more in the c++ code,
velox code as well as Clickhouse backend.
The SPIP doesn't plan to introduce whole Gluten stack into Spark. But the way
to serialize Spark physical plan and be able to send to native backend, through
JNI or gRPC.
We (Gluten and Arrow guys) actually do planned to put the plan conversation in
the substrait-java repo. But to me it makes more sense to put it as part of
Spark repo. Native library and accelerator support will be more and more import
in future.
On 2024/04/10 08:29:08 Wenchen Fan wrote:
> It'
I'm interested in your claim.
Could you elaborate or provide some evidence for your claim, *a door for
all native libraries*, Binwei?
For example, is there any POC for that claim? Maybe, did I miss something
in that SPIP?
Dongjoon.
On Wed, Apr 10, 2024 at 8:19 PM Binwei Yang wrote:
>
> The SP
The SPIP is not for current Gluten, but open a door for all native libraries
and accelerators support.
On 2024/04/11 00:27:43 Weiting Chen wrote:
> Yes, the 1st Apache release(v1.2.0) for Gluten will be in September.
> For Spark version support, currently Gluten v1.1.1 support Spark3.2 and 3.3.
Hi Everyone,
I had to explored IBM's and AWS's S3 shuffle plugins (some time back), I
had also explored AWS FSX lustre in few of my production jobs which has
~20TB of shuffle operations with 200-300 executors. What I have observed is
S3 and fax behaviour was fine during the write phase, however I
Yes, the 1st Apache release(v1.2.0) for Gluten will be in September.
For Spark version support, currently Gluten v1.1.1 support Spark3.2 and 3.3.
We are planning to support Spark3.4 and 3.5 in Gluten v1.2.0.
Spark4.0 support for Gluten is depending on the release schedule in Spark
community.
On 2
+1 for Wenchen's point.
I don't see a strong reason to pull these transformations into Spark
instead of keeping them in third party packages/projects.
On Wed, Apr 10, 2024 at 5:32 AM Wenchen Fan wrote:
>
> It's good to reduce duplication between different native accelerators of
> Spark, and AFA
This approach makes sense to me.
If Spark K8s operator is aligned with Spark versions, for example, it
uses 4.0.0 now.
Because these JIRA tickets are not actually targeting Spark 4.0.0, it
will cause confusion and more questions, like when we are going to cut
Spark release,
should we include Spark
Cool, looks like we have two options here.
Option 1: Spark Operator and Connect Go Client versioning independent of
Spark, e.g. starting with 0.1.0.
Pros: they can evolve versions independently.
Cons: people will need an extra step to decide the version when using Spark
Operator and Connect Go Cli
I read the SPIP. I have a number of ;points if I may
- Maturity of Gluten: as the excerpt mentions, Gluten is a project, and its
feature set and stability IMO are still under development. Integrating a
non-core component could introduce risks if it is not fully mature
- Complexity: integrating Gl
It's good to reduce duplication between different native accelerators of
Spark, and AFAIK there is already a project trying to solve it:
https://substrait.io/
I'm not sure why we need to do this inside Spark, instead of doing
the unification for a wider scope (for all engines, not only Spark).
O
Ya, that would work.
Inevitably, I looked at Apache Flink K8s Operator's JIRA and GitHub repo.
It looks reasonable to me.
Although they share the same JIRA, they choose different patterns per place.
1. In POM file and Maven Artifact, independent version number.
1.8.0
2. Tag is also based on th
14 matches
Mail list logo