I think both Celeborn and Uniffle are good alternatives as a general shuffle 
service.
I recommend that you try them : ). For any question about Celeborn, we're very 
glad
to discuss in Celeborn's mail lists[1][2] or slack[3].

[1] u...@celeborn.apache.org
[2] d...@celeborn.apache.org
[3] 
https://join.slack.com/t/apachecelebor-kw08030/shared_invite/zt-1ju3hd5j8-4Z5keMdzpcVMspe4UJzF4Q

Thanks,
Keyong Zhou

On 2023/10/31 14:24:38 "Battula, Brahma Reddy" wrote:
> Thanks for bringing up this. Good to see that it supports spark and flink.
> 
> Have you done comparison between uniffle and celeborn..?
> 
> 
> On 30/10/23, 8:01 AM, "Keyong Zhou" <zho...@apache.org 
> <mailto:zho...@apache.org>> wrote:
> 
> 
> Great to hear this! It's encouraging that Celeborn helps MR3.
> 
> 
> Celeborn is a general purpose remote shuffle service that stores and serves
> shuffle data (and other intermediate data in the future) to help compute 
> engines
> better use disaggregated architecture, as well as become more efficient and
> stable for huge shuffle sized jobs.
> 
> 
> Currently Celeborn supports Hive on MR, and I think integrating with MR3
> provides a good example to support Hive on Tez.
> 
> 
> Thanks,
> Keyong Zhou
> 
> 
> On 2023/10/24 12:08:54 Sungwoo Park wrote:
> > Hi Hive users,
> >
> > Before the impending release of MR3 1.8, we would like to announce the
> > release of Hive-MR3 with Celeborn (Hive 3.1.3 on MR3 1.8 with Celeborn
> > 0.3.1).
> >
> > Apache Celeborn [1] is remote shuffle service, similar to Magnet [2] and
> > Apache Uniffle [3] (which was discussed in this Hive mailing list a while
> > ago). Celeborn officially supports Spark and Flink, and we have implemented
> > an MR3-extension for Celeborn.
> >
> > In addition to all the benefits of using remote shuffle service,
> > Hive-MR3-Celeborn supports direct processing of mapper output on the
> > reducer side, which means that reducers do not store mapper output on local
> > disks (for unordered edges). In this way, Hive-MR3-Celeborn can eliminate
> > over 95% of local disk writes when tested on the 10TB TPC-DS benchmark.
> > This can be particularly useful when running Hive-MR3 on public clouds
> > where fast local disk storage is expensive or not available.
> >
> > We have documented the usage of Hive-MR3-Celeborn in [4]. You can download
> > Hive-MR3-Celeborn in [5].
> >
> > FYI, MR3 is an execution engine providing native support for Hadoop,
> > Kubernetes, and standalone mode [6]. Hive-MR3, its main application,
> > provides the performance of LLAP yet is very easy to install and operate.
> > If you are using Hive-Tez for running ETL jobs, switching to Hive-MR3 will
> > give you a much higher throughput thanks to its advanced resource sharing
> > model.
> >
> > We have recently opened a Slack channel. If interested, please join the
> > Slack channel and ask any question on MR3:
> >
> > https://join.slack.com/t/mr3-help/shared_invite/zt-1wpqztk35-AN8JRDznTkvxFIjtvhmiNg
> >  
> > <https://join.slack.com/t/mr3-help/shared_invite/zt-1wpqztk35-AN8JRDznTkvxFIjtvhmiNg>
> >
> > Thank you,
> >
> > --- Sungwoo
> >
> > [1] https://celeborn.apache.org/ <https://celeborn.apache.org/>
> > [2] https://www.vldb.org/pvldb/vol13/p3382-shen.pdf 
> > <https://www.vldb.org/pvldb/vol13/p3382-shen.pdf>
> > [3] https://uniffle.apache.org/ <https://uniffle.apache.org/>
> > [4] https://mr3docs.datamonad.com/docs/mr3/features/celeborn/ 
> > <https://mr3docs.datamonad.com/docs/mr3/features/celeborn/>
> > [5] https://github.com/mr3project/mr3-release/releases/tag/v1.8 
> > <https://github.com/mr3project/mr3-release/releases/tag/v1.8>
> > [6] https://mr3docs.datamonad.com/ <https://mr3docs.datamonad.com/>
> >
> 
> 
> 
> 

Reply via email to