Sounds good, though expect no commitment from me to review anything.
My main concerns are about dependency libraries (what are they?) and
testing.

On Tue, 11 Feb 2025 at 05:10, Xiaoqiao He <hexiaoq...@apache.org> wrote:

> Thanks Jinglun for your work. Basically +1 from me to involve it into the
> Hadoop codebase.
> a. After a quick review of JIRA and PR, I think it is solid including
> document and code style.
> b. Contributors involved here are diverse who are from different projects
> and companies, and active enough.
> c. Community with Jinlun offline many times, and IMO he could be
> responsible to review and test about this module.
> Beside that, just suggest following the Hadoop guidelines[1] to develop
> the new features.
>
> @Steve Loughran <ste...@cloudera.com> @Shilun Fan <slfan1...@foxmail.com> 
> leave
> some comments including some concerns in JIRA, would you mind giving more
> suggestions for this discussion?
> Thanks.
>
> Best Regards,
> - He Xiaoqiao
>
> [1] https://hadoop.apache.org/bylaws.html
>
>
> On Sun, Jan 26, 2025 at 3:39 PM jinglun <jinglun...@qq.com.invalid> wrote:
>
>> Hello everyone, I'd like to discuss the integration of volcano engine tos
>> in hadoop.
>>
>>
>> Volcano Engine is a fast growing cloud vendor launched by ByteDance, and
>> TOS is the object storage service of Volcano Engine. A common way is to
>> store data into TOS and run Hadoop/Spark/Flink applications to access TOS.
>> But there is no original support for TOS in hadoop, thus it is not easy for
>> users to build their Big Data System based on TOS.
>> &nbsp;
>> My proposal is to integrate TOS with Hadoop to help users run their
>> applications on TOS. Users only need to do some simple configuration, then
>> their applications can read/write TOS without any code change. This work is
>> similar to AWS S3, AzureBlob, AliyunOSS, Tencnet COS and HuaweiCloud Object
>> Storage in Hadoop.
>>
>>
>> More details could be found at&nbsp;
>> https://issues.apache.org/jira/browse/HADOOP-19236.
>>
>>
>> 1. What is the progress of the work now?
>> The work is currently finished at branch HADOOP_19236. It is developed by
>> the EMR team of Volcano Engine and served many users from both cloud and
>> IDC for more than 2 years.
>>
>>
>> 2. How is the&nbsp;long-term maintenance and testing guaranteed?&nbsp;
>> The contributors are opensource friendly,&nbsp;including&nbsp;ZhengHu(PMC
>> of HBase and Iceberg), Jinglun(Committer of Hadoop),&nbsp;SunXin(Committer
>> of HBase),&nbsp;XianyinXin(Contributor of Spark), Rascal Wu(Contributor of
>> Flink), FangBo(Contributor of Hive) and Yuanzhihuan.&nbsp;We will all be
>> involved in the long-term maintenance of this work.&nbsp;As time goes by,
>> more people from the EMR team and the hadoop-tos users may join this work.
>> So I'm confident at the long-term maintenance and testing.
>>
>>
>> 3. Why should hadoop-tos interaged to hadoop codebase? Shall we use an
>> independent project?
>> Integration is for a better user experience. First, users don't need to
>> go to another repo to find the tos support. Second, users don't need to
>> worry about the versions mapping between hadoop and hadoop-tos. Finally, a
>> connector provided by hadoop community is&nbsp;more reliable and
>> trustworthy.&nbsp;
>>
>>
>>
>>
>>
>>
>>
>>
>> If you have any question, concern or any thing else that is unclear,
>> please let me know.&nbsp;Sincerely looking forward to your reply, thanks
>> very much.
>
>

Reply via email to