[ 
https://issues.apache.org/jira/browse/HADOOP-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17899090#comment-17899090
 ] 

Jinglun commented on HADOOP-19236:
----------------------------------

Thanks [~openinx] [~Ddupg] [~xinxianyin] [~stayrascal]  [~fangbo_worker] 
@yuanzhihuan for your great works. They are the contributors of hadoop-tos 
module. I have reviewed all the PRs and merged into branch HADOOP-19236.

 

Let me do a brief introduction of the hadoop-tos integration work.

 

*Hadoop-tos module*

A new hadoop-tos module is added to hadoop-cloud-storage-project, with 2 
sub-modules hadoop-tos-core and hadoop-tos-shade. The hadoop-tos-shade is used 
to shade all tos sdk dependencies to avoid potential conflicts. The 
hadoop-tos-core module contains the tos filesystem implementation.

The final output is a bundle jar placed under hadoop-tos-core named 
hadoop-tos-core-\{version}.jar. The tos-sdk dependencies are packaged into the 
final output jar. Put the jar under $HADOOP_HOME/share/hadoop/hdfs then it is 
able to access tos. See documents in hadoop-tos for more details.

 

*Dependencies*

Hadoop-tos involves a new dependency `com.volcengine:ve-tos-java-sdk:2.8.6`. It 
is an open source project with apache 2.0  license 
(https://github.com/volcengine/ve-tos-java-sdk/blob/main/LICENSE).

 

Here are the dependencies involved by `com.volcengine:ve-tos-java-sdk:2.8.6`.  
They (okhttp, okio, kotlin, jackson) are open source with apache 2.0 too. 
```
[INFO] org.apache.hadoop:hadoop-tos-shade:jar:3.5.0-SNAPSHOT
[INFO] \- com.volcengine:ve-tos-java-sdk:jar:2.8.6:compile
[INFO]    +- com.squareup.okhttp3:okhttp:jar:4.10.0:compile
[INFO]    |  +- com.squareup.okio:okio-jvm:jar:3.0.0:compile
[INFO]    |  |  +- org.jetbrains.kotlin:kotlin-stdlib-jdk8:jar:1.6.20:test
[INFO]    |  |  |  \- org.jetbrains.kotlin:kotlin-stdlib-jdk7:jar:1.6.20:test
[INFO]    |  |  \- org.jetbrains.kotlin:kotlin-stdlib-common:jar:1.6.20:compile
[INFO]    |  \- org.jetbrains.kotlin:kotlin-stdlib:jar:1.6.20:compile
[INFO]    |     \- org.jetbrains:annotations:jar:13.0:compile
[INFO]    +- com.fasterxml.jackson.core:jackson-annotations:jar:2.12.7:compile
[INFO]    +- com.fasterxml.jackson.core:jackson-databind:jar:2.12.7.1:compile
[INFO]    |  \- com.fasterxml.jackson.core:jackson-core:jar:2.12.7:compile
[INFO]    \- org.slf4j:slf4j-api:jar:1.7.36:compile
```
 

All the dependencies(excluding slf4j) are shaded to avoid potential conflicts.

 

*How to run unit tests*

To run hadoop-tos unit tests, you need a server that can connect TOS. See 
documents in hadoop-tos for more details. I can provide an environment for 
test, please let me know if you need to test hadoop-tos ([email protected]). 

 

*Documents*

The doc is placed under hadoop-tos-core module. Find it at 
`src/site/markdown/cloudstorage/index.md`.

 

*Works in the future*
 # FileSystem#createBulkDelete is a useful interface, it would be nice to 
implement it.
 # Maybe adding jars from hadoop-cloud-project to hadoop-dist. Currently they 
are not included by the final tar file.

 

 

I think it is ready for a public review now. Hi [[email protected]] 
[~hexiaoqiao] [~leosun] , could you kindly take a look at this, thanks very 
much !

> Integration of Volcano Engine TOS in Hadoop.
> --------------------------------------------
>
>                 Key: HADOOP-19236
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19236
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, tools
>    Affects Versions: 3.4.0
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: Integration of Volcano Engine TOS in Hadoop.pdf
>
>
> Volcano Engine is a fast growing cloud vendor launched by ByteDance, and TOS 
> is the object storage service of Volcano Engine. A common way is to store 
> data into TOS and run Hadoop/Spark/Flink applications to access TOS. But 
> there is no original support for TOS in hadoop, thus it is not easy for users 
> to build their Big Data System based on TOS.
>  
> This work aims to integrate TOS with Hadoop to help users run their 
> applications on TOS. Users only need to do some simple configuration, then 
> their applications can read/write TOS without any code change. This work is 
> similar to AWS S3, AzureBlob, AliyunOSS, Tencnet COS and HuaweiCloud Object 
> Storage in Hadoop.
>  
>  Please see the attached document "Integration of Volcano Engine TOS in 
> Hadoop" for more details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to