Also forwarding to hdfs-dev@. I misspelled the address. --Siyao ---------- Forwarded message --------- From: Siyao Meng <sm...@cloudera.com> Date: Thu, Jun 4, 2020 at 9:14 AM Subject: [Ozone] [VOTE] Merge HDDS-2665 branch (OFS) to master To: <ozone-...@hadoop.apache.org>, <hadoop-...@hadoop.apache.org>
Hi Ozone developers, I'd like to propose merging feature branch HDDS-2665 into master branch for new Ozone Filesystem scheme *ofs://*. This new Filesystem scheme intends to improve Ozone user experience. OFS is a client-side FileSystem implementation. It can co-exist with o3fs and be used interchangeably if needed on the same client. OFS scheme in a nutshell: ofs://<Host name[:Port] or OM Service > ID>/[<volumeName>/<bucketName>/path/to/key] And here's a simple list of valid OFS URI -- this should cover all expected daily usages: ofs://om1/ > ofs://om3:9862/ > ofs://omservice/ > ofs://omservice/volume1/ > ofs://omservice/volume1/bucket1/ > ofs://omservice/volume1/bucket1/dir1 > ofs://omservice/volume1/bucket1/dir1/key1 ofs://omservice/tmp/ ofs://omservice/tmp/key1 Located at the root of an OFS Filesystem are volumes and mount(s). Inside a volume lies all the buckets. Inside buckets are keys and directories. For mounts, only temp mount */tmp/* is supported at the moment -- more on this later. So naturally, OFS *don't allow creating keys(files) directly under root or volumes*. Users will receive an error message if they try to do that: $ ozone fs -mkdir /volume1 > 2020-06-04 00:00:00,000 [main] INFO rpc.RpcClient: Creating Volume: > volume1, with hadoop as owner. > $ ozone fs -touch /volume1/key1 > touch: Cannot create file under root or volume. A short note: `ozone fs`, `hadoop fs`, `hdfs dfs` can be used interchangeably. As long as the jars and client config for OFS are in place. 1. With OFS, fs.defaultFS (in core-site.xml) no longer needs to have a specific volume and bucket in its path like o3fs did. Simply put the OM host or service ID: <property> > <name>fs.defaultFS</name> > <value>ofs://omservice</value> > </property> Then the client should be able to access every volume and bucket in that cluster without specifying the host name or service ID. ozone fs -mkdir -p /volume1/bucket1 2. Admins can create and delete volumes and buckets easily with Hadoop FS shell. Volumes and buckets are treated similar to directories so they will be created if they don't exist with *-p*: ozone fs -mkdir -p ofs://omservice/volume1/bucket1/dir1/ Note that the supported volume and bucket name character set rule still applies. For example, bucket and volume names don't like *underscore*(_): $ ozone fs -mkdir -p /volume_1 > mkdir: Bucket or Volume name has an unsupported character : _ 3. To be compatible with legacy Hadoop applications that use /tmp/, we have a special temp mount located at the root of the FS. In order to use it, first an admin needs to create the volume *tmp* (the volume name is hardcoded at the moment) and set its ACL to world ALL access. This only needs to be done *once per cluster*: $ ozone sh volume create tmp > $ ozone sh volume setacl tmp -al world::a Then *each user* needs to mkdir first to initialize their own temp bucket once. After that they can write to it just like they would do to a regular directory: $ ozone fs -mkdir /tmp > 2020-06-04 00:00:00,050 [main] INFO rpc.RpcClient: Creating Bucket: > tmp/0238775c7bd96e2eab98038afe0c4279, with Versioning false and Storage > Type set to DISK and Encryption set to false $ ozone fs -touch /tmp/key1 4. When keys are deleted to trash, they are moved to a trash directory under each *bucket*, because keys can't be moved(renamed) between buckets. $ ozone fs -rm /volume1/bucket1/key1 > 2020-06-04 00:00:00,100 [main] INFO fs.TrashPolicyDefault: Moved: > 'ofs://id1/volume1/bucket1/key1' to trash at: > ofs://id1/volume1/bucket1/.Trash/hadoop/Current/volume1/bucket1/key1 This is similar to how the HDFS encryption zone handles trash location. 5. OFS supports recursive volume, bucket and key listing. i.e. `ozone fs -ls -R ofs://omservice/` will recursively list all volumes, buckets and keys the user has LIST permission to (if ACL is enabled). Note this shouldn't degrade server performance as this logic is completely client-side. As if the client is issuing multiple requests to the server to get all the information. So far the OFS feature set is complete, see sub-tasks of HDDS-2665. The OFS feature branch was rebased 1 day ago to include Ozone master branch commits. It passes existing checks in my latest rebase PR. FileSystem contract tests and basic integration tests are also in place. I ran basic shell commands i.e. the examples above. They work fine. I ran WordCount in the docker compose environment (w/o YARN). It succeeded. And I manually confirmed the correctness of the result. I also ran TeraSort suite (only 1000 rows, in docker compose). The result looks fine. We have tested compatibility with MapReduce and Hive I think it is time to merge HDDS-2665 into master. We can continue future work (refactoring, improvements, performance analysis) from there. I have submitted a PR for easier review of all code changes for OFS here: https://github.com/apache/hadoop-ozone/pull/1021 Please vote on this thread. The vote will run for 7 days through Thursday, June 11 11:59 PM GMT. Thanks, Siyao Meng