+1 for merge with the compatibility issue addressed.

Thanks Uma for putting this together.

- Hanisha

> On Feb 15, 2022, at 12:17 AM, Uma gangumalla <umamah...@apache.org> wrote:
> 
> Dear Ozone Devs,
> 
> As you may know, we have been actively developing Ozone Erasure Coding
> support in a separate branch HDDS-3816-ec.
> 
> We have finished the development of EC key write and read functionality.
> The support of offline recovery( Recovering replica from node loss) will be
> part of second phase work.
> 
> Since the code has already grown and increasingly started seeing merge
> complications, we would like to propose to merge the current EC branch into
> master.
> 
> We will file the new JIRA for the second phase of work and continue the
> offline recovery work there.
> 
> Details on Changes:
> 
>   -
> 
>   Most of the EC core logic went to newly extended classes. Key changes
>   went into EC*OutputStream and EC*InputStream classes for write and read
>   respectively. Based on replication type, ECPipelineProvider will be chosen
>   for creating EC pipelines.
> 
> 
> 
>   -
> 
>   Since we cannot represent the EC replication in the existing replication
>   factor, we have introduced ECReplicationConfig. The ReplicationConfig
>   interface is already pushed to master, so it’s not a new idea coming
>   through this branch merge now. What is newly coming here is the
>   ECReplicationConfig class which can be used to express EC replication
>   configuration.
> 
> 
> 
>   -
> 
>   We wanted to provide the support to enable EC at bucket level. To
>   simplify some complications, we have moved the default replication
>   configurations from client to server.
> 
> 
> 
>   -
> 
>   Client side replication type and replication factor removed from the
>   configuration files and introduced the ozone.server.default.replication
>   and ozone.server.default.replication.type.We would continue to respect if
>   one configures at client side explicitly or passed through APIs, otherwise
>   server side bucket level properties or server side default configuration
>   would take effect.
> 
> 
> 
>   -
> 
>   Other than this change, the rest of EC side code should not impact any
>   of the existing code flows.
> 
> 
> We have finished documentation JIRA(HDDS-6172) for covering this feature
> and we will continue to improve further in master.
> 
> JIRA: HDDS-3816
> 
> Completed tasks: ~ 90
> 
> We wanted to cover the following compatibility issue before the merge:
> 
> HDDS-6209: EC: [Forward compatibility issue] New client to older server
> could fail due to the unavailability for client default replication config
> 
> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> blockers for merge.
> 
> In short what you can do now with this feature:
> 
>   -
> 
>   You can enable EC at bucket level and cluster level.
> 
> How to enable it at bucket level? Just create the bucket by passing the ec
> replication options.
> 
>   -
> 
>   You can create EC keys and read the same back.
>   -
> 
>   You should be able to continue writing even when chosen nodes are
>   failing. (Of Course minimum of Data+Parity live nodes should be available
>   in cluster for complete the write)
>   -
> 
>   You should be able to read the file back even if a few nodes failed in
>   the same ec block group(Failures should not be more than parity number of
>   nodes.).
> 
> What is pending? Offline recovery of lost/missing EC containers. As
> mentioned above, post merge of this branch, I will create a separate JIRA
> for starting the work for OfflineRecovery.
> 
> 
> There are automated acceptance test cases already added. HDDS-6231
> 
> In addition to that, we have also performed basic Acceptance Testing in
> physical cluster:
> 
>   1.
> 
>   Installed 10 nodes cluster and created EC bucket (3:2).
> 
> Uploaded 10GB key.
> 
> Downloaded the same key and checked the md5sum.
> 
> 
>   1.
> 
>   Uploaded 8GB key.
> 
> Downloaded the same key and checked the md5sum.
> 
> 
>   1.
> 
>   Uploaded 3MB key
> 
> Downloaded the same and verified md5sum.
> 
> 
>   1.
> 
>   Changed bucket to (6:3)
> 
> Uploaded 8GB key
> 
> Download the same.
> 
> Also verified the new key should be in 6:3 policy and old keys must be 3:2.
> 
> 
> 
>   1.
> 
>   Verified with several different size key writes and reads.
> 
> 
> Merge checklist items assessment is here:
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> 
> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan Fajth
> <pi...@cloudera.com> for great efforts in core development and also thanks
> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on some
> of the EC tasks.
> 
> Thanks to Marton for design discussion and on some dev tasks as well.
> 
> Thanks to many others who were involved in design discussions, Arpit, Sidd,
> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
> Yiqun Lin.
> Sorry if I miss anyone here, but your efforts are much appreciated. Without
> your tremendous help, we would have not reached this position yet.
> 
> If there are no objections for the merge, I will start the official vote
> later.
> 
> Regards,
> 
> EC Branch Devs


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
For additional commands, e-mail: dev-h...@ozone.apache.org

Reply via email to