+1 for merge.
Thanx Uma for driving this. Good Luck!!!

-Ayush

> On 18-Feb-2022, at 9:38 PM, Rakesh Radhakrishnan <rake...@apache.org> wrote:
> 
> +1 for merging EC into master branch.
> 
> Thanks @Uma for putting this together. Kudos to everyone who contributed to
> this feature!
> 
> Best,
> Rakesh
> 
>> On Tue, Feb 15, 2022 at 1:48 PM Uma gangumalla <umamah...@apache.org> wrote:
>> 
>> Dear Ozone Devs,
>> 
>> As you may know, we have been actively developing Ozone Erasure Coding
>> support in a separate branch HDDS-3816-ec.
>> 
>> We have finished the development of EC key write and read functionality.
>> The support of offline recovery( Recovering replica from node loss) will be
>> part of second phase work.
>> 
>> Since the code has already grown and increasingly started seeing merge
>> complications, we would like to propose to merge the current EC branch into
>> master.
>> 
>> We will file the new JIRA for the second phase of work and continue the
>> offline recovery work there.
>> 
>> Details on Changes:
>> 
>>   -
>> 
>>   Most of the EC core logic went to newly extended classes. Key changes
>>   went into EC*OutputStream and EC*InputStream classes for write and read
>>   respectively. Based on replication type, ECPipelineProvider will be
>> chosen
>>   for creating EC pipelines.
>> 
>> 
>> 
>>   -
>> 
>>   Since we cannot represent the EC replication in the existing replication
>>   factor, we have introduced ECReplicationConfig. The ReplicationConfig
>>   interface is already pushed to master, so it’s not a new idea coming
>>   through this branch merge now. What is newly coming here is the
>>   ECReplicationConfig class which can be used to express EC replication
>>   configuration.
>> 
>> 
>> 
>>   -
>> 
>>   We wanted to provide the support to enable EC at bucket level. To
>>   simplify some complications, we have moved the default replication
>>   configurations from client to server.
>> 
>> 
>> 
>>   -
>> 
>>   Client side replication type and replication factor removed from the
>>   configuration files and introduced the ozone.server.default.replication
>>   and ozone.server.default.replication.type.We would continue to respect
>> if
>>   one configures at client side explicitly or passed through APIs,
>> otherwise
>>   server side bucket level properties or server side default configuration
>>   would take effect.
>> 
>> 
>> 
>>   -
>> 
>>   Other than this change, the rest of EC side code should not impact any
>>   of the existing code flows.
>> 
>> 
>> We have finished documentation JIRA(HDDS-6172) for covering this feature
>> and we will continue to improve further in master.
>> 
>> JIRA: HDDS-3816
>> 
>> Completed tasks: ~ 90
>> 
>> We wanted to cover the following compatibility issue before the merge:
>> 
>> HDDS-6209: EC: [Forward compatibility issue] New client to older server
>> could fail due to the unavailability for client default replication config
>> 
>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
>> blockers for merge.
>> 
>> In short what you can do now with this feature:
>> 
>>   -
>> 
>>   You can enable EC at bucket level and cluster level.
>> 
>> How to enable it at bucket level? Just create the bucket by passing the ec
>> replication options.
>> 
>>   -
>> 
>>   You can create EC keys and read the same back.
>>   -
>> 
>>   You should be able to continue writing even when chosen nodes are
>>   failing. (Of Course minimum of Data+Parity live nodes should be
>> available
>>   in cluster for complete the write)
>>   -
>> 
>>   You should be able to read the file back even if a few nodes failed in
>>   the same ec block group(Failures should not be more than parity number
>> of
>>   nodes.).
>> 
>> What is pending? Offline recovery of lost/missing EC containers. As
>> mentioned above, post merge of this branch, I will create a separate JIRA
>> for starting the work for OfflineRecovery.
>> 
>> 
>> There are automated acceptance test cases already added. HDDS-6231
>> 
>> In addition to that, we have also performed basic Acceptance Testing in
>> physical cluster:
>> 
>>   1.
>> 
>>   Installed 10 nodes cluster and created EC bucket (3:2).
>> 
>> Uploaded 10GB key.
>> 
>> Downloaded the same key and checked the md5sum.
>> 
>> 
>>   1.
>> 
>>   Uploaded 8GB key.
>> 
>> Downloaded the same key and checked the md5sum.
>> 
>> 
>>   1.
>> 
>>   Uploaded 3MB key
>> 
>> Downloaded the same and verified md5sum.
>> 
>> 
>>   1.
>> 
>>   Changed bucket to (6:3)
>> 
>> Uploaded 8GB key
>> 
>> Download the same.
>> 
>> Also verified the new key should be in 6:3 policy and old keys must be 3:2.
>> 
>> 
>> 
>>   1.
>> 
>>   Verified with several different size key writes and reads.
>> 
>> 
>> Merge checklist items assessment is here:
>> 
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>> 
>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan Fajth
>> <pi...@cloudera.com> for great efforts in core development and also thanks
>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on some
>> of the EC tasks.
>> 
>> Thanks to Marton for design discussion and on some dev tasks as well.
>> 
>> Thanks to many others who were involved in design discussions, Arpit, Sidd,
>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
>> Yiqun Lin.
>> Sorry if I miss anyone here, but your efforts are much appreciated. Without
>> your tremendous help, we would have not reached this position yet.
>> 
>> If there are no objections for the merge, I will start the official vote
>> later.
>> 
>> Regards,
>> 
>> EC Branch Devs
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
For additional commands, e-mail: dev-h...@ozone.apache.org

Reply via email to