+1 for merge. Thank you uma for driving this. Thanks, Bharat
On Fri, Feb 18, 2022 at 6:25 AM Dinesh Chitlangia <dine...@apache.org> wrote: > +1 to merge. > > Thanks, > Dinesh > > On Fri, Feb 18, 2022 at 3:44 AM mingchao zhao <captain...@apache.org> > wrote: > > > +1 for merge EC to master. > > > > Thanks Uma for pushing this forward. > > > > guimark <guim...@126.com> 于2022年2月18日周五 14:17写道: > > > > > +1 > > > Great works from Uma, Stephen, Istvan, Markton and all Ozone EC > > developers! > > > > > > > > > We'll keep helping fix BUGs of EC to make the EC feature more and more > > > robust. > > > > > > > > > And we are really looking forward to start detailed design and > > discussions > > > about the Offline Recovery feature, > > > which is very important for a productional EC implementation. > > > > > > > > > > > > > > > Cheers! > > > > > > > > > At 2022-02-15 16:17:55, "Uma gangumalla" <umamah...@apache.org> wrote: > > > >Dear Ozone Devs, > > > > > > > >As you may know, we have been actively developing Ozone Erasure Coding > > > >support in a separate branch HDDS-3816-ec. > > > > > > > >We have finished the development of EC key write and read > functionality. > > > >The support of offline recovery( Recovering replica from node loss) > will > > > be > > > >part of second phase work. > > > > > > > >Since the code has already grown and increasingly started seeing merge > > > >complications, we would like to propose to merge the current EC branch > > > into > > > >master. > > > > > > > >We will file the new JIRA for the second phase of work and continue > the > > > >offline recovery work there. > > > > > > > >Details on Changes: > > > > > > > > - > > > > > > > > Most of the EC core logic went to newly extended classes. Key > changes > > > > went into EC*OutputStream and EC*InputStream classes for write and > > read > > > > respectively. Based on replication type, ECPipelineProvider will be > > > chosen > > > > for creating EC pipelines. > > > > > > > > > > > > > > > > - > > > > > > > > Since we cannot represent the EC replication in the existing > > > replication > > > > factor, we have introduced ECReplicationConfig. The > ReplicationConfig > > > > interface is already pushed to master, so it’s not a new idea > coming > > > > through this branch merge now. What is newly coming here is the > > > > ECReplicationConfig class which can be used to express EC > replication > > > > configuration. > > > > > > > > > > > > > > > > - > > > > > > > > We wanted to provide the support to enable EC at bucket level. To > > > > simplify some complications, we have moved the default replication > > > > configurations from client to server. > > > > > > > > > > > > > > > > - > > > > > > > > Client side replication type and replication factor removed from > the > > > > configuration files and introduced the > > ozone.server.default.replication > > > > and ozone.server.default.replication.type.We would continue to > > respect > > > if > > > > one configures at client side explicitly or passed through APIs, > > > otherwise > > > > server side bucket level properties or server side default > > > configuration > > > > would take effect. > > > > > > > > > > > > > > > > - > > > > > > > > Other than this change, the rest of EC side code should not impact > > any > > > > of the existing code flows. > > > > > > > > > > > >We have finished documentation JIRA(HDDS-6172) for covering this > feature > > > >and we will continue to improve further in master. > > > > > > > >JIRA: HDDS-3816 > > > > > > > >Completed tasks: ~ 90 > > > > > > > >We wanted to cover the following compatibility issue before the merge: > > > > > > > >HDDS-6209: EC: [Forward compatibility issue] New client to older > server > > > >could fail due to the unavailability for client default replication > > config > > > > > > > >Few other JIRAs in HDDS-3816 are still open but I believe they're not > > > >blockers for merge. > > > > > > > >In short what you can do now with this feature: > > > > > > > > - > > > > > > > > You can enable EC at bucket level and cluster level. > > > > > > > >How to enable it at bucket level? Just create the bucket by passing > the > > ec > > > >replication options. > > > > > > > > - > > > > > > > > You can create EC keys and read the same back. > > > > - > > > > > > > > You should be able to continue writing even when chosen nodes are > > > > failing. (Of Course minimum of Data+Parity live nodes should be > > > available > > > > in cluster for complete the write) > > > > - > > > > > > > > You should be able to read the file back even if a few nodes failed > > in > > > > the same ec block group(Failures should not be more than parity > > number > > > of > > > > nodes.). > > > > > > > >What is pending? Offline recovery of lost/missing EC containers. As > > > >mentioned above, post merge of this branch, I will create a separate > > JIRA > > > >for starting the work for OfflineRecovery. > > > > > > > > > > > >There are automated acceptance test cases already added. HDDS-6231 > > > > > > > >In addition to that, we have also performed basic Acceptance Testing > in > > > >physical cluster: > > > > > > > > 1. > > > > > > > > Installed 10 nodes cluster and created EC bucket (3:2). > > > > > > > >Uploaded 10GB key. > > > > > > > >Downloaded the same key and checked the md5sum. > > > > > > > > > > > > 1. > > > > > > > > Uploaded 8GB key. > > > > > > > >Downloaded the same key and checked the md5sum. > > > > > > > > > > > > 1. > > > > > > > > Uploaded 3MB key > > > > > > > >Downloaded the same and verified md5sum. > > > > > > > > > > > > 1. > > > > > > > > Changed bucket to (6:3) > > > > > > > >Uploaded 8GB key > > > > > > > >Download the same. > > > > > > > >Also verified the new key should be in 6:3 policy and old keys must be > > > 3:2. > > > > > > > > > > > > > > > > 1. > > > > > > > > Verified with several different size key writes and reads. > > > > > > > > > > > >Merge checklist items assessment is here: > > > > > > > > > > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > > > > > > > >Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > > Fajth > > > ><pi...@cloudera.com> for great efforts in core development and also > > > thanks > > > >a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on > > some > > > >of the EC tasks. > > > > > > > >Thanks to Marton for design discussion and on some dev tasks as well. > > > > > > > >Thanks to many others who were involved in design discussions, Arpit, > > > Sidd, > > > >Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, > > > Rakesh, > > > >Yiqun Lin. > > > >Sorry if I miss anyone here, but your efforts are much appreciated. > > > Without > > > >your tremendous help, we would have not reached this position yet. > > > > > > > >If there are no objections for the merge, I will start the official > vote > > > >later. > > > > > > > >Regards, > > > > > > > >EC Branch Devs > > > > > >