+1 for the EC branch merge. Best, Sid
On Wed, Apr 6, 2022 at 8:05 PM guimark <guim...@126.com> wrote: > Great news! > +1 to merge. > > > > > At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonn...@cloudera.com.INVALID> > wrote: > >I have been working on the code on this branch for some time, and I > believe > >it is in a good state to merge now. It is mostly new code, and if nothing > >attempts to use EC, none of the EC code paths will be executed. > > > >+1 to merge from me. > > > >Stephen. > > > >On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <umamah...@apache.org> > wrote: > > > >> =====Few Edits Below=================== > >> > >> Dear Ozone Devs, > >> > >> As you may know, we have been actively developing Ozone Erasure Coding > >> support in a separate branch HDDS-3816-ec. > >> > >> We have finished the development of EC key write and read functionality. > >> The support of offline recovery( Recovering replica from node loss) > will be > >> part of second phase work. > >> > >> Since the code has already grown and increasingly started seeing merge > >> complications, we would like to merge the current EC branch into master. > >> > >> We filed the new JIRA(HDDS-6462) for the second phase of work and > continued > >> the offline recovery work there. (we have uploaded the design doc there) > >> > >> Details on Changes: > >> > >> - > >> > >> Most of the EC core logic went to newly extended classes. Key changes > >> went into EC*OutputStream and EC*InputStream classes for write and > read > >> respectively. Based on replication type, ECPipelineProvider will be > >> chosen > >> for creating EC pipelines. > >> > >> > >> > >> - > >> > >> Since we cannot represent the EC replication in the existing > replication > >> factor, we have introduced ECReplicationConfig. The ReplicationConfig > >> interface is already pushed to master, so it’s not a new idea coming > >> through this branch merge now. What is newly coming here is the > >> ECReplicationConfig class which can be used to express EC replication > >> configuration. > >> > >> > >> > >> - > >> > >> We wanted to provide the support to enable EC at bucket level. To > >> simplify some complications, we have moved the default replication > >> configurations from client to server. > >> > >> > >> > >> - > >> > >> Client side replication type and replication factor removed from the > >> configuration files and introduced the > ozone.server.default.replication > >> and ozone.server.default.replication.type.We would continue to > respect > >> if > >> one configures at client side explicitly or passed through APIs, > >> otherwise > >> server side bucket level properties or server side default > configuration > >> would take effect. > >> > >> > >> > >> - > >> > >> Other than this change, the rest of EC side code should not impact > any > >> of the existing code flows. > >> > >> > >> We have finished documentation JIRA(HDDS-6172) for covering this feature > >> and we will continue to improve further in master. > >> > >> Git Branch Name : HDDS-3816-ec > >> > >> JIRAs: HDDS-3816 and HDDS-5351 > >> > >> Completed tasks: ~ 142 > >> > >> + We are covering the following two mandatory JIRAs to come in: > >> > >> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older > server > >> could fail due to the unavailability for client default replication > config > >> > >> 2. HDDS-5909: EC: Onboard EC into upgrade framework. > >> > >> PRs reviews in-progress and expected to close in a day or two. > >> > >> Few other JIRAs in HDDS-3816 are still open but I believe they're not > >> blockers for merge. > >> > >> In short what you can do now with this feature: > >> > >> - > >> > >> You can enable EC at bucket level and cluster level. > >> > >> How to enable it at bucket level? Just create the bucket by passing the > ec > >> replication options. > >> > >> - > >> > >> You can create EC keys and read the same back. > >> - > >> > >> You should be able to continue writing even when chosen nodes are > >> failing. (Of Course minimum of Data+Parity live nodes should be > >> available > >> in cluster for complete the write) > >> - > >> > >> You should be able to read the file back even if a few nodes failed > in > >> the same ec block group(Failures should not be more than parity > number > >> of > >> nodes.). > >> > >> What is pending? Offline recovery of lost/missing EC containers. As > >> mentioned above, post merge of this branch, I will create a separate > JIRA > >> for starting the work for OfflineRecovery. > >> > >> > >> There are automated acceptance test cases already added. HDDS-6231 > >> > >> In addition to that, we have also performed basic Acceptance Testing in > >> physical cluster: > >> > >> 1. > >> > >> Installed 10 nodes cluster and created EC bucket (3:2). > >> > >> Uploaded 10GB key. > >> > >> Downloaded the same key and checked the md5sum. > >> > >> 1. > >> > >> Uploaded 8GB key. > >> > >> Downloaded the same key and checked the md5sum. > >> > >> 1. > >> > >> Uploaded 3MB key > >> > >> Downloaded the same and verified md5sum. > >> > >> 1. > >> > >> Changed bucket to (6:3) > >> > >> Uploaded 8GB key > >> > >> Download the same. > >> > >> Also verified the new key should be in 6:3 policy and old keys must be > >> 3:2.Verified > >> with several different size key writes and reads. > >> > >> > >> > >> Since the merge discussion thread, we have well stabilized code and > fixed > >> several bugs. > >> > >> > >> Merge checklist items assessment is here: > >> > >> > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > >> > >> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > Fajth > >> <pi...@cloudera.com> for great efforts in core development and also > thanks > >> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for > collaborating > >> on some of the EC tasks. > >> > >> Thanks to Marton for design discussion and on some dev tasks as well. > >> > >> Thanks to many others who were involved in design discussions, Arpit, > Sidd, > >> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, > Rakesh, > >> Yiqun Lin. > >> Sorry if I miss anyone here, but your efforts are much appreciated. > Without > >> your tremendous help, we would have not reached this position yet. > >> > >> > >> > >> To start with, here is my +1 > >> > >> The vote will run for 5 days. > >> > >> Regards, > >> Uma > >> > >> > >> > >> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <umamah...@apache.org> > >> wrote: > >> > >> > Dear Ozone Devs, > >> > > >> > As you may know, we have been actively developing Ozone Erasure Coding > >> > support in a separate branch HDDS-3816-ec. > >> > > >> > We have finished the development of EC key write and read > functionality. > >> > The support of offline recovery( Recovering replica from node loss) > will > >> be > >> > part of second phase work. > >> > > >> > Since the code has already grown and increasingly started seeing merge > >> > complications, we would like to propose to merge the current EC branch > >> into > >> > master. > >> > > >> > We filed the new JIRA(HDDS-6462) for the second phase of work and > >> > continued the offline recovery work there. > >> > > >> > Details on Changes: > >> > > >> > - > >> > > >> > Most of the EC core logic went to newly extended classes. Key > changes > >> > went into EC*OutputStream and EC*InputStream classes for write and > >> read > >> > respectively. Based on replication type, ECPipelineProvider will be > >> chosen > >> > for creating EC pipelines. > >> > > >> > > >> > > >> > - > >> > > >> > Since we cannot represent the EC replication in the existing > >> > replication factor, we have introduced ECReplicationConfig. The > >> > ReplicationConfig interface is already pushed to master, so it’s > not > >> a new > >> > idea coming through this branch merge now. What is newly coming > here > >> is the > >> > ECReplicationConfig class which can be used to express EC > replication > >> > configuration. > >> > > >> > > >> > > >> > - > >> > > >> > We wanted to provide the support to enable EC at bucket level. To > >> > simplify some complications, we have moved the default replication > >> > configurations from client to server. > >> > > >> > > >> > > >> > - > >> > > >> > Client side replication type and replication factor removed from > the > >> > configuration files and introduced the > >> ozone.server.default.replication > >> > and ozone.server.default.replication.type.We would continue to > >> respect if > >> > one configures at client side explicitly or passed through APIs, > >> otherwise > >> > server side bucket level properties or server side default > >> configuration > >> > would take effect. > >> > > >> > > >> > > >> > - > >> > > >> > Other than this change, the rest of EC side code should not impact > any > >> > of the existing code flows. > >> > > >> > > >> > We have finished documentation JIRA(HDDS-6172) for covering this > feature > >> > and we will continue to improve further in master. > >> > > >> > Git Branch Name : HDDS-3816-ec > >> > > >> > JIRAs: HDDS-3816 and HDDS-5351 > >> > > >> > Completed tasks: ~ 142 > >> > > >> > + We are covering the following two mandatory JIRAs: > >> > > >> > 1. HDDS-6209: EC: [Forward compatibility issue] New client to older > >> > server could fail due to the unavailability for client default > >> replication > >> > config > >> > > >> > 2. HDDS-5909: EC: Onboard EC into upgrade framework. > >> > > >> > PRs reviews in-progress and expected to close in a day or two. > >> > > >> > Few other JIRAs in HDDS-3816 are still open but I believe they're not > >> > blockers for merge. > >> > > >> > In short what you can do now with this feature: > >> > > >> > - > >> > > >> > You can enable EC at bucket level and cluster level. > >> > > >> > How to enable it at bucket level? Just create the bucket by passing > the > >> ec > >> > replication options. > >> > > >> > - > >> > > >> > You can create EC keys and read the same back. > >> > - > >> > > >> > You should be able to continue writing even when chosen nodes are > >> > failing. (Of Course minimum of Data+Parity live nodes should be > >> available > >> > in cluster for complete the write) > >> > - > >> > > >> > You should be able to read the file back even if a few nodes > failed in > >> > the same ec block group(Failures should not be more than parity > >> number of > >> > nodes.). > >> > > >> > What is pending? Offline recovery of lost/missing EC containers. As > >> > mentioned above, post merge of this branch, I will create a separate > JIRA > >> > for starting the work for OfflineRecovery. > >> > > >> > > >> > There are automated acceptance test cases already added. HDDS-6231 > >> > > >> > In addition to that, we have also performed basic Acceptance Testing > in > >> > physical cluster: > >> > > >> > 1. > >> > > >> > Installed 10 nodes cluster and created EC bucket (3:2). > >> > > >> > Uploaded 10GB key. > >> > > >> > Downloaded the same key and checked the md5sum. > >> > > >> > 1. > >> > > >> > Uploaded 8GB key. > >> > > >> > Downloaded the same key and checked the md5sum. > >> > > >> > 1. > >> > > >> > Uploaded 3MB key > >> > > >> > Downloaded the same and verified md5sum. > >> > > >> > 1. > >> > > >> > Changed bucket to (6:3) > >> > > >> > Uploaded 8GB key > >> > > >> > Download the same. > >> > > >> > Also verified the new key should be in 6:3 policy and old keys must be > >> 3:2.Verified > >> > with several different size key writes and reads. > >> > > >> > Merge checklist items assessment is here: > >> > > >> > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > >> > > >> > Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > Fajth > >> > <pi...@cloudera.com> for great efforts in core development and also > >> > thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for > collaborating > >> > on some of the EC tasks. > >> > > >> > Thanks to Marton for design discussion and on some dev tasks as well. > >> > > >> > Thanks to many others who were involved in design discussions, Arpit, > >> > Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, > Prashanth, > >> > Rakesh, Yiqun Lin. > >> > Sorry if I miss anyone here, but your efforts are much appreciated. > >> > Without your tremendous help, we would have not reached this position > >> yet. > >> > > >> > If there are no objections for the merge, I will start the official > vote > >> > later. > >> > > >> > Regards, > >> > > >> > EC Branch Devs > >> > > >> >