Thanks everyone for voting. The vote has passed with the following stats: +1's : 15 ( Stephen, Mark, Siddharth, Prashant, Aravindan, Nicholas, Bharat, Ayush, Lokesh, Mukul, Mingchao, Jackson, Hanisha, Shashikant, Neil, Sammi, Dinesh, Janus ) No -1s.
As promised, HDDS-5909 and HDDS-6209 are committed in the branch. So, EC branch covered the compatibility issues. Also, we are tracking the merge PR at https://github.com/apache/ozone/pull/3301 to make sure we get green CI before merge. We have got green CI which includes all expected changes in the branch. I will go ahead and merge the branch soon. Regards, Uma On Tue, Apr 5, 2022 at 11:11 PM Uma gangumalla <umamah...@apache.org> wrote: > =====Few Edits Below=================== > > Dear Ozone Devs, > > As you may know, we have been actively developing Ozone Erasure Coding > support in a separate branch HDDS-3816-ec. > > We have finished the development of EC key write and read functionality. > The support of offline recovery( Recovering replica from node loss) will be > part of second phase work. > > Since the code has already grown and increasingly started seeing merge > complications, we would like to merge the current EC branch into master. > > We filed the new JIRA(HDDS-6462) for the second phase of work and > continued the offline recovery work there. (we have uploaded the design doc > there) > > Details on Changes: > > - > > Most of the EC core logic went to newly extended classes. Key changes > went into EC*OutputStream and EC*InputStream classes for write and read > respectively. Based on replication type, ECPipelineProvider will be chosen > for creating EC pipelines. > > > > - > > Since we cannot represent the EC replication in the existing > replication factor, we have introduced ECReplicationConfig. The > ReplicationConfig interface is already pushed to master, so it’s not a new > idea coming through this branch merge now. What is newly coming here is the > ECReplicationConfig class which can be used to express EC replication > configuration. > > > > - > > We wanted to provide the support to enable EC at bucket level. To > simplify some complications, we have moved the default replication > configurations from client to server. > > > > - > > Client side replication type and replication factor removed from the > configuration files and introduced the ozone.server.default.replication > and ozone.server.default.replication.type.We would continue to respect if > one configures at client side explicitly or passed through APIs, otherwise > server side bucket level properties or server side default configuration > would take effect. > > > > - > > Other than this change, the rest of EC side code should not impact any > of the existing code flows. > > > We have finished documentation JIRA(HDDS-6172) for covering this feature > and we will continue to improve further in master. > > Git Branch Name : HDDS-3816-ec > > JIRAs: HDDS-3816 and HDDS-5351 > > Completed tasks: ~ 142 > > + We are covering the following two mandatory JIRAs to come in: > > 1. HDDS-6209: EC: [Forward compatibility issue] New client to older > server could fail due to the unavailability for client default replication > config > > 2. HDDS-5909: EC: Onboard EC into upgrade framework. > > PRs reviews in-progress and expected to close in a day or two. > > Few other JIRAs in HDDS-3816 are still open but I believe they're not > blockers for merge. > > In short what you can do now with this feature: > > - > > You can enable EC at bucket level and cluster level. > > How to enable it at bucket level? Just create the bucket by passing the ec > replication options. > > - > > You can create EC keys and read the same back. > - > > You should be able to continue writing even when chosen nodes are > failing. (Of Course minimum of Data+Parity live nodes should be available > in cluster for complete the write) > - > > You should be able to read the file back even if a few nodes failed in > the same ec block group(Failures should not be more than parity number of > nodes.). > > What is pending? Offline recovery of lost/missing EC containers. As > mentioned above, post merge of this branch, I will create a separate JIRA > for starting the work for OfflineRecovery. > > > There are automated acceptance test cases already added. HDDS-6231 > > In addition to that, we have also performed basic Acceptance Testing in > physical cluster: > > 1. > > Installed 10 nodes cluster and created EC bucket (3:2). > > Uploaded 10GB key. > > Downloaded the same key and checked the md5sum. > > 1. > > Uploaded 8GB key. > > Downloaded the same key and checked the md5sum. > > 1. > > Uploaded 3MB key > > Downloaded the same and verified md5sum. > > 1. > > Changed bucket to (6:3) > > Uploaded 8GB key > > Download the same. > > Also verified the new key should be in 6:3 policy and old keys must be > 3:2.Verified > with several different size key writes and reads. > > > > Since the merge discussion thread, we have well stabilized code and fixed > several bugs. > > > Merge checklist items assessment is here: > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > > Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan Fajth > <pi...@cloudera.com> for great efforts in core development and also > thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for > collaborating on some of the EC tasks. > > Thanks to Marton for design discussion and on some dev tasks as well. > > Thanks to many others who were involved in design discussions, Arpit, > Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, > Rakesh, Yiqun Lin. > Sorry if I miss anyone here, but your efforts are much appreciated. > Without your tremendous help, we would have not reached this position yet. > > > > To start with, here is my +1 > > The vote will run for 5 days. > > Regards, > Uma > > > > On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <umamah...@apache.org> > wrote: > >> Dear Ozone Devs, >> >> As you may know, we have been actively developing Ozone Erasure Coding >> support in a separate branch HDDS-3816-ec. >> >> We have finished the development of EC key write and read functionality. >> The support of offline recovery( Recovering replica from node loss) will be >> part of second phase work. >> >> Since the code has already grown and increasingly started seeing merge >> complications, we would like to propose to merge the current EC branch into >> master. >> >> We filed the new JIRA(HDDS-6462) for the second phase of work and >> continued the offline recovery work there. >> >> Details on Changes: >> >> - >> >> Most of the EC core logic went to newly extended classes. Key changes >> went into EC*OutputStream and EC*InputStream classes for write and read >> respectively. Based on replication type, ECPipelineProvider will be chosen >> for creating EC pipelines. >> >> >> >> - >> >> Since we cannot represent the EC replication in the existing >> replication factor, we have introduced ECReplicationConfig. The >> ReplicationConfig interface is already pushed to master, so it’s not a new >> idea coming through this branch merge now. What is newly coming here is >> the >> ECReplicationConfig class which can be used to express EC replication >> configuration. >> >> >> >> - >> >> We wanted to provide the support to enable EC at bucket level. To >> simplify some complications, we have moved the default replication >> configurations from client to server. >> >> >> >> - >> >> Client side replication type and replication factor removed from the >> configuration files and introduced the ozone.server.default.replication >> and ozone.server.default.replication.type.We would continue to respect if >> one configures at client side explicitly or passed through APIs, otherwise >> server side bucket level properties or server side default configuration >> would take effect. >> >> >> >> - >> >> Other than this change, the rest of EC side code should not impact >> any of the existing code flows. >> >> >> We have finished documentation JIRA(HDDS-6172) for covering this feature >> and we will continue to improve further in master. >> >> Git Branch Name : HDDS-3816-ec >> >> JIRAs: HDDS-3816 and HDDS-5351 >> >> Completed tasks: ~ 142 >> >> + We are covering the following two mandatory JIRAs: >> >> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older >> server could fail due to the unavailability for client default replication >> config >> >> 2. HDDS-5909: EC: Onboard EC into upgrade framework. >> >> PRs reviews in-progress and expected to close in a day or two. >> >> Few other JIRAs in HDDS-3816 are still open but I believe they're not >> blockers for merge. >> >> In short what you can do now with this feature: >> >> - >> >> You can enable EC at bucket level and cluster level. >> >> How to enable it at bucket level? Just create the bucket by passing the >> ec replication options. >> >> - >> >> You can create EC keys and read the same back. >> - >> >> You should be able to continue writing even when chosen nodes are >> failing. (Of Course minimum of Data+Parity live nodes should be available >> in cluster for complete the write) >> - >> >> You should be able to read the file back even if a few nodes failed >> in the same ec block group(Failures should not be more than parity number >> of nodes.). >> >> What is pending? Offline recovery of lost/missing EC containers. As >> mentioned above, post merge of this branch, I will create a separate JIRA >> for starting the work for OfflineRecovery. >> >> >> There are automated acceptance test cases already added. HDDS-6231 >> >> In addition to that, we have also performed basic Acceptance Testing in >> physical cluster: >> >> 1. >> >> Installed 10 nodes cluster and created EC bucket (3:2). >> >> Uploaded 10GB key. >> >> Downloaded the same key and checked the md5sum. >> >> 1. >> >> Uploaded 8GB key. >> >> Downloaded the same key and checked the md5sum. >> >> 1. >> >> Uploaded 3MB key >> >> Downloaded the same and verified md5sum. >> >> 1. >> >> Changed bucket to (6:3) >> >> Uploaded 8GB key >> >> Download the same. >> >> Also verified the new key should be in 6:3 policy and old keys must be >> 3:2.Verified with several different size key writes and reads. >> >> Merge checklist items assessment is here: >> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist >> >> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan Fajth >> <pi...@cloudera.com> for great efforts in core development and also >> thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating >> on some of the EC tasks. >> >> Thanks to Marton for design discussion and on some dev tasks as well. >> >> Thanks to many others who were involved in design discussions, Arpit, >> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, >> Rakesh, Yiqun Lin. >> Sorry if I miss anyone here, but your efforts are much appreciated. >> Without your tremendous help, we would have not reached this position yet. >> >> If there are no objections for the merge, I will start the official vote >> later. >> >> Regards, >> >> EC Branch Devs >> >