+1 for the merge. Thanks for the great work!
On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde <ppo...@cloudera.com.invalid> wrote: > +1 for the EC branch merge. > > Regards, > Prashant > > > On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <swa...@apache.org> wrote: > > > > +1 for the EC branch merge. > > > > Best, > > Sid > > > > On Wed, Apr 6, 2022 at 8:05 PM guimark <guim...@126.com> wrote: > > > >> Great news! > >> +1 to merge. > >> > >> > >> > >> > >> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonn...@cloudera.com > .INVALID> > >> wrote: > >>> I have been working on the code on this branch for some time, and I > >> believe > >>> it is in a good state to merge now. It is mostly new code, and if > nothing > >>> attempts to use EC, none of the EC code paths will be executed. > >>> > >>> +1 to merge from me. > >>> > >>> Stephen. > >>> > >>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <umamah...@apache.org> > >> wrote: > >>> > >>>> =====Few Edits Below=================== > >>>> > >>>> Dear Ozone Devs, > >>>> > >>>> As you may know, we have been actively developing Ozone Erasure Coding > >>>> support in a separate branch HDDS-3816-ec. > >>>> > >>>> We have finished the development of EC key write and read > functionality. > >>>> The support of offline recovery( Recovering replica from node loss) > >> will be > >>>> part of second phase work. > >>>> > >>>> Since the code has already grown and increasingly started seeing merge > >>>> complications, we would like to merge the current EC branch into > master. > >>>> > >>>> We filed the new JIRA(HDDS-6462) for the second phase of work and > >> continued > >>>> the offline recovery work there. (we have uploaded the design doc > there) > >>>> > >>>> Details on Changes: > >>>> > >>>> - > >>>> > >>>> Most of the EC core logic went to newly extended classes. Key > changes > >>>> went into EC*OutputStream and EC*InputStream classes for write and > >> read > >>>> respectively. Based on replication type, ECPipelineProvider will be > >>>> chosen > >>>> for creating EC pipelines. > >>>> > >>>> > >>>> > >>>> - > >>>> > >>>> Since we cannot represent the EC replication in the existing > >> replication > >>>> factor, we have introduced ECReplicationConfig. The > ReplicationConfig > >>>> interface is already pushed to master, so it’s not a new idea coming > >>>> through this branch merge now. What is newly coming here is the > >>>> ECReplicationConfig class which can be used to express EC > replication > >>>> configuration. > >>>> > >>>> > >>>> > >>>> - > >>>> > >>>> We wanted to provide the support to enable EC at bucket level. To > >>>> simplify some complications, we have moved the default replication > >>>> configurations from client to server. > >>>> > >>>> > >>>> > >>>> - > >>>> > >>>> Client side replication type and replication factor removed from the > >>>> configuration files and introduced the > >> ozone.server.default.replication > >>>> and ozone.server.default.replication.type.We would continue to > >> respect > >>>> if > >>>> one configures at client side explicitly or passed through APIs, > >>>> otherwise > >>>> server side bucket level properties or server side default > >> configuration > >>>> would take effect. > >>>> > >>>> > >>>> > >>>> - > >>>> > >>>> Other than this change, the rest of EC side code should not impact > >> any > >>>> of the existing code flows. > >>>> > >>>> > >>>> We have finished documentation JIRA(HDDS-6172) for covering this > feature > >>>> and we will continue to improve further in master. > >>>> > >>>> Git Branch Name : HDDS-3816-ec > >>>> > >>>> JIRAs: HDDS-3816 and HDDS-5351 > >>>> > >>>> Completed tasks: ~ 142 > >>>> > >>>> + We are covering the following two mandatory JIRAs to come in: > >>>> > >>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older > >> server > >>>> could fail due to the unavailability for client default replication > >> config > >>>> > >>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. > >>>> > >>>> PRs reviews in-progress and expected to close in a day or two. > >>>> > >>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not > >>>> blockers for merge. > >>>> > >>>> In short what you can do now with this feature: > >>>> > >>>> - > >>>> > >>>> You can enable EC at bucket level and cluster level. > >>>> > >>>> How to enable it at bucket level? Just create the bucket by passing > the > >> ec > >>>> replication options. > >>>> > >>>> - > >>>> > >>>> You can create EC keys and read the same back. > >>>> - > >>>> > >>>> You should be able to continue writing even when chosen nodes are > >>>> failing. (Of Course minimum of Data+Parity live nodes should be > >>>> available > >>>> in cluster for complete the write) > >>>> - > >>>> > >>>> You should be able to read the file back even if a few nodes failed > >> in > >>>> the same ec block group(Failures should not be more than parity > >> number > >>>> of > >>>> nodes.). > >>>> > >>>> What is pending? Offline recovery of lost/missing EC containers. As > >>>> mentioned above, post merge of this branch, I will create a separate > >> JIRA > >>>> for starting the work for OfflineRecovery. > >>>> > >>>> > >>>> There are automated acceptance test cases already added. HDDS-6231 > >>>> > >>>> In addition to that, we have also performed basic Acceptance Testing > in > >>>> physical cluster: > >>>> > >>>> 1. > >>>> > >>>> Installed 10 nodes cluster and created EC bucket (3:2). > >>>> > >>>> Uploaded 10GB key. > >>>> > >>>> Downloaded the same key and checked the md5sum. > >>>> > >>>> 1. > >>>> > >>>> Uploaded 8GB key. > >>>> > >>>> Downloaded the same key and checked the md5sum. > >>>> > >>>> 1. > >>>> > >>>> Uploaded 3MB key > >>>> > >>>> Downloaded the same and verified md5sum. > >>>> > >>>> 1. > >>>> > >>>> Changed bucket to (6:3) > >>>> > >>>> Uploaded 8GB key > >>>> > >>>> Download the same. > >>>> > >>>> Also verified the new key should be in 6:3 policy and old keys must be > >>>> 3:2.Verified > >>>> with several different size key writes and reads. > >>>> > >>>> > >>>> > >>>> Since the merge discussion thread, we have well stabilized code and > >> fixed > >>>> several bugs. > >>>> > >>>> > >>>> Merge checklist items assessment is here: > >>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > >>>> > >>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > >> Fajth > >>>> <pi...@cloudera.com> for great efforts in core development and also > >> thanks > >>>> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for > >> collaborating > >>>> on some of the EC tasks. > >>>> > >>>> Thanks to Marton for design discussion and on some dev tasks as well. > >>>> > >>>> Thanks to many others who were involved in design discussions, Arpit, > >> Sidd, > >>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, > >> Rakesh, > >>>> Yiqun Lin. > >>>> Sorry if I miss anyone here, but your efforts are much appreciated. > >> Without > >>>> your tremendous help, we would have not reached this position yet. > >>>> > >>>> > >>>> > >>>> To start with, here is my +1 > >>>> > >>>> The vote will run for 5 days. > >>>> > >>>> Regards, > >>>> Uma > >>>> > >>>> > >>>> > >>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <umamah...@apache.org> > >>>> wrote: > >>>> > >>>>> Dear Ozone Devs, > >>>>> > >>>>> As you may know, we have been actively developing Ozone Erasure > Coding > >>>>> support in a separate branch HDDS-3816-ec. > >>>>> > >>>>> We have finished the development of EC key write and read > >> functionality. > >>>>> The support of offline recovery( Recovering replica from node loss) > >> will > >>>> be > >>>>> part of second phase work. > >>>>> > >>>>> Since the code has already grown and increasingly started seeing > merge > >>>>> complications, we would like to propose to merge the current EC > branch > >>>> into > >>>>> master. > >>>>> > >>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and > >>>>> continued the offline recovery work there. > >>>>> > >>>>> Details on Changes: > >>>>> > >>>>> - > >>>>> > >>>>> Most of the EC core logic went to newly extended classes. Key > >> changes > >>>>> went into EC*OutputStream and EC*InputStream classes for write and > >>>> read > >>>>> respectively. Based on replication type, ECPipelineProvider will be > >>>> chosen > >>>>> for creating EC pipelines. > >>>>> > >>>>> > >>>>> > >>>>> - > >>>>> > >>>>> Since we cannot represent the EC replication in the existing > >>>>> replication factor, we have introduced ECReplicationConfig. The > >>>>> ReplicationConfig interface is already pushed to master, so it’s > >> not > >>>> a new > >>>>> idea coming through this branch merge now. What is newly coming > >> here > >>>> is the > >>>>> ECReplicationConfig class which can be used to express EC > >> replication > >>>>> configuration. > >>>>> > >>>>> > >>>>> > >>>>> - > >>>>> > >>>>> We wanted to provide the support to enable EC at bucket level. To > >>>>> simplify some complications, we have moved the default replication > >>>>> configurations from client to server. > >>>>> > >>>>> > >>>>> > >>>>> - > >>>>> > >>>>> Client side replication type and replication factor removed from > >> the > >>>>> configuration files and introduced the > >>>> ozone.server.default.replication > >>>>> and ozone.server.default.replication.type.We would continue to > >>>> respect if > >>>>> one configures at client side explicitly or passed through APIs, > >>>> otherwise > >>>>> server side bucket level properties or server side default > >>>> configuration > >>>>> would take effect. > >>>>> > >>>>> > >>>>> > >>>>> - > >>>>> > >>>>> Other than this change, the rest of EC side code should not impact > >> any > >>>>> of the existing code flows. > >>>>> > >>>>> > >>>>> We have finished documentation JIRA(HDDS-6172) for covering this > >> feature > >>>>> and we will continue to improve further in master. > >>>>> > >>>>> Git Branch Name : HDDS-3816-ec > >>>>> > >>>>> JIRAs: HDDS-3816 and HDDS-5351 > >>>>> > >>>>> Completed tasks: ~ 142 > >>>>> > >>>>> + We are covering the following two mandatory JIRAs: > >>>>> > >>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older > >>>>> server could fail due to the unavailability for client default > >>>> replication > >>>>> config > >>>>> > >>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. > >>>>> > >>>>> PRs reviews in-progress and expected to close in a day or two. > >>>>> > >>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not > >>>>> blockers for merge. > >>>>> > >>>>> In short what you can do now with this feature: > >>>>> > >>>>> - > >>>>> > >>>>> You can enable EC at bucket level and cluster level. > >>>>> > >>>>> How to enable it at bucket level? Just create the bucket by passing > >> the > >>>> ec > >>>>> replication options. > >>>>> > >>>>> - > >>>>> > >>>>> You can create EC keys and read the same back. > >>>>> - > >>>>> > >>>>> You should be able to continue writing even when chosen nodes are > >>>>> failing. (Of Course minimum of Data+Parity live nodes should be > >>>> available > >>>>> in cluster for complete the write) > >>>>> - > >>>>> > >>>>> You should be able to read the file back even if a few nodes > >> failed in > >>>>> the same ec block group(Failures should not be more than parity > >>>> number of > >>>>> nodes.). > >>>>> > >>>>> What is pending? Offline recovery of lost/missing EC containers. As > >>>>> mentioned above, post merge of this branch, I will create a separate > >> JIRA > >>>>> for starting the work for OfflineRecovery. > >>>>> > >>>>> > >>>>> There are automated acceptance test cases already added. HDDS-6231 > >>>>> > >>>>> In addition to that, we have also performed basic Acceptance Testing > >> in > >>>>> physical cluster: > >>>>> > >>>>> 1. > >>>>> > >>>>> Installed 10 nodes cluster and created EC bucket (3:2). > >>>>> > >>>>> Uploaded 10GB key. > >>>>> > >>>>> Downloaded the same key and checked the md5sum. > >>>>> > >>>>> 1. > >>>>> > >>>>> Uploaded 8GB key. > >>>>> > >>>>> Downloaded the same key and checked the md5sum. > >>>>> > >>>>> 1. > >>>>> > >>>>> Uploaded 3MB key > >>>>> > >>>>> Downloaded the same and verified md5sum. > >>>>> > >>>>> 1. > >>>>> > >>>>> Changed bucket to (6:3) > >>>>> > >>>>> Uploaded 8GB key > >>>>> > >>>>> Download the same. > >>>>> > >>>>> Also verified the new key should be in 6:3 policy and old keys must > be > >>>> 3:2.Verified > >>>>> with several different size key writes and reads. > >>>>> > >>>>> Merge checklist items assessment is here: > >>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > >>>>> > >>>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > >> Fajth > >>>>> <pi...@cloudera.com> for great efforts in core development and also > >>>>> thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for > >> collaborating > >>>>> on some of the EC tasks. > >>>>> > >>>>> Thanks to Marton for design discussion and on some dev tasks as well. > >>>>> > >>>>> Thanks to many others who were involved in design discussions, Arpit, > >>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, > >> Prashanth, > >>>>> Rakesh, Yiqun Lin. > >>>>> Sorry if I miss anyone here, but your efforts are much appreciated. > >>>>> Without your tremendous help, we would have not reached this position > >>>> yet. > >>>>> > >>>>> If there are no objections for the merge, I will start the official > >> vote > >>>>> later. > >>>>> > >>>>> Regards, > >>>>> > >>>>> EC Branch Devs > >>>>> > >>>> > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > For additional commands, e-mail: dev-h...@ozone.apache.org > > -- Thanks & Regards, Aravindan