+1 We should merge it so that more people can try it. We can work on the remaining tasks in the master branch. Thanks a lot!
Tsz-Wo On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan <avija...@cloudera.com.invalid> wrote: > +1 for the merge. Thanks for the great work! > > > > On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde <ppo...@cloudera.com.invalid > > > wrote: > > > +1 for the EC branch merge. > > > > Regards, > > Prashant > > > > > On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <swa...@apache.org> wrote: > > > > > > +1 for the EC branch merge. > > > > > > Best, > > > Sid > > > > > > On Wed, Apr 6, 2022 at 8:05 PM guimark <guim...@126.com> wrote: > > > > > >> Great news! > > >> +1 to merge. > > >> > > >> > > >> > > >> > > >> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonn...@cloudera.com > > .INVALID> > > >> wrote: > > >>> I have been working on the code on this branch for some time, and I > > >> believe > > >>> it is in a good state to merge now. It is mostly new code, and if > > nothing > > >>> attempts to use EC, none of the EC code paths will be executed. > > >>> > > >>> +1 to merge from me. > > >>> > > >>> Stephen. > > >>> > > >>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <umamah...@apache.org> > > >> wrote: > > >>> > > >>>> =====Few Edits Below=================== > > >>>> > > >>>> Dear Ozone Devs, > > >>>> > > >>>> As you may know, we have been actively developing Ozone Erasure > Coding > > >>>> support in a separate branch HDDS-3816-ec. > > >>>> > > >>>> We have finished the development of EC key write and read > > functionality. > > >>>> The support of offline recovery( Recovering replica from node loss) > > >> will be > > >>>> part of second phase work. > > >>>> > > >>>> Since the code has already grown and increasingly started seeing > merge > > >>>> complications, we would like to merge the current EC branch into > > master. > > >>>> > > >>>> We filed the new JIRA(HDDS-6462) for the second phase of work and > > >> continued > > >>>> the offline recovery work there. (we have uploaded the design doc > > there) > > >>>> > > >>>> Details on Changes: > > >>>> > > >>>> - > > >>>> > > >>>> Most of the EC core logic went to newly extended classes. Key > > changes > > >>>> went into EC*OutputStream and EC*InputStream classes for write and > > >> read > > >>>> respectively. Based on replication type, ECPipelineProvider will > be > > >>>> chosen > > >>>> for creating EC pipelines. > > >>>> > > >>>> > > >>>> > > >>>> - > > >>>> > > >>>> Since we cannot represent the EC replication in the existing > > >> replication > > >>>> factor, we have introduced ECReplicationConfig. The > > ReplicationConfig > > >>>> interface is already pushed to master, so it’s not a new idea > coming > > >>>> through this branch merge now. What is newly coming here is the > > >>>> ECReplicationConfig class which can be used to express EC > > replication > > >>>> configuration. > > >>>> > > >>>> > > >>>> > > >>>> - > > >>>> > > >>>> We wanted to provide the support to enable EC at bucket level. To > > >>>> simplify some complications, we have moved the default replication > > >>>> configurations from client to server. > > >>>> > > >>>> > > >>>> > > >>>> - > > >>>> > > >>>> Client side replication type and replication factor removed from > the > > >>>> configuration files and introduced the > > >> ozone.server.default.replication > > >>>> and ozone.server.default.replication.type.We would continue to > > >> respect > > >>>> if > > >>>> one configures at client side explicitly or passed through APIs, > > >>>> otherwise > > >>>> server side bucket level properties or server side default > > >> configuration > > >>>> would take effect. > > >>>> > > >>>> > > >>>> > > >>>> - > > >>>> > > >>>> Other than this change, the rest of EC side code should not impact > > >> any > > >>>> of the existing code flows. > > >>>> > > >>>> > > >>>> We have finished documentation JIRA(HDDS-6172) for covering this > > feature > > >>>> and we will continue to improve further in master. > > >>>> > > >>>> Git Branch Name : HDDS-3816-ec > > >>>> > > >>>> JIRAs: HDDS-3816 and HDDS-5351 > > >>>> > > >>>> Completed tasks: ~ 142 > > >>>> > > >>>> + We are covering the following two mandatory JIRAs to come in: > > >>>> > > >>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older > > >> server > > >>>> could fail due to the unavailability for client default replication > > >> config > > >>>> > > >>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. > > >>>> > > >>>> PRs reviews in-progress and expected to close in a day or two. > > >>>> > > >>>> Few other JIRAs in HDDS-3816 are still open but I believe they're > not > > >>>> blockers for merge. > > >>>> > > >>>> In short what you can do now with this feature: > > >>>> > > >>>> - > > >>>> > > >>>> You can enable EC at bucket level and cluster level. > > >>>> > > >>>> How to enable it at bucket level? Just create the bucket by passing > > the > > >> ec > > >>>> replication options. > > >>>> > > >>>> - > > >>>> > > >>>> You can create EC keys and read the same back. > > >>>> - > > >>>> > > >>>> You should be able to continue writing even when chosen nodes are > > >>>> failing. (Of Course minimum of Data+Parity live nodes should be > > >>>> available > > >>>> in cluster for complete the write) > > >>>> - > > >>>> > > >>>> You should be able to read the file back even if a few nodes > failed > > >> in > > >>>> the same ec block group(Failures should not be more than parity > > >> number > > >>>> of > > >>>> nodes.). > > >>>> > > >>>> What is pending? Offline recovery of lost/missing EC containers. As > > >>>> mentioned above, post merge of this branch, I will create a separate > > >> JIRA > > >>>> for starting the work for OfflineRecovery. > > >>>> > > >>>> > > >>>> There are automated acceptance test cases already added. HDDS-6231 > > >>>> > > >>>> In addition to that, we have also performed basic Acceptance Testing > > in > > >>>> physical cluster: > > >>>> > > >>>> 1. > > >>>> > > >>>> Installed 10 nodes cluster and created EC bucket (3:2). > > >>>> > > >>>> Uploaded 10GB key. > > >>>> > > >>>> Downloaded the same key and checked the md5sum. > > >>>> > > >>>> 1. > > >>>> > > >>>> Uploaded 8GB key. > > >>>> > > >>>> Downloaded the same key and checked the md5sum. > > >>>> > > >>>> 1. > > >>>> > > >>>> Uploaded 3MB key > > >>>> > > >>>> Downloaded the same and verified md5sum. > > >>>> > > >>>> 1. > > >>>> > > >>>> Changed bucket to (6:3) > > >>>> > > >>>> Uploaded 8GB key > > >>>> > > >>>> Download the same. > > >>>> > > >>>> Also verified the new key should be in 6:3 policy and old keys must > be > > >>>> 3:2.Verified > > >>>> with several different size key writes and reads. > > >>>> > > >>>> > > >>>> > > >>>> Since the merge discussion thread, we have well stabilized code and > > >> fixed > > >>>> several bugs. > > >>>> > > >>>> > > >>>> Merge checklist items assessment is here: > > >>>> > > >>>> > > >> > > > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > > >>>> > > >>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > > >> Fajth > > >>>> <pi...@cloudera.com> for great efforts in core development and also > > >> thanks > > >>>> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for > > >> collaborating > > >>>> on some of the EC tasks. > > >>>> > > >>>> Thanks to Marton for design discussion and on some dev tasks as > well. > > >>>> > > >>>> Thanks to many others who were involved in design discussions, > Arpit, > > >> Sidd, > > >>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, > > >> Rakesh, > > >>>> Yiqun Lin. > > >>>> Sorry if I miss anyone here, but your efforts are much appreciated. > > >> Without > > >>>> your tremendous help, we would have not reached this position yet. > > >>>> > > >>>> > > >>>> > > >>>> To start with, here is my +1 > > >>>> > > >>>> The vote will run for 5 days. > > >>>> > > >>>> Regards, > > >>>> Uma > > >>>> > > >>>> > > >>>> > > >>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla < > umamah...@apache.org> > > >>>> wrote: > > >>>> > > >>>>> Dear Ozone Devs, > > >>>>> > > >>>>> As you may know, we have been actively developing Ozone Erasure > > Coding > > >>>>> support in a separate branch HDDS-3816-ec. > > >>>>> > > >>>>> We have finished the development of EC key write and read > > >> functionality. > > >>>>> The support of offline recovery( Recovering replica from node loss) > > >> will > > >>>> be > > >>>>> part of second phase work. > > >>>>> > > >>>>> Since the code has already grown and increasingly started seeing > > merge > > >>>>> complications, we would like to propose to merge the current EC > > branch > > >>>> into > > >>>>> master. > > >>>>> > > >>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and > > >>>>> continued the offline recovery work there. > > >>>>> > > >>>>> Details on Changes: > > >>>>> > > >>>>> - > > >>>>> > > >>>>> Most of the EC core logic went to newly extended classes. Key > > >> changes > > >>>>> went into EC*OutputStream and EC*InputStream classes for write > and > > >>>> read > > >>>>> respectively. Based on replication type, ECPipelineProvider will > be > > >>>> chosen > > >>>>> for creating EC pipelines. > > >>>>> > > >>>>> > > >>>>> > > >>>>> - > > >>>>> > > >>>>> Since we cannot represent the EC replication in the existing > > >>>>> replication factor, we have introduced ECReplicationConfig. The > > >>>>> ReplicationConfig interface is already pushed to master, so it’s > > >> not > > >>>> a new > > >>>>> idea coming through this branch merge now. What is newly coming > > >> here > > >>>> is the > > >>>>> ECReplicationConfig class which can be used to express EC > > >> replication > > >>>>> configuration. > > >>>>> > > >>>>> > > >>>>> > > >>>>> - > > >>>>> > > >>>>> We wanted to provide the support to enable EC at bucket level. To > > >>>>> simplify some complications, we have moved the default > replication > > >>>>> configurations from client to server. > > >>>>> > > >>>>> > > >>>>> > > >>>>> - > > >>>>> > > >>>>> Client side replication type and replication factor removed from > > >> the > > >>>>> configuration files and introduced the > > >>>> ozone.server.default.replication > > >>>>> and ozone.server.default.replication.type.We would continue to > > >>>> respect if > > >>>>> one configures at client side explicitly or passed through APIs, > > >>>> otherwise > > >>>>> server side bucket level properties or server side default > > >>>> configuration > > >>>>> would take effect. > > >>>>> > > >>>>> > > >>>>> > > >>>>> - > > >>>>> > > >>>>> Other than this change, the rest of EC side code should not > impact > > >> any > > >>>>> of the existing code flows. > > >>>>> > > >>>>> > > >>>>> We have finished documentation JIRA(HDDS-6172) for covering this > > >> feature > > >>>>> and we will continue to improve further in master. > > >>>>> > > >>>>> Git Branch Name : HDDS-3816-ec > > >>>>> > > >>>>> JIRAs: HDDS-3816 and HDDS-5351 > > >>>>> > > >>>>> Completed tasks: ~ 142 > > >>>>> > > >>>>> + We are covering the following two mandatory JIRAs: > > >>>>> > > >>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older > > >>>>> server could fail due to the unavailability for client default > > >>>> replication > > >>>>> config > > >>>>> > > >>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. > > >>>>> > > >>>>> PRs reviews in-progress and expected to close in a day or two. > > >>>>> > > >>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're > not > > >>>>> blockers for merge. > > >>>>> > > >>>>> In short what you can do now with this feature: > > >>>>> > > >>>>> - > > >>>>> > > >>>>> You can enable EC at bucket level and cluster level. > > >>>>> > > >>>>> How to enable it at bucket level? Just create the bucket by passing > > >> the > > >>>> ec > > >>>>> replication options. > > >>>>> > > >>>>> - > > >>>>> > > >>>>> You can create EC keys and read the same back. > > >>>>> - > > >>>>> > > >>>>> You should be able to continue writing even when chosen nodes are > > >>>>> failing. (Of Course minimum of Data+Parity live nodes should be > > >>>> available > > >>>>> in cluster for complete the write) > > >>>>> - > > >>>>> > > >>>>> You should be able to read the file back even if a few nodes > > >> failed in > > >>>>> the same ec block group(Failures should not be more than parity > > >>>> number of > > >>>>> nodes.). > > >>>>> > > >>>>> What is pending? Offline recovery of lost/missing EC containers. As > > >>>>> mentioned above, post merge of this branch, I will create a > separate > > >> JIRA > > >>>>> for starting the work for OfflineRecovery. > > >>>>> > > >>>>> > > >>>>> There are automated acceptance test cases already added. HDDS-6231 > > >>>>> > > >>>>> In addition to that, we have also performed basic Acceptance > Testing > > >> in > > >>>>> physical cluster: > > >>>>> > > >>>>> 1. > > >>>>> > > >>>>> Installed 10 nodes cluster and created EC bucket (3:2). > > >>>>> > > >>>>> Uploaded 10GB key. > > >>>>> > > >>>>> Downloaded the same key and checked the md5sum. > > >>>>> > > >>>>> 1. > > >>>>> > > >>>>> Uploaded 8GB key. > > >>>>> > > >>>>> Downloaded the same key and checked the md5sum. > > >>>>> > > >>>>> 1. > > >>>>> > > >>>>> Uploaded 3MB key > > >>>>> > > >>>>> Downloaded the same and verified md5sum. > > >>>>> > > >>>>> 1. > > >>>>> > > >>>>> Changed bucket to (6:3) > > >>>>> > > >>>>> Uploaded 8GB key > > >>>>> > > >>>>> Download the same. > > >>>>> > > >>>>> Also verified the new key should be in 6:3 policy and old keys must > > be > > >>>> 3:2.Verified > > >>>>> with several different size key writes and reads. > > >>>>> > > >>>>> Merge checklist items assessment is here: > > >>>>> > > >>>> > > >> > > > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > > >>>>> > > >>>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > > >> Fajth > > >>>>> <pi...@cloudera.com> for great efforts in core development and > also > > >>>>> thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for > > >> collaborating > > >>>>> on some of the EC tasks. > > >>>>> > > >>>>> Thanks to Marton for design discussion and on some dev tasks as > well. > > >>>>> > > >>>>> Thanks to many others who were involved in design discussions, > Arpit, > > >>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, > > >> Prashanth, > > >>>>> Rakesh, Yiqun Lin. > > >>>>> Sorry if I miss anyone here, but your efforts are much appreciated. > > >>>>> Without your tremendous help, we would have not reached this > position > > >>>> yet. > > >>>>> > > >>>>> If there are no objections for the merge, I will start the official > > >> vote > > >>>>> later. > > >>>>> > > >>>>> Regards, > > >>>>> > > >>>>> EC Branch Devs > > >>>>> > > >>>> > > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > > For additional commands, e-mail: dev-h...@ozone.apache.org > > > > > > -- > Thanks & Regards, > Aravindan >