+1 for the EC branch merge. Regards, Prashant
> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <swa...@apache.org> wrote: > > +1 for the EC branch merge. > > Best, > Sid > > On Wed, Apr 6, 2022 at 8:05 PM guimark <guim...@126.com> wrote: > >> Great news! >> +1 to merge. >> >> >> >> >> At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonn...@cloudera.com.INVALID> >> wrote: >>> I have been working on the code on this branch for some time, and I >> believe >>> it is in a good state to merge now. It is mostly new code, and if nothing >>> attempts to use EC, none of the EC code paths will be executed. >>> >>> +1 to merge from me. >>> >>> Stephen. >>> >>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <umamah...@apache.org> >> wrote: >>> >>>> =====Few Edits Below=================== >>>> >>>> Dear Ozone Devs, >>>> >>>> As you may know, we have been actively developing Ozone Erasure Coding >>>> support in a separate branch HDDS-3816-ec. >>>> >>>> We have finished the development of EC key write and read functionality. >>>> The support of offline recovery( Recovering replica from node loss) >> will be >>>> part of second phase work. >>>> >>>> Since the code has already grown and increasingly started seeing merge >>>> complications, we would like to merge the current EC branch into master. >>>> >>>> We filed the new JIRA(HDDS-6462) for the second phase of work and >> continued >>>> the offline recovery work there. (we have uploaded the design doc there) >>>> >>>> Details on Changes: >>>> >>>> - >>>> >>>> Most of the EC core logic went to newly extended classes. Key changes >>>> went into EC*OutputStream and EC*InputStream classes for write and >> read >>>> respectively. Based on replication type, ECPipelineProvider will be >>>> chosen >>>> for creating EC pipelines. >>>> >>>> >>>> >>>> - >>>> >>>> Since we cannot represent the EC replication in the existing >> replication >>>> factor, we have introduced ECReplicationConfig. The ReplicationConfig >>>> interface is already pushed to master, so it’s not a new idea coming >>>> through this branch merge now. What is newly coming here is the >>>> ECReplicationConfig class which can be used to express EC replication >>>> configuration. >>>> >>>> >>>> >>>> - >>>> >>>> We wanted to provide the support to enable EC at bucket level. To >>>> simplify some complications, we have moved the default replication >>>> configurations from client to server. >>>> >>>> >>>> >>>> - >>>> >>>> Client side replication type and replication factor removed from the >>>> configuration files and introduced the >> ozone.server.default.replication >>>> and ozone.server.default.replication.type.We would continue to >> respect >>>> if >>>> one configures at client side explicitly or passed through APIs, >>>> otherwise >>>> server side bucket level properties or server side default >> configuration >>>> would take effect. >>>> >>>> >>>> >>>> - >>>> >>>> Other than this change, the rest of EC side code should not impact >> any >>>> of the existing code flows. >>>> >>>> >>>> We have finished documentation JIRA(HDDS-6172) for covering this feature >>>> and we will continue to improve further in master. >>>> >>>> Git Branch Name : HDDS-3816-ec >>>> >>>> JIRAs: HDDS-3816 and HDDS-5351 >>>> >>>> Completed tasks: ~ 142 >>>> >>>> + We are covering the following two mandatory JIRAs to come in: >>>> >>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older >> server >>>> could fail due to the unavailability for client default replication >> config >>>> >>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. >>>> >>>> PRs reviews in-progress and expected to close in a day or two. >>>> >>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not >>>> blockers for merge. >>>> >>>> In short what you can do now with this feature: >>>> >>>> - >>>> >>>> You can enable EC at bucket level and cluster level. >>>> >>>> How to enable it at bucket level? Just create the bucket by passing the >> ec >>>> replication options. >>>> >>>> - >>>> >>>> You can create EC keys and read the same back. >>>> - >>>> >>>> You should be able to continue writing even when chosen nodes are >>>> failing. (Of Course minimum of Data+Parity live nodes should be >>>> available >>>> in cluster for complete the write) >>>> - >>>> >>>> You should be able to read the file back even if a few nodes failed >> in >>>> the same ec block group(Failures should not be more than parity >> number >>>> of >>>> nodes.). >>>> >>>> What is pending? Offline recovery of lost/missing EC containers. As >>>> mentioned above, post merge of this branch, I will create a separate >> JIRA >>>> for starting the work for OfflineRecovery. >>>> >>>> >>>> There are automated acceptance test cases already added. HDDS-6231 >>>> >>>> In addition to that, we have also performed basic Acceptance Testing in >>>> physical cluster: >>>> >>>> 1. >>>> >>>> Installed 10 nodes cluster and created EC bucket (3:2). >>>> >>>> Uploaded 10GB key. >>>> >>>> Downloaded the same key and checked the md5sum. >>>> >>>> 1. >>>> >>>> Uploaded 8GB key. >>>> >>>> Downloaded the same key and checked the md5sum. >>>> >>>> 1. >>>> >>>> Uploaded 3MB key >>>> >>>> Downloaded the same and verified md5sum. >>>> >>>> 1. >>>> >>>> Changed bucket to (6:3) >>>> >>>> Uploaded 8GB key >>>> >>>> Download the same. >>>> >>>> Also verified the new key should be in 6:3 policy and old keys must be >>>> 3:2.Verified >>>> with several different size key writes and reads. >>>> >>>> >>>> >>>> Since the merge discussion thread, we have well stabilized code and >> fixed >>>> several bugs. >>>> >>>> >>>> Merge checklist items assessment is here: >>>> >>>> >> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist >>>> >>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan >> Fajth >>>> <pi...@cloudera.com> for great efforts in core development and also >> thanks >>>> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for >> collaborating >>>> on some of the EC tasks. >>>> >>>> Thanks to Marton for design discussion and on some dev tasks as well. >>>> >>>> Thanks to many others who were involved in design discussions, Arpit, >> Sidd, >>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, >> Rakesh, >>>> Yiqun Lin. >>>> Sorry if I miss anyone here, but your efforts are much appreciated. >> Without >>>> your tremendous help, we would have not reached this position yet. >>>> >>>> >>>> >>>> To start with, here is my +1 >>>> >>>> The vote will run for 5 days. >>>> >>>> Regards, >>>> Uma >>>> >>>> >>>> >>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <umamah...@apache.org> >>>> wrote: >>>> >>>>> Dear Ozone Devs, >>>>> >>>>> As you may know, we have been actively developing Ozone Erasure Coding >>>>> support in a separate branch HDDS-3816-ec. >>>>> >>>>> We have finished the development of EC key write and read >> functionality. >>>>> The support of offline recovery( Recovering replica from node loss) >> will >>>> be >>>>> part of second phase work. >>>>> >>>>> Since the code has already grown and increasingly started seeing merge >>>>> complications, we would like to propose to merge the current EC branch >>>> into >>>>> master. >>>>> >>>>> We filed the new JIRA(HDDS-6462) for the second phase of work and >>>>> continued the offline recovery work there. >>>>> >>>>> Details on Changes: >>>>> >>>>> - >>>>> >>>>> Most of the EC core logic went to newly extended classes. Key >> changes >>>>> went into EC*OutputStream and EC*InputStream classes for write and >>>> read >>>>> respectively. Based on replication type, ECPipelineProvider will be >>>> chosen >>>>> for creating EC pipelines. >>>>> >>>>> >>>>> >>>>> - >>>>> >>>>> Since we cannot represent the EC replication in the existing >>>>> replication factor, we have introduced ECReplicationConfig. The >>>>> ReplicationConfig interface is already pushed to master, so it’s >> not >>>> a new >>>>> idea coming through this branch merge now. What is newly coming >> here >>>> is the >>>>> ECReplicationConfig class which can be used to express EC >> replication >>>>> configuration. >>>>> >>>>> >>>>> >>>>> - >>>>> >>>>> We wanted to provide the support to enable EC at bucket level. To >>>>> simplify some complications, we have moved the default replication >>>>> configurations from client to server. >>>>> >>>>> >>>>> >>>>> - >>>>> >>>>> Client side replication type and replication factor removed from >> the >>>>> configuration files and introduced the >>>> ozone.server.default.replication >>>>> and ozone.server.default.replication.type.We would continue to >>>> respect if >>>>> one configures at client side explicitly or passed through APIs, >>>> otherwise >>>>> server side bucket level properties or server side default >>>> configuration >>>>> would take effect. >>>>> >>>>> >>>>> >>>>> - >>>>> >>>>> Other than this change, the rest of EC side code should not impact >> any >>>>> of the existing code flows. >>>>> >>>>> >>>>> We have finished documentation JIRA(HDDS-6172) for covering this >> feature >>>>> and we will continue to improve further in master. >>>>> >>>>> Git Branch Name : HDDS-3816-ec >>>>> >>>>> JIRAs: HDDS-3816 and HDDS-5351 >>>>> >>>>> Completed tasks: ~ 142 >>>>> >>>>> + We are covering the following two mandatory JIRAs: >>>>> >>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older >>>>> server could fail due to the unavailability for client default >>>> replication >>>>> config >>>>> >>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. >>>>> >>>>> PRs reviews in-progress and expected to close in a day or two. >>>>> >>>>> Few other JIRAs in HDDS-3816 are still open but I believe they're not >>>>> blockers for merge. >>>>> >>>>> In short what you can do now with this feature: >>>>> >>>>> - >>>>> >>>>> You can enable EC at bucket level and cluster level. >>>>> >>>>> How to enable it at bucket level? Just create the bucket by passing >> the >>>> ec >>>>> replication options. >>>>> >>>>> - >>>>> >>>>> You can create EC keys and read the same back. >>>>> - >>>>> >>>>> You should be able to continue writing even when chosen nodes are >>>>> failing. (Of Course minimum of Data+Parity live nodes should be >>>> available >>>>> in cluster for complete the write) >>>>> - >>>>> >>>>> You should be able to read the file back even if a few nodes >> failed in >>>>> the same ec block group(Failures should not be more than parity >>>> number of >>>>> nodes.). >>>>> >>>>> What is pending? Offline recovery of lost/missing EC containers. As >>>>> mentioned above, post merge of this branch, I will create a separate >> JIRA >>>>> for starting the work for OfflineRecovery. >>>>> >>>>> >>>>> There are automated acceptance test cases already added. HDDS-6231 >>>>> >>>>> In addition to that, we have also performed basic Acceptance Testing >> in >>>>> physical cluster: >>>>> >>>>> 1. >>>>> >>>>> Installed 10 nodes cluster and created EC bucket (3:2). >>>>> >>>>> Uploaded 10GB key. >>>>> >>>>> Downloaded the same key and checked the md5sum. >>>>> >>>>> 1. >>>>> >>>>> Uploaded 8GB key. >>>>> >>>>> Downloaded the same key and checked the md5sum. >>>>> >>>>> 1. >>>>> >>>>> Uploaded 3MB key >>>>> >>>>> Downloaded the same and verified md5sum. >>>>> >>>>> 1. >>>>> >>>>> Changed bucket to (6:3) >>>>> >>>>> Uploaded 8GB key >>>>> >>>>> Download the same. >>>>> >>>>> Also verified the new key should be in 6:3 policy and old keys must be >>>> 3:2.Verified >>>>> with several different size key writes and reads. >>>>> >>>>> Merge checklist items assessment is here: >>>>> >>>> >> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist >>>>> >>>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan >> Fajth >>>>> <pi...@cloudera.com> for great efforts in core development and also >>>>> thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for >> collaborating >>>>> on some of the EC tasks. >>>>> >>>>> Thanks to Marton for design discussion and on some dev tasks as well. >>>>> >>>>> Thanks to many others who were involved in design discussions, Arpit, >>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, >> Prashanth, >>>>> Rakesh, Yiqun Lin. >>>>> Sorry if I miss anyone here, but your efforts are much appreciated. >>>>> Without your tremendous help, we would have not reached this position >>>> yet. >>>>> >>>>> If there are no objections for the merge, I will start the official >> vote >>>>> later. >>>>> >>>>> Regards, >>>>> >>>>> EC Branch Devs >>>>> >>>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org For additional commands, e-mail: dev-h...@ozone.apache.org