Great news!
+1 to merge.



At 2022-04-06 22:18:31, "Stephen O'Donnell" <sodonn...@cloudera.com.INVALID> 
wrote:
>I have been working on the code on this branch for some time, and I believe
>it is in a good state to merge now. It is mostly new code, and if nothing
>attempts to use EC, none of the EC code paths will be executed.
>
>+1 to merge from me.
>
>Stephen.
>
>On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla <umamah...@apache.org> wrote:
>
>> =====Few Edits Below===================
>>
>> Dear Ozone Devs,
>>
>> As you may know, we have been actively developing Ozone Erasure Coding
>> support in a separate branch HDDS-3816-ec.
>>
>> We have finished the development of EC key write and read functionality.
>> The support of offline recovery( Recovering replica from node loss) will be
>> part of second phase work.
>>
>> Since the code has already grown and increasingly started seeing merge
>> complications, we would like to merge the current EC branch into master.
>>
>> We filed the new JIRA(HDDS-6462) for the second phase of work and continued
>> the offline recovery work there. (we have uploaded the design doc there)
>>
>> Details on Changes:
>>
>>    -
>>
>>    Most of the EC core logic went to newly extended classes. Key changes
>>    went into EC*OutputStream and EC*InputStream classes for write and read
>>    respectively. Based on replication type, ECPipelineProvider will be
>> chosen
>>    for creating EC pipelines.
>>
>>
>>
>>    -
>>
>>    Since we cannot represent the EC replication in the existing replication
>>    factor, we have introduced ECReplicationConfig. The ReplicationConfig
>>    interface is already pushed to master, so it’s not a new idea coming
>>    through this branch merge now. What is newly coming here is the
>>    ECReplicationConfig class which can be used to express EC replication
>>    configuration.
>>
>>
>>
>>    -
>>
>>    We wanted to provide the support to enable EC at bucket level. To
>>    simplify some complications, we have moved the default replication
>>    configurations from client to server.
>>
>>
>>
>>    -
>>
>>    Client side replication type and replication factor removed from the
>>    configuration files and introduced the ozone.server.default.replication
>>    and ozone.server.default.replication.type.We would continue to respect
>> if
>>    one configures at client side explicitly or passed through APIs,
>> otherwise
>>    server side bucket level properties or server side default configuration
>>    would take effect.
>>
>>
>>
>>    -
>>
>>    Other than this change, the rest of EC side code should not impact any
>>    of the existing code flows.
>>
>>
>> We have finished documentation JIRA(HDDS-6172) for covering this feature
>> and we will continue to improve further in master.
>>
>> Git Branch Name : HDDS-3816-ec
>>
>> JIRAs: HDDS-3816 and HDDS-5351
>>
>> Completed tasks: ~ 142
>>
>> + We are covering the following two mandatory JIRAs to come in:
>>
>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to older server
>> could fail due to the unavailability for client default replication config
>>
>> 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>>
>> PRs reviews in-progress and expected to close in a day or two.
>>
>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
>> blockers for merge.
>>
>> In short what you can do now with this feature:
>>
>>    -
>>
>>    You can enable EC at bucket level and cluster level.
>>
>> How to enable it at bucket level? Just create the bucket by passing the ec
>> replication options.
>>
>>    -
>>
>>    You can create EC keys and read the same back.
>>    -
>>
>>    You should be able to continue writing even when chosen nodes are
>>    failing. (Of Course minimum of Data+Parity live nodes should be
>> available
>>    in cluster for complete the write)
>>    -
>>
>>    You should be able to read the file back even if a few nodes failed in
>>    the same ec block group(Failures should not be more than parity number
>> of
>>    nodes.).
>>
>> What is pending? Offline recovery of lost/missing EC containers. As
>> mentioned above, post merge of this branch, I will create a separate JIRA
>> for starting the work for OfflineRecovery.
>>
>>
>> There are automated acceptance test cases already added. HDDS-6231
>>
>> In addition to that, we have also performed basic Acceptance Testing in
>> physical cluster:
>>
>>    1.
>>
>>    Installed 10 nodes cluster and created EC bucket (3:2).
>>
>> Uploaded 10GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>    1.
>>
>>    Uploaded 8GB key.
>>
>> Downloaded the same key and checked the md5sum.
>>
>>    1.
>>
>>    Uploaded 3MB key
>>
>> Downloaded the same and verified md5sum.
>>
>>    1.
>>
>>    Changed bucket to (6:3)
>>
>> Uploaded 8GB key
>>
>> Download the same.
>>
>> Also verified the new key should be in 6:3 policy and old keys must be
>> 3:2.Verified
>> with several different size key writes and reads.
>>
>>
>>
>> Since the merge discussion thread, we have well stabilized code and fixed
>> several bugs.
>>
>>
>> Merge checklist items assessment is here:
>>
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>>
>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan Fajth
>> <pi...@cloudera.com> for great efforts in core development and also thanks
>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for collaborating
>> on some of the EC tasks.
>>
>> Thanks to Marton for design discussion and on some dev tasks as well.
>>
>> Thanks to many others who were involved in design discussions, Arpit, Sidd,
>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, Rakesh,
>> Yiqun Lin.
>> Sorry if I miss anyone here, but your efforts are much appreciated. Without
>> your tremendous help, we would have not reached this position yet.
>>
>>
>>
>> To start with, here is my +1
>>
>> The vote will run for 5 days.
>>
>> Regards,
>> Uma
>>
>>
>>
>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla <umamah...@apache.org>
>> wrote:
>>
>> > Dear Ozone Devs,
>> >
>> > As you may know, we have been actively developing Ozone Erasure Coding
>> > support in a separate branch HDDS-3816-ec.
>> >
>> > We have finished the development of EC key write and read functionality.
>> > The support of offline recovery( Recovering replica from node loss) will
>> be
>> > part of second phase work.
>> >
>> > Since the code has already grown and increasingly started seeing merge
>> > complications, we would like to propose to merge the current EC branch
>> into
>> > master.
>> >
>> > We filed the new JIRA(HDDS-6462) for the second phase of work and
>> > continued the offline recovery work there.
>> >
>> > Details on Changes:
>> >
>> >    -
>> >
>> >    Most of the EC core logic went to newly extended classes. Key changes
>> >    went into EC*OutputStream and EC*InputStream classes for write and
>> read
>> >    respectively. Based on replication type, ECPipelineProvider will be
>> chosen
>> >    for creating EC pipelines.
>> >
>> >
>> >
>> >    -
>> >
>> >    Since we cannot represent the EC replication in the existing
>> >    replication factor, we have introduced ECReplicationConfig. The
>> >    ReplicationConfig interface is already pushed to master, so it’s not
>> a new
>> >    idea coming through this branch merge now. What is newly coming here
>> is the
>> >    ECReplicationConfig class which can be used to express EC replication
>> >    configuration.
>> >
>> >
>> >
>> >    -
>> >
>> >    We wanted to provide the support to enable EC at bucket level. To
>> >    simplify some complications, we have moved the default replication
>> >    configurations from client to server.
>> >
>> >
>> >
>> >    -
>> >
>> >    Client side replication type and replication factor removed from the
>> >    configuration files and introduced the
>> ozone.server.default.replication
>> >    and ozone.server.default.replication.type.We would continue to
>> respect if
>> >    one configures at client side explicitly or passed through APIs,
>> otherwise
>> >    server side bucket level properties or server side default
>> configuration
>> >    would take effect.
>> >
>> >
>> >
>> >    -
>> >
>> >    Other than this change, the rest of EC side code should not impact any
>> >    of the existing code flows.
>> >
>> >
>> > We have finished documentation JIRA(HDDS-6172) for covering this feature
>> > and we will continue to improve further in master.
>> >
>> > Git Branch Name : HDDS-3816-ec
>> >
>> > JIRAs: HDDS-3816 and HDDS-5351
>> >
>> > Completed tasks: ~ 142
>> >
>> > + We are covering the following two mandatory JIRAs:
>> >
>> > 1. HDDS-6209: EC: [Forward compatibility issue] New client to older
>> > server could fail due to the unavailability for client default
>> replication
>> > config
>> >
>> > 2. HDDS-5909: EC: Onboard EC into upgrade framework.
>> >
>> > PRs reviews in-progress and expected to close in a day or two.
>> >
>> > Few other JIRAs in HDDS-3816 are still open but I believe they're not
>> > blockers for merge.
>> >
>> > In short what you can do now with this feature:
>> >
>> >    -
>> >
>> >    You can enable EC at bucket level and cluster level.
>> >
>> > How to enable it at bucket level? Just create the bucket by passing the
>> ec
>> > replication options.
>> >
>> >    -
>> >
>> >    You can create EC keys and read the same back.
>> >    -
>> >
>> >    You should be able to continue writing even when chosen nodes are
>> >    failing. (Of Course minimum of Data+Parity live nodes should be
>> available
>> >    in cluster for complete the write)
>> >    -
>> >
>> >    You should be able to read the file back even if a few nodes failed in
>> >    the same ec block group(Failures should not be more than parity
>> number of
>> >    nodes.).
>> >
>> > What is pending? Offline recovery of lost/missing EC containers. As
>> > mentioned above, post merge of this branch, I will create a separate JIRA
>> > for starting the work for OfflineRecovery.
>> >
>> >
>> > There are automated acceptance test cases already added. HDDS-6231
>> >
>> > In addition to that, we have also performed basic Acceptance Testing in
>> > physical cluster:
>> >
>> >    1.
>> >
>> >    Installed 10 nodes cluster and created EC bucket (3:2).
>> >
>> > Uploaded 10GB key.
>> >
>> > Downloaded the same key and checked the md5sum.
>> >
>> >    1.
>> >
>> >    Uploaded 8GB key.
>> >
>> > Downloaded the same key and checked the md5sum.
>> >
>> >    1.
>> >
>> >    Uploaded 3MB key
>> >
>> > Downloaded the same and verified md5sum.
>> >
>> >    1.
>> >
>> >    Changed bucket to (6:3)
>> >
>> > Uploaded 8GB key
>> >
>> > Download the same.
>> >
>> > Also verified the new key should be in 6:3 policy and old keys must be
>> 3:2.Verified
>> > with several different size key writes and reads.
>> >
>> > Merge checklist items assessment is here:
>> >
>> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
>> >
>> > Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan Fajth
>> > <pi...@cloudera.com> for great efforts in core development and also
>> > thanks a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating
>> > on some of the EC tasks.
>> >
>> > Thanks to Marton for design discussion and on some dev tasks as well.
>> >
>> > Thanks to many others who were involved in design discussions, Arpit,
>> > Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
>> > Rakesh, Yiqun Lin.
>> > Sorry if I miss anyone here, but your efforts are much appreciated.
>> > Without your tremendous help, we would have not reached this position
>> yet.
>> >
>> > If there are no objections for the merge, I will start the official vote
>> > later.
>> >
>> > Regards,
>> >
>> > EC Branch Devs
>> >
>>

Reply via email to