I have just fixed the broken links in Wiki. Ethan also pointed about these in offline chat. Thanks
On Thu, Feb 17, 2022 at 11:40 AM Uma gangumalla <umamah...@apache.org> wrote: > Thanks Sidd for pointing that out. Yes I used ref doc from FSO, my bad > some link left out I think. I will fix it. > Thanks Arpit. > > Others, please provide your feedback if any, otherwise we move to official > vote soon after we track down the compat related issues. > > Regards, > Uma > > On Thu, Feb 17, 2022 at 8:48 AM Siddharth Wagle <swa...@apache.org> wrote: > >> +1 to merge the EC feature with read/write and online recovery part >> complete. >> >> Thanks, Uma for putting the wiki page with all of the details. >> Minor nit: The documentation link points to the wrong markdown (FSO) >> instead of the EC documentation. >> >> - Sid >> >> On Thu, Feb 17, 2022 at 8:29 AM Arpit Agarwal >> <aagar...@cloudera.com.invalid> >> wrote: >> >> > Thanks for the detailed explanation. >> > >> > +1 to merge with HDDS-6209 addressed. >> > >> > >> > > On Feb 16, 2022, at 10:49 AM, Uma gangumalla <umamah...@apache.org> >> > wrote: >> > > >> > > Thanks a lot Arpit for your feedback. >> > > >> > > [Arpit Wrote] - New client writing to old server with 3-way and 1-way >> > > replication. >> > > [Uma] As mentioned in the proposal mail, we have a forward >> > > compatibility issue (HDDS-6209) as we have removed the client side >> > default >> > > configurations. One that is in, this should work. >> > > We will make sure to get this in before merge. >> > > >> > > [Arpit Wrote] - Old client writing to new server in bucket without EC >> > > policy [both 1-way and 3-way] >> > > [Uma] Old client alway passed the replication configs. Irrespective of >> > > bucket policy, we respect client passed replication config. so, this >> is >> > > fine. >> > > >> > > [Arpit Wrote] - Old client writing to new server in bucket with EC >> policy >> > > [both 1-way and 3-way] >> > > [Uma] As mentioned above, Old clients always passed non ec replication >> > > options while creating keys. Even when a call comes to the EC policy >> > > bucket, we allow non EC keys to be created on EC buckets. >> > > >> > > Also when a newer client writing EC option keys on an old server >> would be >> > > rejected. That should be covered as part of HDDS-6209. We are using a >> > > server, client versioning mechanism to detect the old server which >> cannot >> > > support EC. >> > > >> > > @Pifta, you may want to add your thoughts if any? >> > > >> > > Regards, >> > > Uma >> > > >> > > On Wed, Feb 16, 2022 at 8:23 AM Arpit Agarwal >> > <aagar...@cloudera.com.invalid> >> > > wrote: >> > > >> > >> Thanks Uma for starting this discussion. Excited to see EC support >> for >> > >> Ozone coming together at last. >> > >> >> > >> We should verify the the compatibility matrix prior to merge: >> > >> >> > >> - New client writing to old server with 3-way and 1-way replication. >> > >> - Old client writing to new server in bucket without EC policy [both >> > 1-way >> > >> and 3-way] >> > >> - Old client writing to new server in bucket with EC policy [both >> 1-way >> > >> and 3-way] >> > >> >> > >> >> > >> Arpit >> > >> >> > >> >> > >>> On Feb 15, 2022, at 12:17 AM, Uma gangumalla <umamah...@apache.org> >> > >> wrote: >> > >>> >> > >>> Dear Ozone Devs, >> > >>> >> > >>> As you may know, we have been actively developing Ozone Erasure >> Coding >> > >>> support in a separate branch HDDS-3816-ec. >> > >>> >> > >>> We have finished the development of EC key write and read >> > functionality. >> > >>> The support of offline recovery( Recovering replica from node loss) >> > will >> > >> be >> > >>> part of second phase work. >> > >>> >> > >>> Since the code has already grown and increasingly started seeing >> merge >> > >>> complications, we would like to propose to merge the current EC >> branch >> > >> into >> > >>> master. >> > >>> >> > >>> We will file the new JIRA for the second phase of work and continue >> the >> > >>> offline recovery work there. >> > >>> >> > >>> Details on Changes: >> > >>> >> > >>> - >> > >>> >> > >>> Most of the EC core logic went to newly extended classes. Key >> changes >> > >>> went into EC*OutputStream and EC*InputStream classes for write and >> > read >> > >>> respectively. Based on replication type, ECPipelineProvider will be >> > >> chosen >> > >>> for creating EC pipelines. >> > >>> >> > >>> >> > >>> >> > >>> - >> > >>> >> > >>> Since we cannot represent the EC replication in the existing >> > >> replication >> > >>> factor, we have introduced ECReplicationConfig. The >> ReplicationConfig >> > >>> interface is already pushed to master, so it’s not a new idea >> coming >> > >>> through this branch merge now. What is newly coming here is the >> > >>> ECReplicationConfig class which can be used to express EC >> replication >> > >>> configuration. >> > >>> >> > >>> >> > >>> >> > >>> - >> > >>> >> > >>> We wanted to provide the support to enable EC at bucket level. To >> > >>> simplify some complications, we have moved the default replication >> > >>> configurations from client to server. >> > >>> >> > >>> >> > >>> >> > >>> - >> > >>> >> > >>> Client side replication type and replication factor removed from >> the >> > >>> configuration files and introduced the >> > ozone.server.default.replication >> > >>> and ozone.server.default.replication.type.We would continue to >> respect >> > >> if >> > >>> one configures at client side explicitly or passed through APIs, >> > >> otherwise >> > >>> server side bucket level properties or server side default >> > >> configuration >> > >>> would take effect. >> > >>> >> > >>> >> > >>> >> > >>> - >> > >>> >> > >>> Other than this change, the rest of EC side code should not impact >> any >> > >>> of the existing code flows. >> > >>> >> > >>> >> > >>> We have finished documentation JIRA(HDDS-6172) for covering this >> > feature >> > >>> and we will continue to improve further in master. >> > >>> >> > >>> JIRA: HDDS-3816 >> > >>> >> > >>> Completed tasks: ~ 90 >> > >>> >> > >>> We wanted to cover the following compatibility issue before the >> merge: >> > >>> >> > >>> HDDS-6209: EC: [Forward compatibility issue] New client to older >> server >> > >>> could fail due to the unavailability for client default replication >> > >> config >> > >>> >> > >>> Few other JIRAs in HDDS-3816 are still open but I believe they're >> not >> > >>> blockers for merge. >> > >>> >> > >>> In short what you can do now with this feature: >> > >>> >> > >>> - >> > >>> >> > >>> You can enable EC at bucket level and cluster level. >> > >>> >> > >>> How to enable it at bucket level? Just create the bucket by passing >> the >> > >> ec >> > >>> replication options. >> > >>> >> > >>> - >> > >>> >> > >>> You can create EC keys and read the same back. >> > >>> - >> > >>> >> > >>> You should be able to continue writing even when chosen nodes are >> > >>> failing. (Of Course minimum of Data+Parity live nodes should be >> > >> available >> > >>> in cluster for complete the write) >> > >>> - >> > >>> >> > >>> You should be able to read the file back even if a few nodes >> failed in >> > >>> the same ec block group(Failures should not be more than parity >> number >> > >> of >> > >>> nodes.). >> > >>> >> > >>> What is pending? Offline recovery of lost/missing EC containers. As >> > >>> mentioned above, post merge of this branch, I will create a separate >> > JIRA >> > >>> for starting the work for OfflineRecovery. >> > >>> >> > >>> >> > >>> There are automated acceptance test cases already added. HDDS-6231 >> > >>> >> > >>> In addition to that, we have also performed basic Acceptance >> Testing in >> > >>> physical cluster: >> > >>> >> > >>> 1. >> > >>> >> > >>> Installed 10 nodes cluster and created EC bucket (3:2). >> > >>> >> > >>> Uploaded 10GB key. >> > >>> >> > >>> Downloaded the same key and checked the md5sum. >> > >>> >> > >>> >> > >>> 1. >> > >>> >> > >>> Uploaded 8GB key. >> > >>> >> > >>> Downloaded the same key and checked the md5sum. >> > >>> >> > >>> >> > >>> 1. >> > >>> >> > >>> Uploaded 3MB key >> > >>> >> > >>> Downloaded the same and verified md5sum. >> > >>> >> > >>> >> > >>> 1. >> > >>> >> > >>> Changed bucket to (6:3) >> > >>> >> > >>> Uploaded 8GB key >> > >>> >> > >>> Download the same. >> > >>> >> > >>> Also verified the new key should be in 6:3 policy and old keys must >> be >> > >> 3:2. >> > >>> >> > >>> >> > >>> >> > >>> 1. >> > >>> >> > >>> Verified with several different size key writes and reads. >> > >>> >> > >>> >> > >>> Merge checklist items assessment is here: >> > >>> >> > >> >> > >> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist >> > >>> >> > >>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan >> > Fajth >> > >>> <pi...@cloudera.com> for great efforts in core development and also >> > >> thanks >> > >>> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating >> on >> > >> some >> > >>> of the EC tasks. >> > >>> >> > >>> Thanks to Marton for design discussion and on some dev tasks as >> well. >> > >>> >> > >>> Thanks to many others who were involved in design discussions, >> Arpit, >> > >> Sidd, >> > >>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, >> > >> Rakesh, >> > >>> Yiqun Lin. >> > >>> Sorry if I miss anyone here, but your efforts are much appreciated. >> > >> Without >> > >>> your tremendous help, we would have not reached this position yet. >> > >>> >> > >>> If there are no objections for the merge, I will start the official >> > vote >> > >>> later. >> > >>> >> > >>> Regards, >> > >>> >> > >>> EC Branch Devs >> > >> >> > >> >> > >> --------------------------------------------------------------------- >> > >> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org >> > >> For additional commands, e-mail: dev-h...@ozone.apache.org >> > >> >> > >> >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org >> > For additional commands, e-mail: dev-h...@ozone.apache.org >> > >> > >> >