+1 to merge the EC feature with read/write and online recovery part complete.
Thanks, Uma for putting the wiki page with all of the details. Minor nit: The documentation link points to the wrong markdown (FSO) instead of the EC documentation. - Sid On Thu, Feb 17, 2022 at 8:29 AM Arpit Agarwal <aagar...@cloudera.com.invalid> wrote: > Thanks for the detailed explanation. > > +1 to merge with HDDS-6209 addressed. > > > > On Feb 16, 2022, at 10:49 AM, Uma gangumalla <umamah...@apache.org> > wrote: > > > > Thanks a lot Arpit for your feedback. > > > > [Arpit Wrote] - New client writing to old server with 3-way and 1-way > > replication. > > [Uma] As mentioned in the proposal mail, we have a forward > > compatibility issue (HDDS-6209) as we have removed the client side > default > > configurations. One that is in, this should work. > > We will make sure to get this in before merge. > > > > [Arpit Wrote] - Old client writing to new server in bucket without EC > > policy [both 1-way and 3-way] > > [Uma] Old client alway passed the replication configs. Irrespective of > > bucket policy, we respect client passed replication config. so, this is > > fine. > > > > [Arpit Wrote] - Old client writing to new server in bucket with EC policy > > [both 1-way and 3-way] > > [Uma] As mentioned above, Old clients always passed non ec replication > > options while creating keys. Even when a call comes to the EC policy > > bucket, we allow non EC keys to be created on EC buckets. > > > > Also when a newer client writing EC option keys on an old server would be > > rejected. That should be covered as part of HDDS-6209. We are using a > > server, client versioning mechanism to detect the old server which cannot > > support EC. > > > > @Pifta, you may want to add your thoughts if any? > > > > Regards, > > Uma > > > > On Wed, Feb 16, 2022 at 8:23 AM Arpit Agarwal > <aagar...@cloudera.com.invalid> > > wrote: > > > >> Thanks Uma for starting this discussion. Excited to see EC support for > >> Ozone coming together at last. > >> > >> We should verify the the compatibility matrix prior to merge: > >> > >> - New client writing to old server with 3-way and 1-way replication. > >> - Old client writing to new server in bucket without EC policy [both > 1-way > >> and 3-way] > >> - Old client writing to new server in bucket with EC policy [both 1-way > >> and 3-way] > >> > >> > >> Arpit > >> > >> > >>> On Feb 15, 2022, at 12:17 AM, Uma gangumalla <umamah...@apache.org> > >> wrote: > >>> > >>> Dear Ozone Devs, > >>> > >>> As you may know, we have been actively developing Ozone Erasure Coding > >>> support in a separate branch HDDS-3816-ec. > >>> > >>> We have finished the development of EC key write and read > functionality. > >>> The support of offline recovery( Recovering replica from node loss) > will > >> be > >>> part of second phase work. > >>> > >>> Since the code has already grown and increasingly started seeing merge > >>> complications, we would like to propose to merge the current EC branch > >> into > >>> master. > >>> > >>> We will file the new JIRA for the second phase of work and continue the > >>> offline recovery work there. > >>> > >>> Details on Changes: > >>> > >>> - > >>> > >>> Most of the EC core logic went to newly extended classes. Key changes > >>> went into EC*OutputStream and EC*InputStream classes for write and > read > >>> respectively. Based on replication type, ECPipelineProvider will be > >> chosen > >>> for creating EC pipelines. > >>> > >>> > >>> > >>> - > >>> > >>> Since we cannot represent the EC replication in the existing > >> replication > >>> factor, we have introduced ECReplicationConfig. The ReplicationConfig > >>> interface is already pushed to master, so it’s not a new idea coming > >>> through this branch merge now. What is newly coming here is the > >>> ECReplicationConfig class which can be used to express EC replication > >>> configuration. > >>> > >>> > >>> > >>> - > >>> > >>> We wanted to provide the support to enable EC at bucket level. To > >>> simplify some complications, we have moved the default replication > >>> configurations from client to server. > >>> > >>> > >>> > >>> - > >>> > >>> Client side replication type and replication factor removed from the > >>> configuration files and introduced the > ozone.server.default.replication > >>> and ozone.server.default.replication.type.We would continue to respect > >> if > >>> one configures at client side explicitly or passed through APIs, > >> otherwise > >>> server side bucket level properties or server side default > >> configuration > >>> would take effect. > >>> > >>> > >>> > >>> - > >>> > >>> Other than this change, the rest of EC side code should not impact any > >>> of the existing code flows. > >>> > >>> > >>> We have finished documentation JIRA(HDDS-6172) for covering this > feature > >>> and we will continue to improve further in master. > >>> > >>> JIRA: HDDS-3816 > >>> > >>> Completed tasks: ~ 90 > >>> > >>> We wanted to cover the following compatibility issue before the merge: > >>> > >>> HDDS-6209: EC: [Forward compatibility issue] New client to older server > >>> could fail due to the unavailability for client default replication > >> config > >>> > >>> Few other JIRAs in HDDS-3816 are still open but I believe they're not > >>> blockers for merge. > >>> > >>> In short what you can do now with this feature: > >>> > >>> - > >>> > >>> You can enable EC at bucket level and cluster level. > >>> > >>> How to enable it at bucket level? Just create the bucket by passing the > >> ec > >>> replication options. > >>> > >>> - > >>> > >>> You can create EC keys and read the same back. > >>> - > >>> > >>> You should be able to continue writing even when chosen nodes are > >>> failing. (Of Course minimum of Data+Parity live nodes should be > >> available > >>> in cluster for complete the write) > >>> - > >>> > >>> You should be able to read the file back even if a few nodes failed in > >>> the same ec block group(Failures should not be more than parity number > >> of > >>> nodes.). > >>> > >>> What is pending? Offline recovery of lost/missing EC containers. As > >>> mentioned above, post merge of this branch, I will create a separate > JIRA > >>> for starting the work for OfflineRecovery. > >>> > >>> > >>> There are automated acceptance test cases already added. HDDS-6231 > >>> > >>> In addition to that, we have also performed basic Acceptance Testing in > >>> physical cluster: > >>> > >>> 1. > >>> > >>> Installed 10 nodes cluster and created EC bucket (3:2). > >>> > >>> Uploaded 10GB key. > >>> > >>> Downloaded the same key and checked the md5sum. > >>> > >>> > >>> 1. > >>> > >>> Uploaded 8GB key. > >>> > >>> Downloaded the same key and checked the md5sum. > >>> > >>> > >>> 1. > >>> > >>> Uploaded 3MB key > >>> > >>> Downloaded the same and verified md5sum. > >>> > >>> > >>> 1. > >>> > >>> Changed bucket to (6:3) > >>> > >>> Uploaded 8GB key > >>> > >>> Download the same. > >>> > >>> Also verified the new key should be in 6:3 policy and old keys must be > >> 3:2. > >>> > >>> > >>> > >>> 1. > >>> > >>> Verified with several different size key writes and reads. > >>> > >>> > >>> Merge checklist items assessment is here: > >>> > >> > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > >>> > >>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan > Fajth > >>> <pi...@cloudera.com> for great efforts in core development and also > >> thanks > >>> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on > >> some > >>> of the EC tasks. > >>> > >>> Thanks to Marton for design discussion and on some dev tasks as well. > >>> > >>> Thanks to many others who were involved in design discussions, Arpit, > >> Sidd, > >>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth, > >> Rakesh, > >>> Yiqun Lin. > >>> Sorry if I miss anyone here, but your efforts are much appreciated. > >> Without > >>> your tremendous help, we would have not reached this position yet. > >>> > >>> If there are no objections for the merge, I will start the official > vote > >>> later. > >>> > >>> Regards, > >>> > >>> EC Branch Devs > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > >> For additional commands, e-mail: dev-h...@ozone.apache.org > >> > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > For additional commands, e-mail: dev-h...@ozone.apache.org > >