+1 to merge the EC feature with read/write and online recovery part
complete.

Thanks, Uma for putting the wiki page with all of the details.
Minor nit: The documentation link points to the wrong markdown (FSO)
instead of the EC documentation.

- Sid

On Thu, Feb 17, 2022 at 8:29 AM Arpit Agarwal <aagar...@cloudera.com.invalid>
wrote:

> Thanks for the detailed explanation.
>
> +1 to merge with HDDS-6209 addressed.
>
>
> > On Feb 16, 2022, at 10:49 AM, Uma gangumalla <umamah...@apache.org>
> wrote:
> >
> > Thanks a lot Arpit for your feedback.
> >
> > [Arpit Wrote] - New client writing to old server with 3-way and 1-way
> > replication.
> > [Uma] As mentioned in the proposal mail, we have a forward
> > compatibility issue (HDDS-6209) as we have removed the client side
> default
> > configurations. One that is in, this should work.
> >          We will make sure to get this in before merge.
> >
> > [Arpit Wrote] - Old client writing to new server in bucket without EC
> > policy [both 1-way and 3-way]
> > [Uma] Old client alway passed the replication configs. Irrespective of
> > bucket policy, we respect client passed replication config. so, this is
> > fine.
> >
> > [Arpit Wrote] - Old client writing to new server in bucket with EC policy
> > [both 1-way and 3-way]
> > [Uma] As mentioned above, Old clients always passed non ec replication
> > options while creating keys. Even when a call comes to the EC policy
> > bucket, we allow non EC keys to be created on EC buckets.
> >
> > Also when a newer client writing EC option keys on an old server would be
> > rejected. That should be covered as part of HDDS-6209. We are using a
> > server, client versioning mechanism to detect the old server which cannot
> > support EC.
> >
> > @Pifta, you may want to add your thoughts if any?
> >
> > Regards,
> > Uma
> >
> > On Wed, Feb 16, 2022 at 8:23 AM Arpit Agarwal
> <aagar...@cloudera.com.invalid>
> > wrote:
> >
> >> Thanks Uma for starting this discussion. Excited to see EC support for
> >> Ozone coming together at last.
> >>
> >> We should verify the the compatibility matrix prior to merge:
> >>
> >> - New client writing to old server with 3-way and 1-way replication.
> >> - Old client writing to new server in bucket without EC policy [both
> 1-way
> >> and 3-way]
> >> - Old client writing to new server in bucket with EC policy [both 1-way
> >> and 3-way]
> >>
> >>
> >> Arpit
> >>
> >>
> >>> On Feb 15, 2022, at 12:17 AM, Uma gangumalla <umamah...@apache.org>
> >> wrote:
> >>>
> >>> Dear Ozone Devs,
> >>>
> >>> As you may know, we have been actively developing Ozone Erasure Coding
> >>> support in a separate branch HDDS-3816-ec.
> >>>
> >>> We have finished the development of EC key write and read
> functionality.
> >>> The support of offline recovery( Recovering replica from node loss)
> will
> >> be
> >>> part of second phase work.
> >>>
> >>> Since the code has already grown and increasingly started seeing merge
> >>> complications, we would like to propose to merge the current EC branch
> >> into
> >>> master.
> >>>
> >>> We will file the new JIRA for the second phase of work and continue the
> >>> offline recovery work there.
> >>>
> >>> Details on Changes:
> >>>
> >>>  -
> >>>
> >>>  Most of the EC core logic went to newly extended classes. Key changes
> >>>  went into EC*OutputStream and EC*InputStream classes for write and
> read
> >>>  respectively. Based on replication type, ECPipelineProvider will be
> >> chosen
> >>>  for creating EC pipelines.
> >>>
> >>>
> >>>
> >>>  -
> >>>
> >>>  Since we cannot represent the EC replication in the existing
> >> replication
> >>>  factor, we have introduced ECReplicationConfig. The ReplicationConfig
> >>>  interface is already pushed to master, so it’s not a new idea coming
> >>>  through this branch merge now. What is newly coming here is the
> >>>  ECReplicationConfig class which can be used to express EC replication
> >>>  configuration.
> >>>
> >>>
> >>>
> >>>  -
> >>>
> >>>  We wanted to provide the support to enable EC at bucket level. To
> >>>  simplify some complications, we have moved the default replication
> >>>  configurations from client to server.
> >>>
> >>>
> >>>
> >>>  -
> >>>
> >>>  Client side replication type and replication factor removed from the
> >>>  configuration files and introduced the
> ozone.server.default.replication
> >>>  and ozone.server.default.replication.type.We would continue to respect
> >> if
> >>>  one configures at client side explicitly or passed through APIs,
> >> otherwise
> >>>  server side bucket level properties or server side default
> >> configuration
> >>>  would take effect.
> >>>
> >>>
> >>>
> >>>  -
> >>>
> >>>  Other than this change, the rest of EC side code should not impact any
> >>>  of the existing code flows.
> >>>
> >>>
> >>> We have finished documentation JIRA(HDDS-6172) for covering this
> feature
> >>> and we will continue to improve further in master.
> >>>
> >>> JIRA: HDDS-3816
> >>>
> >>> Completed tasks: ~ 90
> >>>
> >>> We wanted to cover the following compatibility issue before the merge:
> >>>
> >>> HDDS-6209: EC: [Forward compatibility issue] New client to older server
> >>> could fail due to the unavailability for client default replication
> >> config
> >>>
> >>> Few other JIRAs in HDDS-3816 are still open but I believe they're not
> >>> blockers for merge.
> >>>
> >>> In short what you can do now with this feature:
> >>>
> >>>  -
> >>>
> >>>  You can enable EC at bucket level and cluster level.
> >>>
> >>> How to enable it at bucket level? Just create the bucket by passing the
> >> ec
> >>> replication options.
> >>>
> >>>  -
> >>>
> >>>  You can create EC keys and read the same back.
> >>>  -
> >>>
> >>>  You should be able to continue writing even when chosen nodes are
> >>>  failing. (Of Course minimum of Data+Parity live nodes should be
> >> available
> >>>  in cluster for complete the write)
> >>>  -
> >>>
> >>>  You should be able to read the file back even if a few nodes failed in
> >>>  the same ec block group(Failures should not be more than parity number
> >> of
> >>>  nodes.).
> >>>
> >>> What is pending? Offline recovery of lost/missing EC containers. As
> >>> mentioned above, post merge of this branch, I will create a separate
> JIRA
> >>> for starting the work for OfflineRecovery.
> >>>
> >>>
> >>> There are automated acceptance test cases already added. HDDS-6231
> >>>
> >>> In addition to that, we have also performed basic Acceptance Testing in
> >>> physical cluster:
> >>>
> >>>  1.
> >>>
> >>>  Installed 10 nodes cluster and created EC bucket (3:2).
> >>>
> >>> Uploaded 10GB key.
> >>>
> >>> Downloaded the same key and checked the md5sum.
> >>>
> >>>
> >>>  1.
> >>>
> >>>  Uploaded 8GB key.
> >>>
> >>> Downloaded the same key and checked the md5sum.
> >>>
> >>>
> >>>  1.
> >>>
> >>>  Uploaded 3MB key
> >>>
> >>> Downloaded the same and verified md5sum.
> >>>
> >>>
> >>>  1.
> >>>
> >>>  Changed bucket to (6:3)
> >>>
> >>> Uploaded 8GB key
> >>>
> >>> Download the same.
> >>>
> >>> Also verified the new key should be in 6:3 policy and old keys must be
> >> 3:2.
> >>>
> >>>
> >>>
> >>>  1.
> >>>
> >>>  Verified with several different size key writes and reads.
> >>>
> >>>
> >>> Merge checklist items assessment is here:
> >>>
> >>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist
> >>>
> >>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, Istvan
> Fajth
> >>> <pi...@cloudera.com> for great efforts in core development and also
> >> thanks
> >>> a lot  to Sammi, Mingchao Zhao, Mark Gui, Kaijie for collaborating on
> >> some
> >>> of the EC tasks.
> >>>
> >>> Thanks to Marton for design discussion and on some dev tasks as well.
> >>>
> >>> Thanks to many others who were involved in design discussions, Arpit,
> >> Sidd,
> >>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, Prashanth,
> >> Rakesh,
> >>> Yiqun Lin.
> >>> Sorry if I miss anyone here, but your efforts are much appreciated.
> >> Without
> >>> your tremendous help, we would have not reached this position yet.
> >>>
> >>> If there are no objections for the merge, I will start the official
> vote
> >>> later.
> >>>
> >>> Regards,
> >>>
> >>> EC Branch Devs
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> >> For additional commands, e-mail: dev-h...@ozone.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>

Reply via email to