+1 for the idea. On Nov 3, 2015 07:22, "Zheng, Kai" <kai.zh...@intel.com> wrote:
> Sounds good to me. When it's determined to include EC in 2.9 release, it > may be good to have a rough release date as Zhe asked, so accordingly the > scope of EC can be discussed out. We still have quite a few of things as > Phase I follow-on tasks to do before EC can be deployed in a production > system. Phase II to develop non-striping EC for cold data would possibly be > started after that. We might consider to include only Phase I and leave > Phase II for next release according to the rough release date. > > Regards, > Kai > > -----Original Message----- > From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] > Sent: Tuesday, November 03, 2015 5:41 AM > To: hdfs-dev@hadoop.apache.org > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 > (erasure coding) branch to trunk] > > +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan > +to > have 2.8 and 2.9 releases. > > Regards, > Uma > > On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vino...@hortonworks.com> wrote: > > >Forking the thread. Started looking at the 2.8 list, various features¹ > >status and arrived here. > > > >While I understand the pervasive nature of EC and a need for a > >significant bake-in, moving this to a 3.x release is not a good idea. > >We will surely get a 2.8 out this year and, as needed, I can even spend > >time getting started on a 2.9. OTOH, 3.x is long ways off, and given > >all the incompatibilities there, it would be a while before users can > >get their hands on EC if it were to be only on 3.x. At best, this may > >force sites that want EC to backport the entire EC feature to older > >releases, at worst this will be repeat the mess of 0.20 security release > forks. > > > >If we think adding this to 2.8 (even if it switched off) is too much > >risk per our original plan, let¹s move this to 2.9, there by leaving > >enough time for stability, integration testing and bake-in, and a > >realistic chance of having it end up on users¹ clusters soonish. > > > >+Vinod > > > >> On Oct 19, 2015, at 1:44 PM, Andrew Wang <andrew.w...@cloudera.com> > >>wrote: > >> > >> I think our plan thus far has been to target this for 3.0. I'm okay > >>with putting it in branch-2 if we've given a hard look at > >>compatibility, but I'll note though that 2.8 is already looking like > >>quite a large release, and our release bandwidth has been focused on > >>the 2.6 and 2.7 maintenance releases. Adding another multi-hundred > >>JIRAs to 2.8 might make it too unwieldy to get out the door. If we > >>bump EC past that, 3.0 might very well be our next release vehicle. I > >>do plan to revive the 3.0 schedule some time next year. With EC and > >>JDK8 in a good spot, the only big feature remaining is classpath > >>isolation. > >> > >> EC is also a pretty fundamental change to HDFS. Even if it's > >>compatible, in terms of size and impact it might best belong in a new > >>major release. > >> > >> Best, > >> Andrew > >> > >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < > >> vinayakumarb.apa...@gmail.com> wrote: > >> > >>> Is anyone else also thinks that feature is ready to goto branch-2 > >>>as well? > >>> > >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since > >>>then and ready to go in branch-2. > >>> > >>> -Vinay > >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: > >>> > >>>> Thanks Vinay for capturing the issue and Uma for offering the help. > >>>> > >>>> --- > >>>> Zhe Zhang > >>>> > >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma < > >>> uma.ganguma...@intel.com > >>>>> > >>>> wrote: > >>>> > >>>>> Vinay, > >>>>> > >>>>> > >>>>> I would merge them as part of HDFS-9182. > >>>>> > >>>>> Thanks, > >>>>> Uma > >>>>> > >>>>> > >>>>> > >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vinayakum...@apache.org> > >>>>>wrote: > >>>>> > >>>>>> Hi Andrew, > >>>>>> I see CHANGES.txt entries not yet merged from > >>> CHANGES-HDFS-EC-7285.txt. > >>>>>> > >>>>>> Was this intentional? > >>>>>> > >>>>>> Regards, > >>>>>> Vinay > >>>>>> > >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang < > >>> andrew.w...@cloudera.com> > >>>>>> wrote: > >>>>>> > >>>>>>> Branch has been merged to trunk, thanks again to everyone who > >>>>>>>worked > >>>> on > >>>>>>> the > >>>>>>> feature! > >>>>>>> > >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang > >>>>>>> <zhezh...@cloudera.com> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Thanks everyone who has participated in this discussion. > >>>>>>>> > >>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote > >>> has > >>>>>>> passed. > >>>>>>>> I will do a final 'git merge' with trunk and work with Andrew > >>>>>>>> to > >>>> merge > >>>>>>> the > >>>>>>>> branch to trunk. I'll update on this thread when the merge is > >>> done. > >>>>>>>> > >>>>>>>> --- > >>>>>>>> Zhe Zhang > >>>>>>>> > >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A > >>>>>>>> <yi.a....@intel.com> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> (Change it to binding.) > >>>>>>>>> > >>>>>>>>> +1 > >>>>>>>>> I have been involved in the development and code review on the > >>>>>>> feature > >>>>>>>>> branch. It's a great feature and I think it's ready to merge > >>>>>>>>> it > >>>> into > >>>>>>>> trunk. > >>>>>>>>> > >>>>>>>>> Thanks all for the contribution. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> Yi Liu > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Liu, Yi A > >>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM > >>>>>>>>> To: hdfs-dev@hadoop.apache.org > >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to > >>>> trunk > >>>>>>>>> > >>>>>>>>> +1 (non-binding) > >>>>>>>>> I have been involved in the development and code review on the > >>>>>>> feature > >>>>>>>>> branch. It's a great feature and I think it's ready to merge > >>>>>>>>> it > >>>> into > >>>>>>>> trunk. > >>>>>>>>> > >>>>>>>>> Thanks all for the contribution. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> Yi Liu > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org] > >>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM > >>>>>>>>> To: hdfs-dev@hadoop.apache.org > >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to > >>>> trunk > >>>>>>>>> > >>>>>>>>> +1, > >>>>>>>>> > >>>>>>>>> I've been involved starting from design and development of > >>>>>>> ErasureCoding. > >>>>>>>>> I think phase 1 of this development is ready to be merged to > >>>> trunk. > >>>>>>>>> It had come a long way to the current state with significant > >>>> effort > >>>>>>> of > >>>>>>>>> many Contributors and Reviewers for both design and code. > >>>>>>>>> > >>>>>>>>> Thanks Everyone for the efforts. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> Vinay > >>>>>>>>> > >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> +1 > >>>>>>>>>> > >>>>>>>>>> I've been involved in both development and review on the > >>> branch, > >>>>>>> and > >>>>>>> I > >>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks > >>> to > >>>>>>> all > >>>>>>>>>> the contributors and reviewers! > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> -Jing > >>>>>>>>>> > >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai < > >>>> kai.zh...@intel.com> > >>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Non-binding +1 > >>>>>>>>>>> > >>>>>>>>>>> According to our extensive performance tests, striping + > >>> ISA-L > >>>>>>> coder > >>>>>>>>>> based > >>>>>>>>>>> erasure coding not only can save storage, but also can > >>>> increase > >>>>>>> the > >>>>>>>>>>> throughput of a client or a cluster. It will be a great > >>>>>>> addition to > >>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we > >>> also > >>>>>>>>>>> observed it's > >>>>>>>>>> very > >>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf > >>> test > >>>>>>> report > >>>>>>>>>> after > >>>>>>>>>>> it's sorted out and hope it helps. > >>>>>>>>>>> Thanks! > >>>>>>>>>>> > >>>>>>>>>>> Regards, > >>>>>>>>>>> Kai > >>>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] > >>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM > >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org; > >>> common-...@hadoop.apache.org > >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch > >>> to > >>>>>>> trunk > >>>>>>>>>>> > >>>>>>>>>>> +1 > >>>>>>>>>>> > >>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice > >>>>>>> work. > >>>>>>>>>>> > >>>>>>>>>>> Regards, > >>>>>>>>>>> Uma > >>>>>>>>>>> > >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> > >>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature > >>>>>>> branch > >>>>>>>>>>>> back to trunk. Since November 2014 we have been designing > >>> and > >>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285 > >>>> and > >>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches. > >>>>>>>>>>>> > >>>>>>>>>>>> The HDFS-7285 feature branch was created to support the > >>> first > >>>>>>> phase > >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC > >>> is > >>>>>>> to > >>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters. > >>>>>>> Instead > >>>>>>>>>>>> of always creating 3 replicas of each block with 200% > >>> storage > >>>>>>> space > >>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity > >>>> data > >>>>>>>> blocks. > >>>>>>>>>>>> With most EC configurations, the storage overhead is no > >>> more > >>>>>>> than > >>>>>>>> 50%. > >>>>>>>>>>>> Based on profiling results of production clusters, we > >>> decided > >>>>>>> to > >>>>>>>>>>>> support EC with the striped block layout in the first > >>> phase, > >>>> so > >>>>>>>>>>>> that small files can be better handled. This means dividing > >>>>>>> each > >>>>>>>>>>>> logical HDFS file block into smaller units (striping cells) > >>>> and > >>>>>>>>>>>> spreading them on a set of DataNodes in round-robin > >>> fashion. > >>>>>>> Parity > >>>>>>>>>>>> cells are generated for each stripe of original data cells. > >>>> We > >>>>>>> have > >>>>>>>>>>>> made changes to NameNode, client, and DataNode to > >>> generalize > >>>>>>> the > >>>>>>>>>>>> block concept and handle the mapping between a logical file > >>>>>>> block > >>>>>>>>>>>> and its internal storage blocks. For further details please > >>>> see > >>>>>>> the > >>>>>>>>>>>> design doc on HDFS-7285. > >>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and > >>>> high-performance > >>>>>>>>>>>> codec calculation support. > >>>>>>>>>>>> > >>>>>>>>>>>> The nightly Jenkins job of the branch has reported several > >>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared > >>>> with > >>>>>>>>>>>> trunk. We have posted several versions of the test plan > >>>>>>> including > >>>>>>>>>>>> both unit testing and cluster testing, and have executed > >>> most > >>>>>>> tests > >>>>>>>>>>>> in the plan. The most basic functionalities have been > >>>>>>> extensively > >>>>>>>>>>>> tested and verified in several real clusters with different > >>>>>>>>>>>> hardware configurations; results have been very stable. We > >>>> have > >>>>>>>>>>>> created follow-on tasks for more advanced error handling > >>> and > >>>>>>>>> optimization under the umbrella HDFS-8031. > >>>>>>>>>>>> We also plan to implement or harden the integration of EC > >>>> with > >>>>>>>>>>>> existing features such as WebHDFS, snapshot, append, > >>>> truncate, > >>>>>>>>>>>> hflush, hsync, and so forth. > >>>>>>>>>>>> > >>>>>>>>>>>> Development of this feature has been a collaboration across > >>>>>>> many > >>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina, > >>>>>>> Takanobu > >>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma > >>> Maheswara > >>>>>>> Rao > >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao > >>>> Rui, > >>>>>>> Kai > >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong > >>>>>>> Zhang, > >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code > >>>> contributions > >>>>>>> and > >>>>>>>>> reviews. > >>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to > >>>> the > >>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and > >>>> many > >>>>>>>>>>>> other contributors have made great efforts in system > >>> testing. > >>>>>>> Many > >>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM, > >>>> Todd > >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for > >>>>>>> providing > >>>>>>>>> helpful feedbacks. > >>>>>>>>>>>> > >>>>>>>>>>>> Following the community convention, this vote will last > >>> for 7 > >>>>>>> days > >>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are > >>>>>>> binding > >>>>>>>>>>>> but non-binding votes are very welcome as well. And here's > >>> my > >>>>>>>>>>>> non-binding > >>>>>>>>>> +1. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> --- > >>>>>>>>>>>> Zhe Zhang > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >>>>> > >>>> > >>> > > > >