Forking the thread. Started looking at the 2.8 list, various features’ status 
and arrived here.

While I understand the pervasive nature of EC and a need for a significant 
bake-in, moving this to a 3.x release is not a good idea. We will surely get a 
2.8 out this year and, as needed, I can even spend time getting started on a 
2.9. OTOH, 3.x is long ways off, and given all the incompatibilities there, it 
would be a while before users can get their hands on EC if it were to be only 
on 3.x. At best, this may force sites that want EC to backport the entire EC 
feature to older releases, at worst this will be repeat the mess of 0.20 
security release forks.

If we think adding this to 2.8 (even if it switched off) is too much risk per 
our original plan, let’s move this to 2.9, there by leaving enough time for 
stability, integration testing and bake-in, and a realistic chance of having it 
end up on users’ clusters soonish.

+Vinod

> On Oct 19, 2015, at 1:44 PM, Andrew Wang <andrew.w...@cloudera.com> wrote:
> 
> I think our plan thus far has been to target this for 3.0. I'm okay with
> putting it in branch-2 if we've given a hard look at compatibility, but
> I'll note though that 2.8 is already looking like quite a large release,
> and our release bandwidth has been focused on the 2.6 and 2.7 maintenance
> releases. Adding another multi-hundred JIRAs to 2.8 might make it too
> unwieldy to get out the door. If we bump EC past that, 3.0 might very well
> be our next release vehicle. I do plan to revive the 3.0 schedule some time
> next year. With EC and JDK8 in a good spot, the only big feature remaining
> is classpath isolation.
> 
> EC is also a pretty fundamental change to HDFS. Even if it's compatible, in
> terms of size and impact it might best belong in a new major release.
> 
> Best,
> Andrew
> 
> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> vinayakumarb.apa...@gmail.com> wrote:
> 
>> Is anyone else also thinks that feature is ready to goto branch-2  as well?
>> 
>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since then and
>> ready to go in branch-2.
>> 
>> -Vinay
>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote:
>> 
>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>> 
>>> ---
>>> Zhe Zhang
>>> 
>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>> uma.ganguma...@intel.com
>>>> 
>>> wrote:
>>> 
>>>> Vinay,
>>>> 
>>>> 
>>>> I would merge them as part of HDFS-9182.
>>>> 
>>>> Thanks,
>>>> Uma
>>>> 
>>>> 
>>>> 
>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vinayakum...@apache.org> wrote:
>>>> 
>>>>> Hi Andrew,
>>>>> I see CHANGES.txt entries not yet merged from
>> CHANGES-HDFS-EC-7285.txt.
>>>>> 
>>>>> Was this intentional?
>>>>> 
>>>>> Regards,
>>>>> Vinay
>>>>> 
>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>> andrew.w...@cloudera.com>
>>>>> wrote:
>>>>> 
>>>>>> Branch has been merged to trunk, thanks again to everyone who worked
>>> on
>>>>>> the
>>>>>> feature!
>>>>>> 
>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zhezh...@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>> 
>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
>> has
>>>>>> passed.
>>>>>>> I will do a final 'git merge' with trunk and work with Andrew to
>>> merge
>>>>>> the
>>>>>>> branch to trunk. I'll update on this thread when the merge is
>> done.
>>>>>>> 
>>>>>>> ---
>>>>>>> Zhe Zhang
>>>>>>> 
>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi.a....@intel.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> (Change it to binding.)
>>>>>>>> 
>>>>>>>> +1
>>>>>>>> I have been involved in the development and code review on the
>>>>>> feature
>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>> into
>>>>>>> trunk.
>>>>>>>> 
>>>>>>>> Thanks all for the contribution.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Yi Liu
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Liu, Yi A
>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>> trunk
>>>>>>>> 
>>>>>>>> +1 (non-binding)
>>>>>>>> I have been involved in the development and code review on the
>>>>>> feature
>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>> into
>>>>>>> trunk.
>>>>>>>> 
>>>>>>>> Thanks all for the contribution.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Yi Liu
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org]
>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>> trunk
>>>>>>>> 
>>>>>>>> +1,
>>>>>>>> 
>>>>>>>> I've been involved starting from design and development of
>>>>>> ErasureCoding.
>>>>>>>> I think phase 1 of this development is ready to be merged to
>>> trunk.
>>>>>>>> It had come a long way to the current state with significant
>>> effort
>>>>>> of
>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>> 
>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Vinay
>>>>>>>> 
>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> 
>>>>>>>>> I've been involved in both development and review on the
>> branch,
>>>>>> and
>>>>>> I
>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
>> to
>>>>>> all
>>>>>>>>> the contributors and reviewers!
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> -Jing
>>>>>>>>> 
>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>> kai.zh...@intel.com>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Non-binding +1
>>>>>>>>>> 
>>>>>>>>>> According to our extensive performance tests, striping +
>> ISA-L
>>>>>> coder
>>>>>>>>> based
>>>>>>>>>> erasure coding not only can save storage, but also can
>>> increase
>>>>>> the
>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>> addition to
>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>> also
>>>>>>>>>> observed it's
>>>>>>>>> very
>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>> test
>>>>>> report
>>>>>>>>> after
>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>> Thanks!
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Kai
>>>>>>>>>> 
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com]
>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>> common-...@hadoop.apache.org
>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
>> to
>>>>>> trunk
>>>>>>>>>> 
>>>>>>>>>> +1
>>>>>>>>>> 
>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
>>>>>> work.
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Uma
>>>>>>>>>> 
>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com>
>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
>>>>>> branch
>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
>> and
>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
>>> and
>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>> 
>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>> first
>>>>>> phase
>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
>> is
>>>>>> to
>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>> Instead
>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>> storage
>>>>>> space
>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
>>> data
>>>>>>> blocks.
>>>>>>>>>>> With most EC configurations, the storage overhead is no
>> more
>>>>>> than
>>>>>>> 50%.
>>>>>>>>>>> Based on profiling results of production clusters, we
>> decided
>>>>>> to
>>>>>>>>>>> support EC with the striped block layout in the first
>> phase,
>>> so
>>>>>>>>>>> that small files can be better handled. This means dividing
>>>>>> each
>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
>>> and
>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>> fashion.
>>>>>> Parity
>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>> We
>>>>>> have
>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>> generalize
>>>>>> the
>>>>>>>>>>> block concept and handle the mapping between a logical file
>>>>>> block
>>>>>>>>>>> and its internal storage blocks. For further details please
>>> see
>>>>>> the
>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>> high-performance
>>>>>>>>>>> codec calculation support.
>>>>>>>>>>> 
>>>>>>>>>>> The nightly Jenkins job of the branch has reported several
>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
>>> with
>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>> including
>>>>>>>>>>> both unit testing and cluster testing, and have executed
>> most
>>>>>> tests
>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>> extensively
>>>>>>>>>>> tested and verified in several real clusters with different
>>>>>>>>>>> hardware configurations; results have been very stable. We
>>> have
>>>>>>>>>>> created follow-on tasks for more advanced error handling
>> and
>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>> We also plan to implement or harden the integration of EC
>>> with
>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>> truncate,
>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>> 
>>>>>>>>>>> Development of this feature has been a collaboration across
>>>>>> many
>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
>>>>>> Takanobu
>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>> Maheswara
>>>>>> Rao
>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
>>> Rui,
>>>>>> Kai
>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
>>>>>> Zhang,
>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>> contributions
>>>>>> and
>>>>>>>> reviews.
>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
>>> the
>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
>>> many
>>>>>>>>>>> other contributors have made great efforts in system
>> testing.
>>>>>> Many
>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
>>> Todd
>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>> providing
>>>>>>>> helpful feedbacks.
>>>>>>>>>>> 
>>>>>>>>>>> Following the community convention, this vote will last
>> for 7
>>>>>> days
>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
>>>>>> binding
>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
>> my
>>>>>>>>>>> non-binding
>>>>>>>>> +1.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> ---
>>>>>>>>>>> Zhe Zhang
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>> 

Reply via email to