Re: Do I need a +1 to merge a backport PR?

2021-06-02 Thread epa...@apache.org
It is okay to go ahead and backport as long as there are no major refactoring 
necessary.Minor conflict fixes should be fine.-Eric

On Tuesday, June 1, 2021, 11:43:44 PM CDT, Wei-Chiu Chuang  
wrote: 
 
 I'm curious about the GitHub PR conventions we use today... say I want to
backport a commit from trunk to branch-3.3, and there's a small code
conflict so I push a PR against branch-3.3 using GitHub to go through the
precommit check.

Do I need explicit approval from another committer to merge the backport
PR? (As.a committer, I know I can merge at any time) or can I merge when
the precommit comes back okay?
  

Re: [VOTE] Hadoop 3.1.x EOL

2021-06-07 Thread epa...@apache.org
+1 (binding)
-Eric


   On Thursday, June 3, 2021, 1:14:51 AM CDT, Akira Ajisaka 
 wrote:  
 
 Dear Hadoop developers,

Given the feedback from the discussion thread [1], I'd like to start
an official vote
thread for the community to vote and start the 3.1 EOL process.

What this entails:

(1) an official announcement that no further regular Hadoop 3.1.x releases
will be made after 3.1.4.
(2) resolve JIRAs that specifically target 3.1.5 as won't fix.

This vote will run for 7 days and conclude by June 10th, 16:00 JST [2].

Committers are eligible to cast binding votes. Non-committers are welcomed
to cast non-binding votes.

Here is my vote, +1

[1] https://s.apache.org/w9ilb
[2] 
https://www.timeanddate.com/worldclock/fixedtime.html?msg=4&iso=20210610T16&p1=248

Regards,
Akira

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

  

Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-11 Thread epa...@apache.org
+1 (binding)
Eric


   On Tuesday, June 1, 2021, 5:29:49 AM CDT, Wei-Chiu Chuang 
 wrote:  
 
 Hi community,

This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All blocker
issues have been resolved [1] again.

There are 2 additional issues resolved for RC3:
* Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
HADOOP-16878
* Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source
and destination are the same

There are 4 issues resolved for RC2:
* HADOOP-17666. Update LICENSE for 3.3.1
* MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
* Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
* HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)

The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
fixes compared to hadoop-thirdparty 1.1.0:
* HADOOP-17707. Remove jaeger document from site index.
* HADOOP-17730. Add back error_prone

*RC tag is release-3.3.1-RC3
https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3

*The RC3 artifacts are at*:
https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/

*The maven artifacts are hosted here:*
https://repository.apache.org/content/repositories/orgapachehadoop-1320/

*My public key is available here:*
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS


Things I've verified:
* all blocker issues targeting 3.3.1 have been resolved.
* stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
* LICENSE and NOTICE files checked
* RELEASENOTES and CHANGELOG
* rat check passed.
* Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
* Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
* Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
Had to patch some of them due to commons-lang migration (Hadoop 3.2.0) and
dependency divergence. Issues are being identified but so far nothing
blocker for Hadoop itself.

Please try the release and vote. The vote will run for 5 days.

My +1 to start,

[1] https://issues.apache.org/jira/issues/?filter=12350491
[2]
https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3
  

Re: [DISCUSS] Tips for improving productivity, workflow in the Hadoop project?

2021-07-14 Thread epa...@apache.org
Ahmed has pinpointed my main concern with both JIRA and PR for the same issue.I 
often go back to older issues to try to understand the reasons behind the 
designs.It is somewhat cumbersome to try and follow discussions between JIRA 
and PRs.Thanks,-Eric



   On Wednesday, July 14, 2021, 9:50:47 AM CDT, Ahmed Hussein 
 wrote:  
 
 Do you consider migrating Jira issues to Github issues?

I am a little bit concerned that there are some committers who still prefer
Jira-precommits over GitHub PR
(P.S. I am not a committer).

Their point is that Github-PR confuses them with discussions/comments being
in two places rather than one.

Personally, I found several Github-PRs comments discussing the validity of
the feature/bug.
As a result:
- recently, JIRA became some sort of a "number generator" with insufficient
description/details as the
  developers and the reviewers spending more time discussing in the PR.
- the relation between a single Jira and Github-PR is 1-to-M. In order to
find related discussions, the user may
  need to visit every PR (that may include closed ones)



On Wed, Jul 14, 2021 at 8:46 AM Steve Loughran 
wrote:

> not sure about stale PR closing; when you've a patch which is still pending
> review it's not that fun to have it closed.
>
> maybe better to have review sessions. I recall many, many years ago
> attempts to try and catch up with all outstanding patch reviews.
>
>
>
>
> On Wed, 14 Jul 2021 at 03:00, Akira Ajisaka  wrote:
>
> > Thank you Wei-Chiu for starting the discussion,
> >
> > > 3. JIRA security
> > I'm +1 to use private JIRA issues to handle vulnerabilities.
> >
> > > 5. Doc update
> > +1, I build the document daily and it helps me fixing documents:
> > https://aajisaka.github.io/hadoop-document/ It's great if the latest
> > document is built and published by the Apache Hadoop community.
> >
> > My idea related to GitHub PR:
> > 1. Disable the precommit jobs for JIRA, always use GitHub PR. It saves
> > costs to configure and debug the precommit jobs.
> > https://issues.apache.org/jira/browse/HADOOP-17798
> > 2. Improve the pull request template for the contributors
> > https://issues.apache.org/jira/browse/HADOOP-17799
> >
> > Regards,
> > Akira
> >
> > On Tue, Jul 13, 2021 at 12:35 PM Wei-Chiu Chuang 
> > wrote:
> > >
> > > I work on multiple projects and learned a bunch from those
> projects.There
> > > are nice add-ons that help with productivity. There are things we can
> do
> > to
> > > help us manage the project better.
> > >
> > > 1. Add new issue types.
> > > We can add "Epic" jira type to organize a set of related jiras. This
> > could
> > > be easier to manage than using a regular JIRA and call it "umbrella".
> > >
> > > 2. GitHub Actions
> > > I am seeing more projects moving to GitHub Actions for precommits. We
> > don't
> > > necessarily need to migrate off Jenkins, but there are nice add-ons
> that
> > > can perform static analysis, catching potential issues. For example,
> > Ozone
> > > adds SonarQube to post-commit, and exports the report to SonarCloud.
> > Other
> > > add-ons are available to scan for docker images, vulnerabilities scans.
> > >
> > > 3. JIRA security
> > > It is possible to set up security level (public/private) in JIRA. This
> > can
> > > be used to track vulnerability issues and be made only visible to
> > > committers. Example: INFRA-15258
> > > 
> > >
> > > 4. New JIRA fields
> > > It's possible to add new fields. For example, we can add a "Reviewer"
> > > field, which could help improve the attention to issues.
> > >
> > > 5. Doc update
> > > It is possible to set up automation such that the doc on the Hadoop
> > website
> > > is refreshed for every commit, providing the latest doc to the public.
> > >
> > > 6. Webhook
> > > It's possible to set up webhook such that every commit in GitHub sends
> a
> > > notification to the ASF slack. It can be used for other kinds of
> > > automation. Sky's the limit.
> > >
> > > Thoughts? What else can do we?
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
>


-- 
Best Regards,

*Ahmed Hussein, PhD*
  

Re: [DISCUSS] Tips for improving productivity, workflow in the Hadoop project?

2021-07-14 Thread epa...@apache.org
  +1 for a review of the backlog!-Eric

 On Wednesday, July 14, 2021, 10:02:39 AM CDT, Wei-Chiu Chuang 
 wrote:  
 
 We have more than 400 open PRs. I would be happy to find a way to reduce
that number to a more manageable size.
Otherwise it just becomes another JIRA where issues are filed and sunk to
the bottom of black hole.

A review session is a good idea.
We can
(1) decide on a date
(2) sign up.
(3) decide how much we're planning to complete on that day
(3) those who sign up should stay in the slack channel, or zoom for
communication on that day.

if any one else is interested, let's spawn a new thread for discussion.

On Wed, Jul 14, 2021 at 9:46 PM Steve Loughran 
wrote:

> not sure about stale PR closing; when you've a patch which is still pending
> review it's not that fun to have it closed.
>
> maybe better to have review sessions. I recall many, many years ago
> attempts to try and catch up with all outstanding patch reviews.
>
>
>
>
> On Wed, 14 Jul 2021 at 03:00, Akira Ajisaka  wrote:
>
> > Thank you Wei-Chiu for starting the discussion,
> >
> > > 3. JIRA security
> > I'm +1 to use private JIRA issues to handle vulnerabilities.
> >
> > > 5. Doc update
> > +1, I build the document daily and it helps me fixing documents:
> > https://aajisaka.github.io/hadoop-document/ It's great if the latest
> > document is built and published by the Apache Hadoop community.
> >
> > My idea related to GitHub PR:
> > 1. Disable the precommit jobs for JIRA, always use GitHub PR. It saves
> > costs to configure and debug the precommit jobs.
> > https://issues.apache.org/jira/browse/HADOOP-17798
> > 2. Improve the pull request template for the contributors
> > https://issues.apache.org/jira/browse/HADOOP-17799
> >
> > Regards,
> > Akira
> >
> > On Tue, Jul 13, 2021 at 12:35 PM Wei-Chiu Chuang 
> > wrote:
> > >
> > > I work on multiple projects and learned a bunch from those
> projects.There
> > > are nice add-ons that help with productivity. There are things we can
> do
> > to
> > > help us manage the project better.
> > >
> > > 1. Add new issue types.
> > > We can add "Epic" jira type to organize a set of related jiras. This
> > could
> > > be easier to manage than using a regular JIRA and call it "umbrella".
> > >
> > > 2. GitHub Actions
> > > I am seeing more projects moving to GitHub Actions for precommits. We
> > don't
> > > necessarily need to migrate off Jenkins, but there are nice add-ons
> that
> > > can perform static analysis, catching potential issues. For example,
> > Ozone
> > > adds SonarQube to post-commit, and exports the report to SonarCloud.
> > Other
> > > add-ons are available to scan for docker images, vulnerabilities scans.
> > >
> > > 3. JIRA security
> > > It is possible to set up security level (public/private) in JIRA. This
> > can
> > > be used to track vulnerability issues and be made only visible to
> > > committers. Example: INFRA-15258
> > > 
> > >
> > > 4. New JIRA fields
> > > It's possible to add new fields. For example, we can add a "Reviewer"
> > > field, which could help improve the attention to issues.
> > >
> > > 5. Doc update
> > > It is possible to set up automation such that the doc on the Hadoop
> > website
> > > is refreshed for every commit, providing the latest doc to the public.
> > >
> > > 6. Webhook
> > > It's possible to set up webhook such that every commit in GitHub sends
> a
> > > notification to the ASF slack. It can be used for other kinds of
> > > automation. Sky's the limit.
> > >
> > > Thoughts? What else can do we?
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
>
  

Re: [DISCUSS] Tips for improving productivity, workflow in the Hadoop project?

2021-07-15 Thread epa...@apache.org
 > I usually use PR comments to discuss about the patch submitted.
My concern is that still leaves multiple places to look in order to get a full 
picture of an issue.
-Eric

On Wednesday, July 14, 2021, 7:07:30 PM CDT, Masatake Iwasaki 
 wrote: 

 > - recently, JIRA became some sort of a "number generator" with insufficient
> description/details as the
>    developers and the reviewers spending more time discussing in the PR.

JIRA issues contain useful information in the fields.
We are leveraging them in development and release process.

* https://yetus.apache.org/documentation/0.13.0/releasedocmaker/
* https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12336122

I usually use PR comments to discuss about the patch submitted.
JIRA comments are used for background or design discussion before and after 
submitting PR.
There would be no problem having no comment in minor/trivial JIRA issues.


On 2021/07/14 23:50, Ahmed Hussein wrote:
> Do you consider migrating Jira issues to Github issues?
> 
> I am a little bit concerned that there are some committers who still prefer
> Jira-precommits over GitHub PR
> (P.S. I am not a committer).
> 
> Their point is that Github-PR confuses them with discussions/comments being
> in two places rather than one.
> 
> Personally, I found several Github-PRs comments discussing the validity of
> the feature/bug.
> As a result:
> - recently, JIRA became some sort of a "number generator" with insufficient
> description/details as the
>    developers and the reviewers spending more time discussing in the PR.
> - the relation between a single Jira and Github-PR is 1-to-M. In order to
> find related discussions, the user may
>    need to visit every PR (that may include closed ones)
> 
> 
> 
> On Wed, Jul 14, 2021 at 8:46 AM Steve Loughran 
> wrote:
> 
>> not sure about stale PR closing; when you've a patch which is still pending
>> review it's not that fun to have it closed.
>>
>> maybe better to have review sessions. I recall many, many years ago
>> attempts to try and catch up with all outstanding patch reviews.
>>
>>
>>
>>
>> On Wed, 14 Jul 2021 at 03:00, Akira Ajisaka  wrote:
>>
>>> Thank you Wei-Chiu for starting the discussion,
>>>
 3. JIRA security
>>> I'm +1 to use private JIRA issues to handle vulnerabilities.
>>>
 5. Doc update
>>> +1, I build the document daily and it helps me fixing documents:
>>> https://aajisaka.github.io/hadoop-document/ It's great if the latest
>>> document is built and published by the Apache Hadoop community.
>>>
>>> My idea related to GitHub PR:
>>> 1. Disable the precommit jobs for JIRA, always use GitHub PR. It saves
>>> costs to configure and debug the precommit jobs.
>>> https://issues.apache.org/jira/browse/HADOOP-17798
>>> 2. Improve the pull request template for the contributors
>>> https://issues.apache.org/jira/browse/HADOOP-17799
>>>
>>> Regards,
>>> Akira
>>>
>>> On Tue, Jul 13, 2021 at 12:35 PM Wei-Chiu Chuang 
>>> wrote:

 I work on multiple projects and learned a bunch from those
>> projects.There
 are nice add-ons that help with productivity. There are things we can
>> do
>>> to
 help us manage the project better.

 1. Add new issue types.
 We can add "Epic" jira type to organize a set of related jiras. This
>>> could
 be easier to manage than using a regular JIRA and call it "umbrella".

 2. GitHub Actions
 I am seeing more projects moving to GitHub Actions for precommits. We
>>> don't
 necessarily need to migrate off Jenkins, but there are nice add-ons
>> that
 can perform static analysis, catching potential issues. For example,
>>> Ozone
 adds SonarQube to post-commit, and exports the report to SonarCloud.
>>> Other
 add-ons are available to scan for docker images, vulnerabilities scans.

 3. JIRA security
 It is possible to set up security level (public/private) in JIRA. This
>>> can
 be used to track vulnerability issues and be made only visible to
 committers. Example: INFRA-15258
 

 4. New JIRA fields
 It's possible to add new fields. For example, we can add a "Reviewer"
 field, which could help improve the attention to issues.

 5. Doc update
 It is possible to set up automation such that the doc on the Hadoop
>>> website
 is refreshed for every commit, providing the latest doc to the public.

 6. Webhook
 It's possible to set up webhook such that every commit in GitHub sends
>> a
 notification to the ASF slack. It can be used for other kinds of
 automation. Sky's the limit.

 Thoughts? What else can do we?
>>>
>>> -
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>
>>>
>>
> 
> 

--

Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-23 Thread epa...@apache.org
Hi Jonathan,

Thanks very much for all of your work on this release.

I have a concern about cross-queue (inter-queue) preemption in 2.10.

In 2.8, on a 6 node pseudo-cluster, preempting from one queue to meet the needs 
of another queue seems to work as expected. However, 2.10 in the same 
pseudo-cluster (with the same config properties), only one container was 
preempted for the AM and then nothing else.

I don't know how the community feels about holding up the 2.10.0 release for 
this issue, but we need to get to the bottom of this before we can go to 
2.10.x. I am still investigating.

Thanks,
-Eric




 On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung 
 wrote: 
> Hi folks,
> 
> This is the second release candidate for the first release of Apache Hadoop
> 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
> features such as:
> 
> - User-defined resource types
> - Native GPU support as a schedulable resource type
> - Consistent reads from standby node
> - Namenode port based selective encryption
> - Improvements related to rolling upgrade support from 2.x to 3.x
> - Cost based fair call queue
> 
> The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/
> 
> RC tag is release-2.10.0-RC1.
> 
> The maven artifacts are hosted here:
> https://repository.apache.org/content/repositories/orgapachehadoop-1243/
> 
> My public key is available here:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> 
> The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT.
> 
> Thanks,
> Jonathan Hung

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-26 Thread epa...@apache.org
I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk)

Unfortunately, I ran into the following problem:

Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the following 
error:

2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch):
 Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch. 
(client = 19, server = 21)

The AM happened to launch on the 3.3.0 node.

Is this a protobuf issue? I thought we addressed that?

-Eric Payne



On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
 wrote: 





Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
2.10.0 clients and datanodes. Everything worked as expected.

Jonathan Hung


On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
wrote:

> Hi Jonathan,
>
> Thanks for putting this RC together. You stated that there are
> improvements related to rolling upgrades from 2.x to 3.x and I know I have
> seen multiple JIRAs getting committed to that effect. Could you describe
> any tests that you have done to verify rolling upgrade compatibility
> for 3.x servers talking to 2.x clients and vice versa?
>
> Thanks,
>
> Eric
>
> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> wrote:
>
>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>> (HDFS-14667). Since this is the first of a minor release, we would like to
>> get it into 2.10.0.
>>
>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>> shortly.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>
>> > Thanks for the effort, Jonathan!
>> >
>> > +1 (non-binding) on RC0.
>> >  - Set up a single node cluster with the binary tarball
>> >  - Run Spark Pi and pySpark job
>> >
>> > BR,
>> > Zhankun
>> >
>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > >
>> > wrote:
>> >
>> >> +1 on RC0.
>> >> - Verified signatures
>> >> - Built from sources
>> >> - Ran unit tests for new features
>> >> - Checked artifacts on Nexus, made sure the sources are present.
>> >>
>> >> Thanks
>> >> --Konstantin
>> >>
>> >>
>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> >> wrote:
>> >>
>> >> > Hi folks,
>> >> >
>> >> > This is the first release candidate for the first release of Apache
>> >> Hadoop
>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>> includes
>> >> > features such as:
>> >> >
>> >> > - User-defined resource types
>> >> > - Native GPU support as a schedulable resource type
>> >> > - Consistent reads from standby node
>> >> > - Namenode port based selective encryption
>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >> >
>> >> > The RC0 artifacts are at:
>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >> >
>> >> > RC tag is release-2.10.0-RC0.
>> >> >
>> >> > The maven artifacts are hosted here:
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >> >
>> >> > My public key is available here:
>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >> >
>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>> 6:00 pm
>> >> > PDT.
>> >> >
>> >> > Thanks,
>> >> > Jonathan Hung
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >> >
>> >>
>> >
>>
>

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-27 Thread epa...@apache.org
 Ah! Yes! That makes sense. I will use the mapredonhdfs framework in my next 
set of tests.
The other compatibility tests that I ran worked as expected.
-Eric

On Saturday, October 26, 2019, 12:29:54 PM CDT, Jonathan Hung 
 wrote:  
 
 Hi Eric, I took a quick look, are you using 
mapreduce.application.framework.path to run your MR jobs? If not, this seems 
like expected behavior if AM and tasks get launched on different NMs with 
different locally installed hadoop versions?

Jonathan Hung

On Sat, Oct 26, 2019 at 8:55 AM epa...@apache.org  wrote:

I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk)

Unfortunately, I ran into the following problem:

Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the following 
error:

2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch):
 Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch. 
(client = 19, server = 21)

The AM happened to launch on the 3.3.0 node.

Is this a protobuf issue? I thought we addressed that?

-Eric Payne



On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
 wrote: 





Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
2.10.0 clients and datanodes. Everything worked as expected.

Jonathan Hung


On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
wrote:

> Hi Jonathan,
>
> Thanks for putting this RC together. You stated that there are
> improvements related to rolling upgrades from 2.x to 3.x and I know I have
> seen multiple JIRAs getting committed to that effect. Could you describe
> any tests that you have done to verify rolling upgrade compatibility
> for 3.x servers talking to 2.x clients and vice versa?
>
> Thanks,
>
> Eric
>
> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> wrote:
>
>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>> (HDFS-14667). Since this is the first of a minor release, we would like to
>> get it into 2.10.0.
>>
>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>> shortly.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>
>> > Thanks for the effort, Jonathan!
>> >
>> > +1 (non-binding) on RC0.
>> >  - Set up a single node cluster with the binary tarball
>> >  - Run Spark Pi and pySpark job
>> >
>> > BR,
>> > Zhankun
>> >
>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > >
>> > wrote:
>> >
>> >> +1 on RC0.
>> >> - Verified signatures
>> >> - Built from sources
>> >> - Ran unit tests for new features
>> >> - Checked artifacts on Nexus, made sure the sources are present.
>> >>
>> >> Thanks
>> >> --Konstantin
>> >>
>> >>
>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> >> wrote:
>> >>
>> >> > Hi folks,
>> >> >
>> >> > This is the first release candidate for the first release of Apache
>> >> Hadoop
>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>> includes
>> >> > features such as:
>> >> >
>> >> > - User-defined resource types
>> >> > - Native GPU support as a schedulable resource type
>> >> > - Consistent reads from standby node
>> >> > - Namenode port based selective encryption
>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >> >
>> >> > The RC0 artifacts are at:
>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >> >
>> >> > RC tag is release-2.10.0-RC0.
>> >> >
>> >> > The maven artifacts are hosted here:
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >> >
>> >> > My public key is available here:
>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >> >
>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>> 6:00 pm
>> >> > PDT.
>> >> >
>> >> > Thanks,
>> >> > Jonathan Hung
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >> >
>> >>
>> >
>>
>

  

Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-28 Thread epa...@apache.org
Compatibility testing has gone well for me.

- In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 2.10.0
- In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and trunk
- With one 4-node cluster running 2.10.0 and one 4-node cluster running trunk, 
I ran a word count job in each cluster whose inputs and outputs were from and 
to the opposite cluster.
- I verified that HDFS replication works as expected in a trunk cluster that 
has one 2.10.0 datanode.

Thanks,
-Eric

On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
 wrote: 





Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
2.10.0 clients and datanodes. Everything worked as expected.

Jonathan Hung


On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
wrote:

> Hi Jonathan,
>
> Thanks for putting this RC together. You stated that there are
> improvements related to rolling upgrades from 2.x to 3.x and I know I have
> seen multiple JIRAs getting committed to that effect. Could you describe
> any tests that you have done to verify rolling upgrade compatibility
> for 3.x servers talking to 2.x clients and vice versa?
>
> Thanks,
>
> Eric
>
> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> wrote:
>
>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>> (HDFS-14667). Since this is the first of a minor release, we would like to
>> get it into 2.10.0.
>>
>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>> shortly.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>
>> > Thanks for the effort, Jonathan!
>> >
>> > +1 (non-binding) on RC0.
>> >  - Set up a single node cluster with the binary tarball
>> >  - Run Spark Pi and pySpark job
>> >
>> > BR,
>> > Zhankun
>> >
>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > >
>> > wrote:
>> >
>> >> +1 on RC0.
>> >> - Verified signatures
>> >> - Built from sources
>> >> - Ran unit tests for new features
>> >> - Checked artifacts on Nexus, made sure the sources are present.
>> >>
>> >> Thanks
>> >> --Konstantin
>> >>
>> >>
>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> >> wrote:
>> >>
>> >> > Hi folks,
>> >> >
>> >> > This is the first release candidate for the first release of Apache
>> >> Hadoop
>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>> includes
>> >> > features such as:
>> >> >
>> >> > - User-defined resource types
>> >> > - Native GPU support as a schedulable resource type
>> >> > - Consistent reads from standby node
>> >> > - Namenode port based selective encryption
>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >> >
>> >> > The RC0 artifacts are at:
>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >> >
>> >> > RC tag is release-2.10.0-RC0.
>> >> >
>> >> > The maven artifacts are hosted here:
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >> >
>> >> > My public key is available here:
>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >> >
>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>> 6:00 pm
>> >> > PDT.
>> >> >
>> >> > Thanks,
>> >> > Jonathan Hung
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >> >
>> >>
>> >
>>
>

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-29 Thread epa...@apache.org
Jonathan,

I actually did all my testing on RC1. Sorry for the confusion. I'll respond on 
the RC1 thread.

-Eric

 On Monday, October 28, 2019, 8:00:20 PM CDT, Jonathan Hung 
 wrote: 

Thanks Eric! I sent out an RC1 earlier last week, not sure if you saw that. The 
only diff between RC1 and RC0 is HDFS-14667. If RC1 looks good to you then it'd 
be great to get your testing results on that thread.

Jonathan Hung


On Mon, Oct 28, 2019 at 1:06 PM epa...@apache.org  wrote:
> Compatibility testing has gone well for me.
> 
> - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 
> 2.10.0
> - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and 
> trunk
> - With one 4-node cluster running 2.10.0 and one 4-node cluster running 
> trunk, I ran a word count job in each cluster whose inputs and outputs were 
> from and to the opposite cluster.
> - I verified that HDFS replication works as expected in a trunk cluster that 
> has one 2.10.0 datanode.
> 
> Thanks,
> -Eric
> 
> On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
>  wrote: 
> 
> 
> 
> 
> 
> Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
> 2.10.0 clients and datanodes. Everything worked as expected.
> 
> Jonathan Hung
> 
> 
> On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
> wrote:
> 
>> Hi Jonathan,
>>
>> Thanks for putting this RC together. You stated that there are
>> improvements related to rolling upgrades from 2.x to 3.x and I know I have
>> seen multiple JIRAs getting committed to that effect. Could you describe
>> any tests that you have done to verify rolling upgrade compatibility
>> for 3.x servers talking to 2.x clients and vice versa?
>>
>> Thanks,
>>
>> Eric
>>
>> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
>> wrote:
>>
>>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>>> (HDFS-14667). Since this is the first of a minor release, we would like to
>>> get it into 2.10.0.
>>>
>>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>>> shortly.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>>
>>> > Thanks for the effort, Jonathan!
>>> >
>>> > +1 (non-binding) on RC0.
>>> >  - Set up a single node cluster with the binary tarball
>>> >  - Run Spark Pi and pySpark job
>>> >
>>> > BR,
>>> > Zhankun
>>> >
>>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko >> >
>>> > wrote:
>>> >
>>> >> +1 on RC0.
>>> >> - Verified signatures
>>> >> - Built from sources
>>> >> - Ran unit tests for new features
>>> >> - Checked artifacts on Nexus, made sure the sources are present.
>>> >>
>>> >> Thanks
>>> >> --Konstantin
>>> >>
>>> >>
>>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>>> >> wrote:
>>> >>
>>> >> > Hi folks,
>>> >> >
>>> >> > This is the first release candidate for the first release of Apache
>>> >> Hadoop
>>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>>> includes
>>> >> > features such as:
>>> >> >
>>> >> > - User-defined resource types
>>> >> > - Native GPU support as a schedulable resource type
>>> >> > - Consistent reads from standby node
>>> >> > - Namenode port based selective encryption
>>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>>> >> >
>>> >> > The RC0 artifacts are at:
>>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>>> >> >
>>> >> > RC tag is release-2.10.0-RC0.
>>> >> >
>>> >> > The maven artifacts are hosted here:
>>> >> >
>>> >>
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>>> >> >
>>> >> > My public key is available here:
>>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>> >> >
>>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>>> 6:00 pm
>>> >> > PDT.
>>> >> >
>>> >> > Thanks,
>>> >> > Jonathan Hung
>>> >> >
>>> >> > [1]
>>> >> >
>>> >> >
>>> >>
>>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>>> >> >
>>> >>
>>> >
>>>
>>
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 
> 

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-29 Thread epa...@apache.org
Compatibility testing has gone well for me.

 - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 
2.10.0
- In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and trunk
- With one 4-node cluster running 2.10.0 and one 4-node cluster running trunk, 
I ran a word count job in each cluster whose inputs and outputs were from and 
to the opposite cluster.
- I verified that HDFS replication works as expected in a trunk cluster that 
has one 2.10.0 datanode.

 Thanks,
-Eric


> On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung 
>  wrote: 
> Hi folks,
> 
>This is the second release candidate for the first release of Apache Hadoop
>2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
>features such as:
>
> - User-defined resource types
> - Native GPU support as a schedulable resource type
> - Consistent reads from standby node
> - Namenode port based selective encryption
> - Improvements related to rolling upgrade support from 2.x to 3.x
> - Cost based fair call queue
> 
> The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/
> 
> RC tag is release-2.10.0-RC1.
> 
> The maven artifacts are hosted here:
> https://repository.apache.org/content/repositories/orgapachehadoop-1243/
> 
> My public key is available here:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> 
> The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT.
> 
> Thanks,
> Jonathan Hung
 

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-15 Thread epa...@apache.org
Thanks Jonathan for opening the discussion.

I am not in favor of this proposal. 2.10 was very recently released, and moving 
to 2.10 will take some time for the community. It seems premature to make a 
decision at this point that there will never be a need for a 2.11 release.

-Eric


 On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung 
 wrote: 

Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

  - Delete branch-2.10
  - Rename branch-2 to branch-2.10
  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-19 Thread epa...@apache.org
Hi Konstantin,

Sure, I understand those concerns. On the other hand, I worry about the
stability of 2.10, since we will be on it for a couple of years at least. I 
worry
 that some committers may want to put new features into a branch 2 release,
 and without a branch-2, they will go directly into 2.10. Since we don't always
 catch corner cases or performance problems for some time (usually not until
 the release is deployed to a busy, 4-thousand node cluster), it may be very
 difficult to back out those changes.

It sounds like I'm in the minority here, so I'm not nixing the idea, but I do
 have these reservations.

Thanks,
-Eric



On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko 
 wrote: 
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung  wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epa...@apache.org 
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2...@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >  - Delete branch-2.10
> >  - Rename branch-2 to branch-2.10
> >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[DISCUSS] Guidelines for Code cleanup JIRAs

2020-01-09 Thread epa...@apache.org
There was some discussion on https://issues.apache.org/jira/browse/YARN-9052
about concerns surrounding the costs/benefits of code cleanup JIRAs. This email
is to get the discussion going within a wider audience.

The positive points for code cleanup JIRAs:
- Clean up tech debt
- Make code more readable
- Make code more maintainable
- Make code more performant

The concerns regarding code cleanup JIRAs are as follows:
- If the changes only go into trunk, then contributors and committers trying to
 backport to prior releases will have to create and test multiple patch 
versions.
- Some have voiced concerns that code cleanup JIRAs may not be tested as
  thoroughly as features and bug fixes because functionality is not supposed to
  change.
- Any patches awaiting review that are touching the same code will have to be
  redone, re-tested, and re-reviewed.
- JIRAs that are opened for code cleanup and not worked on right away tend to
  clutter up the JIRA space.

Here are my opinions:
- Code changes of any kind force a non-trivial amount of overhead for other
  developers. For code cleanup JIRAs, sometimes the usability, maintainability,
  and performance is worth the overhead (as in the case of YARN-9052).
- Before opening any JIRA, please always consider whether or not the added
  usability will outweigh the added pain you are causing other developers.
- If you believe the benefits outweigh the costs, please backport the changes
  yourself to all active lines. My preference is to port all the way back to 
2.10.
- Please don't run code analysis tools and then open many JIRAs that document
  those findings. That activity does not put any thought into this cost-benefit
  analysis.

Thanks everyone. I'm looking forward to your thoughts. I appreciate all you do
for the open source community and it is always a pleasure to work with you.
-Eric Payne

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[ANNOUNCE] Jim Brennan is a new Hadoop Committer

2020-08-03 Thread epa...@apache.org
I am pleased to announce that Jim Brennan has accepted the invitation to become 
a Hadoop committer focusing on the YARN space.

Please reach out to Jim and welcome him in his new role.

Congratulations, Jim! Well-deserved!

-Eric Payne

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



2.x client accessing 3.x HDFS and vice versa

2020-08-03 Thread epa...@apache.org
Hello,

We are investigating upgrading to 3.x, but we are very concerned about the 
differences in the HDFS features, interfaces, etc. between 2.10 and 3.3+. Our 
requirements are to not have any cluster downtime and to allow 2.10 HDFS 
clients to communicate with 3.x clusters and 3.x HDFS clients to communicate 
with 2.10 clusters.

Have you encountered these use cases with your users and customers and, if so, 
how have they addressed the issues?

Thank you,
-Eric Payne

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: 2.x client accessing 3.x HDFS and vice versa

2020-08-04 Thread epa...@apache.org
Thanks Craig. Are you enabling any special HDFS features? Specifically, are you 
enabling and using encryption zones?
-Eric


On Tuesday, August 4, 2020, 8:49:26 AM CDT, Craig. Condit 
 wrote:

 Eric,I can't speak to 2.10, but we have been running a production Hadoop 3.2.0 
cluster 
for ~ 1 year now in parallel with a legacy 2.7.3 cluster, and have clients from 
both
communicating with HDFS on the other cluster frequently.As usual, YMMV, but we 
haven't
encountered any serious problems.

- Craig Condit

____

From: epa...@apache.org 
Sent: Monday, August 3, 2020 2:12 PM
To: HDFS Dev 
Subject: [EXTERNAL] 2.x client accessing 3.x HDFS and vice versa

Hello,

We are investigating upgrading to 3.x, but we are very concerned about the 
differences in the HDFS features, interfaces, etc. between 2.10 and 3.3+. Our 
requirements are to not have any cluster downtime and to allow 2.10 HDFS 
clients to communicate with 3.x clusters and 3.x HDFS clients to communicate 
with 2.10 clusters.

Have you encountered these use cases with your users and customers and, if so, 
how have they addressed the issues?

Thank you,
-Eric Payne

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[Virtual MEETUP]: Migration to Hadoop 3

2020-08-24 Thread epa...@apache.org
Hello everyone!

We are considering migrating to Hadoop 3, and we would be very interested to
hear about your experiences. If you have migrated from Hadoop 2 to Hadoop 3
and can provide insights, please kindly consider attending the following:

Date: Wednesday, Aug 26, 2020
Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
Location: Zoom: https://cloudera.zoom.us/j/880548968

Hope to see you there!

Thank you!
Eric Payne
@ Verizon Media

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-26 Thread epa...@apache.org
Hello. Just a reminder that today I would like to invite you all to discuss your
experiences migrating from Hadoop 2 to Hadoop 3.

-Eric

On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org 
 wrote: 

Hello everyone!

We are considering migrating to Hadoop 3, and we would be very interested to
hear about your experiences. If you have migrated from Hadoop 2 to Hadoop 3
and can provide insights, please kindly consider attending the following:

Date: Wednesday, Aug 26, 2020
Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
Location: Zoom: https://cloudera.zoom.us/j/880548968

Hope to see you there!

Thank you!
Eric Payne
@ Verizon Media

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-26 Thread epa...@apache.org
Thank you! This will be very helpful. And I really appreciate your 
participation in todays meeting.
-Eric

On Wednesday, August 26, 2020, 12:36:38 PM CDT, Brahma Reddy Battula 
 wrote: 

Hi Eric,

check the following references for the same.

01/02/2020 Didi talked about their large scale HDFS cluster upgrade
experience.

Slides:
https://drive.google.com/open?id=1iwJ1asalYfgnOCBuE-RfeG-NpSocjIcy

Recording:
https://cloudera.zoom.us/rec/share/7MF_dLX0339OY5391xvkZP8NLrXieaa8gyZK-fYJnUkGOUUXvaUh5cl_6AVYetQl

Didi studied two upgrade approaches from the community documentation:
express upgrade and rolling upgrade. Rolling upgrade was selected.

Yahoo Japan was trying out from hadoop-2.6 to hadop-3.2.1

https://techblog.yahoo.co.jp/entry/20191206786320/

On Wed, Aug 26, 2020 at 6:56 PM epa...@apache.org  wrote:

> Hello. Just a reminder that today I would like to invite you all to
> discuss your
> experiences migrating from Hadoop 2 to Hadoop 3.
>
> -Eric
>
> On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org <
> epa...@apache.org> wrote:
>
> Hello everyone!
>
> We are considering migrating to Hadoop 3, and we would be very interested
> to
> hear about your experiences. If you have migrated from Hadoop 2 to Hadoop 3
> and can provide insights, please kindly consider attending the following:
>
> Date: Wednesday, Aug 26, 2020
> Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
> Location: Zoom: https://cloudera.zoom.us/j/880548968
>
> Hope to see you there!
>
> Thank you!
> Eric Payne
> @ Verizon Media

>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

-- 



--Brahma Reddy Battula


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-27 Thread epa...@apache.org
Wei-Chiu, our plans are tentative at the moment, but we have internally 
discussed migrating from 2.10 to 3.2.

And, thank you all again for the great participation, especially those of you 
in timezones where it was late in the evening.

-Eric

On Thursday, August 27, 2020, 12:47:43 AM CDT, Wei-Chiu Chuang 
 wrote: 

Thanks Brahma,

Eric, do you have a target Hadoop 3 release line in mind?

The "unofficial" plan here at Cloudera is to rebase our current dev
codebase from Hadoop 3.1.1 to 3.3 some time later. The Hadoop 3.1 code line
will approach its 3rd anniversary by this year's end so perhaps we can
start to sunset it.

On Wed, Aug 26, 2020 at 10:51 AM Brahma Reddy Battula 
wrote:

> One more update from me.
>
> We didn't face any issues with YARN, for HDFS you can have a look at the
> following jira's.
>
> https://issues.apache.org/jira/browse/HDFS-13596
> https://issues.apache.org/jira/browse/HDFS-14396
> https://issues.apache.org/jira/browse/HDFS-14509
>
> Following jira is incompatible for ACL commands.Only hadoop-3 clients will
> work against hadoop-3 server during the upgrade.
>
> https://issues.apache.org/jira/browse/HDFS-6984
>
>
>
> On Wed, Aug 26, 2020 at 11:06 PM Brahma Reddy Battula 
> wrote:
>
> >
> > Hi Eric,
> >
> > check the following references for the same.
> >
> > 01/02/2020 Didi talked about their large scale HDFS cluster upgrade
> > experience.
> >
> > Slides:
> > https://drive.google.com/open?id=1iwJ1asalYfgnOCBuE-RfeG-NpSocjIcy
> >
> > Recording:
> >
> https://cloudera.zoom.us/rec/share/7MF_dLX0339OY5391xvkZP8NLrXieaa8gyZK-fYJnUkGOUUXvaUh5cl_6AVYetQl
> >
> > Didi studied two upgrade approaches from the community documentation:
> > express upgrade and rolling upgrade. Rolling upgrade was selected.
> >
> > Yahoo Japan was trying out from hadoop-2.6 to hadop-3.2.1
> >
> > https://techblog.yahoo.co.jp/entry/20191206786320/
> >
> > On Wed, Aug 26, 2020 at 6:56 PM epa...@apache.org 
> > wrote:
> >
> >> Hello. Just a reminder that today I would like to invite you all to
> >> discuss your
> >> experiences migrating from Hadoop 2 to Hadoop 3.
> >>
> >> -Eric
> >>
> >> On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org <
> >> epa...@apache.org> wrote:
> >>
> >> Hello everyone!
> >>
> >> We are considering migrating to Hadoop 3, and we would be very
> interested
> >> to
> >> hear about your experiences. If you have migrated from Hadoop 2 to
> Hadoop
> >> 3
> >> and can provide insights, please kindly consider attending the
> following:
> >>
> >> Date: Wednesday, Aug 26, 2020
> >> Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
> >> Location: Zoom: https://cloudera.zoom.us/j/880548968
> >>
> >> Hope to see you there!
> >>
> >> Thank you!
> >> Eric Payne
> >> @ Verizon Media
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >>
> >
> > --
> >
> >
> >
> > --Brahma Reddy Battula
> >
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [ANNOUNCE] Hui Fei is a new Apache Hadoop Committer

2020-09-24 Thread epa...@apache.org
Congratulations Hui Fei!

On Wednesday, September 23, 2020, 1:07:11 PM CDT, Wei-Chiu Chuang 
 wrote: 
I am pleased to announce that Hui Fei has accepted the invitation to become
a Hadoop committer.

He started contributing to the project in October 2016. Over the past 4
years he has contributed a lot in HDFS, especially in Erasure Coding,
Hadoop 3 upgrade, RBF and Standby Serving reads.

One of the biggest contributions is Hadoop 2->3 rolling upgrade support.
This was a major blocker for any existing Hadoop users to adopt Hadoop 3.
The adoption of Hadoop 3 has gone up after this. In the past the community
discussed a lot about Hadoop 3 rolling upgrade being a must-have, but no
one took the initiative to make it happen. I am personally very grateful
for this.

The work on EC is impressive as well. He managed to onboard EC in
production at scale, fixing tricky problems. Again, I am impressed and
grateful for the contribution in EC.

In addition to code contributions, he invested a lot in the community:

>
>    - Apache Hadoop Community 2019 Beijing Meetup
>    https://blogs.apache.org/hadoop/entry/hadoop-community-meetup-beijing-aug 
>where
>    he discussed the operational experience of RBF in production
>
>
>    - Apache Hadoop Storage Community Sync Online
>    
>https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit#heading=h.irqxw1iy16zo
> where
>    he discussed the Hadoop 3 rolling upgrade support
>
>
Let's congratulate Hui for this new role!

Cheers,
Wei-Chiu Chuang (on behalf of the Apache Hadoop PMC)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] check style changes

2021-05-15 Thread epa...@apache.org
I would be fine with a discussion and vote on relaxing some checkstyle 
restrictions.

Regarding line length, my personal preference is to leave it at 80, but 80 is 
arbitrary and I would not oppose 100 if that's what people want.

Another one that I think should be relaxed is the limit on number of arguments 
to a method. I understand that a ton of arguments makes a method messy, but I 
find it irritating when I add an argument to something that is already over the 
limit and I get penalized for it. The ones I have seen are all constructor 
methods.

-Eric







On Thursday, May 13, 2021, 10:10:27 AM CDT, Sean Busbey 
 wrote: 





Hi folks!

I’d like to start cleaning up our nightly tests. As a bit of low hanging fruit 
I’d like to alter some of our check style rules to match what I think we’ve 
been doing in the community. How would folks prefer I make sure we have 
consensus on such changes?

As an example, our last nightly run had ~81k check style violations (it’s a big 
number but it’s not that bad given the size of the repo) and roughly 16% of 
those were for line lengths in excess of 80 characters but <= 100 characters.

If I wanted to change our line length check to be 100 characters rather than 
the default of 80, would folks rather I have a DISCUSS thread first? Or would 
they rather a Jira + PR with the discussion of the merits happening there?

—
busbey



-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: Hadoop Community Sync Up Schedule

2019-08-20 Thread epa...@apache.org
Hi Wangda,
Thank you for continuing to keep us moving forward and refining these vital 
sync-ups.

> 3) Update the US [YARN/MapReduce] sync up time from 9AM to 10AM PDT.

That puts it at noon central time, which is during our lunch hour. However, I 
am +1 for this if we are able to allow greater participation from folks on the 
U.S. west coast.


-Eric Payne


On Monday, August 19, 2019, 10:32:20 PM CDT, Wangda Tan  
wrote: 

> Hi folks, 
> 
> We have run community sync up for 1.5 months. I spoke to folks offline and 
> got some feedback. Here's a summary of what I've observed from sync ups and 
> talked to organizers.
> 
> Following sync ups have very good participants (sometimes 10+ folks joined): 
> - YARN/MR monthly sync up in APAC (Mandarin)
> - HDFS monthly sync up in APAC (Mandarin). 
> - Submarine weekly sync up in APAC (Mandarin).
> 
> Following sync up have OK-ish participants: (3-5 folks joined). 
> - Storage monthly sync up in APAC (English)
> - Storage bi-weekly sync up in US (English)
> - YARN bi-weekly sync up in US (English).
> 
> Following sync ups don't have good participants: (Skipped a couple of times).
> - YARN monthly sync up in APAC (English). 
> - Submarine bi-weekly sync up in US (English).
> 
> So I'd like to propose the following changes and fixes of the schedule: 
> 1) Cancel the YARN/MR monthly sync up in APAC (English). Folks from APAC who 
> speak English can choose to join the US session. 
> 2) Cancel the Submarine bi-weekly sync up in US (English). Now Submarine 
> developers and users are fast-growing in Mandarin-speaking areas. We can 
> resume the sync if we do see demands from English-
> speaking areas. 
> 3) Update the US sync up time from 9AM to 10AM PDT. 9AM is too early for most 
> of the west-cost folks.  
> 
> Following are fixes for the schedule:  
> 1) In the proposal, repeats are not properly. (I used bi-weekly instead of 
> 2nd/4th week as repeat frequency). I'd like to fix the frequency on Thu and 
> it will take effect starting next week. 
> 
> Overall, thanks for everybody who participated in the sync ups. I do see 
> community contributions grow in the last one month! 
> 
> Any thoughts about the proposal?
> 
> Thanks, 
> Wangda

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: Hadoop Community Sync Up Schedule

2019-08-21 Thread epa...@apache.org
Let's go with bi-weekly (every 2 weeks). Sometimes this gives us 3 sync-ups in 
one month, which I think is fine.
-Eric Payne

On Wednesday, August 21, 2019, 5:01:52 AM CDT, Wangda Tan  
wrote: 
> 
> For folks in other US time zones: how about 11am PDT, is it better or 10am
> PDT will be better? I will be fine with both.
> 
> Hi Matt,
> 
> Thanks for mentioning this issue, this is the exactly issue I saw 🤣.
> 
> Basically there’re two options:
> 
> - a. weekly, bi-weekly (for odd/even week) and every four months.
> - b. weekly, 1st/3rd week or 2nd/4th week, x-th week monthly.
> 
> I’m not sure which one is easier for people to understand as the issue you
> mentioned.
> 
> After thinking about it. I prefer a. since it is more consistent for
> audience and not disrupted because of calendar.
> 
> If we choose a. I will redo the proposal and make it aligns with a.
> 
> Thoughts?
> 
> Thanks,
> Wangda


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: branch-3. Was it a created by mistake?

2019-08-28 Thread epa...@apache.org
 Yes, I think it must have been. branch-3 is not needed right now. I would be 
in favor of its removal, since it is confusing for committers.
Thanks,-Eric


On Tuesday, August 27, 2019, 6:24:09 PM CDT, Wei-Chiu Chuang 
 wrote:  
 
 I just realized there is a branch-3 in the Hadoop repo. Was this created by
mistake?

I don't think we've decided to create a branch-3. It also looks like the
branch is rarely used. The last commit was

commit bf90a27b51b1f1ac102fa861eb28025d21aad19b (origin/branch-3, branch-3)
Author: Chen Liang 
Date:  Thu Nov 29 13:31:58 2018 -0800

    HDFS-13547. Add ingress port based sasl resolver. Contributed by Chen
Liang.
  

Re: [VOTE] Release Hadoop-3.1.3-RC0

2019-09-19 Thread epa...@apache.org



+1 (binding)

Thanks Zhankun for all of your hard work on this release.

I downloaded and built the source and ran it on an insecure multi-node pseudo 
cluster.

I performed various YARN manual tests, including creating custom resources, 
creating queue submission ACLs, and queue refreshes.

One concern is that preemption does not seem to be working when only the custom 
resources are over the queue capacity, but I don't think this is something 
introduced with this release.

-Eric



On Thursday, September 12, 2019, 3:04:44 AM CDT, Zhankun Tang 
 wrote: 





Hi folks,

Thanks to everyone's help on this release. Special thanks to Rohith,
Wei-Chiu, Akira, Sunil, Wangda!

I have created a release candidate (RC0) for Apache Hadoop 3.1.3.

The RC release artifacts are available at:
http://home.apache.org/~ztang/hadoop-3.1.3-RC0/

The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1228/

The RC tag in git is here:
https://github.com/apache/hadoop/tree/release-3.1.3-RC0

And my public key is at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

*This vote will run for 7 days, ending on Sept.19th at 11:59 pm PST.*

For the testing, I have run several Spark and distributed shell jobs in my
pseudo cluster.

My +1 (non-binding) to start.

BR,
Zhankun

On Wed, 4 Sep 2019 at 15:56, zhankun tang  wrote:

> Hi all,
>
> Thanks for everyone helping in resolving all the blockers targeting Hadoop
> 3.1.3[1]. We've cleaned all the blockers and moved out non-blockers issues
> to 3.1.4.
>
> I'll cut the branch today and call a release vote soon. Thanks!
>
>
> [1]. https://s.apache.org/5hj5i
>
> BR,
> Zhankun
>
>
> On Wed, 21 Aug 2019 at 12:38, Zhankun Tang  wrote:
>
>> Hi folks,
>>
>> We have Apache Hadoop 3.1.2 released on Feb 2019.
>>
>> It's been more than 6 months passed and there're
>>
>> 246 fixes[1]. 2 blocker and 4 critical Issues [2]
>>
>> (As Wei-Chiu Chuang mentioned, HDFS-13596 will be another blocker)
>>
>>
>> I propose my plan to do a maintenance release of 3.1.3 in the next few
>> (one or two) weeks.
>>
>> Hadoop 3.1.3 release plan:
>>
>> Code Freezing Date: *25th August 2019 PDT*
>>
>> Release Date: *31th August 2019 PDT*
>>
>>
>> Please feel free to share your insights on this. Thanks!
>>
>>
>> [1] https://s.apache.org/zw8l5
>>
>> [2] https://s.apache.org/fjol5
>>
>>
>> BR,
>>
>> Zhankun
>>
>

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org