Re: [VOTE] Release Apache Hadoop 3.3.0 - RC0

2020-07-13 Thread Rakesh Radhakrishnan
Thanks Brahma for getting this out!

+1 (binding)

Verified the following and looks fine to me.
 * Built from source with CentOS 7.4 and OpenJDK 1.8.0_232.
 * Deployed 3-node cluster.
 * Verified HDFS web UIs.
 * Tried out a few basic hdfs shell commands.
 * Ran sample Terasort, Wordcount MR jobs.

-Rakesh-

On Tue, Jul 7, 2020 at 3:57 AM Brahma Reddy Battula 
wrote:

> Hi folks,
>
> This is the first release candidate for the first release of Apache
> Hadoop 3.3.0
> line.
>
> It contains *1644[1]* fixed jira issues since 3.2.1 which include a lot of
> features and improvements(read the full set of release notes).
>
> Below feature additions are the highlights of the release.
>
> - ARM Support
> - Enhancements and new features on S3a,S3Guard,ABFS
> - Java 11 Runtime support and TLS 1.3.
> - Support Tencent Cloud COS File System.
> - Added security to HDFS Router.
> - Support non-volatile storage class memory(SCM) in HDFS cache directives
> - Support Interactive Docker Shell for running Containers.
> - Scheduling of opportunistic containers
> - A pluggable device plugin framework to ease vendor plugin development
>
> *The RC0 artifacts are at*:
> http://home.apache.org/~brahma/Hadoop-3.3.0-RC0/
>
> *First release to include ARM binary, Have a check.*
> *RC tag is *release-3.3.0-RC0.
>
>
> *The maven artifacts are hosted here:*
> https://repository.apache.org/content/repositories/orgapachehadoop-1271/
>
> *My public key is available here:*
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> The vote will run for 5 weekdays, until Tuesday, July 13 at 3:50 AM IST.
>
>
> I have done a few testing with my pseudo cluster. My +1 to start.
>
>
>
> Regards,
> Brahma Reddy Battula
>
>
> 1. project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.3.0) AND
> fixVersion not in (3.2.0, 3.2.1, 3.1.3) AND status = Resolved ORDER BY
> fixVersion ASC
>


Re: [VOTE] Apache Hadoop Ozone 1.0.0 RC1

2020-08-31 Thread Rakesh Radhakrishnan
Thanks Sammi for getting this out!

+1 (binding)

 * Verified signatures.
 * Built from source.
 * Deployed small non-HA un-secure cluster.
 * Verified basic Ozone file system.
 * Tried out a few basic Ozone shell commands - create, list, delete
 * Ran a few Freon benchmark tests.

Thanks,
Rakesh

On Tue, Sep 1, 2020 at 11:53 AM Jitendra Pandey
 wrote:

> +1 (binding)
>
> 1. Verified signatures
> 2. Built from source
> 3. deployed with docker
> 4. tested with basic s3 apis.
>
> On Tue, Aug 25, 2020 at 7:01 AM Sammi Chen  wrote:
>
> > RC1 artifacts are at:
> > https://home.apache.org/~sammichen/ozone-1.0.0-rc1/
> > 
> >
> > Maven artifacts are staged at:
> > https://repository.apache.org/content/repositories/orgapachehadoop-1278
> >  >
> >
> > The public key used for signing the artifacts can be found at:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > The RC1 tag in github is at:
> > https://github.com/apache/hadoop-ozone/releases/tag/ozone-1.0.0-RC1
> > 
> >
> > Change log of RC1, add
> > 1. HDDS-4063. Fix InstallSnapshot in OM HA
> > 2. HDDS-4139. Update version number in upgrade tests.
> > 3. HDDS-4144, Update version info in hadoop client dependency readme
> >
> > *The vote will run for 7 days, ending on Aug 31th 2020 at 11:59 pm PST.*
> >
> > Thanks,
> > Sammi Chen
> >
>


Re: [VOTE] Release Apache Hadoop 2.8.2 (RC1)

2017-10-24 Thread Rakesh Radhakrishnan
Thanks Junping for getting this out.

+1 (non-binding)

* Built from source on CentOS 7.3.1611, jdk1.8.0_111
* Deployed 3 node cluster
* Ran some sample jobs
* Ran balancer
* Operate HDFS from command line: ls, put, dfsadmin etc
* HDFS Namenode UI looks good


Thanks,
Rakesh

On Fri, Oct 20, 2017 at 6:12 AM, Junping Du  wrote:

> Hi folks,
>  I've created our new release candidate (RC1) for Apache Hadoop 2.8.2.
>
>  Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line
> and will be the latest stable/production release for Apache Hadoop - it
> includes 315 new fixed issues since 2.8.1 and 69 fixes are marked as
> blocker/critical issues.
>
>   More information about the 2.8.2 release plan can be found here:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>   New RC is available at: http://home.apache.org/~
> junping_du/hadoop-2.8.2-RC1 du/hadoop-2.8.2-RC0>
>
>   The RC tag in git is: release-2.8.2-RC1, and the latest commit id
> is: 66c47f2a01ad9637879e95f80c41f798373828fb
>
>   The maven artifacts are available via repository.apache.org repository.apache.org/> at: https://repository.apache.org/
> content/repositories/orgapachehadoop-1064 repository.apache.org/content/repositories/orgapachehadoop-1062>
>
>   Please try the release and vote; the vote will run for the usual 5
> days, ending on 10/24/2017 6pm PST time.
>
> Thanks,
>
> Junping
>
>


Re: [VOTE] Release Apache Hadoop 3.0.0 RC0

2017-11-20 Thread Rakesh Radhakrishnan
Thanks Andrew for getting this out !

+1 (non-binding)

* Built from source on CentOS 7.3.1611, jdk1.8.0_111
* Deployed non-ha cluster and tested few EC file operations.
* Ran basic shell commands(ls, mkdir, put, get, ec, dfsadmin).
* Ran some sample jobs.
* HDFS Namenode UI looks good.

Thanks,
Rakesh

On Wed, Nov 15, 2017 at 3:04 AM, Andrew Wang 
wrote:

> Hi folks,
>
> Thanks as always to the many, many contributors who helped with this
> release. I've created RC0 for Apache Hadoop 3.0.0. The artifacts are
> available here:
>
> http://people.apache.org/~wang/3.0.0-RC0/
>
> This vote will run 5 days, ending on Nov 19th at 1:30pm Pacific.
>
> 3.0.0 GA contains 291 fixed JIRA issues since 3.0.0-beta1. Notable
> additions include the merge of YARN resource types, API-based configuration
> of the CapacityScheduler, and HDFS router-based federation.
>
> I've done my traditional testing with a pseudo cluster and a Pi job. My +1
> to start.
>
> Best,
> Andrew
>


Re: [VOTE] Adopt HDSL as a new Hadoop subproject

2018-03-27 Thread Rakesh Radhakrishnan
+1 for the sub-project idea. Thanks to everyone that contributed!

Regards,
Rakesh

On Tue, Mar 27, 2018 at 4:46 PM, Jack Liu  wrote:

>  +1 (non-binding)
>
>
> On Tue, Mar 27, 2018 at 2:16 AM, Tsuyoshi Ozawa  wrote:
>
> > +1(binding),
> >
> > - Tsuyoshi
> >
> > On Tue, Mar 20, 2018 at 14:21 Owen O'Malley 
> > wrote:
> >
> > > All,
> > >
> > > Following our discussions on the previous thread (Merging branch
> > HDFS-7240
> > > to trunk), I'd like to propose the following:
> > >
> > > * HDSL become a subproject of Hadoop.
> > > * HDSL will release separately from Hadoop. Hadoop releases will not
> > > contain HDSL and vice versa.
> > > * HDSL will get its own jira instance so that the release tags stay
> > > separate.
> > > * On trunk (as opposed to release branches) HDSL will be a separate
> > module
> > > in Hadoop's source tree. This will enable the HDSL to work on their
> trunk
> > > and the Hadoop trunk without making releases for every change.
> > > * Hadoop's trunk will only build HDSL if a non-default profile is
> > enabled.
> > > * When Hadoop creates a release branch, the RM will delete the HDSL
> > module
> > > from the branch.
> > > * HDSL will have their own Yetus checks and won't cause failures in the
> > > Hadoop patch check.
> > >
> > > I think this accomplishes most of the goals of encouraging HDSL
> > development
> > > while minimizing the potential for disruption of HDFS development.
> > >
> > > The vote will run the standard 7 days and requires a lazy 2/3 vote. PMC
> > > votes are binding, but everyone is encouraged to vote.
> > >
> > > +1 (binding)
> > >
> > > .. Owen
> > >
> >
>
>
>
> --
>


Re: [VOTE] Release Apache Hadoop 2.9.1 (RC0)

2018-04-27 Thread Rakesh Radhakrishnan
Thanks Sammi for getting this out!

+1 (binding)

Verified the following and looks fine to me.

 * Built from source.
 * Deployed 3 node cluster with NameNode HA.
 * Verified HDFS web UIs.
 * Tried out HDFS shell commands.
 * Ran Mover, Balancer tools.
 * Ran sample MapReduce jobs.


Rakesh

On Thu, Apr 19, 2018 at 2:27 PM, Chen, Sammi  wrote:

> Hi all,
>
> This is the first dot release of Apache Hadoop 2.9 line since 2.9.0 was
> released on November 17, 2017.
>
> It includes 208 changes. Among them, 9 blockers, 15 critical issues and
> rest are normal bug fixes and feature improvements.
>
> Thanks to the many who contributed to the 2.9.1 development.
>
> The artifacts are available here:  https://dist.apache.org/repos/
> dist/dev/hadoop/2.9.1-RC0/
>
> The RC tag in git is release-2.9.1-RC0. Last git commit SHA is
> e30710aea4e6e55e69372929106cf119af06fd0e.
>
> The maven artifacts are available at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1115/
>
> My public key is available from:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Please try the release and vote; the vote will run for the usual 5 days,
> ending on 4/25/2018 PST time.
>
> Also I would like thank Lei(Eddy) Xu and Chris Douglas for your help
> during the RC preparation.
>
> Bests,
> Sammi Chen
>


Re: [VOTE] Merge Storage Policy Satisfier (SPS) [HDFS-10285] feature branch to trunk

2018-08-07 Thread Rakesh Radhakrishnan
+1

Thanks,
Rakesh

On Wed, Aug 1, 2018 at 12:08 PM, Uma Maheswara Rao G 
wrote:

> Hi All,
>
>
>
>  From the positive responses from JIRA discussion and no objections from
> below DISCUSS thread [1], I am converting it to voting thread.
>
>
>
>  Last couple of weeks we spent time on testing the feature and so far it is
> working fine. Surendra uploaded a test report at HDFS-10285:  [2]
>
>
>
>  In this phase, we provide to run SPS outside of Namenode only and as a
> next phase we continue to discuss and work on to enable it as Internal SPS
> as explained below. We have got clean QA report on branch and if there are
> any static tool comments triggered later while running this thread, we will
> make sure to fix them before merge. We committed and continue to improve
> the code on trunk. Please refer to HDFS-10285 for discussion details.
>
>
>
>  This has been a long effort and we're grateful for the support we've
> received from the community. In particular, thanks to Andrew Wang, Anoop
> Sam John, Anu Engineer, Chris Douglas, Daryn Sharp, Du Jingcheng , Ewan
> Higgs, Jing Zhao, Kai Zheng,  Rakesh R, Ramkrishna , Surendra Singh Lilhore
> , Thomas Demoor, Uma Maheswara Rao G, Vinayakumar, Virajith,  Wei Zhou,
> Yuanbo Liu. Without these members effort, this feature might not have
> reached to this state.
>
>
>
> To start with, here is my +1
>
> It will end on 6th Aug.
>
>
>
> Regards,
>
> Uma
>
> [1]  https://s.apache.org/bhyu
> [2]  https://s.apache.org/AXvL
>
>
> On Wed, Jun 27, 2018 at 3:21 PM, Uma Maheswara Rao G  >
> wrote:
>
> > Hi All,
> >
> >   After long discussions(offline and on JIRA) on SPS, we came to a
> > conclusion on JIRA(HDFS-10285) that, we will go ahead with External SPS
> > merge in first phase. In this phase process will not be running inside
> > Namenode.
> >   We will continue discussion on Internal SPS. Current code base supports
> > both internal and external option. We have review comments for Internal
> > which needs some additional works for analysis and testing etc. We will
> > move Internal SPS work to under HDFS-12226 (Follow-on work for SPS in NN)
> > We are working on cleanup task HDFS-13076 for the merge. .
> > For more clarity on Internal and External SPS proposal thoughts, please
> > refer to JIRA HDFS-10285.
> >
> > If there are no objections with this, I will go ahead for voting soon.
> >
> > Regards,
> > Uma
> >
> > On Fri, Nov 17, 2017 at 3:16 PM, Uma Maheswara Rao G <
> hadoop@gmail.com
> > > wrote:
> >
> >> Update: We worked on the review comments and additional JIRAs above
> >> mentioned.
> >>
> >> >1. After the feedbacks from Andrew, Eddy, Xiao in JIRA reviews, we
> >> planned to take up the support for recursive API support. HDFS-12291<
> >> https://issues.apache.org/jira/browse/HDFS-12291>
> >>
> >> We provided the recursive API support now.
> >>
> >> >2. Xattr optimizations HDFS-12225 >> he.org/jira/browse/HDFS-12225>
> >> Improved this portion as well
> >>
> >> >3. Few other review comments already fixed and committed HDFS-12214<
> >> https://issues.apache.org/jira/browse/HDFS-12214>
> >> Fixed the comments.
> >>
> >> We are continuing to test the feature and working so far well. Also we
> >> uploaded a combined patch and got the good QA report.
> >>
> >> If there are no further objections, we would like to go for merge vote
> >> tomorrow. Please by default this feature will be disabled.
> >>
> >> Regards,
> >> Uma
> >>
> >> On Fri, Aug 18, 2017 at 11:27 PM, Gangumalla, Uma <
> >> uma.ganguma...@intel.com> wrote:
> >>
> >>> Hi Andrew,
> >>>
> >>> >Great to hear. It'd be nice to define which use cases are met by the
> >>> current version of SPS, and which will be handled after the merge.
> >>> After the discussions in JIRA, we planned to support recursive API as
> >>> well. The primary use cases we planned was for Hbase. Please check next
> >>> point for use case details.
> >>>
> >>> >A bit more detail in the design doc on how HBase would use this
> feature
> >>> would also be helpful. Is there an HBase JIRA already?
> >>> Please find the usecase details at this comment in JIRA:
> >>> https://issues.apache.org/jira/browse/HDFS-10285?focusedComm
> >>> entId=16120227&page=com.atlassian.jira.plugin.system.issueta
> >>> bpanels:comment-tabpanel#comment-16120227
> >>>
> >>> >I also spent some more time with the design doc and posted a few
> >>> questions on the JIRA.
> >>> Thank you for the reviews.
> >>>
> >>> To summarize the discussions in JIRA:
> >>> 1. After the feedbacks from Andrew, Eddy, Xiao in JIRA reviews, we
> >>> planned to take up the support for recursive API support. HDFS-12291<
> >>> https://issues.apache.org/jira/browse/HDFS-12291> (Rakesh started the
> >>> work on it)
> >>> 2. Xattr optimizations HDFS-12225 >>> he.org/jira/browse/HDFS-12225> (Patch available)
> >>> 3. Few other review comments already fixed and committed HDFS-12214<
> >>> https://issues.apache.org/jira/browse/HDFS-12214>
> >>>
> >>>

Re: HDFS Erasuring Coding Block placement policy related reconstruction work not scheduled appropriately

2016-06-09 Thread Rakesh Radhakrishnan
Thanks Rui for reporting this.

With "RS-DEFAULT-6-3-64k EC policy" EC file will have 6 data blocks and 3
parity blocks. Like you described initially the cluster has 5 racks, so the
first 5 data blocks will use those racks. Now while adding rack-6,
reconstruction task will be scheduled for placing 6th data block in rack-6.
Presently I could see for an EC file
"BlockManager#isPlacementPolicySatisfied()" is using "#numDataUnits" count
to verify that the block's placement meets requirement of placement policy,
i.e. replicas are placed on no less than minRacks racks in the system.
Thats the reason while adding rack-7 or more racks its not considering the
parity blocks count and not scheduling further reconstruction tasks to
place the parity blocks. I couldn't see any specific reason not to consider
the parity blocks to most racks. IMHO, its good to distribute all 9 blocks
to 9 diff racks, probably you can file a jira to discuss and reach to an
agreement.

Thanks,
Rakesh.

On Thu, Jun 9, 2016 at 11:16 AM, Rui Gao  wrote:

> Hi all,
>
> We found out that under RS-DEFAULT-6-3-64k EC policy, if an EC file was
> witten to 5 racks, reconstruction work would be scheduled if
> the 6th rack is added. While adding the 7th rack or more racks will not
> trigger reconstruction work. Based on
> “BlockPlacementPolicyRackFaultTolerant.java”,
> EC file should be scheduled to distribute to 9 racks if possible.
>
> May I file a JIRA to address this issue?
>
> Looking forward to your opinions.
> Thank you
>
> Gao Rui
>
>
>


Re: Improving recovery performance for degraded reads

2016-07-22 Thread Rakesh Radhakrishnan
Hi Roy,

Thanks for the interest in hdfs erasure coding feature and helping us in
making the feature more attractive to the users by sharing performance
improvement ideas.

Presently, the reconstruction work has been implemented in a centralized
manner in which the reconstruction task will be given to one data
node(first in the pipeline). For example, we have (k, m) erasure code
schema, assume one chunk (say c bytes) is lost because of a disk or server
failure, k * c bytes of data need to be retrieved from k servers to recover
the lost data. The reconstructing data node will fetch k chunks (belonging
to the same stripe as the failed chunk) from k different servers and
perform decoding to rebuild the lost data chunk. Yes, this k-factor
increases the network traffic causes reconstruction to be very slow. IIUC,
during the implementation time this point has come up but I think the
priority has given for supporting the basic functionality first. I could
see quite few jira tasks HDFS-7717, HDFS-7344 where it discussed about
distributing the coding works to data nodes which includes - converting a
file to a striped layout, reconstruction, error handling etc. But I feel,
there is still room for discussing/implementing new approaches to get
better performance results.

In the shared doc, its mentioned that Partial-Parallel-Repair technique is
successfully implemented on top of the Quantcast File System (QFS) [30],
which supports RS-based erasure coded storage and got promising results.
Its really an encouraging factor for us. I haven't gone through this doc
deeply, it would be really great if you (or me or some other folks) could
come up with the thoughts to discuss/implement similar mechanisms in HDFS
as well. Mostly, will kick start the performance improvement activities
after the much awaiting 3.0.0-alpha release:)

 Also, I would like to know what others have done to sustain good
 performance even under failures (other than keeping fail-over
replicas).
I'm not having much idea about this part, probably some other folks can
pitch in and share thoughts.

Regards,
Rakesh

On Fri, Jul 22, 2016 at 2:03 PM, Roy Leonard 
wrote:

> Greetings!
>
> We are evaluating erasure coding on HDFS to reduce storage cost.
> However, the degraded read latency seems like a crucial bottleneck for our
> system.
> After exploring some strategies for alleviating the pain of degraded read
> latency,
> I found a "tree-like recovery" technique might be useful, as described in
> the following paper:
> "Partial-parallel-repair (PPR): a distributed technique for repairing
> erasure coded storage" (Eurosys-2016)
> http://dl.acm.org/citation.cfm?id=2901328
>
> My question is:
>
> Do you already have such tree-like recovery implemented in HDFS-EC if not,
> do you have any plans to add similar technique is near future ?
>
> Also, I would like to know what others have done to sustain good
> performance even under failures (other than keeping fail-over replicas).
>
> Regards,
> R.
>


Re: Improving recovery performance for degraded reads

2016-07-22 Thread Rakesh Radhakrishnan
I'm adding one more point to the above. In my previous mail reply, I've
explained the striped block reconstruction task which will be triggered by
the Namenode on identifying a missing/bad block. Similarly, in case of hdfs
client read failure, currently hdfs client internally submitting read
requests to fetch all the 'k' chunks(belonging to the same stripe as the
failed chunk) from k data nodes and perform decoding to rebuild the lost
data chunk at the client side.

Regards,
Rakesh

On Fri, Jul 22, 2016 at 5:43 PM, Rakesh Radhakrishnan 
wrote:

> Hi Roy,
>
> Thanks for the interest in hdfs erasure coding feature and helping us in
> making the feature more attractive to the users by sharing performance
> improvement ideas.
>
> Presently, the reconstruction work has been implemented in a centralized
> manner in which the reconstruction task will be given to one data
> node(first in the pipeline). For example, we have (k, m) erasure code
> schema, assume one chunk (say c bytes) is lost because of a disk or server
> failure, k * c bytes of data need to be retrieved from k servers to recover
> the lost data. The reconstructing data node will fetch k chunks (belonging
> to the same stripe as the failed chunk) from k different servers and
> perform decoding to rebuild the lost data chunk. Yes, this k-factor
> increases the network traffic causes reconstruction to be very slow. IIUC,
> during the implementation time this point has come up but I think the
> priority has given for supporting the basic functionality first. I could
> see quite few jira tasks HDFS-7717, HDFS-7344 where it discussed about
> distributing the coding works to data nodes which includes - converting a
> file to a striped layout, reconstruction, error handling etc. But I feel,
> there is still room for discussing/implementing new approaches to get
> better performance results.
>
> In the shared doc, its mentioned that Partial-Parallel-Repair technique is
> successfully implemented on top of the Quantcast File System (QFS) [30],
> which supports RS-based erasure coded storage and got promising results.
> Its really an encouraging factor for us. I haven't gone through this doc
> deeply, it would be really great if you (or me or some other folks) could
> come up with the thoughts to discuss/implement similar mechanisms in HDFS
> as well. Mostly, will kick start the performance improvement activities
> after the much awaiting 3.0.0-alpha release:)
>
> >>>> Also, I would like to know what others have done to sustain good
> >>>> performance even under failures (other than keeping fail-over
> replicas).
> I'm not having much idea about this part, probably some other folks can
> pitch in and share thoughts.
>
> Regards,
> Rakesh
>
> On Fri, Jul 22, 2016 at 2:03 PM, Roy Leonard 
> wrote:
>
>> Greetings!
>>
>> We are evaluating erasure coding on HDFS to reduce storage cost.
>> However, the degraded read latency seems like a crucial bottleneck for our
>> system.
>> After exploring some strategies for alleviating the pain of degraded read
>> latency,
>> I found a "tree-like recovery" technique might be useful, as described in
>> the following paper:
>> "Partial-parallel-repair (PPR): a distributed technique for repairing
>> erasure coded storage" (Eurosys-2016)
>> http://dl.acm.org/citation.cfm?id=2901328
>>
>> My question is:
>>
>> Do you already have such tree-like recovery implemented in HDFS-EC if not,
>> do you have any plans to add similar technique is near future ?
>>
>> Also, I would like to know what others have done to sustain good
>> performance even under failures (other than keeping fail-over replicas).
>>
>> Regards,
>> R.
>>
>
>


Re: [VOTE] Release Apache Hadoop 2.7.3 RC0

2016-07-26 Thread Rakesh Radhakrishnan
Thank you Vinod.

+1 (non-binding)

- downloaded and built from source
- deployed HDFS-HA cluster and tested few switching behaviors
- executed few hdfs commands from command line
- viewed basic UI
- ran HDFS/Common unit tests
- checked LICENSE and NOTICE files

Regards,
Rakesh
Intel

On Tue, Jul 26, 2016 at 11:36 AM, Zhihai Xu  wrote:

> Thanks Vinod.
>
> +1 (non-binding)
>
> * Downloaded and built from source
> * Checked LICENSE and NOTICE
> * Deployed a pseudo cluster
> * Ran through MR and HDFS tests
> * verified basic HDFS operations and Pi job.
>
> Zhihai
>
> On Fri, Jul 22, 2016 at 7:15 PM, Vinod Kumar Vavilapalli <
> vino...@apache.org
> > wrote:
>
> > Hi all,
> >
> > I've created a release candidate RC0 for Apache Hadoop 2.7.3.
> >
> > As discussed before, this is the next maintenance release to follow up
> > 2.7.2.
> >
> > The RC is available for validation at:
> > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/ <
> > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/>
> >
> > The RC tag in git is: release-2.7.3-RC0
> >
> > The maven artifacts are available via repository.apache.org <
> > http://repository.apache.org/> at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1040/
> <
> > https://repository.apache.org/content/repositories/orgapachehadoop-1040/
> >
> >
> > The release-notes are inside the tar-balls at location
> > hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I
> > hosted this at
> > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/releasenotes.html <
> > http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html>
> for
> > your quick perusal.
> >
> > As you may have noted, a very long fix-cycle for the License & Notice
> > issues (HADOOP-12893) caused 2.7.3 (along with every other Hadoop
> release)
> > to slip by quite a bit. This release's related discussion thread is
> linked
> > below: [1].
> >
> > Please try the release and vote; the vote will run for the usual 5 days.
> >
> > Thanks,
> > Vinod
> >
> > [1]: 2.7.3 release plan:
> > https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html
> <
> > http://markmail.org/thread/6yv2fyrs4jlepmmr>
>


Re: Improving recovery performance for degraded reads

2016-07-27 Thread Rakesh Radhakrishnan
Hi Roy,

>>>> (a) In your last email, I am sure you meant => "... submitting read
requests to fetch "any" (instead of all) the 'k' chunk (out of k+m-x
surviving chunks)  ?
>>>> Do you have any optimization in place to decide which data-nodes will
be part of those "k" ?

Answer:-
I hope you know the write path, just adding few details here to support the
read explanation part. While writing to an EC file, dfs client writes data
stripe(e.g. 64KB cellsize) to multiple datanodes. For (k, m) schema, the
client writes data block to the first k datanodes and parity block to the
remaining m datanodes. Say, one stripe is (k * cellSize + m * cellSize)
data. While reading, client will fetch in the same order, read stripe by
stripe. The datanodes with data blocks are first fetched than the datanodes
with parity blocks because less EC block reconstruction work is needed.
Internally, dfs client reads the whole stripe one by one and contacts k
datanodes parallelly for each stripe. If there is any failures then will
contact parity datanodes and do reconstruction on the fly.
'DFSStripedInputStream' supported both positional read and read entire
buffer(e.g filesize buffer).

>>>> (b) Are there any caching being done (as proposed for QFS in the
previously attached "PPR" paper) ?
Answer:-
HDFS-9879, there is an open jira to discuss the caching of striped blocks
at the datanode. Perhaps, caching logic could be utilized similar to the
QFS and while reconstruction choose those datanodes that have already
cached the data in memory. This is an open improvement task as of now.

>>>> (c) When you mentioned stripping is being done, I assume it is
probably to reduce the chunk sizes and hence k*c ?
Answer:-
Yes, striping is done by dividing the block into several chunks, we call it
as cellSize (e.g. 64KB). (k * c + m * c) is one stripe. A block group
comprises of several stripes. I'd  suggest you to read the blog -
http://blog.cloudera.com/blog/2015/09/introduction-to-hdfs-erasure-coding-in-apache-hadoop/
to understand more about the stripe, cells and block group terminolgy etc
before reeading the below answer.
   blk_0  blk_1   blk_2
 | ||
 vv   v
  +--+   +--+   +--+
  |cell_0|   |cell_1|   |cell_2|
  +--+   +--+   +--+

>>>> Now, if my object sizes are large (e.g. super HD images) where I would
have to get data from multiple stripes to rebuild the images before I can
display to the
>>>> client, do you think stripping would still help ?
>>>> Is there a possibility that since I know that all the segments of the
HD image would always be read together, by stripping and distributing it on
different nodes, I am ignoring
>>>> its special/temporal locality and further increase any associated
delays ?

Answer:-
Since for each stripe it contacts all the k datanodes, assume if there are
slow datanodes or some dead datanodes in each data block stripe then it
will affect the read performance. AFAIK, for a large file contiguous layout
is suitable, this will be supported in phase-2 and design discussions are
still going on, please see HDFS-8030 jira. On the otherside, in theory I
can say there is a benefit of striping layout, which enables the client to
work with multiple data nodes in parallel, greatly enhancing the aggregate
throughput(assuming that all datanodes are good servers). But this needs to
be tested in your cluster to understand the impact.


Thanks,
Rakesh
Intel

On Sun, Jul 24, 2016 at 12:00 PM, Roy Leonard 
wrote:

> Hi Rakesh,
>
> Thanks for sharing your thoughts and updates.
>
> (a) In your last email, I am sure you meant => "... submitting read
> requests to fetch "any" (instead of all) the 'k' chunk (out of k+m-x
> surviving chunks)  ?
> Do you have any optimization in place to decide which data-nodes will be
> part of those "k" ?
>
> (b) Are there any caching being done (as proposed for QFS in the previously
> attached "PPR" paper) ?
>
> (c) When you mentioned stripping is being done, I assume it is probably to
> reduce the chunk sizes and hence k*c ?
> Now, if my object sizes are large (e.g. super HD images) where I would have
> to get data from multiple stripes to rebuild the images before I can
> display to the client, do you think stripping would still help ?
> Is there a possibility that since I know that all the segments of the HD
> image would always be read together, by stripping and distributing it on
> different nodes, I am ignoring its special/temporal locality and further
> increase any associated delays ?
>
> Just wanted to know your thoughts.
> I am looking forward to the future performance improvements in HDFS.
>
> Regards,
> R.
>
&g

Re: [DISCUSS] Retire BKJM from trunk?

2016-07-27 Thread Rakesh Radhakrishnan
If I remember correctly, Huawei also adopted QJM component. I hope @Vinay
might have discussed internally in Huawei before starting this e-mail
discussion thread. I'm +1, for removing the bkjm contrib from the trunk
code.

Also, there are quite few open sub-tasks under HDFS-3399 umbrella jira,
which was used for the BKJM implementation time. How about closing these
jira by marking as "Won't Fix"?

Thanks,
Rakesh
Intel

On Thu, Jul 28, 2016 at 1:53 AM, Sijie Guo  wrote:

> + Rakesh and Uma
>
> Rakesh and Uma might have a better idea on this. I think Huawei was using
> it when Rakesh and Uma worked there.
>
> - Sijie
>
> On Wed, Jul 27, 2016 at 12:06 PM, Chris Nauroth 
> wrote:
>
> > I recommend including the BookKeeper community in this discussion.  I’ve
> > added their user@ and dev@ lists to this thread.
> >
> > I do not see BKJM being used in practice.  Removing it from trunk would
> be
> > attractive in terms of less code for Hadoop to maintain and build, but if
> > we find existing users that want to keep it, I wouldn’t object.
> >
> > --Chris Nauroth
> >
> > On 7/26/16, 11:14 PM, "Vinayakumar B"  wrote:
> >
> > Hi All,
> >
> >BKJM was Active and made much stable when the NameNode HA was
> > implemented and there was no QJM implemented.
> >Now QJM is present and is much stable which is adopted by many
> > production environment.
> >I wonder whether it would be a good time to retire BKJM from
> trunk?
> >
> >Are there any users of BKJM exists?
> >
> > -Vinay
> >
> >
> >
>


Re: [VOTE] Release Apache Hadoop 3.0.0-alpha1 RC0

2016-08-31 Thread Rakesh Radhakrishnan
Thanks for getting this out.

+1 (non-binding)

- downloaded and built tarball from source
- deployed HDFS-HA cluster and tested few EC file operations
- executed few hdfs commands including EC commands
- viewed basic UI
- ran some of the sample jobs


Best Regards,
Rakesh
Intel

On Thu, Sep 1, 2016 at 6:19 AM, John Zhuge  wrote:

> +1 (non-binding)
>
> - Build source with Java 1.8.0_101 on Centos 6.6 without native
> - Verify license and notice using the shell script in HADOOP-13374
> - Deploy a pseudo cluster
> - Run basic dfs, distcp, ACL, webhdfs commands
> - Run MapReduce workcount and pi examples
> - Run balancer
>
> Thanks,
> John
>
> John Zhuge
> Software Engineer, Cloudera
>
> On Wed, Aug 31, 2016 at 11:46 AM, Gangumalla, Uma <
> uma.ganguma...@intel.com>
> wrote:
>
> > +1 (binding).
> >
> > Overall it¹s a great effort, Andrew. Thank you for putting all the
> energy.
> >
> > Downloaded and built.
> > Ran some sample jobs.
> >
> > I would love to see all this efforts will lead to get the GA from Hadoop
> > 3.X soon.
> >
> > Regards,
> > Uma
> >
> >
> > On 8/30/16, 8:51 AM, "Andrew Wang"  wrote:
> >
> > >Hi all,
> > >
> > >Thanks to the combined work of many, many contributors, here's an RC0
> for
> > >3.0.0-alpha1:
> > >
> > >http://home.apache.org/~wang/3.0.0-alpha1-RC0/
> > >
> > >alpha1 is the first in a series of planned alpha releases leading up to
> > >GA.
> > >The objective is to get an artifact out to downstreams for testing and
> to
> > >iterate quickly based on their feedback. So, please keep that in mind
> when
> > >voting; hopefully most issues can be addressed by future alphas rather
> > >than
> > >future RCs.
> > >
> > >Sorry for getting this out on a Tuesday, but I'd still like this vote to
> > >run the normal 5 days, thus ending Saturday (9/3) at 9AM PDT. I'll
> extend
> > >if we lack the votes.
> > >
> > >Please try it out and let me know what you think.
> > >
> > >Best,
> > >Andrew
> >
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> >
>


Re: HDFS Balancer Stuck after 10 Minz

2016-09-08 Thread Rakesh Radhakrishnan
Have you taken multiple thread dumps (jstack) and observed the operations
which are performing during this period of time. Perhaps there could be
high chance of searching for data blocks which it can move around to
balance the cluster.

Could you tell me the used space and available space values. Have you tried
changing the threshold to a lower value, may be 10 or 5 and what happens
with this value. Also, I think there is no log messages during 15 mins time
period, any possibility of enabling debug log priority and try to dig more
about the problem.

Rakesh

On Thu, Sep 8, 2016 at 6:15 PM, Senthil Kumar 
wrote:

> Hi All ,  We are in the situation to balance the cluster data since median
> reached 98% .. I started balancer as below
>
> Hadoop Version: Hadoop 2.4.1
>
>
> /apache/hadoop/sbin/start-balancer.sh   -threshold  30
>
>
> Once i start balancer it goes will for first 8-10 minutes of time..
> Balancer was moving so quickly first 10 minutes.. Not sure whats happening
> in the cluster after sometime ( say 10 minz ) , balancer is almost stuck .
>
> Log excerpts :
>
> 2016-09-08 04:58:15,653 INFO
> org.apache.hadoop.hdfs.server.balancer.Balancer: Successfully moved
> blk_-5830766563502877304_1279767737 with size=134217728 from
> 10.103.21.27:1004 to 10.142.21.56:1004 through 10.103.21.27:1004
>
> 2016-09-08 04:59:14,426 INFO
> org.apache.hadoop.hdfs.server.balancer.Balancer: Successfully moved
> blk_2601479900_1104500421142 with size=268435456 from 10.103.84.51:1004 to
> 10.142.18.27:1004 through 10.103.84.16:1004
>
> 2016-09-08 05:01:15,037 INFO
> org.apache.hadoop.hdfs.server.balancer.Balancer: Successfully moved
> blk_3073791211_1104972921837 with size=268435456 from 10.103.21.27:1004 to
> 10.142.21.56:1004 through 10.103.21.42:1004
>
>
>
> [05:16]:[hadoop@lvsaishdc3sn0002:~]$ date
>
> Thu Sep  8 05:16:53 GMT+7 2016
>
> [05:16]:[hadoop@lvsaishdc3sn0002:~]$ jps
>
> 1003 Balancer
>
> 20388 Jps
>
>
>
> Last Block Mover Timestamp : 05:01
>
> Current Timestamp: 05:16
>
>
> Almost 15 minz no blocks moved by balancer ..  What could be the issue here
> ??  Restart would help us start moving again..
>
>
>
> It’s not event passing iteration 1 ..
>
>
> I found one thread discussing about the same issue:
>
> http://lucene.472066.n3.nabble.com/A-question-about-
> Balancer-in-HDFS-td4118505.html
>
>
> Pls suggest here to balance cluster ..
>
>
> --Senthil
>


Re: How to setup local environment to run kerberos test cases.

2016-09-29 Thread Rakesh Radhakrishnan
I hope the following documents will help you, it contains the details about
the way to build and run hadoop test cases. Please take a look at it.

https://github.com/apache/hadoop/blob/branch-2.7.3/BUILDING.txt
http://hadoop.apache.org/docs/r2.7.3/hadoop-auth/BuildingIt.html

Please give few more details if you are facing any specific issue, that
would help to dig into it.

Thanks,
Rakesh

On Thu, Sep 29, 2016 at 12:24 PM, Yuanbo Liu  wrote:

> Hi, developers
> I'd like to run kerberos test cases in my local machine, such as
> "TestSecureNameNode", but I can't make it. Can anybody tell me how to setup
> local environment so that those test cases can run successfully. Any help
> will be appreciated, thanks in advance.
>


Re: How to setup local environment to run kerberos test cases.

2016-09-29 Thread Rakesh Radhakrishnan
May be its due to file permission issues or something else. The test uses
MiniKdc, which is based on Apache Directory Server and is embedded in test
cases. Could you share the complete logs of the failed test, I think you
can look at the your machine/env location:

$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports/org.apache.hadoop.hdfs.server.namenode.TestSecureNameNode-output.txt

Thanks,
Rakesh

On Thu, Sep 29, 2016 at 1:13 PM, Yuanbo Liu  wrote:

> Hi, Rakesh
> Thanks for your response. Those docs are helpful but not what I'm asking. I
> was running test cases in my local machine, some test cases threw
> exception.
> For example:
> mvn clean package -Dtest=TestSecureNameNode
> it threw:
>
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 50.024 sec
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.
> namenode.TestSecureNameNode
> testName(org.apache.hadoop.hdfs.server.namenode.TestSecureNameNode)  Time
> elapsed: 48.552 sec  <<< ERROR!
> java.io.IOException: Failed on local exception: java.io.IOException:
> Couldn't setup connection for hdfs/localh...@example.com to
> localhost.localdomain/127.0.0.1:43815; Host Details : local host is:
> "localhost.localdomain/127.0.0.1"; destination host is:
> "localhost.localdomain":43815;
> at sun.security.krb5.KdcComm.send(KdcComm.java:242)
> at sun.security.krb5.KdcComm.send(KdcComm.java:200)
> at sun.security.krb5.KrbTgsReq.send(KrbTgsReq.java:254)
> at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:269)
> at
> sun.security.krb5.internal.CredentialsUtil.serviceCreds(
> CredentialsUtil.java:302)
> at
> sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(
> CredentialsUtil.java:120)
> at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
> at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:693)
> at sun.security.jgss.GSSContextImpl.initSecContext(
> GSSContextImpl.java:248)
>
> This test case is related to Kerberos. I guess I need to setup something
> before I run it, but I don't know how to do it. Any thoughts?
>


Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)

2017-03-22 Thread Rakesh Radhakrishnan
Thanks Junping for getting this out.

+1 (non-binding)

* downloaded and built from source with jdk1.8.0_45
* deployed HDFS-HA cluster
* ran some sample jobs
* run balancer
* executed basic dfs cmds


Rakesh

On Wed, Mar 22, 2017 at 8:30 PM, Jian He  wrote:

> +1 (binding)
>
> - built from source
> - deployed a pseudo cluster
> - ran basic example tests.
> - Navigate the UI a bit, looks good.
>
> Jian
>
> > On Mar 22, 2017, at 9:03 PM, larry mccay  wrote:
> >
> > +1 (non-binding)
> >
> > - verified signatures
> > - built from source and ran tests
> > - deployed pseudo cluster
> > - ran basic tests for hdfs, wordcount, credential provider API and
> related
> > commands
> > - tested webhdfs with knox
> >
> >
> > On Wed, Mar 22, 2017 at 7:21 AM, Ravi Prakash 
> wrote:
> >
> >> Thanks for all the effort Junping!
> >>
> >> +1 (binding)
> >> + Verified signature and MD5, SHA1, SHA256 checksum of tarball
> >> + Verified SHA ID in git corresponds to RC3 tag
> >> + Verified wordcount for one small text file produces same output as
> >> hadoop-2.7.3.
> >> + HDFS Namenode UI looks good.
> >>
> >> I agree none of the issues reported so far are blockers. Looking
> forward to
> >> another great release.
> >>
> >> Thanks
> >> Ravi
> >>
> >> On Tue, Mar 21, 2017 at 8:10 PM, Junping Du 
> wrote:
> >>
> >>> Thanks all for response with verification work and vote!
> >>>
> >>>
> >>> Sounds like we are hitting several issues here, although none seems to
> be
> >>> blockers so far. Given the large commit set - 2000+ commits first
> landed
> >> in
> >>> branch-2 release, we may should follow 2.7.0 practice that to claim
> this
> >>> release is not for production cluster, just like Vinod's suggestion in
> >>> previous email. We should quickly come up with 2.8.1 release in next 1
> >> or 2
> >>> month for production deployment.
> >>>
> >>>
> >>> We will close the vote in next 24 hours. For people who haven't vote,
> >>> please keep on verification work and report any issues if founded - I
> >> will
> >>> check if another round of RC is needed based on your findings. Thanks!
> >>>
> >>>
> >>> Thanks,
> >>>
> >>>
> >>> Junping
> >>>
> >>>
> >>> 
> >>> From: Kuhu Shukla 
> >>> Sent: Tuesday, March 21, 2017 3:17 PM
> >>> Cc: Junping Du; common-...@hadoop.apache.org;
> hdfs-dev@hadoop.apache.org
> >> ;
> >>> yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
> >>> Subject: Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)
> >>>
> >>>
> >>> +1 (non-binding)
> >>>
> >>> - Verified signatures.
> >>> - Downloaded and built from source tar.gz.
> >>> - Deployed a pseudo-distributed cluster on Mac Sierra.
> >>> - Ran example Sleep job successfully.
> >>> - Deployed latest Apache Tez 0.9 and ran sample Tez orderedwordcount
> >>> successfully.
> >>>
> >>> Thank you Junping and everyone else who worked on getting this release
> >> out.
> >>>
> >>> Warm Regards,
> >>> Kuhu
> >>> On Tuesday, March 21, 2017, 3:42:46 PM CDT, Eric Badger
> >>>  wrote:
> >>> +1 (non-binding)
> >>>
> >>> - Verified checksums and signatures of all files
> >>> - Built from source on MacOS Sierra via JDK 1.8.0 u65
> >>> - Deployed single-node cluster
> >>> - Successfully ran a few sample jobs
> >>>
> >>> Thanks,
> >>>
> >>> Eric
> >>>
> >>> On Tuesday, March 21, 2017 2:56 PM, John Zhuge 
> >>> wrote:
> >>>
> >>>
> >>>
> >>> +1. Thanks for the great effort, Junping!
> >>>
> >>>
> >>>  - Verified checksums and signatures of the tarballs
> >>>  - Built source code with Java 1.8.0_66-b17 on Mac OS X 10.12.3
> >>>  - Built source and native code with Java 1.8.0_111 on Centos 7.2.1511
> >>>  - Cloud connectors:
> >>>  - s3a: integration tests, basic fs commands
> >>>  - adl: live unit tests, basic fs commands. See notes below.
> >>>  - Deployed a pseudo cluster, passed the following sanity tests in
> >>>  both insecure and SSL mode:
> >>>  - HDFS: basic dfs, distcp, ACL commands
> >>>  - KMS and HttpFS: basic tests
> >>>  - MapReduce wordcount
> >>>  - balancer start/stop
> >>>
> >>>
> >>> Needs the following JIRAs to pass all ADL tests:
> >>>
> >>>  - HADOOP-14205. No FileSystem for scheme: adl. Contributed by John
> >> Zhuge.
> >>>  - HDFS-11132. Allow AccessControlException in contract tests when
> >>>  getFileStatus on subdirectory of existing files. Contributed by
> >>> Vishwajeet
> >>>  Dusane
> >>>  - HADOOP-13928. TestAdlFileContextMainOperatio
> >> nsLive.testGetFileContext1
> >>>  runtime error. (John Zhuge via lei)
> >>>
> >>>
> >>> On Mon, Mar 20, 2017 at 10:31 AM, John Zhuge 
> >> wrote:
> >>>
>  Yes, it only affects ADL. There is a workaround of adding these 2
>  properties to core-site.xml:
> 
>  
>    fs.adl.impl
>    org.apache.hadoop.fs.adl.AdlFileSystem
>  
> 
>  
>    fs.AbstractFileSystem.adl.impl
>    org.apache.hadoop.fs.adl.Adl
>  
> 
>  I have the initial patch ready but hitting these live unit test
> >> failures:
> 
>  Failed test

Re: [VOTE] Moving Submarine to a separate Apache project proposal

2019-09-03 Thread Rakesh Radhakrishnan
+1, Thanks for the proposal.

I am interested to participate in this project. Please include me as well
in the project.

Thanks,
Rakesh

On Tue, Sep 3, 2019 at 11:59 AM zhankun tang  wrote:

> +1
>
> Thanks for Wangda's proposal.
>
> The submarine project is born within Hadoop, but not limited to Hadoop. It
> began with a trainer on YARN but it quickly realized that only a trainer is
> not enough to meet the AI platform requirements. But now there's no
> user-friendly open-source solution covers the whole AI pipeline like data
> engineering, training, and serving. And the underlying data infrastructure
> itself is also evolving, for instance, many people love k8s. Not mentioning
> there're many AI domain problems in this area to be solved.
> It's almost for sure that building such an ML platform would utilize
> various other open-source components taking ML into consideration
> initially.
>
> I see submarine grows rapidly towards an enterprise-grade ML platform which
> could potentially enable AI ability for data engineer and scientist. This
> is an exciting thing for both the community and the industry.
>
> BR,
> Zhankun
>
>
> On Tue, 3 Sep 2019 at 13:34, Xun Liu  wrote:
>
> > +1
> >
> > Hello everyone, I am a member of the submarine development team.
> > I have been contributing to submarine for more than a year.
> > I have seen the progress of submarine development very fast.
> > In more than a year, there are 9 long-term developers of different
> > companies. Contributing,
> > submarine cumulative code has more than 200,000 lines of code, is growing
> > very fast,
> > and is used in the production environment of multiple companies.
> >
> > In the submarine development group, there are 5 PMCs and 7committer
> members
> > from Hadoop, spark, zeppelin projects.
> > They are very familiar with the development process and specifications of
> > the apache community,
> > and can well grasp the project development progress and project quality.
> > So I recommend submarine to be a TLP project directly.
> >
> > We will continue to contribute to the submarine project. :-)
> >
> > Xun Liu
> > Regards
> >
> > On Tue, 3 Sep 2019 at 12:01, Devaraj K  wrote:
> >
> > > +1
> > >
> > > Thanks Wangda for the proposal.
> > > I would like to participate in this project, Please add me also to the
> > > project.
> > >
> > > Regards
> > > Devaraj K
> > >
> > > On Mon, Sep 2, 2019 at 8:50 PM zac yuan  wrote:
> > >
> > > > +1
> > > >
> > > > Submarine will be a complete solution for AI service development.  It
> > can
> > > > take advantage of two best cluster systems: yarn and k8s, which will
> > help
> > > > more and more people get AI ability. To be a separate Apache project,
> > > will
> > > > accelerate the procedure of development apparently.
> > > >
> > > > Look forward to a big success in submarine project~
> > > >
> > > > 朱林浩  于2019年9月3日周二 上午10:38写道:
> > > >
> > > > > +1,
> > > > > Hopefully, that will become the top project,
> > > > >
> > > > > I also hope to make more contributions to this project.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > At 2019-09-03 09:26:53, "Naganarasimha Garla" <
> > > > naganarasimha...@apache.org>
> > > > > wrote:
> > > > > >+ 1,
> > > > > >I would also like start participate in this project, hope to
> get
> > > > > myself
> > > > > >added to the project.
> > > > > >
> > > > > >Thanks and Regards,
> > > > > >+ Naga
> > > > > >
> > > > > >On Tue, Sep 3, 2019 at 8:35 AM Wangda Tan 
> > > wrote:
> > > > > >
> > > > > >> Hi Sree,
> > > > > >>
> > > > > >> I put it to the proposal, please let me know what you think:
> > > > > >>
> > > > > >> The traditional path at Apache would have been to create an
> > > incubator
> > > > > >> > project, but the code is already being released by Apache and
> > most
> > > > of
> > > > > the
> > > > > >> > developers are familiar with Apache rules and guidelines. In
> > > > > particular,
> > > > > >> > the proposed PMC has 2 Apache TLP PMCs and proposed initial
> > > > committers
> > > > > >> > have 4 Apache TLP PMCs (from 3 different companies). They will
> > > > provide
> > > > > >> > oversight and guidance for the developers that are less
> > > experienced
> > > > in
> > > > > >> the
> > > > > >> > Apache Way. Therefore, the Submarine project would like to
> > propose
> > > > > >> becoming
> > > > > >> > a Top Level Project at Apache.
> > > > > >> >
> > > > > >>
> > > > > >> To me, putting to TLP has mostly pros, it is an easier process
> > (same
> > > > as
> > > > > ORC
> > > > > >> spin-off from Hive), much less overhead to both dev community
> and
> > > > Apache
> > > > > >> side.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Wangda
> > > > > >>
> > > > > >> On Sun, Sep 1, 2019 at 2:04 PM Sree Vaddi <
> > sree_at_ch...@yahoo.com>
> > > > > wrote:
> > > > > >>
> > > > > >> > +1 to move submarine to separate apache project.
> > > > > >> >
> > > > > >> > It is not clear in the proposal, if submarine majority 

Re: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree

2019-09-19 Thread Rakesh Radhakrishnan
+1

Rakesh

On Fri, Sep 20, 2019 at 12:29 AM Aaron Fabbri  wrote:

> +1 (binding)
>
> Thanks to the Ozone folks for their efforts at maintaining good separation
> with HDFS and common. I took a lot of heat for the unpopular opinion that
> they should  be separate, so I am glad the process has worked out well for
> both codebases. It looks like my concerns were addressed and I appreciate
> it.  It is cool to see the evolution here.
>
> Aaron
>
>
> On Thu, Sep 19, 2019 at 3:37 AM Steve Loughran  >
> wrote:
>
> > in that case,
> >
> > +1 from me (binding)
> >
> > On Wed, Sep 18, 2019 at 4:33 PM Elek, Marton  wrote:
> >
> > >  > one thing to consider here as you are giving up your ability to make
> > >  > changes in hadoop-* modules, including hadoop-common, and their
> > >  > dependencies, in sync with your own code. That goes for filesystem
> > > contract
> > >  > tests.
> > >  >
> > >  > are you happy with that?
> > >
> > >
> > > Yes. I think we can live with it.
> > >
> > > Fortunatelly the Hadoop parts which are used by Ozone (security + rpc)
> > > are stable enough, we didn't need bigger changes until now (small
> > > patches are already included in 3.1/3.2).
> > >
> > > I think it's better to use released Hadoop bits in Ozone anyway, and
> > > worst (best?) case we can try to do more frequent patch releases from
> > > Hadoop (if required).
> > >
> > >
> > > m.
> > >
> > >
> > >
> >
>


[jira] [Created] (HDFS-16362) [FSO] Refactor isFileSystemOptimized usage in OzoneManagerUtils

2021-11-30 Thread Rakesh Radhakrishnan (Jira)
Rakesh Radhakrishnan created HDFS-16362:
---

 Summary: [FSO] Refactor isFileSystemOptimized usage in 
OzoneManagerUtils
 Key: HDFS-16362
 URL: https://issues.apache.org/jira/browse/HDFS-16362
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Rakesh Radhakrishnan


This task is to refactor the om request instantiation based on 
#isFileSystemOptimized() check in OzoneManagerUtils class.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-10-07 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan resolved HDFS-15253.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org