Re: Thinking ahead to hadoop-2.6

2014-09-26 Thread Arun Murthy
Sounds good. I'll branch this weekend and we can merge the jiras we
discussed in this thread as they they get wrapped next week.

Thanks everyone.

Arun


> On Sep 24, 2014, at 7:39 PM, Vinod Kumar Vavilapalli  
> wrote:
>
> We can branch off in a week or two so that work on branch-2 itself can go
> ahead with other features that can't fit in 2.6. Independent of that, we
> can then decide on the timeline of the release candidates once branch-2.6
> is close to being done w.r.t the planned features.
>
> Branching it off can let us focus on specific features that we want in for
> 2.6 and then eventually blockers for the release, nothing else. There is a
> trivial pain of committing to one more branch, but it's worth it in this
> case IMO.
>
> A lot of efforts are happening in parallel from the YARN side from where I
> see. 2.6 is a little bulky if only on the YARN side and I'm afraid if we
> don't branch off and selectively try to get stuff in, it is likely to be in
> a perpetual delay.
>
> My 2 cents.
>
> +Vinod
>
> On Wed, Sep 24, 2014 at 3:28 PM, Suresh Srinivas 
> wrote:
>
>> Given some of the features are in final stages of stabilization,
>> Arun, we should hold off creating 2.6 branch or building an RC by a week?
>> All the features in flux are important ones and worth delaying the release
>> by a week.
>>
>> On Wed, Sep 24, 2014 at 11:36 AM, Andrew Wang 
>> wrote:
>>
>>> Hey Nicholas,
>>>
>>> My concern about Archival Storage isn't related to the code quality or
>> the
>>> size of the feature. I think that you and Jing did good work. My concern
>> is
>>> that once we ship, we're locked into that set of archival storage APIs,
>> and
>>> these APIs are not yet finalized. Simply being able to turn off the
>> feature
>>> does not change the compatibility story.
>>>
>>> I'm willing to devote time to help review these JIRAs and kick the tires
>> on
>>> the APIs, but my point above was that I'm not sure it'd all be done by
>> the
>>> end of the week. Testing might also reveal additional changes that need
>> to
>>> be made, which also might not happen by end-of-week.
>>>
>>> I guess the question before us is if we're comfortable putting something
>> in
>>> branch-2.6 and then potentially adding API changes after. I'm okay with
>>> that as long as we're all aware that this might happen.
>>>
>>> Arun, as RM is this cool with you? Again, I like this feature and I'm
>> fine
>>> with it's inclusion, just a heads up that we might need some extra time
>> to
>>> finalize things before an RC can be cut.
>>>
>>> Thanks,
>>> Andrew
>>>
>>> On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze <
>>> s29752-hadoop...@yahoo.com.invalid> wrote:
>>>
 Hi,

 I am worry about KMS and transparent encryption since there are quite
>>> many
 bugs discovered after it got merged to branch-2.  It gives us an
>>> impression
 that the feature is not yet well tested.  Indeed, transparent
>> encryption
>>> is
 a complicated feature which changes the core part of HDFS.  It is not
>>> easy
 to get everything right.


 For HDFS-6584: Archival Storage, it is a relatively simple and low risk
 feature.  It introduces a new storage type ARCHIVE and the concept of
>>> block
 storage policy to HDFS.  When a cluster is configured with ARCHIVE
>>> storage,
 the blocks will be stored using the appropriate storage types specified
>>> by
 storage policies assigned to the files/directories.  Cluster admin
>> could
 disable the feature by simply not configuring any storage type and not
 setting any storage policy as before.   As Suresh mentioned, HDFS-6584
>> is
 in the final stages to be merged to branch-2.

 Regards,
 Tsz-Wo



 On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas <
 sur...@hortonworks.com> wrote:


>
>
> I actually would like to see both archival storage and single replica
> memory writes to be in 2.6 release. Archival storage is in the final
 stages
> of getting ready for branch-2 merge as Nicholas has already indicated
>> on
> the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
>>> these
> features are being in development for sometime.
>
> On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang <
>> andrew.w...@cloudera.com>
> wrote:
>
>> Hey Arun,
>>
>> Maybe we could do a quick run through of the Roadmap wiki and
 add/retarget
>> things accordingly?
>>
>> I think the KMS and transparent encryption are ready to go. We've
>> got
>>> a
>> very few further bug fixes pending, but that's it.
>>
>> Two HDFS things that I think probably won't make the end of the week
>>> are
>> archival storage (HDFS-6584) and single replica memory writes
 (HDFS-6581),
>> which I believe are under the HSM banner. HDFS-6484 was just merged
>> to
>> trunk and I think needs a little more work before it goes into
>>> branch-2.
>> HDFS

[jira] [Created] (HDFS-7151) DFSInputStream method seek works incorrectly on huge HDFS block size

2014-09-26 Thread Andrew Rewoonenco (JIRA)
Andrew Rewoonenco created HDFS-7151:
---

 Summary: DFSInputStream method seek works incorrectly on huge HDFS 
block size
 Key: HDFS-7151
 URL: https://issues.apache.org/jira/browse/HDFS-7151
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, fuse-dfs, hdfs-client
Affects Versions: 2.5.1, 2.4.1, 2.5.0, 2.4.0, 2.3.0
 Environment: dfs.block.size > 2Gb
Reporter: Andrew Rewoonenco
Priority: Critical


Hadoop incorrectly works with block size more than 2Gb.

The seek method of DFSInputStream class used int (32 bit signed) internal value 
for seeking inside current block. This cause seek error when block size is 
greater 2Gb.

Found when using very large parquet files (10Gb) in Impala on Cloudera cluster 
with block size 10Gb.

Here is some log output:
W0924 08:27:15.920017 40026 DFSInputStream.java:1397] BlockReader failed to 
seek to 4390830898. Instead, it seeked to 95863602.
W0924 08:27:15.921295 40024 DFSInputStream.java:1397] BlockReader failed to 
seek to 5597521814. Instead, it seeked to 1302554518.

BlockReader seek only 32-bit offsets (4390830898-95863602=4Gb as 
5597521814-1302554518).

The code fragment producing that bug:
int diff = (int)(targetPos - pos);
  if (diff <= blockReader.available()) {

Similar errors can exist in other parts of the HDFS.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7152) add command-line or configuration options for balancer (tweak speed)

2014-09-26 Thread Andrew Rewoonenco (JIRA)
Andrew Rewoonenco created HDFS-7152:
---

 Summary: add command-line or configuration options for balancer 
(tweak speed)
 Key: HDFS-7152
 URL: https://issues.apache.org/jira/browse/HDFS-7152
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer
Affects Versions: 2.5.0, 2.3.0, 2.6.0
Reporter: Andrew Rewoonenco


Make a command-line or configuration options for balancer (hints to process it 
work faster):

1. Add option to filter minimal and maximal block size.
 Description: 
 a) when datanode holds a lot of small files and a couple of big ones balancer 
do senseless balancing on small files, it take a lot of time to nothing.
 b) when datanode have a large and very large files balancer sometimes stuck on 
moving very large files fails with timeouts.
 So it is good to limit such actions.

2. Add option for block move timeout and iteration timeout.
 Description: 
  - in version 2.3.0 - 2.5.0 socket use non-configurable socket timeout of 60 
seconds making use of balancer is useless when block size of HDFS greater 2 Gb.
  - in version 2.6.0 and later hard-coded values used for balancer iteration 
time.
  They need to be replaced by configurable ones.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7153) Add storagePolicy to NN edit log during file creation

2014-09-26 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-7153:
---

 Summary: Add storagePolicy to NN edit log during file creation
 Key: HDFS-7153
 URL: https://issues.apache.org/jira/browse/HDFS-7153
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Storage Policy ID is currently not logged in the NN edit log during file 
creation as part of {{AddOp}}. This is okay for now since we don't have an API 
to set storage policy during file creation.

However now that we have storage policies, for HDFS-6581 we are looking into 
using the feature instead of adding a new field to the INodeFile header. It 
would be useful to have the ability to save policy on file create.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7154) Fix returning value of starting reconfiguration task

2014-09-26 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-7154:
---

 Summary: Fix returning value of starting reconfiguration task
 Key: HDFS-7154
 URL: https://issues.apache.org/jira/browse/HDFS-7154
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0, 2.6.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu


Running {{hdfs dfsadmin -reconfig ... start}} mistakenly returns {{-1}} (255). 
It is due to {{DFSAdmin#startReconfiguration()}} returns wrong exit code. It is 
expected to return 0 to indicate success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7155) Bugfix in createLocatedFileStatus called by bad merge

2014-09-26 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-7155:
---

 Summary: Bugfix in createLocatedFileStatus called by bad merge
 Key: HDFS-7155
 URL: https://issues.apache.org/jira/browse/HDFS-7155
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


FsDirectory.createLocatedFileStatus fails to initialize the blockSize.

Likely caused by a bad merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7155) Bugfix in createLocatedFileStatus caused by bad merge

2014-09-26 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-7155.
-
   Resolution: Fixed
Fix Version/s: HDFS-6581

Thanks for the quick review Chris. Committed to the feature branch.

> Bugfix in createLocatedFileStatus caused by bad merge
> -
>
> Key: HDFS-7155
> URL: https://issues.apache.org/jira/browse/HDFS-7155
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-6581
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-6581
>
> Attachments: HDFS-7155.01.patch
>
>
> FsDirectory.createLocatedFileStatus fails to initialize the blockSize.
> Likely caused by a bad merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7156) Fsck documentation is outdated.

2014-09-26 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-7156:
-

 Summary: Fsck documentation is outdated.
 Key: HDFS-7156
 URL: https://issues.apache.org/jira/browse/HDFS-7156
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.5.1
Reporter: Konstantin Shvachko


fsck documentation got stale. It does not describe options like 
-includeSnapshots and -list-corruptfileblocks.
It should be broght in sync with fsck USAGE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7157) Using Time.now() for recording start/end time of reconfiguration tasks

2014-09-26 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-7157:
---

 Summary: Using Time.now() for recording start/end time of 
reconfiguration tasks
 Key: HDFS-7157
 URL: https://issues.apache.org/jira/browse/HDFS-7157
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0, 2.6.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu


Reconfiguration task {{startTime}} and {{endTime}} are the wall time concepts, 
which should be obtained from {{o.a.h.util.Time.now()}}.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7158) Reduce the memory usage of WebImageViewer

2014-09-26 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-7158:


 Summary: Reduce the memory usage of WebImageViewer
 Key: HDFS-7158
 URL: https://issues.apache.org/jira/browse/HDFS-7158
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


Currently the webimageviewer can take up as much memory as the NN uses in order 
to serve the WebHDFS requests from the client.

This jira proposes to optimize the memory usage of webimageviewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7159) Use block storage policy to set lazy persist preference

2014-09-26 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-7159:
---

 Summary: Use block storage policy to set lazy persist preference
 Key: HDFS-7159
 URL: https://issues.apache.org/jira/browse/HDFS-7159
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Now that HDFS-6584 feature is ready and supports block storage policies on both 
files and directories, we can make use of Storage Policies to store the 
LAZY_PERSIST preference.

This only affects how the preference is persisted in the FsImage/Edit logs. 
There is no change to the client API or to NN-DN interaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7160) Fix incorrect layout version caused by bad merge

2014-09-26 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-7160:
---

 Summary: Fix incorrect layout version caused by bad merge
 Key: HDFS-7160
 URL: https://issues.apache.org/jira/browse/HDFS-7160
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


The layout version was not correctly updated while merging from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)