Retrying socket connection failure times can be made as configurable
Key: HADOOP-7086
URL: https://issues.apache.org/jira/browse/HADOOP-7086
Project: Hadoop Common
Issue Ty
You are mixing a few things up.
You're testing your I/O using C.
What do you see if you try testing your direct I/O from Java?
I'm guessing that you'll keep your i/o piece in place and wrap it within some
JNI code and then re-write the test in Java?
Also are you testing large streams or random
On Tue, Jan 4, 2011 at 12:58 PM, Da Zheng wrote:
> The most important reason for me to use direct I/O is that the Atom
> processor is too weak. If I wrote a simple program to write data to the
> disk, CPU is almost 100% but the disk hasn't reached its maximal bandwidth.
> When I write data to SSD
On 1/5/11 12:44 AM, Christopher Smith wrote:
> On Tue, Jan 4, 2011 at 9:11 PM, Da Zheng wrote:
>
>> On 1/4/11 5:17 PM, Christopher Smith wrote:
>>> If you use direct I/O to reduce CPU time, that means you are saving CPU
>> via
>>> DMA. If you are using Java's heap though, you can kiss that goodby
On 1/5/11 9:50 AM, Segel, Mike wrote:
> You are mixing a few things up.
>
> You're testing your I/O using C.
> What do you see if you try testing your direct I/O from Java?
> I'm guessing that you'll keep your i/o piece in place and wrap it within some
> JNI code and then re-write the test in Ja
Da Zheng wrote:
> I already did "ant compile-c++-libhdfs -Dlibhdfs=1", but it seems nothing is
> compiled as it prints the following:
> check-c++-libhdfs:
> check-c++-makefile-libhdfs:
> create-c++-libhdfs-makefile:
> compile-c++-libhdfs:
> BUILD SUCCESSFUL
> Total time: 2 seconds
You may ne
I know many people use git, so wanted to share a neat tip I figured out this
morning that lets you graft the pre-split history into the post-split
repositories. I'm using git 1.7.1, not sure how new these features are. Here
are the steps:
1) Check out the git repos from git.apache.org into git/had
Hi,
I just submitted a patch for the feature I've been working on.
https://issues.apache.org/jira/browse/HADOOP-7076
This patch works fine on my system and passes all the unit tests.
Now some 30 minutes later it seems the build on the hudson has failed.
https://hudson.apache.org/hudson/job/PreCo
I found where to report this ... so I did:
https://issues.apache.org/jira/browse/INFRA-3340
2011/1/5 Niels Basjes :
> Hi,
>
> I just submitted a patch for the feature I've been working on.
> https://issues.apache.org/jira/browse/HADOOP-7076
>
> This patch works fine on my system and passes all the
This is great. Thanks, Todd. -C
On Wed, Jan 5, 2011 at 12:36 PM, Todd Lipcon wrote:
> I know many people use git, so wanted to share a neat tip I figured out this
> morning that lets you graft the pre-split history into the post-split
> repositories. I'm using git 1.7.1, not sure how new these fe
I agree with Jay B. Checksumming is usually the culprit for high CPU on clients
and datanodes. Plus, a checksum of 4 bytes for every 512, means for 64MB block,
the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it to generate 1
ext3 checksum block per DFS block will speedup read/write wi
On Jan 5, 2011, at 4:03 PM, Milind Bhandarkar wrote:
> I agree with Jay B. Checksumming is usually the culprit for high CPU on
> clients and datanodes. Plus, a checksum of 4 bytes for every 512, means for
> 64MB block, the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it to
> generate
>
> Know thine usage scenarios.
Yup.
- milind
---
Milind Bhandarkar
(mbhandar...@linkedin.com)
(650-776-3236)
I'm not sure of that. I wrote a small checksum program for testing.
After the size of a block gets to larger than 8192 bytes, I don't see
much performance improvement. See the code below. I don't think 64MB can
bring us any benefit.
I did change io.bytes.per.checksum to 131072 in hadoop, and the
Is 20.3 a 'dead' release ?
I haven't seen any discussion of this on the apache lists about creating a 20.3
release, and kind of goes against all the discussion that we recently had with
StAck about creating a 'append' release on 0.20.
I'm not against 20.3, but I would like to see some discussi
Zhenhua Guo wrote:
> It seems that mapred.task.cache.levels is used by JobTracker to create
> task caches for nodes at various levels. This makes data-locality
> scheduling possible.
> If I set mapred.task.cache.levels to 0 and use default network
> topology, then mapreduce job will stall forever
Have you tried with org.apache.hadoop.util.DataChecksum and
org.apache.hadoop.util.PureJavaCrc32 ?
- Milind
On Jan 5, 2011, at 3:42 PM, Da Zheng wrote:
> I'm not sure of that. I wrote a small checksum program for testing. After the
> size of a block gets to larger than 8192 bytes, I don't see
SequenceFile.createWriter ignores FileSystem parameter
--
Key: HADOOP-7087
URL: https://issues.apache.org/jira/browse/HADOOP-7087
Project: Hadoop Common
Issue Type: Bug
Components
JMX Bean that exposes version and build information
---
Key: HADOOP-7088
URL: https://issues.apache.org/jira/browse/HADOOP-7088
Project: Hadoop Common
Issue Type: New Feature
Report
Use readlink to get absolute paths in the scripts
--
Key: HADOOP-7089
URL: https://issues.apache.org/jira/browse/HADOOP-7089
Project: Hadoop Common
Issue Type: Improvement
Components
isn't DataChecksum just a wrapper of CRC32?
I'm still using Hadoop 0.20.2. there is no PureJavaCrc32
Da
On 1/5/11 7:44 PM, Milind Bhandarkar wrote:
> Have you tried with org.apache.hadoop.util.DataChecksum and
> org.apache.hadoop.util.PureJavaCrc32 ?
>
> - Milind
>
> On Jan 5, 2011, at 3:42 PM
Possible resource leaks in hadoop core code
---
Key: HADOOP-7090
URL: https://issues.apache.org/jira/browse/HADOOP-7090
Project: Hadoop Common
Issue Type: Bug
Affects Versions: 0.21.0
R
[
https://issues.apache.org/jira/browse/HADOOP-6872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Shvachko resolved HADOOP-6872.
-
Resolution: Duplicate
Fixed as a part of HADOOP-6906.
> ChecksumFs#listStatus s
[
https://issues.apache.org/jira/browse/HADOOP-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Shvachko resolved HADOOP-6718.
-
Resolution: Duplicate
Incorporated in HADOOP-6706 for 0.22.
> Client does not c
24 matches
Mail list logo