distcp_different versions

Евгений Селявка Mon, 29 Oct 2012 01:08:46 -0700

Dear members, i try copy data from hdfs clusters.

I have one cluster version:
Hadoop 0.20.1+169.127
Subversion  -r 2157de3c7179c7e244c907fb9c8804e1c076f050
Compiled by root on Sun Jan 16 19:29:48 UTC 2011
>From source with checksum e28f0ec421b292b8d07210057a756bc8


nn1


And another one:
Hadoop 2.0.0-cdh4.1.1
Subversion 
file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.1.1/src/hadoop-common-project/hadoop-common
-r 581959ba23e4af85afd8db98b7687662fe9c5f20
Compiled by jenkins on Tue Oct 16 11:19:12 PDT 2012
>From source with checksum 95f5c7f30b4030f1f327758e7b2bd61f

nn2

When i try to copy data i get this error, i run this command on server
with hadoop 2.0.0:

hadoop distcp -p -i hftp://nn1:50070/test/100m hdfs:///testdata/

12/10/29 12:03:25 INFO tools.DistCp:
srcPaths=[hftp://css-st-heartbeat.scartel.dc:50070/test/100m]
12/10/29 12:03:25 INFO tools.DistCp: destPath=hdfs:/testdata
12/10/29 12:03:26 WARN conf.Configuration: session.id is deprecated.
Instead, use dfs.metrics.session-id
12/10/29 12:03:26 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
12/10/29 12:03:28 INFO tools.DistCp: sourcePathsCount=1
12/10/29 12:03:28 INFO tools.DistCp: filesToCopyCount=1
12/10/29 12:03:28 INFO tools.DistCp: bytesToCopyCount=100.0m
12/10/29 12:03:28 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
12/10/29 12:03:28 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
12/10/29 12:03:28 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the
same.
12/10/29 12:03:28 INFO mapred.LocalJobRunner: OutputCommitter set in config null
12/10/29 12:03:28 INFO mapred.JobClient: Running job: job_local_0001
12/10/29 12:03:28 INFO mapred.LocalJobRunner: OutputCommitter is
org.apache.hadoop.mapred.FileOutputCommitter
12/10/29 12:03:28 WARN mapreduce.Counters: Group
org.apache.hadoop.mapred.Task$Counter is deprecated. Use
org.apache.hadoop.mapreduce.TaskCounter instead
12/10/29 12:03:28 INFO util.ProcessTree: setsid exited with exit code 0
12/10/29 12:03:28 INFO mapred.Task:  Using ResourceCalculatorPlugin :
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@6c8b058b
12/10/29 12:03:28 WARN mapreduce.Counters: Counter name
MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group
name and  BYTES_READ as counter name instead
12/10/29 12:03:28 INFO mapred.MapTask: numReduceTasks: 0
12/10/29 12:03:28 INFO tools.DistCp: FAIL 100m : java.io.IOException:
HTTP_OK expected, received 400
        at 
org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:365)
        at 
org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119)
        at 
org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103)
        at 
org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187)
        at java.io.DataInputStream.read(DataInputStream.java:83)
        at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:424)
        at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
        at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:263)

12/10/29 12:03:28 INFO mapred.Task: Task:attempt_local_0001_m_000000_0
is done. And is in the process of commiting
12/10/29 12:03:28 INFO mapred.LocalJobRunner:
12/10/29 12:03:28 INFO mapred.Task: Task attempt_local_0001_m_000000_0
is allowed to commit now
12/10/29 12:03:28 INFO mapred.FileOutputCommitter: Saved output of
task 'attempt_local_0001_m_000000_0' to
hdfs://test2.video.scartel.dc:8020/testdata/_distcp_logs_svpizr
12/10/29 12:03:28 INFO mapred.LocalJobRunner: Copied: 0 Skipped: 0 Failed: 1
12/10/29 12:03:28 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
12/10/29 12:03:29 INFO mapred.JobClient:  map 100% reduce 0%
12/10/29 12:03:29 INFO mapred.JobClient: Job complete: job_local_0001
12/10/29 12:03:29 INFO mapred.JobClient: Counters: 26
12/10/29 12:03:29 INFO mapred.JobClient:   File System Counters
12/10/29 12:03:29 INFO mapred.JobClient:     FILE: Number of bytes read=175990
12/10/29 12:03:29 INFO mapred.JobClient:     FILE: Number of bytes
written=263692
12/10/29 12:03:29 INFO mapred.JobClient:     FILE: Number of read operations=0
12/10/29 12:03:29 INFO mapred.JobClient:     FILE: Number of large
read operations=0
12/10/29 12:03:29 INFO mapred.JobClient:     FILE: Number of write operations=0
12/10/29 12:03:29 INFO mapred.JobClient:     HDFS: Number of bytes read=0
12/10/29 12:03:29 INFO mapred.JobClient:     HDFS: Number of bytes written=975
12/10/29 12:03:29 INFO mapred.JobClient:     HDFS: Number of read operations=7
12/10/29 12:03:29 INFO mapred.JobClient:     HDFS: Number of large
read operations=0
12/10/29 12:03:29 INFO mapred.JobClient:     HDFS: Number of write operations=6
12/10/29 12:03:29 INFO mapred.JobClient:     HFTP: Number of bytes read=0
12/10/29 12:03:29 INFO mapred.JobClient:     HFTP: Number of bytes written=0
12/10/29 12:03:29 INFO mapred.JobClient:     HFTP: Number of read operations=0
12/10/29 12:03:29 INFO mapred.JobClient:     HFTP: Number of large
read operations=0
12/10/29 12:03:29 INFO mapred.JobClient:     HFTP: Number of write operations=0
12/10/29 12:03:29 INFO mapred.JobClient:   Map-Reduce Framework
12/10/29 12:03:29 INFO mapred.JobClient:     Map input records=1
12/10/29 12:03:29 INFO mapred.JobClient:     Map output records=1
12/10/29 12:03:29 INFO mapred.JobClient:     Input split bytes=145
12/10/29 12:03:29 INFO mapred.JobClient:     Spilled Records=0
12/10/29 12:03:29 INFO mapred.JobClient:     CPU time spent (ms)=0
12/10/29 12:03:29 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
12/10/29 12:03:29 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
12/10/29 12:03:29 INFO mapred.JobClient:     Total committed heap
usage (bytes)=250413056
12/10/29 12:03:29 INFO mapred.JobClient:
org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
12/10/29 12:03:29 INFO mapred.JobClient:     BYTES_READ=128
12/10/29 12:03:29 INFO mapred.JobClient:   distcp
12/10/29 12:03:29 INFO mapred.JobClient:     Bytes expected=104857600
12/10/29 12:03:29 INFO mapred.JobClient:     Files failed=1

I don't understand why this error occur, may be it is a bug in Hadoop
0.20.1 or in Hadoop 2.0.0?

-- 
С уважением Селявка Евгений

distcp_different versions

Reply via email to