Re: ALTER TABLE T1 PARTITION(P1) CONCATENATE bug?

Navis류승우 Tue, 14 Oct 2014 18:13:44 -0700

Could you tell the version number of hive?

Thanks,
Navis


2014-10-15 2:00 GMT+09:00 Time Less <timelessn...@gmail.com>:

> I have found a work-around for this bug. After you issue the ALTER
> TABLE...CONCATENATE command, issue:
>
> ALTER TABLE T1 PARTITION (P1) SET LOCATION
> ".../apps/hive/warehouse/DB1/T1/P1";
>
> This will fix the metadata that CONCATENATE breaks.
>
>
> ––
> *Tim Ellis:* 510-761-6610
>
>
> On Mon, Oct 13, 2014 at 10:37 PM, Time Less <timelessn...@gmail.com>
> wrote:
>
>> Has anyone seen anything like this? Google searches turned up nothing, so
>> I thought I'd ask here, then file a JIRA if no-one thinks I'm doing it
>> wrong.
>>
>> If I ALTER a particular table with three partitions once, it works.
>> Second time it works, too, but reports it is moving a directory to the
>> Trash that doesn't exist (still, this doesn't kill it). The third time I
>> ALTER the table, it crashes, because the directory structure has been
>> modified to something invalid.
>>
>> Here's a nearly-full output of the 2nd and 3rd runs. The ALTER is exactly
>> the same both times (I just press UP ARROW):
>>
>>
>> *HQL, 2nd Run:*hive (analytics)> alter table bidtmp partition
>> (log_type='bidder',dt='2014-05-01',hour=11) concatenate ;
>>
>>
>> *Output:*Starting Job = job_1412894367814_0017, Tracking URL =
>> ....application_1412894367814_0017/
>> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill
>> job_1412894367814_0017
>> Hadoop job information for null: number of mappers: 97; number of
>> reducers: 0
>> 2014-10-13 20:28:23,143 null map = 0%,  reduce = 0%
>> 2014-10-13 20:28:36,042 null map = 1%,  reduce = 0%, Cumulative CPU 49.69
>> sec
>> ...
>> 2014-10-13 20:31:56,415 null map = 99%,  reduce = 0%, Cumulative CPU
>> 812.65 sec
>> 2014-10-13 20:31:57,458 null map = 100%,  reduce = 0%, Cumulative CPU
>> 813.88 sec
>> MapReduce Total cumulative CPU time: 13 minutes 33 seconds 880 msec
>> Ended Job = job_1412894367814_0017
>> Loading data to table analytics.bidtmp partition (log_type=bidder,
>> dt=2014-05-01, hour=11)
>> rmr: DEPRECATED: Please use 'rm -r' instead.
>> Moved: '.../apps/hive/warehouse/analytics.db/bidtmp/
>> *dt=2014-05-01/hour=11/log_type=bidder*' to trash at:
>> .../user/hdfs/.Trash/Current
>> *// (note the bold-faced path doesn't exist, the partition is specified
>> as log_type first, then dt, then hour)*
>> Partition analytics.bidtmp*{log_type=bidder, dt=2014-05-01, hour=11}*
>> stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0]
>> *(here, the partition ordering is correct!)*
>> MapReduce Jobs Launched:
>> Job 0: Map: 97   Cumulative CPU: 813.88 sec   HDFS Read: 30298871932 HDFS
>> Write: 28746848923 SUCCESS
>> Total MapReduce CPU Time Spent: 13 minutes 33 seconds 880 msec
>> OK
>> Time taken: 224.128 seconds
>>
>>
>> *HQL, 3rd Run:*hive (analytics)> alter table bidtmp partition
>> (log_type='bidder',dt='2014-05-01',hour=11) concatenate ;
>>
>>
>> *Output:*java.io.FileNotFoundException: File does not exist:
>> .../apps/hive/warehouse/analytics.db/bidtmp/dt=2014-05-01/hour=11/log_type=bidder
>> *(because it should be log_type=.../dt=.../hour=... - not this order)*
>>         at org.apache.hadoop.hdfs.
>> DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>         at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>         at
>> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:419)
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
>>         at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
>>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
>>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>         at
>> org.apache.hadoop.hive.ql.io.rcfile.merge.BlockMergeTask.execute(BlockMergeTask.java:214)
>>         at
>> org.apache.hadoop.hive.ql.exec.DDLTask.mergeFiles(DDLTask.java:511)
>>         at
>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:458)
>>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>>         at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
>>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
>>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
>>         at
>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
>>         at
>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>>         at
>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
>>         at
>> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
>>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> Job Submission failed with exception 'java.io.FileNotFoundException(File
>> does not exist:
>> .../apps/hive/warehouse/analytics.db/bidtmp/dt=2014-05-01/hour=11/log_type=bidder)'
>> FAILED: Execution Error, return code 1 from
>> org.apache.hadoop.hive.ql.exec.DDLTask
>>
>> ––
>> *Tim Ellis:* 510-761-6610
>>
>>
>

Re: ALTER TABLE T1 PARTITION(P1) CONCATENATE bug?

Reply via email to