Hi everyone,I have a TSV file (around 4 GB). I have creted a hive table on that 
 using the following command. It works finr without indexing. However, when I 
create an index based on 2 columsn I get the following error:

create table products (user_id String, session_id String, ordering_date Date, 
product_id String, reorder_date Date, ordering_mode int) ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;

CREATE INDEX user_id_ordering_mode_index ON TABLE 
products(user_id,ordering_mode) AS 
'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED 
REBUILD;
ALTER INDEX user_id_ordering_mode_index ON products REBUILD;
SET hive.index.compact.file=/user/hive/warehouse/default__ 
user_id_ordering_mode_index__;SET 
hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat;

hive> select count (*) from products where ordering_mode = 1;Total MapReduce 
jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile 
time: 1In order to change the average load for a reducer (in bytes):  set 
hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum 
number of reducers:  set hive.exec.reducers.max=<number>In order to set a 
constant number of reducers:  set 
mapred.reduce.tasks=<number>java.lang.NumberFormatException: For input string: 
"00000001218  1       
hdfs://pzxdcc0250.cdbt.pldc.kp.org:8020/user/hive/warehouse/products/products_clean.tsv
 16599219"       at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)   
     at java.lang.Long.parseLong(Long.java:441)      at 
java.lang.Long.parseLong(Long.java:483)      at 
org.apache.hadoop.hive.ql.index.HiveIndexResult.add(HiveIndexResult.java:174)   
     at 
org.apache.hadoop.hive.ql.index.HiveIndexResult.<init>(HiveIndexResult.java:127)
     at 
org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat.getSplits(HiveIndexedInputFormat.java:120)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)  
     at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)  at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)        at 
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)        at 
java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:415)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)        at 
org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:564) at 
org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:559) at 
java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:415)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:559)    
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:550)     at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)    at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)    at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)       at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)  at 
org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1485)        at 
org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1263)   at 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1091)       at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)        at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)     at 
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)  at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422) at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)       at 
org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684) at 
org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)        at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)     at 
org.apache.hadoop.util.RunJar.main(RunJar.java:212)Job Submission failed with 
exception 'java.lang.NumberFormatException(For input string: "00000001218      
1       
hdfs://pzxdcc0250.cdbt.pldc.kp.org:8020/user/hive/warehouse/products/products.tsv
       16599219")'FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
                                          

Reply via email to