The "data_table" has around 5k fields, all doubles. As for the "age_mean" table, here it is:
hive> desc age_mean; OK id string name string age_mean double Time taken: 0.127 seconds Does this help? Thanks! Fernando On Tue, Jan 15, 2013 at 4:35 PM, Mark Grover <grover.markgro...@gmail.com>wrote: > Fernando, > Could you share your table definitions as well please? > > > On Tue, Jan 15, 2013 at 10:31 AM, Fernando Andrés Doglio Turissini < > fernando.dog...@globant.com> wrote: > >> Hello everyone, I'm struggling with an exception I'm getting on a >> particular query that's driving me crazy! >> >> Here is the exception I get: >> >> java.lang.RuntimeException: >> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while >> processing writable org.apache.hadoop.hive.serde2.colum >> nar.BytesRefArrayWritable@71412b61 >> at >> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) >> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) >> at org.apache.hadoop.mapred.Child.main(Child.java:249) >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime >> Error while processing writable >> org.apache.hadoop.hive.serde2.columnar.BytesRefArray >> Writable@71412b61 >> at >> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) >> at >> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) >> ... 8 more >> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2 >> at >> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:506) >> ... 9 more >> >> >> Here is the query I'm running: >> >> INSERT INTO TABLE variance >> SELECT id, collect_set(name)[0], SUM( POW(age - age_mean, 2) ) / count(1) >> FROM age_mean join data_table on (age_mean.id = '01' AND data_table.q1 = >> 1) >> where age is not null and age_mean is not null GROUP BY id; >> >> It's probably relevant to mention that I'm doing this on an EMR cluster. >> >> Any idea what might be causing the exception? >> >> Thanks! >> Fernando >> > >