same here ... we have way more than 256 partitions in multiple tables. I am sure the issue has something to do with an empty string passed to the substr function. can you validate that the table has no null/empty string for user_name or try running the query with len(user_name) > 1 (not sure about query syntax) ?
On Tue, May 3, 2011 at 7:02 PM, Steven Wong <sw...@netflix.com> wrote: > I have way more than 256 partitions per table. AFAIK, there is no partition > limit. > > > > From your stack trace, you have some host name issue somewhere. > > > > > > *From:* Time Less [mailto:timelessn...@gmail.com] > *Sent:* Tuesday, May 03, 2011 6:52 PM > *To:* user@hive.apache.org > *Subject:* Maximum Number of Hive Partitions = 256? > > > > I created a partitioned table, partitioned daily. If I query the earlier > partitions, everything works. The later ones fail with error: > > hive> select substr(user_name,1,1),count(*) from u_s_h_b where > dtpartition='2010-10-24' group by substr(user_name,1,1) ; > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks not specified. Estimated from input data size: 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer=<number> > In order to limit the maximum number of reducers: > set hive.exec.reducers.max=<number> > In order to set a constant number of reducers: > set mapred.reduce.tasks=<number> > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.mapred.FileInputFormat.identifyHosts(FileInputFormat.java:556) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHosts(FileInputFormat.java:524) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:235) > ......snip....... > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > Job Submission failed with exception > 'java.lang.ArrayIndexOutOfBoundsException(0)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MapRedTask > > It turns out that 2010-10-24 is 257 days from the very first partition in > my dataset (2010-01-09): > > | date_sub('2010-10-24',interval 257 day) | > +-----------------------------------------+ > | 2010-02-09 | > > That seems like an interesting coincidence. But try as I might, the Great > Googles will not show me a way to tune this, or even if it is tuneable, or > expected. Has anyone else run into a 256-partition limit in Hive? How do you > work around it? Why is that even the limit?! Shouldn't it be more like > 32-bit maxint??!! > > Thanks! > > -- > Tim Ellis > Riot Games >