Shoot, I meant to reply to the group, not respond to Mark directly.  (Mark 
replied offline to me; not sure the etiquette in pasting that response in here 
as well!)


Hi Mark, thanks for the response!  I tried using the 
memory-intensive boostrap action and got a different error; however, 
I'm not sure if it represents progress in the right direction or 
regression.  (I thought the memory-intensive script was for memory 
intensive map-reduce jobs -- not table DDL.  So I am wondering if it 
made things even worse.)


java.lang.OutOfMemoryError: GC overhead limit exceeded

As for the other suggestion, I agree that 15k partitions (and growing) is 
unruly; but, the files are not small!  Each is over one gigabyte and represents 
one hour from the past twenty months.  I would imagine others must have 
similar setups and have some way around my issue.  Also, since it worked in the 
older hadoop/hive stack, I'm suspicious that there is some configuration item I 
should be able to tweak.


In the meantime, I am 
tempted to drop the entire database and recreate from scratch (since all tables 
are external anyway).  If no solution is found, we will probably look into some 
kind of hybrid system where older data is archived in 
other tables and a union is used in queries.


Sincerely,
Matt

Reply via email to