Shoot, I meant to reply to the group, not respond to Mark directly. (Mark replied offline to me; not sure the etiquette in pasting that response in here as well!)
Hi Mark, thanks for the response! I tried using the memory-intensive boostrap action and got a different error; however, I'm not sure if it represents progress in the right direction or regression. (I thought the memory-intensive script was for memory intensive map-reduce jobs -- not table DDL. So I am wondering if it made things even worse.) java.lang.OutOfMemoryError: GC overhead limit exceeded As for the other suggestion, I agree that 15k partitions (and growing) is unruly; but, the files are not small! Each is over one gigabyte and represents one hour from the past twenty months. I would imagine others must have similar setups and have some way around my issue. Also, since it worked in the older hadoop/hive stack, I'm suspicious that there is some configuration item I should be able to tweak. In the meantime, I am tempted to drop the entire database and recreate from scratch (since all tables are external anyway). If no solution is found, we will probably look into some kind of hybrid system where older data is archived in other tables and a union is used in queries. Sincerely, Matt