Re: Large Scale Table Reprocess

2013-07-26 Thread Alan Gates
I believe: alter table _tablename_ set fileformat orcfile; will do what you want. All future partitions that are added will be in orcfile format (assuming you use insert to create the partitions) or assumed to be in orcfile format if you do alter table add partition. As to whether orcfile wil

Re: Large Scale Table Reprocess

2013-07-26 Thread John Omernik
More specifically, we have a table that is currently defined as RCFile, to do this, I'd like to define all new partitions as ORC. With the advent of ORC, these types of problems are going to come up for many folks, any guidance would be appreciated ... Also, based on the strategic goals of ORC fi

Re: Large Scale Table Reprocess

2013-07-26 Thread John Omernik
Can you give some examples of how to alter partitions for different input types? I'd appreciate it :) On Fri, Jul 26, 2013 at 3:29 PM, Alan Gates wrote: > A table can definitely have partitions with different input > formats/serdes. We test this all the time. > > Assuming your old data doesn't

Re: Large Scale Table Reprocess

2013-07-26 Thread Alan Gates
A table can definitely have partitions with different input formats/serdes. We test this all the time. Assuming your old data doesn't stay for ever and most of your queries are on more recent data (which is usually the case) I'd advise you to not reprocess any data, just alter the table to s

Large Scale Table Reprocess

2013-07-25 Thread John Omernik
Just finishing up testing with Hive 11 and ORC. Thank you to Owen and all those who have put hard work into this. Just ORC files, when compared to RC files in Hive 9, 10, and 11 saw a huge increase in performance, it was amazing. That said, now we gotta reprocess. We have a large table with lots