Re: Oracle to HDFS through Sqoop and a Hive External Table

2013-11-03 Thread Venkat Ranganathan
If you are using Sqoop 1.4.4, you could use the hcatalog table option to bring it to Hive. That allows it to be agnostic of the hive table formats (you can use ORC for example or RCFile) and it handles the partitioning easily (including dynamic partitions). And you can directly export from a hi

Re: Oracle to HDFS through Sqoop and a Hive External Table

2013-11-03 Thread Raj Hadoop
Manish, Thanks for reply. 1. Load to Hdfs, beware of Sqoop error handling, as its a mapreduce based framework, so if 1 mapper fails it might happen that you get partial data. So do you say that - if I can handle errors in Sqoop, going for 100 HDFS folders/files - is it OK ? 2. Create partitio

RE: Oracle to HDFS through Sqoop and a Hive External Table

2013-11-03 Thread manish.hadoop.work
1. Load to Hdfs, beware of Sqoop error handling, as its a mapreduce based framework, so if 1 mapper fails it might happen that you get partial data. 2. Create partition based on date and hour, if customer table has some date or timestamp column. 3. Think about file format also, as that will aff