If you are using Sqoop 1.4.4, you could use the hcatalog table option to
bring it to Hive. That allows it to be agnostic of the hive table formats
(you can use ORC for example or RCFile) and it handles the partitioning
easily (including dynamic partitions). And you can directly export from a
hi
Manish,
Thanks for reply.
1. Load to Hdfs, beware of Sqoop error handling, as its a mapreduce based
framework, so if 1 mapper fails it might happen that you get partial data.
So do you say that - if I can handle errors in Sqoop, going for 100 HDFS
folders/files - is it OK ?
2. Create partitio
1. Load to Hdfs, beware of Sqoop error handling, as its a mapreduce based
framework, so if 1 mapper fails it might happen that you get partial data.
2. Create partition based on date and hour, if customer table has some date or
timestamp column.
3. Think about file format also, as that will aff