Implement dynamic Partitioning on daily cadence. Example: ParentDirectory/partition=Day1/Day1_n_files.gz ParentDirectory/partition=Day2/Day2_n_files.gz ParentDirectory/partition=Day30/Day30_n_files.gz And so on...
You can also opt for Monthly partitions rather than daily by comparing the file size with the HDFS block size. Let us know your comments on the same. Thanks, Saurabh Sent from my iPhone, please avoid typos. > On 16-May-2014, at 1:00 am, Matouk IFTISSEN <matouk.iftis...@ysance.com> > wrote: > > files are all the same format (.gz) but they are in different subdirectories > !! > my problematique is : I want to do an import by day from oracle to hdfs (in > directory : > hdfs_my_parent_directory/import_dir_day1/part_data_import.gz > hdfs_my_parent_directory/import_dir_day2/part_data_import.gz > ............ > hdfs_my_parent_directory/import_dir_day30/part_data_import.gz > > If I point the Hive table to the parent directory : hdfs_my_parent_directory > it did'nt read (load) the Data !! > > How Can do this, to read all files in subdirectories using one (or tow) Hive > Table (s) ?? > > Thanks By adavce, and sorry for grammer and orthographe error :) > > > 2014-05-14 13:46 GMT+02:00 Joshua Fennessy <j...@joshuafennessy.com>: >> If those files are all the same format, you would point the Hive table to >> the parent directory. It will resource through and find all of the files to >> include on the table. >> >> You can filter files to use multiple tables, but recursion is designed in. >> >> Sent from my gadget. Please excuse any spelling errors. >> >> >>> On May 14, 2014, at 7:31 AM, "Matouk IFTISSEN" <matouk.iftis...@ysance.com> >>> wrote: >>> >>> Hé Geeks, >>> Is there a best manner to load or read data in hive table (normal or >>> external) from plural subdirectories? >>> exemple : I have a directory my_directory and in there are lot of >>> subdirectory ie: >>> my_directory --> my_subdirectory1 , my_subdirectory2, ..., my_subdirectoryx >>> and my data (files) are in those subdirectories!!!, how can I read them >>> with one/two Hive tables ? >>> >>> Thanks by advance >>> -- >>> Matouk IFTISSEN | Consultant BI & Big Data >>> >>> 24 rue du sentier - 75002 Paris - www.ysance.com >>> Fax : +33 1 73 72 97 26 >>> Ysance sur :Twitter | Facebook | Google+ | LinkedIn | Newsletter >>> Nos autres sites : ys4you | labdecisionnel | decrypt > > > > -- > Matouk IFTISSEN | Consultant BI & Big Data > > 24 rue du sentier - 75002 Paris - www.ysance.com > Fax : +33 1 73 72 97 26 > Ysance sur :Twitter | Facebook | Google+ | LinkedIn | Newsletter > Nos autres sites : ys4you | labdecisionnel | decrypt