Hi, On 25 Aug 2012, at 05:58, Ravi Shetye <ravi.she...@vizury.com> wrote:
> Thanks Richin and Pedro, > So a final clarification > Another way of doing apart from dynamic partition is if you can create > your directories like below either manually or the ETL process you might be > doing to get the table data it is pretty easy. > > s3://ravi/logs/adv_id=123/date=2012-01-01/log.gz > s3://ravi/logs/adv_id=456/date=2012-01-02/log.gz > s3://ravi/logs/adv_id=123/date=2012-01-03/log.gz > > 1)Since I have used PARTITIONED BY (adv_id STRING,date STRING) Hive system > will read the bucket name adv_id=123 and understand that the data within this > bucket can be accessed by a pseudo column adv_id? Yes. > 2) It would be wrong if I use PARTITIONED BY (date STRING,adv_id STRING) and > keep the same bucket structure? Yes, the order of the fields in PARTITIONED BY must match the structure. > 3)Also it wont work if I store data in s3://ravi/logs/123/2012-01-01/log.gz ? No, you need xxx=. Cheers, Pedro