Nitin I am still confused, from the below data that i have given should the file which sits in the folder Country=USA and state=IL have only the rows where Country=USA and state=IL or will it have rows of other countries also. The reason i ask is because if we have a 250GB file and would like to create 10 partitions that would end up in 2.5 TB * 3 = 7.5TB. Is this expected. Thanks S
________________________________ From: Nitin Pawar <nitinpawar...@gmail.com> To: user@hive.apache.org; Sai Sai <saigr...@yahoo.in> Sent: Monday, 27 May 2013 2:08 PM Subject: Re: Partitioning confusion when you specify the load data query with specific partition, it will put the entire data into that partition. On Mon, May 27, 2013 at 1:08 PM, Sai Sai <saigr...@yahoo.in> wrote: > >After creating a partition for a country (USA) and state (IL) and when we go >to the the hdfs site to look at the partition in the browser we r seeing all >the records for all the countries and states rather than just for the >partition created for US and IL given below, is this correct behavior: >******************** >Here is my commands: >******************** > > > >CREATE TABLE employees (name STRING, salary FLOAT, subordinates ARRAY<STRING>, >deductions MAP<STRING, FLOAT>, address STRUCT<street:STRING, city:STRING, >state:STRING, zip:INT, country:STRING> ) PARTITIONED BY (country STRING, state >STRING); > > >LOAD DATA LOCAL INPATH >'/home/satish/data/employees/input/employees-country.txt' INTO TABLE employees >PARTITION (country='USA',state='IL'); > > >******************** > >Here is my original data file, where i have a few countries data such as USA, >INDIA, UK, AUS: >******************** > > > >John Doe100000.0Mary SmithTodd JonesFederal Taxes.2State Taxes.05Insurance.11 >Michigan Ave.ChicagoIL60600USA >Mary Smith80000.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 Ontario >St.ChicagoIL60601USA >Todd Jones70000.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago Ave.Oak >ParkIL60700USA >Bill King60000.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure >Dr.ObscuriaIL60100USA >Boss Man200000.0John DoeFred FinanceFederal Taxes.3State Taxes.07Insurance.051 >Pretentious Drive.ChicagoIL60500USA >Fred Finance150000.0Stacy AccountantFederal Taxes.3State Taxes.07Insurance.052 >Pretentious Drive.ChicagoIL60500USA >Stacy Accountant60000.0Federal Taxes.15State Taxes.03Insurance.1300 Main >St.NapervilleIL60563USA >John Doe 2100000.0Mary SmithTodd JonesFederal Taxes.2State >Taxes.05Insurance.11 Michigan Ave.ChicagoIL60600INDIA >Mary Smith 280000.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 >Ontario St.ChicagoIL60601INDIA >Todd Jones 270000.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago >Ave.Oak ParkIL60700AUSTRALIA >Bill King 260000.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure >Dr.ObscuriaIL60100AUSTRALIA >Boss Man2 200000.0John DoeFred FinanceFederal Taxes.3State >Taxes.07Insurance.051 Pretentious Drive.ChicagoIL60500UK >Fred Finance 2150000.0Stacy AccountantFederal Taxes.3State >Taxes.07Insurance.052 Pretentious Drive.ChicagoIL60500UK >Stacy Accountant 260000.0Federal Taxes.15State Taxes.03Insurance.1300 Main >St.NapervilleIL60563UK >******************** > >Now when i navigate to: >Contents of directory >/user/hive/warehouse/db1.db/employees/country=USA/state=IL > >******************** > >I see all the records and was wondering if it should have only USA & IL >records. >Please help. -- Nitin Pawar