Nitin
I am still confused, from the below data that  i have given should the file 
which sits in the folder Country=USA and state=IL have only the rows where 
Country=USA and state=IL or will it have rows of other countries also.
The reason i ask is because if we have a 250GB file and would like to create 10 
partitions that would end up in 2.5 TB * 3 = 7.5TB. Is this expected.
Thanks
S


________________________________
 From: Nitin Pawar <nitinpawar...@gmail.com>
To: user@hive.apache.org; Sai Sai <saigr...@yahoo.in> 
Sent: Monday, 27 May 2013 2:08 PM
Subject: Re: Partitioning confusion
 


when you specify the load data query with specific partition, it will put the 
entire data into that partition. 




On Mon, May 27, 2013 at 1:08 PM, Sai Sai <saigr...@yahoo.in> wrote:


>
>After creating a partition for a country (USA) and state (IL) and when we go 
>to the the hdfs site to look at the partition in the browser we r seeing  all 
>the records for all the countries and states rather than just for the 
>partition created for US and IL given below, is this correct behavior:
>********************
>Here is my commands:
>********************
>
>
>
>CREATE TABLE employees (name STRING, salary FLOAT, subordinates ARRAY<STRING>, 
>deductions MAP<STRING, FLOAT>, address STRUCT<street:STRING, city:STRING, 
>state:STRING, zip:INT, country:STRING> ) PARTITIONED BY (country STRING, state 
>STRING);
>
>
>LOAD DATA LOCAL INPATH 
>'/home/satish/data/employees/input/employees-country.txt' INTO TABLE employees 
>PARTITION (country='USA',state='IL');
>
>
>********************
>
>Here is my original data file, where i have a few countries data such as USA, 
>INDIA, UK, AUS:
>********************
>
>
>
>John Doe100000.0Mary SmithTodd JonesFederal Taxes.2State Taxes.05Insurance.11 
>Michigan Ave.ChicagoIL60600USA
>Mary Smith80000.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 Ontario 
>St.ChicagoIL60601USA
>Todd Jones70000.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago Ave.Oak 
>ParkIL60700USA
>Bill King60000.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure 
>Dr.ObscuriaIL60100USA
>Boss Man200000.0John DoeFred FinanceFederal Taxes.3State Taxes.07Insurance.051 
>Pretentious Drive.ChicagoIL60500USA
>Fred Finance150000.0Stacy AccountantFederal Taxes.3State Taxes.07Insurance.052 
>Pretentious Drive.ChicagoIL60500USA
>Stacy Accountant60000.0Federal Taxes.15State Taxes.03Insurance.1300 Main 
>St.NapervilleIL60563USA
>John Doe 2100000.0Mary SmithTodd JonesFederal Taxes.2State 
>Taxes.05Insurance.11 Michigan Ave.ChicagoIL60600INDIA
>Mary Smith 280000.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 
>Ontario St.ChicagoIL60601INDIA
>Todd Jones 270000.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago 
>Ave.Oak ParkIL60700AUSTRALIA
>Bill King 260000.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure 
>Dr.ObscuriaIL60100AUSTRALIA
>Boss Man2 200000.0John DoeFred FinanceFederal Taxes.3State 
>Taxes.07Insurance.051 Pretentious Drive.ChicagoIL60500UK
>Fred Finance 2150000.0Stacy AccountantFederal Taxes.3State 
>Taxes.07Insurance.052 Pretentious Drive.ChicagoIL60500UK
>Stacy Accountant 260000.0Federal Taxes.15State Taxes.03Insurance.1300 Main 
>St.NapervilleIL60563UK
>********************
>
>Now when i navigate to:
>Contents of directory 
>/user/hive/warehouse/db1.db/employees/country=USA/state=IL
>
>********************
>
>I see all the records and was wondering if it should have only USA & IL 
>records.
>Please help.


-- 
Nitin Pawar

Reply via email to