hi,i am using CDH 5.2 (hive 13).as for partitioned views, to my understanding 
they are more about abstracting a non partitioned back end. My questing is 
related to how and if the query optimizer can detect possible uses of 
partitioned column, even if it might be "masked" by a view (or multiple 
view).Empirical tests, by the show, suggest that indeed, partition optimization 
does take place.  regards, cobby.
      From: "Moore, Douglas" <douglas.mo...@thinkbiganalytics.com>
 To: "user@hive.apache.org" <user@hive.apache.org>; 'cobby cohen' 
<ququr...@yahoo.com> 
 Sent: Monday, March 16, 2015 4:19 PM
 Subject: Re: view over partitioned table
   
Mich,
What version of Hive are you running?Have you seen 
this?https://cwiki.apache.org/confluence/display/Hive/PartitionedViews
- Douglas

From: Mich Talebzadeh <m...@peridale.co.uk>
Reply-To: <user@hive.apache.org>
Date: Sun, 15 Mar 2015 19:01:57 +0000
To: <user@hive.apache.org>, 'cobby cohen' <ququr...@yahoo.com>
Subject: RE: view over partitioned table

#yiv1224423554 #yiv1224423554 -- _filtered #yiv1224423554 
{font-family:Helvetica;panose-1:2 11 6 4 2 2 2 2 2 4;} _filtered #yiv1224423554 
{panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv1224423554 
{font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv1224423554 
{font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}#yiv1224423554 
#yiv1224423554 p.yiv1224423554MsoNormal, #yiv1224423554 
li.yiv1224423554MsoNormal, #yiv1224423554 div.yiv1224423554MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;}#yiv1224423554 a:link, 
#yiv1224423554 span.yiv1224423554MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv1224423554 a:visited, #yiv1224423554 
span.yiv1224423554MsoHyperlinkFollowed 
{color:purple;text-decoration:underline;}#yiv1224423554 
span.yiv1224423554EmailStyle17 {color:windowtext;}#yiv1224423554 
.yiv1224423554MsoChpDefault {font-size:10.0pt;} _filtered #yiv1224423554 
{margin:72.0pt 72.0pt 72.0pt 72.0pt;}#yiv1224423554 
div.yiv1224423554WordSection1 {}#yiv1224423554 OK,    This is the way I read 
it. Crearte table t1 .. partitioned by date will use horizontal partitioning as 
per common with any RDBMS say Oracle.    The view I will create it as follows:  
  hive> create view v1 as select * from t1; OK Time taken: 0.073 seconds    
hive> analyze table t1 partition (date) compute statistics; Query ID = 
hduser_20150315184646_af3890b2-4079-4f81-b38b-da9af67b2751 Total jobs = 1 
Launching Job 1 out of 1 ----- Stage-Stage-0: Map: 1   Cumulative CPU: 0.6 sec  
 HDFS Read: 299 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 600 msec 
OK    hive> insert into table t1 partition (date)  values (1,'2015-04-04'); 
hive> insert into table t1 partition (date)  values (1,'2015-05-04');    Help 
to update stats    hive> analyze table t1 partition (date) compute statistics;  
  select from the view after exiting hive and back       hive> explain select 
id from v1 where date = '2015-04-04'; OK STAGE DEPENDENCIES:   Stage-0 is a 
root stage    STAGE PLANS:   Stage: Stage-0     Fetch Operator       limit: -1  
     Processor Tree:         TableScan           alias: t1           
Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE   
        Select Operator             expressions: id (type: string)             
outputColumnNames: _col0             Statistics: Num rows: 1 Data size: 1 Basic 
stats: COMPLETE Column stats: NONE             ListSink    Time taken: 0.627 
seconds, Fetched: 17 row(s)       Now select the same from base table    hive> 
explain select id from t1 where date = '2015-04-04'; OK STAGE DEPENDENCIES:   
Stage-0 is a root stage    STAGE PLANS:   Stage: Stage-0     Fetch Operator     
  limit: -1       Processor Tree:         TableScan           alias: t1         
  Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE 
          Select Operator             expressions: id (type: string)            
 outputColumnNames: _col0             Statistics: Num rows: 1 Data size: 1 
Basic stats: COMPLETE Column stats: NONE             ListSink    Time taken: 
0.597 seconds, Fetched: 17 row(s)    The plan looks the same but it is slower 
on the view    HTH    Mich    NOTE: The information in this email is 
proprietary and confidential. This message is for the designated recipient 
only, if you are not the intended recipient, you should destroy it immediately. 
Any information in this message shall not be understood as given or endorsed by 
Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. 
It is the responsibility of the recipient to ensure that this email is virus 
free, therefore neither Peridale Ltd, its subsidiaries nor their employees 
accept any responsibility.    From: cobby cohen [mailto:ququr...@yahoo.com]
Sent: 15 March 2015 17:52
To: user@hive.apache.org
Subject: view over partitioned table    in the following case:    create table 
T1(id string) partitioned by (date date);    create view V1(id, date) select 
id, date from T1    if i do the following select id from V1 where date = 
'2015-04-04' will the query optimize on T1's partitioning?    and what if V1 is 
a bit more complex, further depending on other views (all of which expose the 
partitioned column as-is)?    thx! 

  

Reply via email to