Re: Re: Question about OVER clause

2018-09-27 Thread anci_...@yahoo.com
Thanks,but the article talked too little to help. 
Actually, it only told me that by using this code
we would accumulate all records of which v_date was less than or equal to 
current row.
But the question is, what will happen by the code below? (the field v_date is a 
string value with the format '-MM-dd')
Thanks!



anci_...@yahoo.com
 
From: Alan Gates
Date: 2018-09-22 07:19
To: user; anci_sun
Subject: Re: Question about OVER clause
This article might be helpful.  It's for SQL Server, but the semantics should 
be similar.

https://www.sqlpassion.at/archive/2015/01/22/sql-server-windowing-functions-rows-vs-range/

Alan.

On Wed, Sep 19, 2018 at 6:47 AM 孙志禹  wrote:
Dears,
   What is the difference between ROW BETWEEN and RANGE BETWEEN when using a 
OVER clause? I found it difficult to get an answer about this for hive.
   Hope there would be a more detailed help article about OVER clause at the 
Confluence.
   Thanks!


Question about INSERT OVERWRITE TABLE with dynamic partition

2018-10-23 Thread anci_...@yahoo.com
Dears,
I found an interesting thing. 
When inserting a NULL result into a partition which already contained some 
records, there was a difference in the results between using static partition 
INSERT and using dynamic partition INSERT.
See the example below: 
Partition '20180101' of table A contained 100 records.
By using 
we can delete the records in partition '20180101'.
But by using 
there would be no change to the partition '20180101'. 
In fact, if we running 'select * from A where partition_A = '20180101'' 
, we will still get 100 records from it.
Expecting an explanation for it.
Thanks!



孙志禹


Re: Re: Question about INSERT OVERWRITE TABLE with dynamic partition

2018-10-25 Thread anci_...@yahoo.com
Thanks, I think it's the proper explanation. For the query result in the second 
query is null, there won't be a partition name generated in dynamic partition 
step, so the system doesn't know which partition to overwrite.
Thanks very much!


Regards,
孙志禹
 
From: Tanvi Thacker
Date: 2018-10-25 08:34
To: user
Subject: Re: Question about INSERT OVERWRITE TABLE with dynamic partition
A logical explanation could be:-
In the first query, you are telling hive which partition to overwrite, so a 
step which actually deletes the partition data and overwrites it with the query 
result, knows that which partition to delete and there is an empty result/file 
to move.

but for the second query, Dynamic partition step needs to deduce partition name 
from the query result, but as your query is not producing any row, there is no 
info of the partition to take action on.

Regards,
Tanvi Thacker

On Tue, Oct 23, 2018 at 9:38 PM anci_...@yahoo.com  wrote:
Dears,
I found an interesting thing. 
When inserting a NULL result into a partition which already contained some 
records, there was a difference in the results between using static partition 
INSERT and using dynamic partition INSERT.
See the example below: 
Partition '20180101' of table A contained 100 records.
By using 
we can delete the records in partition '20180101'.
But by using 
there would be no change to the partition '20180101'. 
In fact, if we running 'select * from A where partition_A = '20180101'' 
, we will still get 100 records from it.
Expecting an explanation for it.
Thanks!



孙志禹


Efficiency of too many grouping sets

2018-11-05 Thread anci_...@yahoo.com
Dears,
I had taken a SELECT script with 90 sets in one GROUPING, and there was a 
serious data skewing problem.
Was it concerned with the too many GROUPING SETS and how to solve it?
( I couldn't simply set hive.groupby.skewindata to true because there were 
some COUNT(DISTINCT ...) in it)
Thanks!
Regards,
孙志禹


Rlike '\s' couldn't get the space

2018-11-12 Thread anci_...@yahoo.com
Dears,
I see that using '\s' can get the whitespace character in normal java  
regular expressions, but in HIVE I found it couldn't. 
Why? And is there any other differences between the regular expressions in 
JAVA and HIVE?




Regards,
孙志禹