Hi Jan,
I have date in different format also, so that is the reason I was thinking
to do by this approach. How can I make sure this will work on the selected
partition only and it will not scan the entire table. I will add your
suggestion in my UDF as deterministic thing.
My simple question here
@kulkarni,
When I did explain on my query, I got these things, I am not sure how to
understand these thing. Any help will be appreciated whether my approach is
right or not?-
hive> EXPLAIN SELECT * FROM PDS_ATTRIBUTE_DATA_REALTIME where
dt=yesterdaydate('MMdd', 2) LIMIT 5;
OK
ABSTRACT S
Oops, sorry I made a copy&paste mistake :) The annotation should read
@*UDFType(deterministic=true*)
Jan
On Tue, Aug 7, 2012 at 7:37 PM, Jan Dolinár wrote:
> I'm afraid that he query
>
> SELECT * FROM REALTIME where dt= yesterdaydate('MMdd') LIMIT 10;
>
> will scan entire table, because th
I'm afraid that he query
SELECT * FROM REALTIME where dt= yesterdaydate('MMdd') LIMIT 10;
will scan entire table, because the functions is evaluated at runtime, so
Hive doesn't know what the value is when it decides which files to scan. I
am not 100% sure though, you should try it.
Also, yo
Have you tried using EXPLAIN[1] on your query? I usually like to use that
to get a better understanding of what my query is actually doing and
debugging at other times.
[1] https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain
On Tue, Aug 7, 2012 at 12:20 PM, Raihan Jamal wrote
Hi Jan,
I figured that out, it is working fine for me now. The only question I have
is, if I am doing like this-
SELECT * FROM REALTIME where dt= yesterdaydate('MMdd') LIMIT 10;
Then the above query will be evaluated as below right?
SELECT * FROM REALTIME where dt= ‘20120806’ LIMIT 1
I tested that function using main and by printing it out and it works fine.
As I am trying to get the Yesterday's date.
I need my query to be like this as today's date is Aug 6th, so query should
be for Aug 5th. And this works fine for me.
*SELECT * FROM REALTIME where dt= '20120805' LIMIT 10;*
Hi Jamal,
Check if the function really returns what it should and that your data are
really in MMdd format. You can do this by simple query like this:
SELECT dt, yesterdaydate('MMdd') FROM REALTIME LIMIT 1;
I don't see anything wrong with the function itself, it works well for me
(althou