Have you tried using EXPLAIN[1] on your query? I usually like to use that to get a better understanding of what my query is actually doing and debugging at other times.
[1] https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain On Tue, Aug 7, 2012 at 12:20 PM, Raihan Jamal <jamalrai...@gmail.com> wrote: > Hi Jan, > > > I figured that out, it is working fine for me now. The only question I > have is, if I am doing like this- > > > > SELECT * FROM REALTIME where dt= yesterdaydate('yyyyMMdd') LIMIT 10; > > > > Then the above query will be evaluated as below right? > > > > SELECT * FROM REALTIME where dt= ‘20120806’ LIMIT 10; > > > > So that means it will look for data in the corresponding dt partition > *(20120806) > *only right as above table is partitioned on dt column ? And it will not > scan the whole table right?** > > > > *Raihan Jamal* > > > > On Mon, Aug 6, 2012 at 10:56 PM, Jan Dolinár <dolik....@gmail.com> wrote: > >> Hi Jamal, >> >> Check if the function really returns what it should and that your data >> are really in yyyyMMdd format. You can do this by simple query like this: >> >> SELECT dt, yesterdaydate('yyyyMMdd') FROM REALTIME LIMIT 1; >> >> I don't see anything wrong with the function itself, it works well for me >> (although I tested it in hive 0.7.1). The only thing I would change about >> it would be to optimize it by calling 'new' only at the time of >> construction and reusing the object when the function is called, but that >> should not affect the functionality at all. >> >> Best regards, >> Jan >> >> >> >> >> On Tue, Aug 7, 2012 at 3:39 AM, Raihan Jamal <jamalrai...@gmail.com>wrote: >> >>> *Problem* >>> >>> I created the below UserDefinedFunction to get the yesterday's day in >>> the format I wanted as I will be passing the format into this below method >>> from the query. >>> >>> >>> >>> *public final class YesterdayDate extends UDF {* >>> >>> * * >>> >>> * public String evaluate(final String format) { * >>> >>> * DateFormat dateFormat = new >>> SimpleDateFormat(format); * >>> >>> * Calendar cal = Calendar.getInstance();* >>> >>> * cal.add(Calendar.DATE, -1); * >>> >>> * return >>> dateFormat.format(cal.getTime()).toString(); * >>> >>> * } * >>> >>> *}* >>> >>> >>> >>> >>> >>> So whenever I try to run the query like below by adding the jar to >>> classpath and creating the temporary function yesterdaydate, I always get >>> zero result back- >>> >>> >>> >>> hive> create temporary function *yesterdaydate* as >>> 'com.example.hive.udf.YesterdayDate'; >>> >>> OK >>> >>> Time taken: 0.512 seconds >>> >>> >>> >>> Below is the query I am running- >>> >>> >>> >>> *hive> SELECT * FROM REALTIME where dt= yesterdaydate('yyyyMMdd') LIMIT >>> 10;* >>> >>> *OK* >>> >>> * * >>> >>> And I always get zero result back but the data is there in that table >>> for Aug 5th.** >>> >>> >>> >>> What wrong I am doing? Any suggestions will be appreciated. >>> >>> >>> >>> >>> >>> NOTE:- As I am working with Hive 0.6 so it doesn’t support variable >>> substitution thing, so I cannot use hiveconf here and the above table has >>> been partitioned on dt(date) column.** >>> >> >> > -- Swarnim