Re: Filtering

2013-05-19 Thread Owen O'Malley
On Sun, May 19, 2013 at 3:11 PM, Peter Marron < peter.mar...@trilliumsoftware.com> wrote: >Hi Owen, > > ** ** > > Firstly I want to say a huge thank you. You have really helped me > enormously. > You're welcome. > > OK. I think that I get it now. In my custom InputFormat I can read

RE: Filtering

2013-05-19 Thread Peter Marron
Hi Owen, Firstly I want to say a huge thank you. You have really helped me enormously. I realize that you have been busy with other things (like release 0.11.0) and so I can understand that it must have been a pain to take time out to help me. >The critical piece is in OpProcFactory where the set

RE: Filtering

2013-05-16 Thread Peter Marron
>>On Wed, May 15, 2013 at 3:38 AM, Peter Marron >> wrote: … >I've started doing similar work for the ORC reader. I guess that I’m glad that I’m not completely alone here. >> >>Firstly although that page mentions InputFormat there doesn’t seem to be any >>way (that I can find) >>to perform filte

Re: Filtering

2013-05-15 Thread Owen O'Malley
On Wed, May 15, 2013 at 3:38 AM, Peter Marron < peter.mar...@trilliumsoftware.com> wrote: > Hi, > > ** ** > > I’m using Hive 0.10.0 and Hadoop 1.0.4. > > ** ** > > I would like to create a normal table but have some of my code run so that > I can remove filtering > > parts of the quer

RE: Filtering on TIMESTAMP data type

2012-06-04 Thread Ladda, Anand
Can anyone helpout with the TIMESTAMP literals piece. So far, I've gotten Select day_timestamp from lu_day where day_timestamp > to_utc_timestamp('2012-06-04 00:00:00', 'GMT') to work ok and give me back timestamps greater than the one in the literal. Is this the best function to get this to wo

Re: filtering out crawlers

2011-02-09 Thread Wil -
Hi, There are quite a few databases online with known robots. http://www.robotstxt.org/db.html and http://www.botsvsbrowsers.com/category/1/index.html comes to mind. The hardest part is figuring out the suspect robots which do not identify themselves. From: Ca

Re: Filtering out files in a bucket (update on HIVE-951)

2011-01-24 Thread Avram Aelony
hmmm, I've seen mention of SymLink but I don't yet grasp how it works/applies to selecting files to process. Also, I don't have much control over how the data gets to the bucket I end up reading from, hence the need to powerfully select. Could you point me to some SymLink documentation or an

Re: Filtering out files in a bucket (update on HIVE-951)

2011-01-24 Thread Edward Capriolo
On Mon, Jan 24, 2011 at 5:58 PM, Avram Aelony wrote: > Hi, > > I really like the virtual column feature in 0.7 that allows me to request > INPUT__FILE__NAME and see the names of files that are being acted on. > > Because I can see the files that are being read, I see that I am spending > time qu