RE: Soliciting Design Thoughts on Date Searching

2007-03-05 Thread Steven Parkes
But, letting it stay in the text stream and not putting it in a separate date field would give you some trouble with ranges because things that weren't dates could mess you up. This is why Chris suggested putting a prefix on the token. For example, leading underscor

Re: Soliciting Design Thoughts on Date Searching

2007-03-05 Thread Erick Erickson
See below... On 3/5/07, Walt Stoneburner <[EMAIL PROTECTED]> wrote: Erick / Steve, Thank you both (as well as everyone else who weighed in) on helping get to a far more optimal solution well before any code was ever slung. Since we all know that someone else is going to find this in the archi

Re: Soliciting Design Thoughts on Date Searching

2007-03-05 Thread Walt Stoneburner
Erick / Steve, Thank you both (as well as everyone else who weighed in) on helping get to a far more optimal solution well before any code was ever slung. Since we all know that someone else is going to find this in the archives some day, I'd like to unveil the rest of my ignorance and misconc

Re: Soliciting Design Thoughts on Date Searching

2007-03-04 Thread Erick Erickson
y, March 01, 2007 7:54 AM To: java-user@lucene.apache.org Subject: Re: Soliciting Design Thoughts on Date Searching Thank you all for the suggestions steering me down the right path. As an aside, the easy part, at least for me, is extracting the dates -- Peter was dead on about how doing that: heu

RE: Soliciting Design Thoughts on Date Searching

2007-03-01 Thread Steven Parkes
[mailto:[EMAIL PROTECTED] Sent: Thursday, March 01, 2007 7:54 AM To: java-user@lucene.apache.org Subject: Re: Soliciting Design Thoughts on Date Searching Thank you all for the suggestions steering me down the right path. As an aside, the easy part, at least for me, is extracting the dates -- Pe

Re: Soliciting Design Thoughts on Date Searching

2007-03-01 Thread Walt Stoneburner
Thank you all for the suggestions steering me down the right path. As an aside, the easy part, at least for me, is extracting the dates -- Peter was dead on about how doing that: heuristics, multiple regular expressions, and data structures. As Steve pointed out, this isn't as trivial as it soun

Re: Soliciting Design Thoughts on Date Searching

2007-03-01 Thread mark harwood
Original Message From: Otis Gospodnetic <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, 1 March, 2007 10:25:24 AM Subject: Re: Soliciting Design Thoughts on Date Searching Ah, I once worked in a place where we did exactly that - recognition and extraction

Re: Soliciting Design Thoughts on Date Searching

2007-03-01 Thread Otis Gospodnetic
Tag - Search - Share - Original Message From: Steven Parkes <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, February 28, 2007 6:56:02 PM Subject: RE: Soliciting Design Thoughts on Date Searching Yeah, date finding is a little like entity extraction, since dat

RE: Soliciting Design Thoughts on Date Searching

2007-02-28 Thread Steven Parkes
y, February 28, 2007 3:26 PM To: Lucene Users Subject: Re: Soliciting Design Thoughts on Date Searching : I have generic material that _contain_ dates: historic time lines, : certificates, news articles, forms, deeds, testimonies, and wildly : free form genealogical information. The dates hav

Re: Soliciting Design Thoughts on Date Searching

2007-02-28 Thread Chris Hostetter
: I have generic material that _contain_ dates: historic time lines, : certificates, news articles, forms, deeds, testimonies, and wildly : free form genealogical information. The dates have no specific : structure, obvious context, nor consistency. identifying an extracting dates from bulk text

Re: Soliciting Design Thoughts on Date Searching

2007-02-28 Thread Peter W.
expert, but it sounds like you need to associate many dates to a single record. ... Tom -Original Message- From: Walt Stoneburner [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 28, 2007 2:13 PM To: java-user@lucene.apache.org Subject: Re: Soliciting Design Thoughts on Date Searching

RE: Soliciting Design Thoughts on Date Searching

2007-02-28 Thread Aigner, Thomas
would be treated as a synonym type (basically setPositionIncrement(0)?) Just thinking outloud.. Tom -Original Message- From: Walt Stoneburner [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 28, 2007 2:13 PM To: java-user@lucene.apache.org Subject: Re: Soliciting Design Thoughts

Re: Soliciting Design Thoughts on Date Searching

2007-02-28 Thread Walt Stoneburner
Been searching http://www.gossamer-threads.com/lists/lucene/java-user/ as Erick suggested; man, is there a wealth of information in the Lucene archives. I have found many examples of how to convert text to dates and back, how to search Date fields for various ranges, and so forth -- but I don't t

Re: Soliciting Design Thoughts on Date Searching

2007-02-27 Thread Erick Erickson
If you search the mailing list archive for 'date', you'll find a wealth of discussion on this topic. Also, try DateTools, DateRange, etc. http://www.gossamer-threads.com/lists/lucene/java-user/ Erick On 2/27/07, Walt Stoneburner <[EMAIL PROTECTED]> wrote: I've been asked if it's possible to

Soliciting Design Thoughts on Date Searching

2007-02-27 Thread Walt Stoneburner
I've been asked if it's possible to search on dates within a document. The high level goal is to index a number of documents which mention specific dates, and then perform a broad query for documents that mention dates within a certain time period. In thinking about how to go about solving this p