Re: Extracting contact data

Karl Wettin Wed, 13 Jan 2010 09:04:37 -0800

Lucene will probably only be helpful if you know what you are lookingfor, e.g. that you search for a given person, a given street and giventime intervals.


Is this what you want to do?

If you instead are looking for a way to really extract any person,street and time interval that a document is associated with youprobably want to look for a natural language processing project thatcan do something like semantic part of speech tagging for you.



      karl

13 jan 2010 kl. 17.39 skrev Ortelli, Gian Luca:

Hi community,
I have a general understanding of Lucene concepts, and I'm wonderingif
it's the right tool for my job:
- I need to extract data like e.g. time intervals ("8am - 12pm"),street
addresses from a set of files. The common issue with this data unit is
that they contain spaces and are not always definable through regexes.



- the extraction must take into consideration the "proximity": for
example, a mail address which is close to the work "Contacts" will
receive a higher rank, since I'm looking for contact data.
Do you think I can get any advantage from building a solution onLucene?
 Gianluca



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Extracting contact data

Reply via email to