Hi,
Well you completely understood my problem:wistle:, the point you mentioned
about how much to extract after the word Location is something i'll have to
figure out. So lets say that the input to my system would be:-
"
Location : Montvale, NJ
Duration : 7 months
"
Now the problem is when the in
Hi,
Thanks for your replies, it really helped me a lot.
Thanks&Regards,
Abhishek
--
View this message in context:
http://www.nabble.com/How-to-tune-Analyzer-for-Text-Extraction-tp24926082p24938899.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---
Hi,
you should also have a look at GATE (http://gate.ac.uk) which comes with a
NER application called ANNIE. You could use it to analyse your docs before
indexing them with Lucene or SOLR.
As Grant mentioned, UIMA can also be used for that as there are a number of
NER annotators available for it
On Aug 11, 2009, at 5:27 PM, xs2Abhishek wrote:
Hi,
I am trying to make a decision on weather or not I can use Lucene
for my
requirements, which mainly include data tagging. I have to be able
to parse
or index a .txt file and then be able to extract text accordingly.
For e.g
if the inpu
If this file has a predefined construct, e.g.:
title: someting
location: new york
then you can write a simple parser that extracts that information.
But I think otherwise this falls outside the scope of Lucene, unless I
misunderstood you.
If I had to give it a long shot though, I'd try to in
xs2Abhishek schrieb:
Hi,
I am trying to make a decision on weather or not I can use Lucene for my
requirements, which mainly include data tagging. I have to be able to parse
or index a .txt file and then be able to extract text accordingly. For e.g
if the input document has some text like: "Loca