lucene-seme1 s wrote:
Can you please share the custom Analyzer you have ?
Unfortunately it's not mine to share but see the Lucene Token and
Analyzer classes - it's not particularly hard to code.
-
To unsubscribe, e-mail: [
ay of doing this which avoids this problem might be to look at
> the new payloads API.
> Anyone care to wade in with if this is feasible and the state of play with
> payloads?
>
> Cheers
> Mark
>
>
> - Original Message
> From: Grant Ingersoll <[EMAIL PROT
feasible and the state of play with
payloads?
Cheers
Mark
- Original Message
From: Grant Ingersoll <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 18 March, 2008 12:24:02 AM
Subject: Re: Indexing/Querying Annotations and Fields for a document
You would parse
You would parse the XML (or whatever) into separate strings, and put
each piece into it's own Field in a Lucene Document
For instance:
Document doc = new Document();
String body = getBody(input);
String people = getPeople(input)
Field body = new Field("body", body);
Field people = new Field("p
I already have the document preprocessed and the annotations (i.e.
John) are already stored in an array with features attached
to some annotations (such as the root and lemma of the word). Can you please
elaborate some more on how to "index them as normally would" ?
Regards,
JK
On Mon, Mar 17, 2
I think there are a couple of ways you can approach this, although I
have never used GATE.
If these annotations are marked in line in your content, then you can
either preprocess the files to have them separately and index as you
normally would, or you can use the relatively new TeeTokenFil