Re: getting term offset information for fields with multiple value entiries

2007-08-20 Thread duiduder
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello Grant, dear community I have written some lines of code to adapt the offset values from Lucene to values where the terms really appear in the concatenated field value entries. My tests are successful :) There are two additional methods inside

Re: getting term offset information for fields with multiple value entiries

2007-08-20 Thread duiduder
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 this is the 2.2.0 release Grant Ingersoll schrieb: > What version of Lucene are you using? > > > On Aug 17, 2007, at 12:44 PM, [EMAIL PROTECTED] wrote: > > Hello community, dear Grant > > I have build a JUnit test case that illustrates the prob

Re: getting term offset information for fields with multiple value entiries

2007-08-17 Thread Grant Ingersoll
What version of Lucene are you using? On Aug 17, 2007, at 12:44 PM, [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello community, dear Grant I have build a JUnit test case that illustrates the problem - there, I try to cut out the right substring with the offset v

Re: getting term offset information for fields with multiple value entiries

2007-08-17 Thread duiduder
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello community, dear Grant I have build a JUnit test case that illustrates the problem - there, I try to cut out the right substring with the offset values given from Lucene - and fail :( A few remarks: In this example, the 'é' from 'Bosé' makes t

Re: getting term offset information for fields with multiple value entiries

2007-08-16 Thread Grant Ingersoll
Hi Christian, Is there anyway you can post a complete, self-contained example preferably as a JUnit test? I think it would be useful to know more about how you are indexing (i.e. what Analyzer, etc.) The offsets should be taken from whatever is set in on the Token during Analysis. I, too,

getting term offset information for fields with multiple value entiries

2007-08-16 Thread duiduder
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello, I have an index with an 'actor' field, for each actor there exists an single field value entry, e.g. stored/compressed,indexed,tokenized,termVector,termVectorOffsets,termVectorPosition movie_actors:Mayrata O'Wisiedo (as Mairata O'Wisiedo)