Re: Spans, appended fields, and term positions

2005-11-22 Thread Erik Hatcher
On 22 Nov 2005, at 03:18, Paul Elschot wrote: You just about have me convinced :) For some reason adding a method to Analyzer seems much heavier to me - not quite sure why, just a gut feeling. In case this position increment gap is needed both at indexing time and at query/highlighting tim

Re: Spans, appended fields, and term positions

2005-11-22 Thread Paul Elschot
On Monday 21 November 2005 22:20, Erik Hatcher wrote: > > On 21 Nov 2005, at 16:09, Yonik Seeley wrote: > >> The Analyzer extensions seem fine, but much more general purpose > >> than my need. > > > > For your need (a global increment), isn't expanding analyzer > > actually easier? > > analyse

Re: Spans, appended fields, and term positions

2005-11-21 Thread Erik Hatcher
On 21 Nov 2005, at 16:09, Yonik Seeley wrote: The Analyzer extensions seem fine, but much more general purpose than my need. For your need (a global increment), isn't expanding analyzer actually easier? analyser = new OldAnalyzer() { public int getPositionIncrementGap(String field) {

Re: Spans, appended fields, and term positions

2005-11-21 Thread Yonik Seeley
> > For position increments, it doesn't have to be tracked. The patch to > > DocumentWriter could also be: > > > > int position = fieldPositions[fieldNumber]; > > + if (position>0) position+=analyzer.getPositionIncrementGap > > (fieldName) > > This could be thwarted with tokens using zer

Re: Spans, appended fields, and term positions

2005-11-21 Thread Erik Hatcher
On 21 Nov 2005, at 12:55, Yonik Seeley wrote: On 11/21/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: Modifying Analyzer as you have suggested would require DocumentWriter additionally keep track of the field names and note when one is used again. For position increments, it doesn't have to be t

Re: Spans, appended fields, and term positions

2005-11-21 Thread Yonik Seeley
On 11/21/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > Modifying Analyzer as you have suggested would > require DocumentWriter additionally keep track of the field names > and note when one is used again. For position increments, it doesn't have to be tracked. The patch to DocumentWriter could als

Re: Spans, appended fields, and term positions

2005-11-21 Thread Erik Hatcher
On 21 Nov 2005, at 04:26, Erik Hatcher wrote: What about adding an offset to Field, setPositionOffset(int offset)? Looking at DocumentWriter, it looks like this would be the simplest thing that could work, without precluding the interesting option of modifying Analyzer to allow with flags

Re: Spans, appended fields, and term positions

2005-11-21 Thread Erik Hatcher
Yonik, Thanks for your carefully thought out and detailed reply. On 20 Nov 2005, at 12:00, Yonik Seeley wrote: Does it make sense to add an IndexWriter setting to specify a default position increment gap to use when multiple fields are added in this way? Per-field might be nice... The good

Re: Spans, appended fields, and term positions

2005-11-20 Thread Yonik Seeley
> It depends on > Document.fields() of a stored and retrieved document: does it return > all the appended field parts as separate Fields, or does it only > return one Field with all parts appended? Separate fields. Stored fields are returned back to you verbatim. -Yonik Now hiring -- http://for

Re: Spans, appended fields, and term positions

2005-11-20 Thread Paul Elschot
One more thing to consider: the field length in the index. Probably the added position increment between appended parts of a field should not be reflected in the total field size as indexed. This would also be a consideration for queries and for the field norms: when multiple fields are used they

Re: Spans, appended fields, and term positions

2005-11-20 Thread Yonik Seeley
> Does it make sense to add an IndexWriter setting to > specify a default position increment gap to use when multiple fields > are added in this way? Per-field might be nice... The good news is that Analyzer is an abstract class, and not an Interface, so we could add something to it without break

Spans, appended fields, and term positions

2005-11-20 Thread Erik Hatcher
I'm working on building a custom highlighter for a client, which may eventually be generalizable. In my work, I've come across some issues I'd like to discuss. One issue is of appended fields allowing querying across boundaries. For example, if I index two fields with the same name: