Re: Phrase Highlighting

2009-06-04 Thread Mark Miller
Yeah, the highlighter framework as is is certainly limiting. When I first did the SpanHighlighter without trying to fit it into the old Highlighter (an early incomplete prototype type thing anyway) I made them merge right off the bat because it was very easy. That was because I could just use t

Re: Phrase Highlighting

2009-06-04 Thread Michael McCandless
Mark, is this because the highlighter package doesn't include enough information as to why the fragmenter picked a given fragment? Because... the SpanScorer is in fact doing all the work to properly locate the full span for the phrase (I think?), so it's ashame that because there's no way for it t

Re: Phrase Highlighting

2009-06-03 Thread Max Lynch
On Wed, Jun 3, 2009 at 7:34 PM, Mark Miller wrote: > Max Lynch wrote: > >> Well what happens is if I use a SpanScorer instead, and allocate it like >>> >>> >> >> >> >>> such: analyzer = StandardAnalyzer([]) tokenStream = analyzer.tokenStream("contents", luc

Re: Phrase Highlighting

2009-06-03 Thread Mark Miller
Max Lynch wrote: Well what happens is if I use a SpanScorer instead, and allocate it like such: analyzer = StandardAnalyzer([]) tokenStream = analyzer.tokenStream("contents", lucene.StringReader(text)) ctokenStream = lucene.CachingTokenFilter(tokenStre

Re: Phrase Highlighting

2009-06-02 Thread Max Lynch
> Well what happens is if I use a SpanScorer instead, and allocate it like > > such: > > > >analyzer = StandardAnalyzer([]) > >tokenStream = analyzer.tokenStream("contents", > > lucene.StringReader(text)) > >ctokenStream = lucene.CachingTokenFilter(tokenStream)

Re: Phrase Highlighting

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 3:09 PM, Max Lynch wrote: > Sorry, the following code is in python, but I can hack a Java thing together > if necessary. I'm a big Python fan :) > HighlighterSpanScorer is the SpanScorer from the highlight > package just renamed to avoid conflict with the other SpanScorer

Re: Phrase Highlighting

2009-05-21 Thread Max Lynch
On Thu, Apr 30, 2009 at 5:16 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Thu, Apr 30, 2009 at 12:15 AM, Max Lynch wrote: > > You should switch to the SpanScorer (in o.a.l.search.highlighter). > >> That fragment scorer should only match true phrase matches. > >> > >> Mike > >>

Re: Phrase Highlighting

2009-04-30 Thread Michael McCandless
On Thu, Apr 30, 2009 at 12:15 AM, Max Lynch wrote: > You should switch to the SpanScorer (in o.a.l.search.highlighter). >> That fragment scorer should only match true phrase matches. >> >> Mike >> > > Thanks Mike.  I gave it a try and it wasn't working how I expected.  I am > using pylucene right

Re: Phrase Highlighting

2009-04-29 Thread Max Lynch
You should switch to the SpanScorer (in o.a.l.search.highlighter). > That fragment scorer should only match true phrase matches. > > Mike > Thanks Mike. I gave it a try and it wasn't working how I expected. I am using pylucene right now so I can ask them if the implementation is different. I'm

Re: Phrase Highlighting

2009-04-29 Thread Michael McCandless
You should switch to the SpanScorer (in o.a.l.search.highlighter). That fragment scorer should only match true phrase matches. Mike On Tue, Apr 28, 2009 at 9:49 PM, Max Lynch wrote: > Hi, > I am trying to find out exactly when a word I'm looking for in a document is > found.  I've talked to a fe

Phrase Highlighting

2009-04-28 Thread Max Lynch
Hi, I am trying to find out exactly when a word I'm looking for in a document is found. I've talked to a few people on IRC and it seems like the best way is to use a highlighter. What I have right now is a system where I put each word the highlighter is called with into a list so I then know whic