On Tue, Oct 15, 2013 at 10:57 AM, Michael McCandless
wrote:
> On Tue, Oct 15, 2013 at 10:11 AM, Robert Muir wrote:
>> On Tue, Oct 15, 2013 at 9:59 AM, Michael McCandless
>> wrote:
>>> Well, unfortunately, this is a trap that users do hit.
>>>
>>> By requiring the user to think about the limit on
On Tue, Oct 15, 2013 at 10:11 AM, Robert Muir wrote:
> On Tue, Oct 15, 2013 at 9:59 AM, Michael McCandless
> wrote:
>> Well, unfortunately, this is a trap that users do hit.
>>
>> By requiring the user to think about the limit on creating
>> PostingsHighlighter, he/she would think about it and re
I'm very grateful for the assistance. It'd be great to know the value
of DEFAULT_MAX_LENGTH in the documentation. I know the majority of
applications care more about precision than recall... but I know of a
lot of people using Lucene for high recall applications, too. Working
in high recall domains
On Tue, Oct 15, 2013 at 9:59 AM, Michael McCandless
wrote:
> Well, unfortunately, this is a trap that users do hit.
>
> By requiring the user to think about the limit on creating
> PostingsHighlighter, he/she would think about it and realize they are
> in fact setting a limit.
>
> Silent limits ar
Well, unfortunately, this is a trap that users do hit.
By requiring the user to think about the limit on creating
PostingsHighlighter, he/she would think about it and realize they are
in fact setting a limit.
Silent limits are dangerous because you don't offhand know what's
wrong / why you see no
I strongly disagree: there is no trap, its a reasonable default for
good summarization, and the behavior is no different than the other
highlighters here.
Typically people *do* care about performance and its important to have
a clean simple API too.
In my opinion increasing this limit is very eso
Maybe we should make the max length a required argument to
PostingsHighlighter ctor?
Because it's trappy now, since you don't realize offhand that it's
silently enforcing a limit ...
Mike McCandless
http://blog.mikemccandless.com
On Tue, Oct 15, 2013 at 9:31 AM, Robert Muir wrote:
> Thanks Jo
Thanks Jon. Ill add some stuff to the javadocs here to try to make it
more obvious.
On Tue, Oct 15, 2013 at 5:54 AM, Jon Stewart
wrote:
> Awesome, that did it! I didn't realize that DEFAULT_MAX_LENGTH was
> only 10,000. I've now upped it to 16MB (I'm not doing the usual thing
> and performance is
Awesome, that did it! I didn't realize that DEFAULT_MAX_LENGTH was
only 10,000. I've now upped it to 16MB (I'm not doing the usual thing
and performance is not a particular concern).
Thanks,
Jon
On Mon, Oct 14, 2013 at 9:58 PM, Robert Muir wrote:
> are your documents large?
>
> try PostingsHig
are your documents large?
try PostingsHighlighter(int) ctor with a larger value than DEFAULT_MAX_LENGTH.
sounds like the passages you see with matches are very deep into the
document and its just hitting the default limit and returning the
default summarization (getEmptyHighlight())
otherwise, p
I upgraded to 4.5. Same results, unfortunately. Most docs in the
result set will have a Passage where numMatches() > 0, but some do
not. In these cases, the Passage array's length is greater than zero.
Jon
On Mon, Oct 14, 2013 at 5:24 PM, Robert Muir wrote:
> did you try the latest release? Th
did you try the latest release? There are some bugs fixed...
On Mon, Oct 14, 2013 at 2:11 PM, Jon Stewart
wrote:
> Hello,
>
> I've observed that when using PostingsHighlighter in Lucene 4.4 that
> some of the responsive documents in TopDocs will have zero matches in
> the associated array of Pass
12 matches
Mail list logo