Hi,
Are there any notes on making the highlighter work consistently with a
shingle generated index?
I have a situation where complete matches highlight OK, but partial
matches do not - leading to a number of blank previews...
Our analyser look like:
TokenStream result =
Steve,
Exactly the right question...
Prompted by your question, further investigation reveals that I need to
move the "access" part of my lucene query into a filter to prevent
non-matching documents getting scored.
In that situation of course the highlighter finds nothing to highlight -
that'
Hi folks,
Is there a reason why the setMaxDocCharsToAnalyze() method of
WeightedSpanTermExtractor() is protected?
The class is a perfect fit for my requirement (enumerating the list of
terms present in a document that match the current query for subsequent
highlighting in a PDF file) with th
Did you consider using shingles?
It solves the "to be or not to be" problem quite nicely.
Dawn
On 24/07/2013 12:34, Ankit Murarka wrote:
I tried using Phrase Query with slops. Now since I am specifying the
slop I also need to specify the 2nd term.
In my case the 2nd term is not present. The w
On 18/01/2011 21:04, Grant Ingersoll wrote:
[X] ASF Mirrors (linked in our release announcements or via the Lucene website)
[] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
[] I/we build them from source via an SVN/Git checkout.
[] Other (someone in your company mirrors them
Removing redundant calls to rewrite was the key when I had this issue
moving from 2.3.x to 3.0.x...
Dawn
On 25/01/2011 20:04, Uwe Schindler wrote:
And: you don't need to rewrite queries before highlighting, highlighter does
this automatically internally if needed.
-
Uwe Schindler
We use the contrib package 'Highlighter' to do exactly that on our PDF
newspaper website.
Dawn
On 03/02/2011 17:31, Gong Li wrote:
Hi,
I am developing an advanced pdf search engine in java by using pdfbox and
lucene. And I must display the context of each keyword in the user
interface, but i
Hi Folks,
Before I run off and reinvent the wheel here - has anyone done any form
of result grouping with lucene?
My use case looks something like this:
Newspaper pages are stored as documents in the lucene index.
I need to list the newpapers that match my criteria in date order, so
that I ca
On 23/03/2011 17:55, Grant Ingersoll wrote:
Have you looked at Solr and date faceting capabilities? Also, it has result
grouping, but I think you are just describing faceting/filtering.
SOLR is not an option, we are already have the index (>2 million pages
some with 100,000 terms).
What I'
On 12/05/2011 15:47, Wulf Berschin wrote:
I think support for highlighting documents would be a very welcome
feature. Highlighting HTML documents is already possible with the
org.apache.solr.analysis.HTMLStripCharFilter and a NullFragmenter, but
ther seems to be nothing for highlighting PDF fi
Are you using QueryAnalyser...?
If so remember that NOT is a reserved word.
Dawn
On 26/07/2011 04:25, SBS wrote:
If I enter a query of just the word "not" I get no matches. If I run a
query with just the word "included" I get lots of matches. If I run the
query "not included" (without surroun
Hi folks,
I'm researching the best options to use for analysing/storing newspaper
pages in out online archive, and wondered if anyone has any good hints
or tips on good practice for this type of media?
I'm currently thinking alone the lines of using a customised
StandardAnalyser (no stop wor
Hi Steve,
On 28/11/2011 19:43, Steven A Rowe wrote:
I assume that when you refer to "the impact of stop words," you're concerned
about query-time performance? You should consider the possibility that performance
without removing stop words is good enough that you won't have to take any steps
13 matches
Mail list logo