The Hits class was deprecated at some point and has been removed from
recent releases.
The 2.9.3 javadoc at
http://lucene.apache.org/java/2_9_3/api/core/org/apache/lucene/search/Hits.html
shows a little code sample
TopDocs topDocs = searcher.search(query, numHits);
ScoreDoc[] hits = topDocs.sc
You'll have to call .commit() from the IndexWriter to make the changes
externally visible.
The call IndexReader.reopen to get a reader seeing the committed
changes; the reopen will be efficient (only open "new" segments vs the
old reader).
It's still best to use near-real-time reader when possibl
We have a modified version of a Lucene StandardAnalyzer , we use it for
tokenizing music metadata such as as artist names & song titles, so
typically only a few words. On tokenizing it usually it strips out
punctuations which is correct, however if the input text consists of
only punctuation
Hi,
I am having a weird experience. I made a few changes with the source code
(Lucene 3.3). I created a basic application to test it. First, I added
Lucene 3.3 project to basic project as "required projects on the build path"
to be able to debug. When everything was ok, I removed it from required
Hi Paul,
Since you have modified the StandardAnalyzer (I presume you mean
StandardFilter), why not do a check on the term.text() and if its all
punctuation, skip the analysis for that term? Something like this in
your StandardFilter:
public final boolean incrementToken() throws IOException {
Ch
Hi Mead,
You may want to check out the permuterm index idea.
http://www-nlp.stanford.edu/IR-book/html/htmledition/permuterm-indexes-1.html
Basically you write a custom filter that takes a term and generates all
word permutations off it. On the query side, you convert your query so
its always a p
Hi,
I would like to read the term and its frequency or score out of indices. How
can I do it using Java?
Thanks!
Hi Paul,
You could add a rule to the StandardTokenizer JFlex grammar to handle this
case, bypassing its other rules.
Another option is to create a char filter that substitutes PUNCT-EXCLAMATION
for exclamation points, PUNCT-PERIOD for periods, etc., but only when the
entire input consists excl