Hi,
You can completely ban within-a-word search by simply using WhitespaceTokenizer
for example.By the way, it is all about how you tokenize/analyze your text.
Once you decided, you can create a two versions of a single field using
different analysers.This allows you to assign different weights
I'm seeing a randomly occuring Index Corruption exception during a Solr
data ingest. This can occur anywhere during the 7-8 hours our ingests
take. I've submitted a Solr bug issue to JIRA as this is the environment
I'm using, but it does look as though the error is occurring in Lucene
code, so I t
OK, got your concern now. Right, when docs are deleted they are only
marked as deleted, the actual data is _not_ purged (yet).
As you add more documents to your index, segments will get merged as
part of normal processing. When segments are merged, the deleted data
is expunged. So if you're contin
Hi,
Apologies for repeating question from IRC room but I am not sure if that is
alive.
I have no idea about how lucene works but I need to modify some part in
rdf4j project which depends on that.
I need to use lucene to create a mapping file based on text searching and I
found there is a followi
Thanks Erick for your answer, we have huge index 700Gb, 350 millions of
documents
We had a case of log flooding due to a bug in an application, that generate 100
000 000 documents, so we have deleted them, but there is no impact on indexSize
without optimize.
I think it's normal, true ?
Thanks