On 31/10/2011 21:42, Petite Abeille wrote:
On Oct 31, 2011, at 9:32 PM, Andrzej Bialecki wrote:
similarity-preserving hash function was calculated on each sentence, and the
hash was added as a field. The property of the hash was that similar documents
(sentences) would produce a similar hash
yes: override that method idfExplain(java.util.Collection,
org.apache.lucene.search.Searcher)
On Mon, Oct 31, 2011 at 5:24 PM, David Ryan wrote:
> Thanks! Is there any way to extend the Similarity class to overwrite the
> behavior (e.g., using the max idf instead of the sum of each term idfs)?
Thanks! Is there any way to extend the Similarity class to overwrite the
behavior (e.g., using the max idf instead of the sum of each term idfs)?
On Thu, Oct 27, 2011 at 5:41 AM, Robert Muir wrote:
> On Thu, Oct 20, 2011 at 3:11 PM, David Ryan wrote:
>
> >
> > However, in some case, when I
On Oct 31, 2011, at 9:32 PM, Andrzej Bialecki wrote:
> similarity-preserving hash function was calculated on each sentence, and the
> hash was added as a field. The property of the hash was that similar
> documents (sentences) would produce a similar hash, with only some bit-level
> perturbati
On 22/10/2011 11:11, Grant Ingersoll wrote:
Hi All,
I'm giving a talk at ApacheCon titled "Bet you didn't know Lucene can..."
(http://na11.apachecon.com/talks/18396). It's based on my observation, that over the
years, a number of us in the community have done some pretty cool things using Luc
Nice not to have to worry about performance. You say there is another
question, but not what it is. The code you show looks like it should
do what you want.
For anything non-trivial I prefer to build the queries directly in
code rather than concatenating strings to be parsed, because I find it
h
thanks Ian for your response. This is a one-time offline program so am not
bothered about the performance (i.e. speed etc.).
one more question, there are some situations where I need to run a AND
clause (i.e. more than one phrase, such as "Apple" AND "Steve Jobs"). My
approach was something like :
That's a good idea, if your index is "large enough", and/or you make
heavy use of FieldCache (eg, sorting by field), regardless of whether
you use NRT or "normal" commit + reopen to reopen your reader.
Mike McCandless
http://blog.mikemccandless.com
On Sun, Oct 30, 2011 at 7:36 PM, Denis Bazhenov
Do the individual docs get bigger after 28 million? Can you try
loading the last few million docs, from when the size jumps, and see
what happens? Or load them in reverse order or something, again to
see what happens?
I don't have indexes with that many docs, but I believe that plenty of
people
Sounds custom made for boosting. Depending on how you are structuring
your fields and queries you could use either index or query time
boosts, or even both.
http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_difference_between_field_.28or_document.29_boosting_and_query_boosting.3F
--
Ian.
10 matches
Mail list logo