I have a use case for which I'm trying to figure out the best way to use
Lucene and could use some guidance.
I have a set of documents representing products in a catalog (name,
description, etc.). I then pull down data from different sources such as
Ebay and Amazon and need to determine if the ite
Thank you, Trejkaz.
I was just about to post the fact that I /finally/ found that method by
looking at the source code for LUKE.
There is a night and day difference in performance.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Trying-to-generate-a-list-of-DISTINCT-field-na
Hi Mike,
If you just need the IDF you can run HighFreqTerm.java in contrib against
either your sample index or your index to get the N terms with the highest DF
values (i.e. lowest IDF.) If you have a large index, giving it lots of memory
seems to help.
Depending on your use case, you may inst
Hi Simon,
I guess in a sense we are interested in obtaining a list of the top N terms,
but they would be the top terms in the sense that they have the lowest IDF
values. These would be the terms that appear in all or almost all documents in
the document set. This is not a count of the number of
On Thu, Dec 15, 2011 at 6:33 PM, Mike O'Leary wrote:
> We have a large set of documents that we would like to index with a
> customized stopword list. We have run tests by indexing a random set of about
> 10% of the documents, and we'd like to generate a list of the terms in that
> smaller set
We have a large set of documents that we would like to index with a customized
stopword list. We have run tests by indexing a random set of about 10% of the
documents, and we'd like to generate a list of the terms in that smaller set
and their IDF values as a way to create a starter set of stopw
Hi. I'm trying to configure an analyzer to be somewhat forgiving of
spelling mistakes in longer words of a search query. So, for example, if a
word in the query matches at least five characters of an indexed word
(token), I want that to be a hit. NGramTokenFilter with a minimum gram size
of 5 seems
Hi,
I have come across a problem with our code that is not scaling well and I'm
hoping there is a way I can tweak our existing code to run faster.
We are indexing on a Java object called "Node". A "Node" can have one or
more "Attributes". The "Attributes" consist of a key / value pair and the
I opened LUCENE-3649.
Shai
On Thu, Dec 15, 2011 at 2:50 PM, Shai Erera wrote:
> Sure, as soon as I'll be in front of a computer.
>
> Shai
> On Dec 15, 2011 2:48 PM, "Uwe Schindler" wrote:
>
>> Yes, I could attach the patch there! Will you open it?
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Al
Sure, as soon as I'll be in front of a computer.
Shai
On Dec 15, 2011 2:48 PM, "Uwe Schindler" wrote:
> Yes, I could attach the patch there! Will you open it?
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original M
... issue for *it*, not 'other' :)
Shai
On Dec 15, 2011 2:47 PM, "Shai Erera" wrote:
> If you already did it, then a patch will be great. Perhaps we should open
> an issue for other?
>
> Shai
> On Dec 15, 2011 11:44 AM, "Uwe Schindler" wrote:
>
>> Alternatively in overview.html (which fits bett
Yes, I could attach the patch there! Will you open it?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Shai Erera [mailto:ser...@gmail.com]
> Sent: Thursday, December 15, 2011 1:47 PM
> To: java-user@luce
If you already did it, then a patch will be great. Perhaps we should open
an issue for other?
Shai
On Dec 15, 2011 11:44 AM, "Uwe Schindler" wrote:
> Alternatively in overview.html (which fits better).
>
> There is only one limitation according to docs: The first sentence is
> copied over to the
Alternatively in overview.html (which fits better).
There is only one limitation according to docs: The first sentence is copied
over to the package description an if the first sentence is formatted as
or whatever, it kills the whole Javascript formatting. So to make it perfect
(and it looks r
If you remove the useless CSS in the HTML it looks perfect in package.html!
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Shai Erera [mailto:ser...@gmail.com]
> Sent: Thursday, December 15, 2011 8:39 A
15 matches
Mail list logo