I have been looking at using ParallelReader as its documentation indicates, to
allow certain fields to be updated while most of the fields will not be updated.
However, this does not seem possible. Let's say I have two indexes, A and B,
which are used in a ParallelReader. If I update a documen
Hi,
When I searched for java projects that can be acceptable for GSOC 2011, I
found Lucene. It nicely matches with my interests. Could you please tell me
whether you are going to be there this time. If yes, what kind of ideas that
you will be present. I like to start quite earlier working on this
I'm resurrecting this old thread because this issue is now reaching a
critical point for us and I'm going to have to modify the Lucene source code
for it to continue to work for us.
Just a quick refresher: we have one index with several hundred thousand
unqiue field names and found that opening an
Hi!
I would like to announce the release of RankingAlgorithm ver 1.1 and
would like to invite you to try it out. It is very good and does not
need any changes to your existing indexes but the way they are accessed,
ranked and scored changes. This version has Score Boosting enabling
Document
Hi Simon,
thanks for your answer. My comments below:
so you mean you would want to do that analysis on the client side and
only shoot the already tokenized values to the server?
What exactly is too slow? Can you provide more info what the problem is?
After all I think you should ask on the sol
On Tue, Feb 22, 2011 at 2:58 PM, Erik Fäßler wrote:
> Hi there,
>
> I'd like to serialize some Lucene Documents I've built before. My goal is to
> send the documents over a http connection to a Solr server which then should
> add them to its index.
ok so why do you build lucene documents if you
Hi there,
I'd like to serialize some Lucene Documents I've built before. My goal
is to send the documents over a http connection to a Solr server which
then should add them to its index.
I thought this would work as the Document class implements Serializable
as do the Fields. Unfortunately,
Well, actually it depends
If your suggestion terms corresponds with the terms in your "main" index, then
you can use TermEnum#docFreq()+
Otherwise, if you develop a separate index for the suggestions (that do not
correspond with the terms in your main index), then you just can add a
calculat
On Tue, Feb 22, 2011 at 3:15 AM, wrote:
> Here is how long it took for each run :
> - default : run 1 = 55 minutes, run 2 = 59 minutes
> - balanced : run 1 = 145 minutes, run 2 = 121 minutes
>
> Is that an expected behavior?
Hmm BalancedSegmentMergePolicy was over 2X slower to optimize...?
Th
On Tue, Feb 22, 2011 at 11:23 AM, Clemens Wyss wrote:
> Fernando, Uwe thanks for your suggestions.
> Is it possible to get the number of "hits" per term?
> ferrari (125)
> lamborghini (34)
> ...
I think you can just call TermEnum#docFreq(), no?
simon
>
>> -Ursprüngliche Nachricht-
>> Von
Fernando, Uwe thanks for your suggestions.
Is it possible to get the number of "hits" per term?
ferrari (125)
lamborghini (34)
...
> -Ursprüngliche Nachricht-
> Von: Fernando Wasylyszyn [mailto:ferw...@yahoo.com.ar]
> Gesendet: Montag, 21. Februar 2011 21:11
> An: java-user@lucene.apache.
Hi,
As, mention above i am using query like:
criteria = (sql OR sqlserver OR "sql server") AND java AND delphi
In the above scenario i need hit(document) containing at least one occurrence
of (sql OR sqlserver OR "sql server"). Also java and delphi must present in
document.
Still I have not g
Hi,
I did some tests with the BalancedSegmentMergePolicy, looking specifically
at the optimize. I have an index that is 70 Gb large, and contains around
35 millions documents.
I duplicated the index 4 times, and I ran 2 optimize with the default
merge policy, and 2 with the balanced policy.
He
Hey,
On Mon, Feb 21, 2011 at 8:56 PM, Ajay Anandan wrote:
> Hi
> I am trying to implement an Expectation Maximization algorithm for document
> clustering. I am planning to use Lucene Term Vectors for finding similarity
> between 2 documents. There are 2 kinds of EM algos using naive Bayes: the
14 matches
Mail list logo