date:20110222

ParallelReader and updateDocument don't play nice?

2011-02-22 Thread Groose, Brian

I have been looking at using ParallelReader as its documentation indicates, to allow certain fields to be updated while most of the fields will not be updated. However, this does not seem possible. Let's say I have two indexes, A and B, which are used in a ParallelReader. If I update a documen

Are you going to enter Google Summer of Code?

2011-02-22 Thread Lasantha Bandara

Hi, When I searched for java projects that can be acceptable for GSOC 2011, I found Lucene. It nicely matches with my interests. Could you please tell me whether you are going to be there this time. If yes, what kind of ideas that you will be present. I like to start quite earlier working on this

Re: IndexWriter.close() performance issue

2011-02-22 Thread Mark Kristensson

I'm resurrecting this old thread because this issue is now reaching a critical point for us and I'm going to have to modify the Lucene source code for it to continue to work for us. Just a quick refresher: we have one index with several hundred thousand unqiue field names and found that opening an

[Announce] RankingAlgorithm ver 1.1

2011-02-22 Thread Nagendra Nagarajayya

Hi! I would like to announce the release of RankingAlgorithm ver 1.1 and would like to invite you to try it out. It is very good and does not need any changes to your existing indexes but the way they are accessed, ranked and scored changes. This version has Score Boosting enabling Document

Re: Serialization of Lucene Document objects

2011-02-22 Thread Erik Fäßler

Hi Simon, thanks for your answer. My comments below: so you mean you would want to do that analysis on the client side and only shoot the already tokenized values to the server? What exactly is too slow? Can you provide more info what the problem is? After all I think you should ask on the sol

Re: Serialization of Lucene Document objects

2011-02-22 Thread Simon Willnauer

On Tue, Feb 22, 2011 at 2:58 PM, Erik Fäßler wrote: > Hi there, > > I'd like to serialize some Lucene Documents I've built before. My goal is to > send the documents over a http connection to a Solr server which then should > add them to its index. ok so why do you build lucene documents if you

Serialization of Lucene Document objects

2011-02-22 Thread Erik Fäßler

Hi there, I'd like to serialize some Lucene Documents I've built before. My goal is to send the documents over a http connection to a Solr server which then should add them to its index. I thought this would work as the Document class implements Serializable as do the Fields. Unfortunately,

Re: Suggest search terms

2011-02-22 Thread Fernando Wasylyszyn

Well, actually it depends If your suggestion terms corresponds with the terms in your "main" index, then you can use TermEnum#docFreq()+ Otherwise, if you develop a separate index for the suggestions (that do not correspond with the terms in your main index), then you just can add a calculat

Re: recurrent IO/CPU peaks

2011-02-22 Thread Michael McCandless

On Tue, Feb 22, 2011 at 3:15 AM, wrote: > Here is how long it took for each run : > - default : run 1 = 55 minutes, run 2 = 59 minutes > - balanced : run 1 = 145 minutes, run 2 = 121 minutes > > Is that an expected behavior? Hmm BalancedSegmentMergePolicy was over 2X slower to optimize...? Th

Re: Suggest search terms

2011-02-22 Thread Simon Willnauer

On Tue, Feb 22, 2011 at 11:23 AM, Clemens Wyss wrote: > Fernando, Uwe thanks for your suggestions. > Is it possible to get the number of "hits" per term? > ferrari (125) > lamborghini (34) > ... I think you can just call TermEnum#docFreq(), no? simon > >> -Ursprüngliche Nachricht- >> Von

AW: Suggest search terms

2011-02-22 Thread Clemens Wyss

Fernando, Uwe thanks for your suggestions. Is it possible to get the number of "hits" per term? ferrari (125) lamborghini (34) ... > -Ursprüngliche Nachricht- > Von: Fernando Wasylyszyn [mailto:ferw...@yahoo.com.ar] > Gesendet: Montag, 21. Februar 2011 21:11 > An: java-user@lucene.apache.

Re: lucene3.0.3 | get correct document in case of multiple Boolean query in search criteria

2011-02-22 Thread Ranjit Kumar

Hi, As, mention above i am using query like: criteria = (sql OR sqlserver OR "sql server") AND java AND delphi In the above scenario i need hit(document) containing at least one occurrence of (sql OR sqlserver OR "sql server"). Also java and delphi must present in document. Still I have not g

Re: recurrent IO/CPU peaks

2011-02-22 Thread v . sevel

Hi, I did some tests with the BalancedSegmentMergePolicy, looking specifically at the optimize. I have an index that is 70 Gb large, and contains around 35 millions documents. I duplicated the index 4 times, and I ran 2 optimize with the default merge policy, and 2 with the balanced policy. He

Re: Lucene TermVector

2011-02-22 Thread Simon Willnauer

Hey, On Mon, Feb 21, 2011 at 8:56 PM, Ajay Anandan wrote: > Hi > I am trying to implement an Expectation Maximization algorithm for document > clustering. I am planning to use Lucene Term Vectors for finding similarity > between 2 documents. There are 2 kinds of EM algos using naive Bayes: the

ParallelReader and updateDocument don't play nice?

Are you going to enter Google Summer of Code?

Re: IndexWriter.close() performance issue

[Announce] RankingAlgorithm ver 1.1

Re: Serialization of Lucene Document objects

Re: Serialization of Lucene Document objects

Serialization of Lucene Document objects

Re: Suggest search terms

Re: recurrent IO/CPU peaks

Re: Suggest search terms

AW: Suggest search terms

Re: lucene3.0.3 | get correct document in case of multiple Boolean query in search criteria

Re: recurrent IO/CPU peaks

Re: Lucene TermVector

14 matches

Site Navigation

Mail list logo

Footer information