How to safely close Index objects?

2009-11-10 Thread Jacob Rhoden
Hi Guys, Given a class with two static variables, is the following safe? ie If I call "close" while something else is using the objects, do the objects simply hold a flag saying they need to be destroyed once the objects are finished being used, or do they not track if anything is current

Re: Lucene index write performance optimization

2009-11-10 Thread Otis Gospodnetic
This is what we have in Lucene in Action 2: ~/lia2$ ff \*Thread\*java ./src/lia/admin/CreateThreadedIndexTask.java ./src/lia/admin/ThreadedIndexWriter.java Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR -

Re: Multiple threads under tomcat

2009-11-10 Thread Erick Erickson
1) Should the "FSDirectory dir" object be shared as some sort of static variable? I don't think it matters, do the simplest thing. The overhead in creating an FSDirectory is so small in relation to the other operations that I don't think you'd ever notice. 2) Should the "IndexSearcher searcher" or

Multiple threads under tomcat

2009-11-10 Thread Jacob Rhoden
Apologies if this info is already somewhere, but google cant find it (: I am assuming the following code is completely thread safe: // Called from a servlet when a user action results in the index needing to be updated public static void rebuildIndex() { FSDirectory dir = new NIOFSDirecto

Sorting and Pagination with Lucene 2.9

2009-11-10 Thread sbhatti
I noticed that this question has been asked but I could not find good answer so I am posting again. Is there a good example of sorting and pagination wtih Lucene 2.9. I have looked at Solr 1.4 source code for examples and put together some code for testing but it's not quite working. I have defi

Re: Could one filed include more than one value?

2009-11-10 Thread Wenhao Xu
Got it! Thanks! On Tue, Nov 10, 2009 at 1:51 AM, Anshum wrote: > I haven't tried multi field query parser. Though about the usage of adding > multiple values while indexing, here's a code snippet that might help. > --snip starts-- > IndexWriter iw = new IndexWriter(...); > Document d = new Docum

Re: Lucene index write performance optimization

2009-11-10 Thread Yonik Seeley
On Tue, Nov 10, 2009 at 11:43 AM, Jamie Band wrote: > As an aside note, is there any way for Lucene to support simultaneous writes > to an index? The indexing process is highly parallelized... just use multiple threads to add documents to the same IndexWriter. -Yonik http://www.lucidimagination.

Re: Lucene index write performance optimization

2009-11-10 Thread Glen Newton
You might try re-implementing, using ThreadPoolExecutor http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/ThreadPoolExecutor.html glen 2009/11/10 Jamie Band : > Hi There > > Our app spends alot of time waiting for Lucene to finish writing to the > index. I'd like to minimize this. If y

Lucene index write performance optimization

2009-11-10 Thread Jamie Band
Hi There Our app spends alot of time waiting for Lucene to finish writing to the index. I'd like to minimize this. If you have a moment to spare, please let me know if my LuceneIndex class presented below can be improved upon. It is used in the following way: luceneIndex = new LuceneIndex(C

Scoring formula - Average number of terms in IDF

2009-11-10 Thread kdev
Hi, I want to change the default scoring formula of lucene and one of the changes I want to perform is on the idf term. What I want to do is to include the average number of terms of the documents indexed in the collection in the idf method of the Similarity class. In order to change the scoring

Re: remove duplicate when merging indexes

2009-11-10 Thread m.harig
Thanks Ian , it works , thanks a lot. Ian Lea wrote: > > Try updateDocument(new Term("id", ""+i), doc). > > See javadocs for Term constructors. > > > > -- > Ian. > > > On Tue, Nov 10, 2009 at 9:47 AM, m.harig wrote: >> >> Thanks again >> >> this is my code , >> >>  doc.add(new Field("id",

Re: Directory.list() deprecation

2009-11-10 Thread Michael McCandless
On Mon, Nov 9, 2009 at 7:53 PM, Daniel Noll wrote: > On Tue, Nov 10, 2009 at 00:44, Michael McCandless > wrote: >> Stepping back, since presumably your app knows what it's storing in >> the directory, can't you filter for files you know you've created? >> What's the larger use case here? > > The

Re: remove duplicate when merging indexes

2009-11-10 Thread Simon Willnauer
Ian got it :) simon On Tue, Nov 10, 2009 at 10:58 AM, Ian Lea wrote: > Try updateDocument(new Term("id", ""+i), doc). > > See javadocs for Term constructors. > > > > -- > Ian. > > > On Tue, Nov 10, 2009 at 9:47 AM, m.harig wrote: >> >> Thanks again >> >> this is my code , >> >>  doc.add(new Fie

Re: remove duplicate when merging indexes

2009-11-10 Thread Ian Lea
Try updateDocument(new Term("id", ""+i), doc). See javadocs for Term constructors. -- Ian. On Tue, Nov 10, 2009 at 9:47 AM, m.harig wrote: > > Thanks again > > this is my code , > >  doc.add(new Field("id",""+i,Field.Store.YES,Field.Index.NOT_ANALYZED)); > >  doc.add(new Field("title", index

Re: Could one filed include more than one value?

2009-11-10 Thread Anshum
I haven't tried multi field query parser. Though about the usage of adding multiple values while indexing, here's a code snippet that might help. --snip starts-- IndexWriter iw = new IndexWriter(...); Document d = new Document(); doc.add(new Field("field", new FileReader(f11)); doc.add(new Field("f

Re: remove duplicate when merging indexes

2009-11-10 Thread m.harig
Thanks simon ,, this is my code doc.add(new Field("id",""+i,Field.Store.YES,Field.Index.NOT_ANALYZED)); doc.add(new Field("title", indexForm.getTitle(), Field.Store.YES, Field.Index.ANALYZED)); doc.add(new Field("conte

Re: remove duplicate when merging indexes

2009-11-10 Thread m.harig
Thanks again this is my code , doc.add(new Field("id",""+i,Field.Store.YES,Field.Index.NOT_ANALYZED)); doc.add(new Field("title", indexForm.getTitle(), Field.Store.YES, Field.Index.ANALYZED)); doc.add(new Field("contents",

Re: Change norm encoding

2009-11-10 Thread Michael McCandless
Well, assuming there are no objections to the current approach, and performance checks out, I'll try to get this into 3.1... Mike On Tue, Nov 10, 2009 at 4:33 AM, Benjamin Heilbrunn wrote: > Hi, > > I applied > http://issues.apache.org/jira/secure/attachment/12411342/Lucene-1260.patch > That's

Re: Change norm encoding

2009-11-10 Thread Benjamin Heilbrunn
Hi, I applied http://issues.apache.org/jira/secure/attachment/12411342/Lucene-1260.patch That's exactly what I was looking for. The problem is, that from know on I'm on a patched version and I'm not very happy with breaking compatibility to the "original" jars... So is there a chance that this p

Re: remove duplicate when merging indexes

2009-11-10 Thread Simon Willnauer
On Tue, Nov 10, 2009 at 10:22 AM, m.harig wrote: > > Thanks simon > >    How I do get the unique ID ? will it be added to the index? There is no such thing build into lucene. You need to generate your own unique ID. Make sure you do NOT use the document ID as it is volatile and is likely to change

Re: remove duplicate when merging indexes

2009-11-10 Thread m.harig
Thanks simon How I do get the unique ID ? will it be added to the index? Simon Willnauer wrote: > > You need some kind of unique ID for you documents like a primary key in a > RDB. > If you have something like that you can call > IndexWriter#updateDocument(uniqueIDTerm, document) this wil

Re: Could one filed include more than one value?

2009-11-10 Thread Wenhao Xu
Thanks. Yes, I mean multi valued fileds, but I am still confused how to use it. BTW, I also found another class MultiFieldQueryParser. If I don't use multi valued fields, it seems I can use this class to compose a query over multiple fields. But is there difference of the result returned with the

Re: remove duplicate when merging indexes

2009-11-10 Thread Simon Willnauer
You need some kind of unique ID for you documents like a primary key in a RDB. If you have something like that you can call IndexWriter#updateDocument(uniqueIDTerm, document) this will delete the old document and add the new one. simon On Tue, Nov 10, 2009 at 10:05 AM, m.harig wrote: > > hello a

remove duplicate when merging indexes

2009-11-10 Thread m.harig
hello all, This is my situation , i've multiple indexes , for example , index1 , index2 , index3 ... i've to update the indexes every night . If i open my IndexWriter create=false (since i want to update the existing index) , am getting duplicate documents appends with the existing indexes ,

Re: Could one filed include more than one value?

2009-11-10 Thread Anshum
If you are talking about multi valued fields, the answer is yes. You may create multiple field objects with the same name and add to the same document. It would lead to adding the values to the same field name for the document. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts ex

Could one filed include more than one value?

2009-11-10 Thread Wenhao Xu
Hi, guys, I have such a problem: I have lots of files. In these files, two of them are related to each other and should be deemed as a whole (There is a map file to map them together). So they are somewhat like a set of pairs of files: , , ... . For a keyword search, the result should a

Re: Index maintaining/updating

2009-11-10 Thread Wenhao Xu
Thanks, guys. It helps a lot! W. On Mon, Nov 9, 2009 at 11:35 PM, Anshum wrote: > Hi Wenhao, > Its generally better to incrementally buld your index and at the same > tiime. > Considering by this time you'd be a little aware of implementing/using > luceneAPI, here is what you could do. > Open t