Re: IndexWriter memory leak?

2010-04-07 Thread Shai Erera
gt; > I managed to get rid of the Reader's "memory leak" by actually setting to > null the pointer to the actual Tika's Reader in my wrapper when the wrapper > is closed. But I still think that I would be nicer if IndexWriter wouldn't > ma

RE: preserving markup of content ?

2010-04-07 Thread Uwe Schindler
The "simple" solution is very easy: Index the markup-free document by adding with new Field.Index.ANALYZED and Field.Store.NO, so it does not get stored. Then again add the same data (but with markup) to the index with Field.Store.YES but Field.Index.NO. If you like you can do this even with the

preserving markup of content ?

2010-04-07 Thread Sulman Sarwar
Hi All, I am working on some language data and i need to index/search it. I have used lucene for indexing plain text documents before as well (no fancy tricks, just plain text indexing). The data that i have now is transcribed text and is heavily marked up. (Its mostly conversations and interviews

Re: IndexWriter memory leak?

2010-04-07 Thread Ruben Laguna
27;s Reader in my wrapper when the wrapper is closed. But I still think that I would be nicer if IndexWriter wouldn't maintain references to the Readers after indexing. [3] http://img.skitch.com/20100407-ntn2kg13fx49wx4q118bp9h1hb.jpg On Wed, Apr 7, 2010 at 10:35 PM, Ruben Laguna wrote: >

IndexWriter memory leak?

2010-04-07 Thread Ruben Laguna
osed after IndexWriter.updateDocument. Each one of those Readers retains 1MB. The question is why IndexWriter holds references to those Readers after the Documents have been indexed. [1] http://img.skitch.com/20100407-1183815yiausisg73u9wfgscsj.jpg [2] http://img.skitch.com/20100407-b86irkp7e4uif2wq1dd4t899qb.jpg -- /Rubén

Re: custom low-level indexer (to speed things up) when fields, terms and docids are in order

2010-04-07 Thread britske
Just to update and close this thread (I forgot about it) : after investigation it turns out that 75% of the time of the custom async-indexer (see original email) was spend in FieldInfos.add(...) . More specifically in the part where fieldname is interned using String.intern(). Copy/pasing and usi

Is it possible to have Lucene and Solr (or two Solr instances) pointing at the same index directory?

2010-04-07 Thread Paolo Castagna
Hi, (I know that this is probably not recommended and not a common scenario, but...) Is it possible to have an application using Lucene and a separate (i.e. different JVM) instance of Solr both pointing at the same index and read/write to the index from both applications? I am trying (separately