Re: Lucene Merge failing on Open Files

2011-04-06 Thread Paul Taylor
On 04/04/2011 21:06, Simon Willnauer wrote: On Mon, Apr 4, 2011 at 9:59 PM, Paul Taylor wrote: On 04/04/2011 20:13, Michael McCandless wrote: How are you merging these indices? (IW.addIndexes?). Are you changing any of IW's defaults, eg mergeFactor? Mike Hi Mike I have indexWriter.setMax

Re: Question about open files

2011-04-06 Thread Ian Lea
Yes, to the best of my knowledge and experience, closing readers and writers releases the file handles. -- Ian. On Wed, Apr 6, 2011 at 12:59 AM, Jean-Baptiste Reure wrote: > We are using version 3.0.3. So you can confirm that closing the writer (and > the reader created from that writer) shoul

Re: Help with delimited text

2011-04-06 Thread Ian Lea
You can add multiple values for a field to a single document. Document doc = new Document(); String[] paths = whatever.split(","); for (String p : paths) { doc.add(new Field("path", p, whatever ...); } For searching, assuming you only want to be able to wildcard on path delimiters, you could i

Re: Highlighting a phrase with "Single"

2011-04-06 Thread Koji Sekiguchi
(11/04/06 14:01), shrinath.m wrote: If there is a phrase in search, the highlighter highlights every word separately.. Like this : I love Lucene Instead what I want is like this : I love Lucene Not sure my mailer problem or not, I don't see the difference between above two. But reading t

Indexation takes a lot of time :(

2011-04-06 Thread ZYWALEWSKI, DANIEL (DANIEL)
Hello Champions !! I have a problem with indexation(or should I say its time); So the elements to Index are represtented by my own class - DocumentToIndex that consists of Fields(one Field is a fieldName and fieldValue). All documentToIndex are kept/stocked in ArrayList. When I start indexing f

Re: Indexation takes a lot of time :(

2011-04-06 Thread Ian Lea
15 minutes for 28k docs does sound very slow. In my experience it's usually the reading of the raw data from database or network or wherever that turns out to be the problem. You could easily check that by commenting out the lucene calls in your code. See also http://wiki.apache.org/lucene-java/

Re: Re: Re: A likely bug of TermsPosition.nextPosition

2011-04-06 Thread 袁武 [GMail]
Dear Mike: I have run the CheckIndex of branch_3x, and the result report is listed below: [oracle@server bin]$ java -classpath ./ org.apache.lucene.index.CheckIndex /data/Index/URL/Generic/ -fix NOTE: testing will be more thorough if you run java with '-ea:org.apache.lucene...', so assertions

Re: Indexation takes a lot of time :(

2011-04-06 Thread findbestopensource
Hello daniel, The code seems to be fine. I think you are calculating the time for entire program which may read the data from external source and prepare the array list. Just calculate time only for indexing. Regards Aditya www.findbestopensource.com On Wed, Apr 6, 2011 at 2:38 PM, ZYWALEWSKI,

Re: Concurrent Issue

2011-04-06 Thread findbestopensource
You might have closed the IndexReader object but trying to access the search results. Regards Aditya www.findbestopensource.com On Tue, Apr 5, 2011 at 5:26 PM, Yogesh Dabhi wrote: > Hi > > > > My application is cluster in jobss application servers & lucene > directory was shared. > > > > Conc

Re: Highlighting a phrase with "Single"

2011-04-06 Thread shrinath.m
Thats right :) Thanks Koji :) On Wed, Apr 6, 2011 at 3:31 PM, Koji Sekiguchi [via Lucene] < ml-node+2784321-1329059645-376...@n3.nabble.com> wrote: > (11/04/06 14:01), shrinath.m wrote: > > > If there is a phrase in search, the highlighter highlights every word > > separately.. > > Like this : >

Re: Help with delimited text

2011-04-06 Thread Mark Wiltshire
Thanks Ian, I have managed to do that and through Luke I get My expected results. Here is now my Index Code.                StringTokenizer st = buildSubjectArea(dbConnection, oid);                int tokenCount = 0;                while (st.hasMoreTokens()){                tokenCount++;         

Re: Lucene Merge failing on Open Files

2011-04-06 Thread Michael McCandless
Can you turn on IndexWriter's infoStream, get the failure to happen, and post the resulting output? How are you adding the multiple indices together? Can you post the code that does that? The number of open file handles needed during indexing is a function of how many merges are running and how

Re: Question about open files

2011-04-06 Thread Erick Erickson
I suspect you're already aware of this, but I've overlooked the obvious so many times I thought I'd mention it... A classic mistake is to assign a reader with reopen and not close the old reader, see: http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/index/IndexReader.html#reopen() <

Re: Help with delimited text

2011-04-06 Thread Erick Erickson
A TermQuery is really dumb. It doesn't do anything at all to the input, it assumes you've done all that up front. Try parsing a query rather than using TermQuery And I suspect you'll have problems with casing, but that's another story Best Erick On Wed, Apr 6, 2011 at 6:33 AM, Mark Wilts

Re: Concurrent Issue

2011-04-06 Thread Piotr Pezik
Only to second this explanation. I got the same exception in a web application with a single IndexReader, accessed by many threads. The index gets updated every half hour or so, so I closed the old IndexReader and opened a new one every now and then. Even though the method for obtaining the

Indexing Non-Textual Data

2011-04-06 Thread Chris Spencer
Hi, I'm new to Lucene, so forgive me if this is a newbie question. I have a dataset composed of several thousand lists of 128 integer features, each list associated with a class label. Would it be possible to use Lucene as a classifier, by indexing the label with respect to these integer features,

Re: Indexing Non-Textual Data

2011-04-06 Thread Otis Gospodnetic
Hi Chris, Yes, people have done classification with Lucene before. Have a look at http://search-lucene.com/?q=classifier&fc_project=Lucene for some discussions and actual code (in old JIRA issues) Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: ht

Re: Question about open files

2011-04-06 Thread Jean-Baptiste Reure
I found what the problem was: we were closing the IndexSearcher but not the underlying IndexReader (I wrongly assumed that closing one would close the other). Everything is working perfectly now, thanks for the help. JB On 6 April 2011 22:13, Erick Erickson wrote: > I suspect you're already awa

Re: Concurrent Issue

2011-04-06 Thread findbestopensource
You are trying to access the reader which is already closed by some other thread. 1. Keep a reference count for the reader you create. 2. Have a common function through which all functions will retrieve Reader objects 3. Once the index got changed, create a new reader, do warmup 4. When the new re