Re: Is there a way to limit the size of an index?

2010-01-09 Thread Dvora
Well, this rule seems not working... I tried to create an index of 90k documents, with different merge factors. Somehow, the files size in the final index were 1MB, or 8MB - nothing in the middle. Am I missing something? Is the best way to really control the files size is to implement a custom Di

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
Ok Michael, Thank you for your answers, I'll check the numbers of writers as no other process uses the directory. I'll let you know the way I make it work. Tom --- En date de : Sam 9.1.10, Michael McCandless a écrit : De: Michael McCandless Objet: Re: Concurrent access IndexReader / IndexWr

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread Michael McCandless
Can you double check that you're not creating 2 writers on the same directory, somehow? Or: is there any other process that removes files from this directory? Answering your original questions...: commit/read does not require any external synchronization or locking. You should generally keep you

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
Here are two stack traces: add+remove a document: Tom --- Remove a document: java.io.FileNotFoundException: /home/ia/prod/current-deployment/indexes/advertisement/_0.cfs (No such file or directory)     at java.io.RandomAccessFile.open(Native Method)     at java.io.RandomAccessFile.(R

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread Michael McCandless
Can you post the full FNFE stack trace? Mike On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote: > Hi, > > I often get a FileNotFoundException when my single IndexWriter commits while > the IndexReader also tries to read. My application is multithreaded (Tomcat > uses the business APIs); I f

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
Michael, The exception only occurs when the writer commits. But but the IndexReader can keep on reading. The searches are performed by the IndexSearcher using the IndexReader. My filesystem is ext2fs. I give a few details below about the way I use them; the FNF exception occurs in the "commit

Re: Search query problem

2010-01-09 Thread Ahmet Arslan
> Is there another stemmer we can use that is perhaps not as > aggressive as the Porter Stemmer. "KStem is an alternative to Porter for developers looking for a less agressive stemmer. It was written by Bob Krovetz, ported to Lucene by Sergio Guzman-Lara (UMASS Amherst)." [1] [1]http://wiki.ap

Re: Search query problem

2010-01-09 Thread Shashi Kant
Couldn't you just mod the PorterStemmer class for your requirements? (we did and provided it a list of ignore words & phrases specific to our needs) On Sat, Jan 9, 2010 at 4:00 AM, Jamie wrote: > Hi All > > Is there another stemmer we can use that is perhaps not as aggressive as the > Porter Stem

Re: a complete solution for building a website search with lucene

2010-01-09 Thread Simon Willnauer
I don't know that much about nutch but hadoop shouldn't really run under windows in production. If you use windows for development this should not be a big issue. Oatis is right you should use cygwin together with hadoop. look at http://wiki.apache.org/hadoop/FAQ for initial info. simon On Sat, J

Re: Search query problem

2010-01-09 Thread Jamie
Hi All Is there another stemmer we can use that is perhaps not as aggressive as the Porter Stemmer. i.e. the stemming could remove ing's, er's, but not something so significant as to convert ""Lowe's" to "Low" Thanks Jamie Will Murnane wrote: On Fri, Jan 8, 2010 at 16:27, Jamie wrote: