Well, this rule seems not working...
I tried to create an index of 90k documents, with different merge factors.
Somehow, the files size in the final index were 1MB, or 8MB - nothing in the
middle. Am I missing something? Is the best way to really control the files
size is to implement a custom Di
Ok Michael,
Thank you for your answers, I'll check the numbers of writers as no other
process uses the directory. I'll let you know the way I make it work.
Tom
--- En date de : Sam 9.1.10, Michael McCandless a
écrit :
De: Michael McCandless
Objet: Re: Concurrent access IndexReader / IndexWr
Can you double check that you're not creating 2 writers on the same
directory, somehow?
Or: is there any other process that removes files from this directory?
Answering your original questions...: commit/read does not require any
external synchronization or locking. You should generally keep you
Here are two stack traces: add+remove a document:
Tom
---
Remove a document:
java.io.FileNotFoundException:
/home/ia/prod/current-deployment/indexes/advertisement/_0.cfs (No such file or
directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(R
Can you post the full FNFE stack trace?
Mike
On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote:
> Hi,
>
> I often get a FileNotFoundException when my single IndexWriter commits while
> the IndexReader also tries to read. My application is multithreaded (Tomcat
> uses the business APIs); I f
Michael,
The exception only occurs when the writer commits. But but the IndexReader can
keep on reading.
The searches are performed by the IndexSearcher using the IndexReader.
My filesystem is ext2fs.
I give a few details below about the way I use them; the FNF exception occurs
in the "commit
> Is there another stemmer we can use that is perhaps not as
> aggressive as the Porter Stemmer.
"KStem is an alternative to Porter for developers looking for a less agressive
stemmer. It was written by Bob Krovetz, ported to Lucene by Sergio Guzman-Lara
(UMASS Amherst)." [1]
[1]http://wiki.ap
Couldn't you just mod the PorterStemmer class for your requirements?
(we did and provided it a list of ignore words & phrases specific to
our needs)
On Sat, Jan 9, 2010 at 4:00 AM, Jamie wrote:
> Hi All
>
> Is there another stemmer we can use that is perhaps not as aggressive as the
> Porter Stem
I don't know that much about nutch but hadoop shouldn't really run
under windows in production. If you use windows for development this
should not be a big issue.
Oatis is right you should use cygwin together with hadoop. look at
http://wiki.apache.org/hadoop/FAQ for initial info.
simon
On Sat, J
Hi All
Is there another stemmer we can use that is perhaps not as aggressive as
the Porter Stemmer. i.e. the stemming could remove ing's, er's, but not
something so significant as to convert ""Lowe's" to "Low"
Thanks
Jamie
Will Murnane wrote:
On Fri, Jan 8, 2010 at 16:27, Jamie wrote:
10 matches
Mail list logo