RE: HTMLStripReader, HTMLStripCharFilter

2010-04-26 Thread Uwe Schindler
To reset this token stream you have to wrap it with a CachingTokenFilter. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Justin [mailto:cry...@yahoo.com] > Sent: Tuesday, April 27, 2010 1:16 AM > To: ja

Re: HTMLStripReader, HTMLStripCharFilter

2010-04-26 Thread Justin
Thanks for the update! I appreciate the hard work. Perhaps someone can help me with the use of HTMLStripCharFilter... I get an exception (3.1-dev) similar to the one reported here (2.9): https://issues.apache.org/jira/browse/LUCENE-1695 With the following code: Analyzer htmlStripAnalyze

RE: IndexWriter and memory usage

2010-04-26 Thread Woolf, Ross
We are still plagued by this issue. I tried applying the patch mentioned but this did not resolve the issue. I once tried to attach images from the heap dump to send out to the group but the server removed them so I have posted the images on a public service with links this time. I would ap

Re: an analyzer map at hand?

2010-04-26 Thread Robert Muir
On Mon, Apr 26, 2010 at 11:04 AM, Paul Libbrecht wrote: > > is this something that is shared somewhere? > (I know everyone has its own favorites). > thanks in advance > Not really, in general its a little bit more organized in lucene trunk though. Analyzers and snowball were merged, and analyzers

an analyzer map at hand?

2010-04-26 Thread Paul Libbrecht
Hello Luceners, I am sure I'm not the only one having such a snippet in my dedicated analyzer: m.put("en", new SnowballAnalyzer("English")); m.put("es", new SnowballAnalyzer("Spanish")); m.put("de", new SnowballAnalyzer("German")); m.put("dk", new SnowballAnal

Re: Term offsets for highlighting

2010-04-26 Thread Koji Sekiguchi
Stephen Greene wrote: Hi Koji, Thank you. I implemented a solution based on the FieldTermStackTest.java and if I do a search like "iron ore" it matches iron or ore. The same is true if I specify iron AND ore. The termSetMap[0].value[0] = ore and termSetMap[0].value[1] = iron. What am I missing

RE: Term offsets for highlighting

2010-04-26 Thread Stephen Greene
Hi Koji, Thank you. I implemented a solution based on the FieldTermStackTest.java and if I do a search like "iron ore" it matches iron or ore. The same is true if I specify iron AND ore. The termSetMap[0].value[0] = ore and termSetMap[0].value[1] = iron. What am I missing in having a prhase matc

Re: merge results from physically separate hosts

2010-04-26 Thread Erik Hatcher
Solr's distributed search feature is about querying multiple indexes and merging the results. Different indexes, but same schema. Erik On Apr 25, 2010, at 6:02 AM, Shaun Senecal wrote: Is there currently a way to take a query, run it on multiple hosts containing different indexes, th

Re: VM options for faster lucene search

2010-04-26 Thread Harsh Srivastava
Hi, I am asking in terms of http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#PerformanceTuning VM oprions. -- Er. Harsh Srivastava On Mon, Apr 26, 2010 at 12:40 PM, Anshum wrote: > There are a few things you could do, > 1. Run the JVM in server mode [-server] > 2. Assign more RAM

Re: VM options for faster lucene search

2010-04-26 Thread Anshum
There are a few things you could do, 1. Run the JVM in server mode [-server] 2. Assign more RAM (in case you're running a 64 bit architecture) (both initial and max limit) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me.