date:20081203

Re: lucene nicking my memory ?

2008-12-03 Thread Magnus Rundberget

Well... after various tests I downgraded to lucene 1.9.1 to see if that had any effect... doesn't seem that way. I have set up a JMeter test with 5 concurrent users doing a search (a silly search for a two letter word) every 3 seconds (with a random of +/- 500ms). - With 512 MB xms/xmx

Re: NPE inside org.apache.lucene.index.SegmentReader.getNorms

2008-12-03 Thread Mark Miller

Sounds familiar. This may actually be in JIRA already. - Mark On Dec 3, 2008, at 6:25 PM, "Teruhiko Kurosaka" <[EMAIL PROTECTED]> wrote: Mike, You are right. There was an error on my part. I think I was, in effect, making a SpanNearQuery object of: new SpanNearQuery(new SpanQuery[0], 0,

RE: NPE inside org.apache.lucene.index.SegmentReader.getNorms

2008-12-03 Thread Teruhiko Kurosaka

Mike, You are right. There was an error on my part. I think I was, in effect, making a SpanNearQuery object of: new SpanNearQuery(new SpanQuery[0], 0, true); > -Original Message- > From: Michael McCandless [mailto:[EMAIL PROTECTED] > Sent: Wednesday, December 03, 2008 10:47 AM > To:

Re: Indexing Names in Lucene -- Thomas = Tom, etc

2008-12-03 Thread Khawaja Shams

Hi, Yes that is pretty obvious that I would have to index Tom, but I think you missed the point. I don't have a list of names with their nick names, and this is pretty common: Mike being Michael, Richard being Rich or Dick, William could be Bill or Will, etc. I thought I would check if there was

Re: NPE inside org.apache.lucene.index.SegmentReader.getNorms

2008-12-03 Thread Michael McCandless

Actually I think something "outside" Lucene is probably setting that field. How did you create the Query that you are searching on? Mike Teruhiko Kurosaka wrote: Hello again, A debugging session shows that SpanWeight.query.field is null when SpanWeight.scorer() is being executed. API do

RE: NPE inside org.apache.lucene.index.SegmentReader.getNorms

2008-12-03 Thread Teruhiko Kurosaka

Hello again, A debugging session shows that SpanWeight.query.field is null when SpanWeight.scorer() is being executed. API doc says getField() is to "Returns the name of the field matched by this query." Am I right to assume that this field is set by a search mechanism within Lucene, not by my cod

NPE inside org.apache.lucene.index.SegmentReader.getNorms

2008-12-03 Thread Teruhiko Kurosaka

My application died throwing NPE inside SegmentReader.getNorms(). Exception in thread "main" java.lang.NullPointerException at java.util.Hashtable.get(Hashtable.java:336) at org.apache.lucene.index.SegmentReader.getNorms(SegmentReader.java:438) at org.apache.lucene.index.S

Re: Termfreq

2008-12-03 Thread Gustavo Corral

Yes, of course it makes sense. I was just confused about the documentation for the Similarity function. On Wed, Dec 3, 2008 at 9:52 AM, Erick Erickson <[EMAIL PROTECTED]>wrote: > I'm not much of an expert on term frequencies and scoring, > but would you really want the score calculated for a docu

Re: Termfreq

2008-12-03 Thread Erick Erickson

I'm not much of an expert on term frequencies and scoring, but would you really want the score calculated for a document to be affected by the occurrence of terms in a field you did NOT search on? I sure wouldn't, Best Erick On Wed, Dec 3, 2008 at 10:44 AM, Gustavo Corral <[EMAIL PROTECTED]>wr

Termfreq

2008-12-03 Thread Gustavo Corral

Hi list, I hope this is not a silly question, but I should ask. I developed a IR system for XML documents with Lucene and I was checking the explain() output for some queries, but I don't understand this part: 0.121383816 = fieldWeight(title:efecto in 1), product of: 1.0 = tf(termFreq(title:efec

Re: lucene nicking my memory ?

2008-12-03 Thread Mark Miller

Careful here. Not only do you need to pass -server, but you need the ability to use it :) It will silently not work if its not there I believe. Oddly, the JRE doesn't seem to come with the server hotspot implementation. The JDK always does appear to. Probably varies by OS to some degree. Some

Re: lucene nicking my memory ?

2008-12-03 Thread Eric Bowman

Are you not passing -server on the command line? You need to do that. In my experience with Sun JVM 1.6.x, the default gc strategy is really amazingly good, as long as you pass -server. If passing -server doesn't fix it, I would recommend enabling the various verbose GC logs and watching what hap

Re: lucene nicking my memory ?

2008-12-03 Thread Michael McCandless

Are you actually hitting OOME? Or, you're watching heap usage and it bothers you that the GC is taking a long time (allowing too much garbage to use up heap space) before sweeping? One thing to try (only for testing) might be a lower and lower -Xmx until you do hit OOME; then you'll know

Re: lucene nicking my memory ?

2008-12-03 Thread Magnus Rundberget

Sure, Tried with the following Java version: build 1.5.0_16-b06-284 (dev), 1.5.0_12 (production) OS : Mac OS/X Leopard(dev) and Windows XP(dev), Windows 2003 (production) Container : Jetty 6.1 and Tomcat 5.5 (latter is used both in dev and production) current jvm options -Xms512m -Xmx1024M

Re: lucene nicking all my memory

2008-12-03 Thread Magnus Rundberget

Cheers, In my scenario. I've made sure that the index does not get modified (so reopen shouldnt be necessary ?). I've tried scenario with both caching and not caching indexsearcher (and hereby the indexreader it creates in its constructor). When not caching Ive made sure to close the indexs

Re: lucene nicking all my memory

2008-12-03 Thread Ganesh

You are opening and closing IndexSearcher for every search. Try by caching IndexSearcher and do reopen the IndexReader, when the index gets modified. In your code below, How did you create IndexSearcher. If it is using IndexReader and you need to close that too. This might be the cause of mem

Re: lucene nicking my memory ?

2008-12-03 Thread Glen Newton

Hi Magnus, Could you post the OS, version, RAM size, swapsize, Java VM version, hardware, #cores, VM command line parameters, etc? This can be very relevant. Have you tried other garbage collectors and/or tuning as described in http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html?

Re: Problem with special charecters

2008-12-03 Thread prabin meitei

I don't think there is a way to do that if you are using lucene standardAnalyzer. because standardAnalyzer is meant to tokenize by some standard token characters. for custom analyzing it is good to use your own analyzer. You can probably use SimpleAnalyzer Prabin toostep.com On Wed, Dec 3, 2008

lucene nicking all my memory

2008-12-03 Thread Magnus Rundberget

Hi, We have an application using Tomcat, Spring etc and Lucene 2.4.0. Our index is about 100MB (in test) and has about 20 indexed fields. Performance is pretty good, but we are experiencing a very high usage of memory when searching. Looking at JConsole during a somewhat silly scenario (but

Re: Problem with special charecters

2008-12-03 Thread Ravichandra

It worked out well. Thanks Is there any way that we can use standardAnalyzer and tell it not generated tokens out of this? Thanks Ravichandra prabin meitei wrote: > > use your own analyzer. Write a class extending lucene analyzer. you can > override the tokenStream method to include whateve

lucene nicking my memory ?

2008-12-03 Thread Magnus Rundberget

Hi, We have an application using Tomcat, Spring etc and Lucene 2.4.0. Our index is about 100MB (in test) and has about 20 indexed fields. Performance is pretty good, but we are experiencing a very high usage of memory when searching. Looking at JConsole during a somewhat silly scenario (but

Re: Query time document group boosting

2008-12-03 Thread Toke Eskildsen

On Tue, 2008-12-02 at 23:42 +0100, Chris Hostetter wrote: > : A cosmetic remark, I would personally choose a single field for the boosts > and > : then one token per source. (groupboost:A^10 groupboost:B^1 > groupboost:C^0.1). > > that's a key improvement, as it helps keep the number of unique f

Re: # of fields, performance

2008-12-03 Thread Michael McCandless

Also, if you do some testing of this, please post back the results if you can. As you've noticed, this (how Lucene performs with a great many / variable fields per doc) isn't a well explored area yet... Mike Mark Miller wrote: There is not much impact as long as you turn off Norms for t

Re: Problem with special charecters

2008-12-03 Thread prabin meitei

use your own analyzer. Write a class extending lucene analyzer. you can override the tokenStream method to include whatever you want and exclude what you don't want. eg. of a token stream method which may work for you public TokenStream tokenStream(String fieldName, Reader reader) { Tok

Re: Problem with special charecters

2008-12-03 Thread Ravichandra

Hi I tried that approach, I did used escaping using the "\", and the query has the special charecter, but i got no results that time. What I found out was when I use Standard Analyzer on "ABC+S", the terms generated are "ABC" and "S" and '+' is getting lost. When I used whitespaceAnalyzer or

Re: Problem with special charecters

2008-12-03 Thread Ravichandra

Hi I tried that approach, I did used escaping using the "\", and the query has the special charecter, but i got no results that time. What I found out was when I use Standard Analyzer on "ABC+S", the terms generated are "ABC" and "S" and '+' is getting lost. When I used whitespaceAnalyzer or key

Re: Indexing Names in Lucene -- Thomas = Tom, etc

2008-12-03 Thread Ganesh

If you want to query for Tom, then you need to index the value Tom. Create one more field as Alias or add alias name as part of name field. Regards Ganesh - Original Message - From: "Khawaja Shams" <[EMAIL PROTECTED]> To: Sent: Wednesday, December 03, 2008 11:46 AM Subject: Indexing

Re: Problem with special charecters

2008-12-03 Thread prabin meitei

try manually escaping the search string, adding "\" in front of the special characters. (you can do this easily by using string replace) This will make sure that your query contains the special characters Prabin toostep.com On Wed, Dec 3, 2008 at 12:03 PM, Ravichandra < [EMAIL PROTECTED]> wrote:

Re: Indexing Names in Lucene -- Thomas = Tom, etc

2008-12-03 Thread Ian Lea

Hi To get from Thomas to Tom you'll need to use synonyms. For Thom you would have been able to use prefixes or wild cards. If you google for lucene synonyms you'll find loads of stuff. Also, I believe that Solr has built in support for synonyms. -- Ian. On Wed, Dec 3, 2008 at 6:16 AM, Khaw

Re: lucene nicking my memory ?

Re: NPE inside org.apache.lucene.index.SegmentReader.getNorms

RE: NPE inside org.apache.lucene.index.SegmentReader.getNorms

Re: Indexing Names in Lucene -- Thomas = Tom, etc

Re: NPE inside org.apache.lucene.index.SegmentReader.getNorms

RE: NPE inside org.apache.lucene.index.SegmentReader.getNorms

NPE inside org.apache.lucene.index.SegmentReader.getNorms

Re: Termfreq

Re: Termfreq

Termfreq

Re: lucene nicking my memory ?

Re: lucene nicking my memory ?

Re: lucene nicking my memory ?

Re: lucene nicking my memory ?

Re: lucene nicking all my memory

Re: lucene nicking all my memory

Re: lucene nicking my memory ?

Re: Problem with special charecters

lucene nicking all my memory

Re: Problem with special charecters

lucene nicking my memory ?

Re: Query time document group boosting

Re: # of fields, performance

Re: Problem with special charecters

Re: Problem with special charecters

Re: Problem with special charecters

Re: Indexing Names in Lucene -- Thomas = Tom, etc

Re: Problem with special charecters

Re: Indexing Names in Lucene -- Thomas = Tom, etc

29 matches

Site Navigation

Mail list logo

Footer information