Well...
after various tests I downgraded to lucene 1.9.1 to see if that had
any effect... doesn't seem that way.
I have set up a JMeter test with 5 concurrent users doing a search (a
silly search for a two letter word) every 3 seconds (with a random of
+/- 500ms).
- With 512 MB xms/xmx
Sounds familiar. This may actually be in JIRA already.
- Mark
On Dec 3, 2008, at 6:25 PM, "Teruhiko Kurosaka" <[EMAIL PROTECTED]>
wrote:
Mike,
You are right. There was an error on my part. I think
I was, in effect, making a SpanNearQuery object of:
new SpanNearQuery(new SpanQuery[0], 0,
Mike,
You are right. There was an error on my part. I think
I was, in effect, making a SpanNearQuery object of:
new SpanNearQuery(new SpanQuery[0], 0, true);
> -Original Message-
> From: Michael McCandless [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, December 03, 2008 10:47 AM
> To:
Hi, Yes that is pretty obvious that I would have to index Tom, but I think
you missed the point. I don't have a list of names with their nick names,
and this is pretty common: Mike being Michael, Richard being Rich or Dick,
William could be Bill or Will, etc. I thought I would check if there was
Actually I think something "outside" Lucene is probably setting that
field.
How did you create the Query that you are searching on?
Mike
Teruhiko Kurosaka wrote:
Hello again,
A debugging session shows that
SpanWeight.query.field is null when SpanWeight.scorer() is being
executed.
API do
Hello again,
A debugging session shows that
SpanWeight.query.field is null when SpanWeight.scorer() is being executed.
API doc says getField() is to "Returns the name of the field matched by this
query."
Am I right to assume that this field is set by a search mechanism within Lucene,
not by my cod
My application died throwing NPE inside SegmentReader.getNorms().
Exception in thread "main" java.lang.NullPointerException
at java.util.Hashtable.get(Hashtable.java:336)
at
org.apache.lucene.index.SegmentReader.getNorms(SegmentReader.java:438)
at org.apache.lucene.index.S
Yes, of course it makes sense. I was just confused about the documentation
for the Similarity function.
On Wed, Dec 3, 2008 at 9:52 AM, Erick Erickson <[EMAIL PROTECTED]>wrote:
> I'm not much of an expert on term frequencies and scoring,
> but would you really want the score calculated for a docu
I'm not much of an expert on term frequencies and scoring,
but would you really want the score calculated for a document
to be affected by the occurrence of terms in a field you did
NOT search on?
I sure wouldn't,
Best
Erick
On Wed, Dec 3, 2008 at 10:44 AM, Gustavo Corral <[EMAIL PROTECTED]>wr
Hi list,
I hope this is not a silly question, but I should ask.
I developed a IR system for XML documents with Lucene and I was checking the
explain() output for some queries, but I don't understand this part:
0.121383816 = fieldWeight(title:efecto in 1), product of:
1.0 = tf(termFreq(title:efec
Careful here. Not only do you need to pass -server, but you need the
ability to use it :) It will silently not work if its not there I
believe. Oddly, the JRE doesn't seem to come with the server hotspot
implementation. The JDK always does appear to. Probably varies by OS to
some degree.
Some
Are you not passing -server on the command line? You need to do that.
In my experience with Sun JVM 1.6.x, the default gc strategy is really
amazingly good, as long as you pass -server.
If passing -server doesn't fix it, I would recommend enabling the
various verbose GC logs and watching what hap
Are you actually hitting OOME?
Or, you're watching heap usage and it bothers you that the GC is
taking a long time (allowing too much garbage to use up heap space)
before sweeping?
One thing to try (only for testing) might be a lower and lower -Xmx
until you do hit OOME; then you'll know
Sure,
Tried with the following
Java version: build 1.5.0_16-b06-284 (dev), 1.5.0_12 (production)
OS : Mac OS/X Leopard(dev) and Windows XP(dev), Windows 2003
(production)
Container : Jetty 6.1 and Tomcat 5.5 (latter is used both in dev and
production)
current jvm options
-Xms512m -Xmx1024M
Cheers,
In my scenario. I've made sure that the index does not get modified
(so reopen shouldnt be necessary ?).
I've tried scenario with both caching and not caching indexsearcher
(and hereby the indexreader it creates in its constructor).
When not caching Ive made sure to close the indexs
You are opening and closing IndexSearcher for every search. Try by caching
IndexSearcher and do reopen the IndexReader, when the index gets modified.
In your code below, How did you create IndexSearcher. If it is using
IndexReader and you need to close that too. This might be the cause of
mem
Hi Magnus,
Could you post the OS, version, RAM size, swapsize, Java VM version,
hardware, #cores, VM command line parameters, etc? This can be very
relevant.
Have you tried other garbage collectors and/or tuning as described in
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html?
I don't think there is a way to do that if you are using lucene
standardAnalyzer. because standardAnalyzer is meant to tokenize by some
standard token characters. for custom analyzing it is good to use your own
analyzer. You can probably use SimpleAnalyzer
Prabin
toostep.com
On Wed, Dec 3, 2008
Hi,
We have an application using Tomcat, Spring etc and Lucene 2.4.0.
Our index is about 100MB (in test) and has about 20 indexed fields.
Performance is pretty good, but we are experiencing a very high usage
of memory when searching.
Looking at JConsole during a somewhat silly scenario (but
It worked out well.
Thanks
Is there any way that we can use standardAnalyzer and tell it not generated
tokens out of this?
Thanks
Ravichandra
prabin meitei wrote:
>
> use your own analyzer. Write a class extending lucene analyzer. you can
> override the tokenStream method to include whateve
Hi,
We have an application using Tomcat, Spring etc and Lucene 2.4.0.
Our index is about 100MB (in test) and has about 20 indexed fields.
Performance is pretty good, but we are experiencing a very high usage
of memory when searching.
Looking at JConsole during a somewhat silly scenario (but
On Tue, 2008-12-02 at 23:42 +0100, Chris Hostetter wrote:
> : A cosmetic remark, I would personally choose a single field for the boosts
> and
> : then one token per source. (groupboost:A^10 groupboost:B^1
> groupboost:C^0.1).
>
> that's a key improvement, as it helps keep the number of unique f
Also, if you do some testing of this, please post back the results if
you can.
As you've noticed, this (how Lucene performs with a great many /
variable fields per doc) isn't a well explored area yet...
Mike
Mark Miller wrote:
There is not much impact as long as you turn off Norms for t
use your own analyzer. Write a class extending lucene analyzer. you can
override the tokenStream method to include whatever you want and exclude
what you don't want.
eg. of a token stream method which may work for you
public TokenStream tokenStream(String fieldName, Reader reader) {
Tok
Hi
I tried that approach, I did used escaping using the "\", and the query has
the special charecter, but i got no results that time.
What I found out was when I use Standard Analyzer on "ABC+S", the terms
generated are "ABC" and "S" and '+' is getting lost.
When I used whitespaceAnalyzer or
Hi
I tried that approach, I did used escaping using the "\", and the query has
the special charecter, but i got no results that time.
What I found out was when I use Standard Analyzer on "ABC+S", the terms
generated are "ABC" and "S" and '+' is getting lost.
When I used whitespaceAnalyzer or key
If you want to query for Tom, then you need to index the value Tom. Create
one more field as Alias or add alias name as part of name field.
Regards
Ganesh
- Original Message -
From: "Khawaja Shams" <[EMAIL PROTECTED]>
To:
Sent: Wednesday, December 03, 2008 11:46 AM
Subject: Indexing
try manually escaping the search string, adding "\" in front of the special
characters. (you can do this easily by using string replace)
This will make sure that your query contains the special characters
Prabin
toostep.com
On Wed, Dec 3, 2008 at 12:03 PM, Ravichandra <
[EMAIL PROTECTED]> wrote:
Hi
To get from Thomas to Tom you'll need to use synonyms. For Thom you
would have been able to use prefixes or wild cards.
If you google for lucene synonyms you'll find loads of stuff. Also, I
believe that Solr has built in support for synonyms.
--
Ian.
On Wed, Dec 3, 2008 at 6:16 AM, Khaw
29 matches
Mail list logo