Hi,
Doron, thanks for the advice.
regards,
Wooi Meng
--
View this message in context:
http://www.nabble.com/search-within-search-tf2558237.html#a7171019
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
--
: It seems that there is no simple function to ask the weight for a term
: in a document directly. So I decide not to iterate the documents of a
as i said: it depends on what you mean by "term weight" ...
: term or the terms of a document. I'm iterating the terms of the index,
: searching for th
Chris Hostetter wrote:
I don't really know what a "term matrix" is, but when you ask about
"weight' is it possible you are just looking for the TermDoc.freq() of the
term/doc pair?
Thank you Chris,
that was also my first idea. I wanted to get the document frequency
indexreader.docFreq(
Good suggestion, I tried watching the GCs in YourKit while testing but
unfortunately they don't seem to line up with the searches that take
forever. They also don't last long enough to make up that kind of
time. I have our heap limited to 1GB right now and its using around
768MB of that.
On 11/3/
On 11/3/06, Ben Dotte <[EMAIL PROTECTED]> wrote:
I'm trying to figure out a way to troubleshoot a performance problem
we're seeing when searching against a memory-based index. What happens
is we will run a search against the index and it generally returns in
1 second or less. But every once in a
Hi All,
I also need to resolve this issue. What is the best way to catch this exception?
Thanks
Mathews
-Original Message-
From: Eric Louvard [mailto:[EMAIL PROTECTED]
Sent: Friday, November 03, 2006 8:36 AM
To: java-user@lucene.apache.org
Subject: TooManyClauses with MultiTermQueries
He
I personally like your effort, but technically I would disagree.
The SOLR project, and the project I am working on, DBSight, have an
detached approach which is implementation agnostic, no matter if it's
java, ruby, php, .net. The return results can be a rendered HTML,
JSON, XML. I don't think yo
: When I enter the query: "Table AND NOT Chair" I get one hit, doc3
: When I enter the query: "Table AND (NOT Chair)" I get 0 hits.
:
: I had thought that both queries would return the same results. Is this a
: bug, or, am I not understanding the query language correctly?
it's a confusing eccen
Hi,
What exactly are your concerned about the "non-detached" approach (see
below)?
Chris Lu wrote:
I would prefer a detached approach instead of Hibernate or EJB's
approach, which is kind of too tightly coupled with any system. How to
it is probably going to be couple with yours ;-)
rebuild
On 11/3/06, Patrick Turcotte <[EMAIL PROTECTED]> wrote:
>
> It will make mails list more easy to read (I am using gmail and I do
> not have client-side filters).
That is not true.
You can have labels, and, if you look at the top of the page, right beside
the "Search the Web" button, you have
Hi
No, he is talking about
http://www.hibernate.org/hib_docs/annotations/reference/en/html/lucene.html
Also note that I'm about to release a new version much more flexible
http://www.mail-archive.com/hibernate-dev%40lists.jboss.org/msg00392.html
and for the future (but flexible)
http://www.mail-a
Hi,
I'm trying to figure out a way to troubleshoot a performance problem
we're seeing when searching against a memory-based index. What happens
is we will run a search against the index and it generally returns in
1 second or less. But every once in a while it takes 15-20 seconds for
the exact sa
I don't really know what a "term matrix" is, but when you ask about
"weight' is it possible you are just looking for the TermDoc.freq() of the
term/doc pair?
: Date: Thu, 02 Nov 2006 12:45:30 +0100
: From: Soeren Pekrul <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java-user@
Daniel,
Yes, this is correct if you happen to be doing a radius search and sorting
by mileage.
Peter
On 11/3/06, Daniel Rosher <[EMAIL PROTECTED]> wrote:
Hi Peter,
Does this mean you are calculating the euclidean distance twice ... once
for
the HitCollecter to filter
'out of range' documents,
spinergywmy <[EMAIL PROTECTED]> wrote on 03/11/2006 00:40:42:
>I have another problem is I do not perform the real search within
search
> feature which according to the way that I have coded, because for the
second
> time searching, I actually go back to the index directory to search the
> ent
Paramasivam,
Take a look at Solr, in particular the DocSetHitCollector class. The
collector simply sets a bit in a BitSet, or saves the docIds in an array
(for low hit counts). Solr's BitSet was optimized (by Yonik, I believe) to
be faster than Java's BitSet, so this HitCollector is very fast. Th
Hi all,
Our company has a set of assets and we use meta-data (XML files) to
describe each asset. My job is to index and search over the meta-data
associated with the assets. The interesting aspect of my problem is that
an asset can have more than one meta-data file associated with it,
depending
Hello, in working with Lucene since several years.
One of my biggest problem was the unability of lucene to search with
wildcard. Also I have develop my own MultiTermQueries.
Now there's a standard class for this, but you'll allways become an
exception if your search is to generic, 'a*' for ex
Hi Peter,
Does this mean you are calculating the euclidean distance twice ... once for
the HitCollecter to filter
'out of range' documents, and then again for the custom Comparator to sort
the returned documents?
especially since the filtering is done outside Lucene?
Regards,
Dan
Joe,
Fields
Haven't used them, but had a look at them some time ago. Seems like a
nice set of helper factory classes to manage Lucene engine through
Spring IoC. Can't do much wrong in here I guess... If you'd be using
Spring in your app, you'd have to come up with similar factories either
way, so probably it'd
> You need to increase the memory for java. I think 32-bit jave is
limited to a 1.3 gig heap but
> could be wrong. No heuristics at the tip of my fingers.
32-bit JVM under Linux/Windows. Solaris runs OK. Limit on the heap is
~1.7 - 1.8Gb.
-Original Message-
From: Breck Baldwin [mailto:[EM
Martin Braun wrote:
Hi Breck,
i have tried your tutorial and built (hopefully) a successful
SpellCheck.model File with
49M.
My Lucene Index directory is 2,4G. When I try to read the Model with the
readmodel function,
i get an "Exception in thread "main" java.lang.OutOfMemoryError: Java
heap sp
It will make mails list more easy to read (I am using gmail and I do
not have client-side filters).
That is not true.
You can have labels, and, if you look at the top of the page, right beside
the "Search the Web" button, you have a "create filter" link.
Patrick
One thing it took me a while to grasp, and is not automatic for folks with
significant database backgrounds is that the fields in a Lucene document are
only related to those of any other document by the meaning you, as a
programmer, understand. That is, document 1 may have fields a, b, c.
Document
Hi,
I recently stumbled across what I think might be a bug in the QueryParser.
Before I enter it as a bug, I wanted to run it by this group to see if I'm
just not looking at the boolean expression correctly.
Here's the issue:
I created an index with 5 documents, all have one field: "text", with
Yes! I modified the example to be compliant with 2.1 api, and I added
the hits.score() call, for each discovered results.
It works!
[java] Hits for "freedom" were found in quotes by:
[java] 1. Mohandas Gandhi with score = 0.53033006
[java] 2. Ayn Rand with score = 0.25
[java]
On Nov 3, 2006, at 3:20 AM, Michele Amoretti wrote:
why not to put a [LUCENE USER] automatic tag at the beginning of
e-mails subjects?
Because the To and Reply-to headers indicate the list. All Apache e-
mail lists operate the same, and we are not going to change this
behavior.
E
http://javatechniques.com/public/java/docs/basics/lucene-memory-search.html
is this good? it seems to be good..
On 11/3/06, Michele Amoretti <[EMAIL PROTECTED]> wrote:
Ok, sorry I did not read it in depth.
Now, where can I find an example of:
- building the RAMDirectory
- scoring all document
Ok, sorry I did not read it in depth.
Now, where can I find an example of:
- building the RAMDirectory
- scoring all documents against the query?
thanks
On 11/3/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: I have a question: is the score for a document different if I have
: only that doc
Hi Peter
When I use the CustomHitCollector, it affect the application performance.
Also how you accomplish the grouping the results with out affecting
performance. Also If possible give some code snippet for custome
hitcollector.
TIA
Sri
"Peter Keegan" <[EMAIL PROTECTED]> wrote in message
n
Hi,
Doron, good call, thanks.
I have another problem is I do not perform the real search within search
feature which according to the way that I have coded, because for the second
time searching, I actually go back to the index directory to search the
entire indeces again rather then cache
Hi,
why not to put a [LUCENE USER] automatic tag at the beginning of
e-mails subjects?
It will make mails list more easy to read (I am using gmail and I do
not have client-side filters).
--
Michele Amoretti, Ph.D.
Distributed Systems Group
Dipartimento di Ingegneria dell'Informazione
Università
Michele,
On Friday 03 November 2006 07:07, Michele Amoretti wrote:
> I have a question: is the score for a document different if I have
> only that document in my index, or if I have N documents?
> If the answer is yes, I will put all N documents together, otherwise I
> will evaluate them one by o
33 matches
Mail list logo