doron, Thanks!
But in lucene api: For performance reasons it is recommended to open only one
IndexSearcher and use it for all of your searches.
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/IndexSearcher.html
That is a searcher will remain open unless you update the index. So
Yeah sorry about that, I hit the wrong one :( Posting at 3am is never a
good thing!
To bed!
Doron Cohen wrote:
I believe this should go to the solr-user@lucene.apache.org ?
Michael Imbeault <[EMAIL PROTECTED]> wrote on 05/09/2006
23:26:55:
--
Michael Imbeault
CHUL Research Center (CHUQ)
2
thk,,,Cohen and lin.
2006/9/6, Doron Cohen <[EMAIL PROTECTED]>:
I think that Nutch would crawl and search all these 3 types. Not sure that
Nutch would provide the framework you seem to look for, but perhaps it is
worth to take a look - http://lucene.apache.org/nutch/
"James liu" <[EMAIL PROT
I believe this should go to the solr-user@lucene.apache.org ?
Michael Imbeault <[EMAIL PROTECTED]> wrote on 05/09/2006
23:26:55:
> Old issue (see
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg00651.html),
> but I'm experiencing the same exact thing on windows xp, latest tomcat.
> I
I think that Nutch would crawl and search all these 3 types. Not sure that
Nutch would provide the framework you seem to look for, but perhaps it is
worth to take a look - http://lucene.apache.org/nutch/
"James liu" <[EMAIL PROTECTED]> wrote on 05/09/2006 23:10:16:
> i wanna find frame which can
Old issue (see
http://www.mail-archive.com/solr-user@lucene.apache.org/msg00651.html),
but I'm experiencing the same exact thing on windows xp, latest tomcat.
I noticed that the tomcat process gobbles memory (10 megs a second
maybe) and then jams at 125 megs. Can't find a fix yet. I'm using a p
i wanna find frame which can index xml,word,excel,pdf,,,not one.
i just wanna know who know the frame like what i wanna.
2006/9/6, yueyu lin <[EMAIL PROTECTED]>:
First, Lucene is just a index toolkit, you have to USE it to implement
your
application.
If you want to index something, you must
Hits is not really a simple container - it references a certain searcher -
that same searcher that was used to find these hits. When a request for a
result document is made, the Hits object delegates this request to the
searcher. So in order to "page through" the results using an existing Hits
obje
First, Lucene is just a index toolkit, you have to USE it to implement your
application.
If you want to index something, you must have knowledge how to extract
information from them and what kind of keys they need to be set.
Then you can do what you want to.
On 9/5/06, James liu <[EMAIL PROTECTE
i wanna find frame which can index xml,word,excel,pdf,,,not one.
2006/9/6, Doron Cohen <[EMAIL PROTECTED]>:
Lucene FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ - has a few
entries just for this:
How can I index HTML documents?
How can I index XML documents?
How can I index Open
hi,
The following words are quoted from "lucene in action":
"There are a couple of implementation approaches:
1. Keep the original Hits and IndexSearcher instances available while the
user is navigating the search results.
2. Requery each time the user navigates to a new page.
It turns out th
Lucene FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ - has a few
entries just for this:
How can I index HTML documents?
How can I index XML documents?
How can I index OpenOffice.org files?
How can I index MS-Word documents?
How can I index MS-Excel documents?
How can I index MS
i find lius many question so i wanna give up and find new.
who recommend ?
: Sorry for the confusion and thanks for taking the time to educate me. So, if
: I am just indexing literal values, what is the best way to do that (what
: analyzer)? Sounds like this approach, even though it works, is not the
: preferred method.
if you truely want just the literal values then
Sorry for the confusion and thanks for taking the time to educate me. So, if
I am just indexing literal values, what is the best way to do that (what
analyzer)? Sounds like this approach, even though it works, is not the
preferred method.
analyzer = new PerFieldAnalyzerW
1) consider using JUnit tests .. it makes it a lot easier for other people
to understand your expecations, and if it winds up demonstraing a genuine
bug in Lucene, it's easy to add to the test tree.
2) as i said before, your fields must be TOKENIZED, or your analyzer is
irrelevant at index time.
QueryParser.setDefaultOperator(Operator op)
Chris Salem wrote:
With all the parsers I have tried a space in a query, such as doing a search for
"sales manager", interprets the space as an OR, is there a way to change it so
that it interprets a space as an AND?
Chris Salem
440.946.5214 x5458
With all the parsers I have tried a space in a query, such as doing a search
for "sales manager", interprets the space as an OR, is there a way to change it
so that it interprets a space as an AND?
Chris Salem
440.946.5214 x5458
[EMAIL PROTECTED]
(The following links were included with this e
Here's a little sample program (borrowed some code from Erick Erickson :)).
Whether I add as TOKENIZED or UN_TOKENIZED seems to make no difference in
the output. Is this what you'd expect?
- Philip
package com.test;
import java.io.IOException;
import java.util.HashSet;
import java.util.regex.
[discussion moved here from dev-list]
Could it be an out-of-mem error?
Can you run it with a debugger, to see what really happens?
JVMs usually create a javacore file, and in case of an out-of-mem also a
heapdump file - these give more info on the problem. In case this file was
not created in thi
Some info to help you on you're journey :)
1. If you add a field as untokenized then it will not be analyzed when added
to the index. However, QueryParser will not know that this happened and will
tokenize queries on that field.
2. The solution that Hoss has explained to you is to leave the defa
I would rather use this
BitSet bits = new BitSet(reader.maxDocs()); //Not sure of exact method, lucene
is not on this PC...
instead of = new BitSet(reader.maxDocs())
- Original Message
From: Mark Miller <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 5 September, 200
: Could someone with some experience spot-check this WildcardFilter...it seems
: to work fine in simple testing, but I'd like to know if there are any
: glaring deficiencies. Have not had much to do with filters before.
It looks fine to me.
-Hoss
--
: the contents, but also two numerical values in other document fields. For
: example, lets assume that the normal score for Document A is 0.33 (as
: calculated by Lucene). What I need is that its true score is 0.33 * (value
: of field A) * (value of field B). What is the best way to accomplish
Could someone with some experience spot-check this WildcardFilter...it seems
to work fine in simple testing, but I'd like to know if there are any
glaring deficiencies. Have not had much to do with filters before.
public class WildcardFilter extends Filter {
private Term term;
public Wild
: So, if I do as you suggest below (using PerFieldAnalyzerWrapper with
: StandardAnalyzer) then I still need to enclose in quotes the phrases
: (keywords with spaces) when I issue the search, and they are only returned
Yes, quotes will be neccessary to tell the QueryParser "this
is one chunk of t
Stanislav Jordanov wrote:
Suppose I have a bunch of valid .cfs files while the
segmens/segments.new file is missing or invalid.
The task is to 'recover' the present .cfs files into a valid index.
I think it will be necessary and sufficient to create a segments file
that references the .cfs file
On Tuesday 05 September 2006 15:59, Mark Miller wrote:
> Okay, more realistically, anyone have any experience with Randy Puttnick's
> modifaction of wildcardquery and fuzzyquery? Any ideas on getting something
> like those in a SpanQuery?
You can use the IndexSearcher method that searches a query
Not for now, but I'd like to contribute span support soon.
Karel
An alternative highlighter implementation was recently contributed here:
http://issues.apache.org/jira/browse/LUCENE-644?page=all
I've not had the time to study this alternative in detail (I hope to soon) so I can't say if it wi
See here for a thread reviewing the challenges and possible solutions
associated with this problem:
http://www.mail-archive.com/java-user@lucene.apache.org/msg02543.html
An alternative highlighter implementation was recently contributed here:
http://issues.apache.org/jira/browse/LUCENE-644?
Okay, more realistically, anyone have any experience with Randy Puttnick's
modifaction of wildcardquery and fuzzyquery? Any ideas on getting something
like those in a SpanQuery?
- Mark
Anybody experimented with a filter in a spanquery? Pipedream?
thanks,
Mark
Hello,
After a search, I need to highlight only the terms that do "really"
correspond to the query.
For instance :
1/ I search docs with toto and titi in the SAME sentence (using
SpanNotQuery(spanNearQuery({"toto","titi"},9)),".") )
2/ Then I try to highlight "toto" and "titi" found (I use the
Suppose I have a bunch of valid .cfs files while the
segmens/segments.new file is missing or invalid.
The task is to 'recover' the present .cfs files into a valid index.
I think it will be necessary and sufficient to create a segments file
that references the .cfs files.
The only problem I've en
Oh, that is great! I didn't notice this javadoc. Maybe i need to
update my lucene lib.
I had thought one user requests his query, other queries maybe impact
on the result since using a single IndexSearcher. Forget these mails.
Thanks a lot..
On 9/5/06, karl wettin <[EMAIL PROTECTED]>
On Tue, 2006-09-05 at 13:32 +0100, Gonçalo Gaiolas wrote:
> should this boosting occur during index time or at query time? I'm a
> bit confused as to where should I apply this boost in order to affect
> the results of a search query.
You boost at index time.
-
Hi Karl,
Thanks for the super quick response!
One question - should this boosting occur during index time or at query
time? I'm a bit confused as to where should I apply this boost in order to
affect the results of a search query.
Once again thanks a lot!
Gonçalo
-Original Message-
Fro
On Tue, 2006-09-05 at 17:57 +0800, jacky wrote:
> 1. I wander if concurrent users can get the right results with
> different queries since the class has only one IndexSearcher instance.
>
> 2. As we know, a new IndexSearcher can be created when user request
> his query. If first method gets the r
On Tue, 2006-09-05 at 11:54 +0100, Gonçalo Gaiolas wrote:
> - Scoring should take in consideration not only the relevance of
> the contents, but also two numerical values in other document fields. For
> example, let’s assume that the normal score for Document A is 0.33 (as
> calculated by
On Tue, 2006-09-05 at 02:38 -0700, Venkateshprasanna wrote:
> I saw these classes and want to use them for my implementation as well.
> But I am not getting the source code for the specified package:
> org.apache.commons.collections
http://jakarta.apache.org/commons/collections/
---
Hi there,
I need to make two changes to Lucene :
- Scoring should take in consideration not only the relevance of
the contents, but also two numerical values in other document fields. For
example, lets assume that the normal score for Document A is 0.33 (as
calculated by Lucene).
hi,
The source code in the end is the class to search sth.
1. I wander if concurrent users can get the right results with different
queries since the class has only one IndexSearcher instance.
2. As we know, a new IndexSearcher can be created when user request his
query. If first metho
Why not add a single Field to each Document, like
|d.add(*new *Field("doctype","document", Field.Store.YES,
Field.Index.TOKENIZED));|
Then searching for "doctype:document" returns all documents
-Laurent
lude wrote:
Why would you want to do this?
This is a 'feature-request' of our searcheng
You could define your own query syntax (for example an empty string) for
a query matching all docs, examine the query string before passing it to
QueryParser, and instead create a MatchAllDocsQuery when a you have a match.
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/MatchAll
Why would you want to do this?
This is a 'feature-request' of our searchengine.
The user should have the possibilty to query for all(!) documents.
This would allow him to see all available document listet.
Is there a simple way to define a query that returns all documents of an
index?
Thanks l
I saw these classes and want to use them for my implementation as well. But I
am not getting the source code for the specified package:
org.apache.commons.collections
Is there any other way of implementing the same?
Why only classes from that package has to be used?
Regards,
Venkateshprasanna
46 matches
Mail list logo