Well Philip...bad news. I should have thought of this before...I think
the query parser is the problem. You are tokening "all in the quotes" to
one token...but when QueryParser sees that, it doesnt matter what
analyzer you use, it's going to see the quotes and strip them right off
. Then it pas
I am out of ideas. If I'm feeling perky I'll build you one in the morning.
No, I've never used Luke. Is there an easy way to examine my RAMDirectory
index? I can create the index with no quoted keywords, and when I search
for a keyword, I get back the expected results (just can't search for a
p
No, I've never used Luke. Is there an easy way to examine my RAMDirectory
index? I can create the index with no quoted keywords, and when I search
for a keyword, I get back the expected results (just can't search for a
phrase that has whitespace in it). If I create the index with phrases in
quo
OK, I've gotta ask. Have you examined your index with Luke to see if what
you *think* is in the index actually *is*???
Erick
On 9/1/06, Philip Brown <[EMAIL PROTECTED]> wrote:
Interesting...just ran a test where I put double quotes around everything
(including single keywords) of source text
Interesting...just ran a test where I put double quotes around everything
(including single keywords) of source text and then ran searches for a known
keyword with and without double quotes -- doesn't find either time.
Mark Miller-5 wrote:
>
> Sorry to hear you're having trouble. You indeed nee
Added the to the other section and reran the javacc and imported the
new files...but, I still get the same result -- no results. (Quotes are in
the source text and query string.) Anything else I might be missing?
Philip
Mark Miller-5 wrote:
>
> Sorry to hear you're having trouble. You indee
Sorry to hear you're having trouble. You indeed need the double quotes in
the source text. You will also need them in the query string. Make sure they
are in both places. My machine is hosed right now or I would do it for you
real quick. My guess is that I forgot to mention...no only do you need t
Well, I tried that, and it doesn't seem to work still. I would be happy to
zip up the new files, so you can see what I'm using -- maybe you can get it
to work. The first time, I tried building the documents without quotes
surrounding each phrase. Then, I retried by enclosing every phrase within
On Friday 01 September 2006 19:46, Mark Miller wrote:
> Eric also gave me the idea of using a SpanNear with maximum slop as a
> boolean to connect spans. Using this and SpanOr seems to make my time spent
> on the distribution of proximity clauses a little foolish :) Is that true?
There is practice
That is a good point. I was just thinking that it would be a pain for
searchers to have to include the quotes when searching, but I guess
there is little way around it. The best you could do is have an option
that specified a quoted search...and you might as well make that option
be to put the
Thanks, but I don't "think" I need that. But curious, how will it know it's
a phrase if it's not enclosed in quotes? Won't all its terms be treated
separately then?
Philip
Mark Miller-5 wrote:
>
> One more tip...if you would like to be able to search phrases without
> putting in the quotes
: But why do I have to reterive atleast 1 document when im using the TopDocs ?
: (If I set nDoc to 0 it will throw an exception).
i didn't say you had to, i just saaid "maybe" ... i don't know whatthe
behavior is if you use 0 -- ideally it would work fine, but in practice i
do't know if anyone ha
Eric also gave me the idea of using a SpanNear with maximum slop as a
boolean to connect spans. Using this and SpanOr seems to make my time spent
on the distribution of proximity clauses a little foolish :) Is that true?
Is there any disadvantage to the max slop Spannear, SpanOr solution? Any
adva
Thanks for the tip Paul. It is embarrassing, but I only realized how OrSpan
queries worked a day or two ago based on a tip from Eric. The way I assumed
it would create the spans before was just wrong and I never had researched
further. Now I see that it would be a nice optimization for what I
have
On Friday 01 September 2006 12:54, Mark Miller wrote:
> Hi Paul,
>
> I also have to treat things differently depending on if I am in a
> proximity clause or boolean clause. A wildcard in a boolean is mapped to
> a wildcard query. A wildcard in a proximity is mapped to a regex span
> that has b
Collect searched results in your own HitCollector and return results how
ever you like..
:)
Jelda
> -Original Message-
> From: Rupinder Singh Mazara [mailto:[EMAIL PROTECTED]
> Sent: Friday, September 01, 2006 5:13 PM
> To: java-user@lucene.apache.org
> Subject: retrieving LowestDoc
>
>
hi all
the search implementation that i have requires not the top 1000
documents but the lowest 1000 documents to be returned
I donot want to store the entire result set in memory and go to
the last 1000 , is there any implementation / suggestions on how to
achieve this
thanks
Erick Erickson wrote:
OK, a not very helpful answer, but "of course they're slower, they do
more
work" (the span versions). But that's fairly useless, since the
question is
really "is it enough slower in my situation that I need to find an
alternative?". And the only way I know of to answer tha
One more tip...if you would like to be able to search phrases without
putting in the quotes you must strip them with the analyzer. In
standardfilter (in the standard analyzer code) add this:
private static final String QUOTED_TYPE = tokenImage[QUOTED];
- youll see where to put that
and youll s
OK, a not very helpful answer, but "of course they're slower, they do more
work" (the span versions). But that's fairly useless, since the question is
really "is it enough slower in my situation that I need to find an
alternative?". And the only way I know of to answer that question is to make
som
So this will recognize anything in quotes as a single token and '_' and
'-' will not break up words. There may be some repercussions for the NUM
token but nothing I'd worry about. maybe you want to use Unicode for '-'
and '_' as well...I wouldn't worry about it myself.
- Mark
TOKEN : {
Philip Brown wrote:
Do you mean StandardTokenizer.jj (org.apache.lucene.analysis.standard)? I'm
not seeing StandardAnalyzer.jj in the Lucene source download.
Mark Miller-5 wrote:
Philip Brow
Do you mean StandardTokenizer.jj (org.apache.lucene.analysis.standard)? I'm
not seeing StandardAnalyzer.jj in the Lucene source download.
Mark Miller-5 wrote:
>
> Philip Brown wrote:
>> Hi,
>>
>
Hi Andzej,
Thanks for the tip, it does what I want. You are right, though, it's of limited
use for helping the user access data. But I'm sure it will come in handy for my
own analysis.
Best,
Ariel
-Message d'origine-
De : Andrzej Bialecki [mailto:[EMAIL PROTECTED]
Envoyé : jeudi 31 août
Erick Erickson wrote:
Let me chime in here on a different note before you get happy with
wildcard queries, take a look at the thread "I just don't get
wildcards at
all". There is lots of good info that Erik, Chris and Otis provided me.
The danger with prefixquery and wildcard query is that
Philip Brown wrote:
Hi,
After running some tests using the StandardAnalyzer, and getting 0 results
from the search, I believe I need a special Tokenizer/Analyzer. Does
anybody have something that parses like the following:
- doesn't parse apart phrases (in quotes)
- doesn't parse/separate hyph
Yes I am sure only one writer at a time accessing index.
no i am not getting any other exception.
and there is no problem of disk space also.
right now i have backcopy of indexes so whenever one index got corrupted
i m replacing with backup one and starting the indexer again from that
durat
You probably forgot to close an IndexWriter?
Well, I wish it were that easy...I open one IndexWriter to write the
documents to the index after it is created, and then call writer.optimize()
and writer.close(). Your suggestion is a good one in that, from what I've
read, the writer needs to be clo
Paul Elschot wrote:
Mark,
On Thursday 31 August 2006 23:18, Mark Miller wrote:
I am not a huge fan of the queryparser's syntax so I have started an
open source project to create a viable alternative. I could really use
some helping testing it out. The more I can get it tested the better
ch
On Thu, 2006-08-31 at 19:34 -0700, Philip Brown wrote:
karl wettin-3 wrote:
> >
> > On Thu, 2006-08-31 at 15:24 -0700, Philip Brown wrote:
> >>
> >> I'm getting the following error trying to instantiate an IndexModifier
> >> on a RAMDirectory index:
> >>
> >> java.io.IOException: Lock obtain tim
Hi
I am trying to build an application which uses JMS objects and Lucene. I
am creating Lucene Documents and sending them through JMS objects to a
queue( I am using IBM MQ Series ). There is a listener which listens to
this queue and Indexes these documents. The problem I am facing is there
i
Thx Hoss.
But why do I have to reterive atleast 1 document when im using the TopDocs ?
(If I set nDoc to 0 it will throw an exception).
/
Marcus
-Ursprungligt meddelande-
Från: Chris Hostetter [mailto:[EMAIL PROTECTED]
Skickat: den 31 augusti 2006 20:09
Till: java-user@lucene.apache.org
Mark,
On Thursday 31 August 2006 23:18, Mark Miller wrote:
> I am not a huge fan of the queryparser's syntax so I have started an
> open source project to create a viable alternative. I could really use
> some helping testing it out. The more I can get it tested the better
> chance it has of se
33 matches
Mail list logo