Re: Basic searching doubt

2009-10-31 Thread Raf
If you search "A B" (with quotes) it is correct, but if you search only A B (without quotes) it is not correct, because, by default the query parser creates "OR" queries. So searching A B you will find all documents that contains A or B, while searching only A or only B you will normally find le

Basic searching doubt

2009-10-30 Thread Hrishikesh Agashe
Hi, If I search for string "A B" (i.e. A followed by a space followed by B) and I get 20 hits, then is it correct to expect that if I search for "A" (i.e. only A), I will get at least 20 hit or more? Similarly for if I search for B, I will get 20 hits or more? --Hrishi DISCLAIMER == T

Re: Searching doubt

2009-08-04 Thread m.harig
Thanks all, but how nutch handle this problem? am aware of nutch but not in depth. If i search the keyword "about us" , nutch gives me exactly what i want. Is there any scoring techinques? please let me know. -- View this message in context: http://www.nabble.com/Searc

Re: Searching doubt

2009-08-04 Thread Phil Whelan
(sorry, tangent. I'll be quick) On Tue, Aug 4, 2009 at 8:42 AM, Shai Erera wrote: > Interesting ... I don't have access to a Japanese dictionary, so I just > extract bi-grams. Shai - if you're interested in parsing Japanese, check out Kakasi. It can split into words and convert Kanji->Katakana/Hi

Re: Searching doubt

2009-08-04 Thread Shai Erera
may hurt recall severely. Shai On Tue, Aug 4, 2009 at 7:34 PM, N Hira wrote: > > Good summary, Shai. > > I've missed some of this thread as well, but does anyone know what happened > to the suggestion about query manipulation? > > e.g., query (about us) => query("abo

Re: Searching doubt

2009-08-04 Thread N Hira
t;creditcard") Regards, -h - Original Message From: Shai Erera To: java-user@lucene.apache.org Sent: Tuesday, August 4, 2009 10:31:46 AM Subject: Re: Searching doubt Hi Darren, The question was, how given a string "aboutus" in a document, you can return that document a

Re: Searching doubt

2009-08-04 Thread Matthew Hall
Well.. search on both anyhow. "about us" OR "aboutus" should hit the spot I think. Matt Ian Lea wrote: The question was, how given a string "aboutus" in a document, you can return that document as a result to the query "about us" (note the space). So we're mostly discussing how to detect and t

Re: Searching doubt

2009-08-04 Thread Ian Lea
> The question was, how given a string "aboutus" in a document, you can return > that document as a result to the query "about us" (note the space). So we're > mostly discussing how to detect and then break the word "aboutus" to two > words. I haven't really been following this thread so apologies

Re: Searching doubt

2009-08-04 Thread Shai Erera
Interesting ... I don't have access to a Japanese dictionary, so I just extract bi-grams. But I guess that in this case, if one can access an English dictionary (are you aware of an "open-source" one, or free one BTW?), one can use the method you mention. But still, doing this for every Token you

Re: Searching doubt

2009-08-04 Thread Phil Whelan
On Tue, Aug 4, 2009 at 8:31 AM, Shai Erera wrote: > Hi Darren, > > The question was, how given a string "aboutus" in a document, you can return > that document as a result to the query "about us" (note the space). So we're > mostly discussing how to detect and then break the word "aboutus" to two >

Re: Searching doubt

2009-08-04 Thread darren
A, ok. Interesting problem there as well. I'll think on that one some too! cheers. > Hi Darren, > > The question was, how given a string "aboutus" in a document, you can > return > that document as a result to the query "about us" (note the space). So > we're > mostly discussing how to detec

Re: Searching doubt

2009-08-04 Thread Shai Erera
Hi Darren, The question was, how given a string "aboutus" in a document, you can return that document as a result to the query "about us" (note the space). So we're mostly discussing how to detect and then break the word "aboutus" to two words. What you wrote though seems interesting as well, onl

Re: Searching doubt

2009-08-04 Thread darren
Just catching this thread, but if I understand what is being asked I can share how I do multi-word phrase matching. If that's not what's wanted, pardons! Ok, I load an entire dictionary into a lucene index, phrases and all. When I'm scanning some text, I do lookups in this dictionary index using

Re: Searching doubt

2009-08-04 Thread Phil Whelan
On Tue, Aug 4, 2009 at 3:56 AM, Shai Erera wrote: > 2) Use a dictionary (real dictionary), and search it for every substring, > e.g. "a", "ab", "abo" ... "about" etc. If you find a match, split it there. > This needs some fine tuning, like checking if the rest is also a word and if > the full strin

Re: Searching doubt

2009-08-04 Thread Shai Erera
;ll index it. Is there any technique to use while indexing > ? am using lucene 2.4.0 version. Please suggest me. > -- > View this message in context: > http://www.nabble.com/Searching-doubt-tp24802552p24805609.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: Searching doubt

2009-08-04 Thread m.harig
ease suggest me. -- View this message in context: http://www.nabble.com/Searching-doubt-tp24802552p24805609.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsubscr...@lu

Re: Searching doubt

2009-08-04 Thread Shai Erera
t; if i use this code for my search , it gives me the unwanted search hits , > > as i mentioned , if i search for "about us" , this is an example , > there may be more number of urls like this , for example , "credit cards" > , > "book marks" ,

Re: Searching doubt

2009-08-04 Thread m.harig
is an example , there may be more number of urls like this , for example , "credit cards" , "book marks" , how do i handle it ? -- View this message in context: http://www.nabble.com/Searching-doubt-tp24802552p24803560.html Sent from the Lucene - Java Users mailing list arch

Re: Searching doubt

2009-08-03 Thread Shai Erera
e out > of this. > > -- > View this message in context: > http://www.nabble.com/Searching-doubt-tp24802552p24803073.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > - >

Re: Searching doubt

2009-08-03 Thread m.harig
(query); but its returning 0 hits as results , where am i wrong? please help me out of this. -- View this message in context: http://www.nabble.com/Searching-doubt-tp24802552p24803073.html Sent from the Lucene - Java Users mailing list archive at Nabble.com

Re: Searching doubt

2009-08-03 Thread Shai Erera
http://www./aboutus/.xyz/ > >http://www./aboutus/.....def/ > > > > if i search "aboutus" , the results coming up correctly.

Re: Searching doubt

2009-08-03 Thread Anshum
if i search "aboutus" , the results coming up correctly. Please > any1 suggest me how to handle this situation. > -- > View this message in context: > http://www.nabble.com/Searching-doubt-tp24802552p24802552.ht