Re: Phrase search with ComplexPhraseQueryParser/SpanQueryParser.

2014-03-06 Thread Modassar Ather
Hi Ahmet, As per your suggestion I have posted the request with example on Lucene-5205 jira ticket. Thanks, Modassar On Wed, Mar 5, 2014 at 8:44 PM, Ahmet Arslan wrote: > Hi Modassar, > > Can you post your request (with an example if possible) to lucene-5205 > jura ticket too? If you don't ha

Re: Phrase search with ComplexPhraseQueryParser/SpanQueryParser.

2014-03-05 Thread Ahmet Arslan
Hi Modassar, Can you post your request (with an example if possible) to lucene-5205 jura ticket too? If you don't have an jira account, anyone can create one.  Thanks, Ahmet On Wednesday, March 5, 2014 9:40 AM, Modassar Ather wrote: Hi, Phrases with stop words in them are not getting searc

Re: phrase search in a particular case

2010-06-19 Thread Lance Norskog
SpanFirstQuery is the clean option. Another option is to add a "start token" to each title. Then, search for "startToken oil spill". This will be faster than SpanFirstQuery. But it also requires doing something weird to the field. Lance On Thu, Jun 17, 2010 at 3:19 PM, Michael McCandless wrote:

Re: phrase search in a particular case

2010-06-17 Thread Michael McCandless
SpanFirstQuery? Mike On Thu, Jun 17, 2010 at 3:23 PM, rakesh rakesh wrote: > Hi, > > I have thousands of article titles in lucene index. So for a query "Oil > spill" I want to return all the article title starts with "Oil spill". I do > not want those titles which has this phrase but do not star

RE: Phrase search on NOT_ANALYZED content

2010-03-04 Thread Murdoch, Paul
, March 04, 2010 8:54 AM To: java-user@lucene.apache.org Subject: Re: Phrase search on NOT_ANALYZED content I'm still struggling with your overall goal here, but... It sounds like what you're looking for is an exact match in some cases but not others? In which case you could think about in

Re: Phrase search on NOT_ANALYZED content

2010-03-04 Thread Erick Erickson
Message- > From: java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org > [mailto:java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org > ] On Behalf Of Erick Erickson > Sent: Wednesday, March 03, 2010 4:30 PM > To: java-user@lucene.apache.org > Subject

RE: Phrase search on NOT_ANALYZED content

2010-03-04 Thread Murdoch, Paul
em. Thanks, Paul -Original Message- From: java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org [mailto:java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org ] On Behalf Of Erick Erickson Sent: Wednesday, March 03, 2010 4:30 PM To: java-user@lucene.apache

Re: Phrase search on NOT_ANALYZED content

2010-03-03 Thread Erick Erickson
NOT_ANALYZED is probably not what you want. NOT_ANALYZED stores the entire input as a *single* token, so you can never match on anything except the entire input. What did you hope to accomplish by indexint NOT_ANALYZED? That's actually a pretty specialized thing to do, perhaps there's a better way

RE: Phrase Search and NOT_ANALYZED

2010-02-25 Thread Murdoch, Paul
ssage- From: java-user-return-45156-paul.b.murdoch=saic@lucene.apache.org [mailto:java-user-return-45156-paul.b.murdoch=saic@lucene.apache.org] On Behalf Of Murdoch, Paul Sent: Wednesday, February 24, 2010 5:11 PM To: java-user@lucene.apache.org Subject: RE: Phrase Search and NOT_ANA

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
:01 PM To: java-user@lucene.apache.org Subject: RE: Phrase Search and NOT_ANALYZED Thanks, I've been looking at that one too. I'm trying to make it happen with the StandardAnalyzer. Unfortunately, I think I see some redesign for more robustness in the future. Cheers, Paul ---

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
apache.org [mailto:java-user-return-45154-paul.b.murdoch=saic@lucene.apache.org] On Behalf Of Robert Muir Sent: Wednesday, February 24, 2010 4:55 PM To: java-user@lucene.apache.org Subject: Re: Phrase Search and NOT_ANALYZED check out KeywordAnalyzer! On Wed, Feb 24, 2010 at 4:51 PM, Mur

Re: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Robert Muir
ead. > > Thanks, > > Paul > > > -Original Message- > From: java-user-return-45149-paul.b.murdoch=saic@lucene.apache.org > [mailto:java-user-return-45149-paul.b.murdoch=saic@lucene.apache.org > ] On Behalf Of Erick Erickson > Sent: Wednesday, February 24, 20

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
=saic@lucene.apache.org ] On Behalf Of Digy Sent: Wednesday, February 24, 2010 4:45 PM To: java-user@lucene.apache.org Subject: RE: Phrase Search and NOT_ANALYZED Since it is not analyzed, your text is stored as a single term in the index [something in the index]. But the query name:"someth

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
aul.b.murdoch=saic@lucene.apache.org ] On Behalf Of Erick Erickson Sent: Wednesday, February 24, 2010 4:23 PM To: java-user@lucene.apache.org Subject: Re: Phrase Search and NOT_ANALYZED What does Luke's explain show you? That'll show you a lot about how the query gets transformed

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Digy
Since it is not analyzed, your text is stored as a single term in the index [something in the index]. But the query name:"something in the index" is translated as : find 4 consecutive terms which have values "something", "in","the" and "index" respectively. or if stop words are removed

Re: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Erick Erickson
What does Luke's explain show you? That'll show you a lot about how the query gets transformed.. My first guess is that stop words are messing you up Erick On Wed, Feb 24, 2010 at 3:51 PM, Murdoch, Paul wrote: > Hi, > > > > I'm indexing a field using the StandardAnalyzer 2.9. > > > > fi

Re: Phrase search

2009-06-11 Thread Savvas-Andreas Moysidis
Hello, You could use a PhraseQuery with the terms "cool" and "gaming" and "computer" and set the slop factor you reckon is right. Then could assign a boost to this query only, which will make it bubble up the list. I don't think you can get away without specifying a slop factor though(like in the

Re: Phrase search

2009-06-10 Thread Daniel Noll
On Fri, Jun 5, 2009 at 21:31, Abhi wrote: > Say I have indexed the following strings: > > 1. "cool gaming laptop" > 2. "cool gaming lappy" > 3. "gaming laptop cool" > > Now when I search with a query say "cool gaming computer", I want string 1 > and 2 to appear on top (where search terms are closer

Re: phrase search with custom TokenFilter

2008-03-18 Thread Chris Hostetter
You're going to want to change your TokenFilter so that it emits the split pieces tokens immediately after the original token and with a positionIncrement of "0" .. don't buffer then up and wait for the entire stream to finish first. it true order of the tokens in the tokenstream and the posit

Re: Phrase Search not returning results

2007-08-23 Thread Spencer Tickner
Hi Grant, Thanks for the advice.. It turns out it was all my own stupidity,, I had commented out (for whatever reason) setPositionIncrement(0) on my synonym analyzer.. Cheers, Spence On 8/23/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > I would suggest starting with: http://wiki.apache.org/l

Re: Phrase Search not returning results

2007-08-23 Thread Grant Ingersoll
I would suggest starting with: http://wiki.apache.org/lucene-java/ LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71 especially the part on Luke. Luke will let you try out the various queries and show you what they look like before being submitted. Cheers, Grant On Aug 23, 2007, at

Re: Phrase Search

2007-06-18 Thread Laxmilal Menaria
Ok.. thanks, I have tried to index address field as UN_TOKENIZED and search using above query, its return Nothing, How can I specified " NOT tokenize" in query.. --Thanks, On 6/18/07, Erick Erickson <[EMAIL PROTECTED]> wrote: Phrase queries won't help you here Your particular issue can be

Re: Phrase Search

2007-06-18 Thread Chris Hostetter
: Another good old trick is to index field values (tokenized) with : appended special starting and ending tokens, e.g. instead of "Hiran : Magri" use "_start_ Hiran Magri _end_". Then you can query for fields : that are exactly equal to a phrase, while still retaining the : possibility to search b

Re: Phrase Search

2007-06-18 Thread Andrzej Bialecki
Erick Erickson wrote: Phrase queries won't help you here Your particular issue can be addressed, but I'm not sure it's a reasonable long-term solution If you indexed your address field as UN_TOKENIZED, and did NOT tokenize your query, it should give you what you want. What's happening i

Re: Phrase Search

2007-06-18 Thread Erick Erickson
Phrase queries won't help you here Your particular issue can be addressed, but I'm not sure it's a reasonable long-term solution If you indexed your address field as UN_TOKENIZED, and did NOT tokenize your query, it should give you what you want. What's happening is that StandardAnalyzer

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
: Sorry for the confusion and thanks for taking the time to educate me. So, if : I am just indexing literal values, what is the best way to do that (what : analyzer)? Sounds like this approach, even though it works, is not the : preferred method. if you truely want just the literal values then

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Philip Brown
D it will still work. > > > do you have na example of something that *isn't* working the way you want? > ... if not i don't see what your problem is, all your tests are passing :) > > > : Date: Tue, 5 Sep 2006 14:06:13 -0700 (PDT) > : From: Philip Brown <[EMAIL

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
rom: Philip Brown <[EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: Re: Phrase search using quotes -- special Tokenizer : : : Here's a little sample program (borrowed some code from Erick Erickson :)). : Whether I add as TOKENIZED or UN_

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Philip Brown
Here's a little sample program (borrowed some code from Erick Erickson :)). Whether I add as TOKENIZED or UN_TOKENIZED seems to make no difference in the output. Is this what you'd expect? - Philip package com.test; import java.io.IOException; import java.util.HashSet; import java.util.regex.

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Mark Miller
Some info to help you on you're journey :) 1. If you add a field as untokenized then it will not be analyzed when added to the index. However, QueryParser will not know that this happened and will tokenize queries on that field. 2. The solution that Hoss has explained to you is to leave the defa

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
: So, if I do as you suggest below (using PerFieldAnalyzerWrapper with : StandardAnalyzer) then I still need to enclose in quotes the phrases : (keywords with spaces) when I issue the search, and they are only returned Yes, quotes will be neccessary to tell the QueryParser "this is one chunk of t

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Philip Brown
So, if I do as you suggest below (using PerFieldAnalyzerWrapper with StandardAnalyzer) then I still need to enclose in quotes the phrases (keywords with spaces) when I issue the search, and they are only returned in the results if the case is identical to how it was added? (This seems to be what

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Chris Hostetter
: Yeah, they are more complex than the "exactish" match -- basically, there are : more fields involved -- combined sometimes with AND and sometimes with OR, : and sometimes negated field values, sometimes groupings, etc. These other : field values are all single words (no spaces), and a search mi

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Mark Miller
More to consider: perhaps there is some way to get what you want by overriding getFieldQuery(String, String) instead. I have not been able to come up with anything simple off the top of my head, but overriding getFieldQuery would free you from having to make that line change on every Lucene up

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Philip Brown
Yeah, they are more complex than the "exactish" match -- basically, there are more fields involved -- combined sometimes with AND and sometimes with OR, and sometimes negated field values, sometimes groupings, etc. These other field values are all single words (no spaces), and a search might invo

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Mark Miller
Keeping in mind that Hoss's input is much more valuable than mine... It sounds like you want what I originally tgave you. You want to be able to perform complex queries with the QueryParser and you want '-' and '_' to not break words, and you want quoted words to be tokenized as one token with

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Chris Hostetter
: Thanks for your input. I'm sure I could do as you suggest (and maybe that : will end up being my best option), but I had hoped to use a string for : creating the query object, particularly as some of my queries are a bit : complex. you have to clarify what you mean by "use a string for creatin

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Philip Brown
Thanks for your input. I'm sure I could do as you suggest (and maybe that will end up being my best option), but I had hoped to use a string for creating the query object, particularly as some of my queries are a bit complex. Thanks. Chris Hostetter wrote: > > > I haven't really been followi

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Erick Erickson
Yeah, what he said On 9/3/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: I haven't really been following this thread, but it's gotten so long i got interested. from whta i can tell skimming the discussion so far, it seems like the biggest confusion is about the definition of a "phrase" a

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Chris Hostetter
I haven't really been following this thread, but it's gotten so long i got interested. from whta i can tell skimming the discussion so far, it seems like the biggest confusion is about the definition of a "phrase" and what analyzers do with "quote" characters and what the QueryParser does with "q

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Philip Brown
Just as you, I would PREFER not to change any of the base Lucene code -- and I imagine there is still some way to do what I want (possibly by extending some other existing class) with what is already available. Regarding point 0) -- You are right in that if I add "test phrase" to index as UN_TO

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Erick Erickson
Disclaimer: Of course I'm not as familiar with your problem space as you are, so I may be way out in left field, but... I *still* think you're making waay too much work for yourself and need to examine your assumptions. 0> But when you index something UN_TOKENIZED as in your example, I don't

Re: Phrase search using quotes -- special Tokenizer

2006-09-02 Thread Philip Brown
I tend to agree with Mark. I tried a query as so... TermQuery query = new TermQuery(new Term("keywordField", "phrase test")); IndexSearcher searcher= new IndexSearcher(activeIdx); Hits hits = searcher.search(query); And this produced the expected results. Whe

Re: Phrase search using quotes -- special Tokenizer

2006-09-02 Thread Mark Miller
I think if he wants to use the queryparser to parse his search strings that he has no choice but to modify it. It will eat any pair of quotes going through it no matter what analyzer is used. - Mark Well, you're flying blind. Is the behavior rooted in the indexing or querying? Since you can't

Re: Phrase search using quotes -- special Tokenizer

2006-09-02 Thread Erick Erickson
Well, you're flying blind. Is the behavior rooted in the indexing or querying? Since you can't answer that, you're reduced to trying random things hoping that one of them works. A little like voodoo. I've wasted farr too much time trying to solve what I was *sure* was the problem only to f

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
Well Philip...bad news. I should have thought of this before...I think the query parser is the problem. You are tokening "all in the quotes" to one token...but when QueryParser sees that, it doesnt matter what analyzer you use, it's going to see the quotes and strip them right off . Then it pas

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
I am out of ideas. If I'm feeling perky I'll build you one in the morning. No, I've never used Luke. Is there an easy way to examine my RAMDirectory index? I can create the index with no quoted keywords, and when I search for a keyword, I get back the expected results (just can't search for a p

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
No, I've never used Luke. Is there an easy way to examine my RAMDirectory index? I can create the index with no quoted keywords, and when I search for a keyword, I get back the expected results (just can't search for a phrase that has whitespace in it). If I create the index with phrases in quo

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Erick Erickson
OK, I've gotta ask. Have you examined your index with Luke to see if what you *think* is in the index actually *is*??? Erick On 9/1/06, Philip Brown <[EMAIL PROTECTED]> wrote: Interesting...just ran a test where I put double quotes around everything (including single keywords) of source text

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
Interesting...just ran a test where I put double quotes around everything (including single keywords) of source text and then ran searches for a known keyword with and without double quotes -- doesn't find either time. Mark Miller-5 wrote: > > Sorry to hear you're having trouble. You indeed nee

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
Added the to the other section and reran the javacc and imported the new files...but, I still get the same result -- no results. (Quotes are in the source text and query string.) Anything else I might be missing? Philip Mark Miller-5 wrote: > > Sorry to hear you're having trouble. You indee

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
Sorry to hear you're having trouble. You indeed need the double quotes in the source text. You will also need them in the query string. Make sure they are in both places. My machine is hosed right now or I would do it for you real quick. My guess is that I forgot to mention...no only do you need t

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
Well, I tried that, and it doesn't seem to work still. I would be happy to zip up the new files, so you can see what I'm using -- maybe you can get it to work. The first time, I tried building the documents without quotes surrounding each phrase. Then, I retried by enclosing every phrase within

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
That is a good point. I was just thinking that it would be a pain for searchers to have to include the quotes when searching, but I guess there is little way around it. The best you could do is have an option that specified a quoted search...and you might as well make that option be to put the

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
Thanks, but I don't "think" I need that. But curious, how will it know it's a phrase if it's not enclosed in quotes? Won't all its terms be treated separately then? Philip Mark Miller-5 wrote: > > One more tip...if you would like to be able to search phrases without > putting in the quotes

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
One more tip...if you would like to be able to search phrases without putting in the quotes you must strip them with the analyzer. In standardfilter (in the standard analyzer code) add this: private static final String QUOTED_TYPE = tokenImage[QUOTED]; - youll see where to put that and youll s

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
So this will recognize anything in quotes as a single token and '_' and '-' will not break up words. There may be some repercussions for the NUM token but nothing I'd worry about. maybe you want to use Unicode for '-' and '_' as well...I wouldn't worry about it myself. - Mark TOKEN : {

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
Philip Brown wrote: Do you mean StandardTokenizer.jj (org.apache.lucene.analysis.standard)? I'm not seeing StandardAnalyzer.jj in the Lucene source download. Mark Miller-5 wrote: Philip Brow

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
Do you mean StandardTokenizer.jj (org.apache.lucene.analysis.standard)? I'm not seeing StandardAnalyzer.jj in the Lucene source download. Mark Miller-5 wrote: > > Philip Brown wrote: >> Hi, >> >

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
Philip Brown wrote: Hi, After running some tests using the StandardAnalyzer, and getting 0 results from the search, I believe I need a special Tokenizer/Analyzer. Does anybody have something that parses like the following: - doesn't parse apart phrases (in quotes) - doesn't parse/separate hyph