Highlighter issue for Exact Phrase search

2019-05-31 Thread Naaser
Hello, I am searching exact phrase say "Jane Doe", there are two instances of this in the text. My highlighter is only outputting the first instance and not the second one. Can someone please help me understand the issue and how to fix it, any help would be highly appreciated. Part of my code is

Re: Exact Phrase Search returning in correct results

2014-06-11 Thread Scott Selvia
o be able to search stop words consider adding > CharArraySet.EMPTY_SET to the StandardAnalyzer's initializer. > > > > -Original Message- > From: Scott Selvia [mailto:ssel...@gmail.com] > Sent: Wednesday, June 11, 2014 12:48 PM > To: java-user@lucene.apache

RE: Exact Phrase Search returning in correct results

2014-06-11 Thread Allison, Timothy B.
7;s initializer. -Original Message- From: Scott Selvia [mailto:ssel...@gmail.com] Sent: Wednesday, June 11, 2014 12:48 PM To: java-user@lucene.apache.org Subject: Exact Phrase Search returning in correct results I'm having an issue searching for an exact phrase with Lucene 4.7. My

Exact Phrase Search returning in correct results

2014-06-11 Thread Scott Selvia
I’m having an issue searching for an exact phrase with Lucene 4.7. My use case loaded the Declaration of Independence into a Lucene search database. I search for “it becomes” and I get two hits; one for “it, becomes” and another for a line that just has “becomes” at the end of the line. Expec

Re: Phrase search with ComplexPhraseQueryParser/SpanQueryParser.

2014-03-06 Thread Modassar Ather
Hi Ahmet, As per your suggestion I have posted the request with example on Lucene-5205 jira ticket. Thanks, Modassar On Wed, Mar 5, 2014 at 8:44 PM, Ahmet Arslan wrote: > Hi Modassar, > > Can you post your request (with an example if possible) to lucene-5205 > jura ticket too? If you don't ha

Re: Phrase search with ComplexPhraseQueryParser/SpanQueryParser.

2014-03-05 Thread Ahmet Arslan
Hi Modassar, Can you post your request (with an example if possible) to lucene-5205 jura ticket too? If you don't have an jira account, anyone can create one.  Thanks, Ahmet On Wednesday, March 5, 2014 9:40 AM, Modassar Ather wrote: Hi, Phrases with stop words in them are not getting searc

Phrase search with ComplexPhraseQueryParser/SpanQueryParser.

2014-03-04 Thread Modassar Ather
Hi, Phrases with stop words in them are not getting searched whereas a phrase without it gets searched using ComplexPhraseQueryParser/SpanQueryParser. SpanQueryParser reference: https://issues.apache.org/jira/browse/LUCENE-5205 The similar search works fine with classic parser which uses PhraseQ

How to Phrase Search Query with wild card

2013-08-07 Thread TechRay
rms = termList.toArray(new Term[0]); multiPhrasequery.add(firstTerm); multiPhrasequery.add(secondTerm); org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery( multiPhrasequery, this.type);

Re: How to implement fuzzy phrase search with Lucene?

2012-06-01 Thread Jack Krupansky
You'll have to be more specific about what you mean by "fuzzy phrase search". Even in the classic Lucene query parser "sloppy phrase search is supported" - variable spacing between terms. LUCENE-2754 added support for all multi-term queries (which includes Fuzz

How to implement fuzzy phrase search with Lucene?

2012-06-01 Thread harish.bn
Did you find any solution for this. I am looking for similar solution, please let me know if you found any useful info regarding fuzzy phrase search inlucene. Thanks & Regards, Harish B.N. Lead Software Engineer Thomson Reuters Phone: +91-80-67193219 Mobile: +91-9845807294 ha

Re: multiple phrase search for topic

2011-11-02 Thread deb.lucene
arity" function in Lucene. Regards, d -- View this message in context: http://lucene.472066.n3.nabble.com/multiple-phrase-search-for-topic-tp3461423p3474768.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: multiple phrase search for topic

2011-10-31 Thread Ian Lea
r.SHOULD); > > ** > thanks for the carrot2 pointer. > > -d > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/multiple-phrase-search-for-topic-tp3461423p3468005.html > Sent from the Lucene - Java

Re: multiple phrase search for topic

2011-10-31 Thread deb.lucene
on.LUCENE_33)); Query query = queryParser.parse(searchString); bQuery.add(query,BooleanClause.Occur.SHOULD); ** thanks for the carrot2 pointer. -d -- View this message in context: http://lucene.472066.n3.nabble.com/multiple-phrase-search-for-topic-tp3461423p3

Re: multiple phrase search for topic

2011-10-28 Thread Ian Lea
My questions are : > > 1) is there anything wrong in this usage of the phrase/boolean query? > 2) how I can guarantee to retrieve the most suitable news documents (i.e. > document which contains a lot of the related phrases) in the top searched > results? I utilized the BooleanClause.Occur.SHO

multiple phrase search for topic

2011-10-28 Thread deb.lucene
all of the 10k phrases, but using the SHOULD feature I surmise the best results will be which contains at least a few of the phrases. thanks in advance, --d -- View this message in context: http://lucene.472066.n3.nabble.com/multiple-phrase-search-for-topic-tp3461423p3461423.html Sent from

Fuzzy Phrase Search

2010-10-27 Thread Andrew Scott
Hi Guys, I am wondering how I can go about doing a Fuzzy Phrase search using Lucene.NET 2.9.2 - I've tired looking around everywhere but there doesn't really seem to be any resources related to this anywhere. I found this stackoverflow link<http://stackoverflow.com/questions/2589086

How to implement fuzzy phrase search with Lucene?

2010-07-12 Thread a peng
Hi, I have a requirement recently to implement fuzzy phrase, for example, in the indexed document there is a sentence "I like lucene very much". And when I search "I do like lucene very much" or "I like lucene much", I both want to get the search result, can someone guide me how to implement this f

Re: phrase search in a particular case

2010-06-19 Thread Lance Norskog
SpanFirstQuery is the clean option. Another option is to add a "start token" to each title. Then, search for "startToken oil spill". This will be faster than SpanFirstQuery. But it also requires doing something weird to the field. Lance On Thu, Jun 17, 2010 at 3:19 PM, Michael McCandless wrote:

Re: phrase search in a particular case

2010-06-17 Thread Michael McCandless
SpanFirstQuery? Mike On Thu, Jun 17, 2010 at 3:23 PM, rakesh rakesh wrote: > Hi, > > I have thousands of article titles in lucene index. So for a query "Oil > spill" I want to return all the article title starts with "Oil spill". I do > not want those titles which has this phrase but do not star

phrase search in a particular case

2010-06-17 Thread rakesh rakesh
Hi, I have thousands of article titles in lucene index. So for a query "Oil spill" I want to return all the article title starts with "Oil spill". I do not want those titles which has this phrase but do not start with this. Can anyone help me. Thanks in advance Thanks rakesh

RE: Phrase search on NOT_ANALYZED content

2010-03-04 Thread Murdoch, Paul
, March 04, 2010 8:54 AM To: java-user@lucene.apache.org Subject: Re: Phrase search on NOT_ANALYZED content I'm still struggling with your overall goal here, but... It sounds like what you're looking for is an exact match in some cases but not others? In which case you could think about in

Re: Phrase search on NOT_ANALYZED content

2010-03-04 Thread Erick Erickson
Message- > From: java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org > [mailto:java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org > ] On Behalf Of Erick Erickson > Sent: Wednesday, March 03, 2010 4:30 PM > To: java-user@lucene.apache.org > Subject

RE: Phrase search on NOT_ANALYZED content

2010-03-04 Thread Murdoch, Paul
em. Thanks, Paul -Original Message- From: java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org [mailto:java-user-return-45278-paul.b.murdoch=saic@lucene.apache.org ] On Behalf Of Erick Erickson Sent: Wednesday, March 03, 2010 4:30 PM To: java-user@lucene.apache

Re: Phrase search on NOT_ANALYZED content

2010-03-03 Thread Erick Erickson
s a better way to accomplish your goal. Best Erick On Wed, Mar 3, 2010 at 4:11 PM, Murdoch, Paul wrote: > If I have indexed some content that contains some words and a single > whitespace between each word as NOT_ANALYZED, is it possible to perform > a phrase search on that a portion of

Phrase search on NOT_ANALYZED content

2010-03-03 Thread Murdoch, Paul
If I have indexed some content that contains some words and a single whitespace between each word as NOT_ANALYZED, is it possible to perform a phrase search on that a portion of that content? I'm indexing and searching with the StandardAnalyzer 2.9. Using the KeywordAnalyzer works, but I ha

RE: Phrase Search and NOT_ANALYZED

2010-02-25 Thread Murdoch, Paul
ssage- From: java-user-return-45156-paul.b.murdoch=saic@lucene.apache.org [mailto:java-user-return-45156-paul.b.murdoch=saic@lucene.apache.org] On Behalf Of Murdoch, Paul Sent: Wednesday, February 24, 2010 5:11 PM To: java-user@lucene.apache.org Subject: RE: Phrase Search and NOT_ANA

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
:01 PM To: java-user@lucene.apache.org Subject: RE: Phrase Search and NOT_ANALYZED Thanks, I've been looking at that one too. I'm trying to make it happen with the StandardAnalyzer. Unfortunately, I think I see some redesign for more robustness in the future. Cheers, Paul ---

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
apache.org [mailto:java-user-return-45154-paul.b.murdoch=saic@lucene.apache.org] On Behalf Of Robert Muir Sent: Wednesday, February 24, 2010 4:55 PM To: java-user@lucene.apache.org Subject: Re: Phrase Search and NOT_ANALYZED check out KeywordAnalyzer! On Wed, Feb 24, 2010 at 4:51 PM, Mur

Re: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Robert Muir
ead. > > Thanks, > > Paul > > > -Original Message- > From: java-user-return-45149-paul.b.murdoch=saic@lucene.apache.org > [mailto:java-user-return-45149-paul.b.murdoch=saic@lucene.apache.org > ] On Behalf Of Erick Erickson > Sent: Wednesday, February 24, 20

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
=saic@lucene.apache.org ] On Behalf Of Digy Sent: Wednesday, February 24, 2010 4:45 PM To: java-user@lucene.apache.org Subject: RE: Phrase Search and NOT_ANALYZED Since it is not analyzed, your text is stored as a single term in the index [something in the index]. But the query name:"someth

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
aul.b.murdoch=saic@lucene.apache.org ] On Behalf Of Erick Erickson Sent: Wednesday, February 24, 2010 4:23 PM To: java-user@lucene.apache.org Subject: Re: Phrase Search and NOT_ANALYZED What does Luke's explain show you? That'll show you a lot about how the query gets transformed

RE: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Digy
.@saic.com] Sent: Wednesday, February 24, 2010 10:51 PM To: java-user@lucene.apache.org Subject: Phrase Search and NOT_ANALYZED Hi, I'm indexing a field using the StandardAnalyzer 2.9. field = new Field(fieldName, fieldValue, Field.Store.YES, Field.Index.NOT_ANALYZED); Let's say fieldName

Re: Phrase Search and NOT_ANALYZED

2010-02-24 Thread Erick Erickson
What does Luke's explain show you? That'll show you a lot about how the query gets transformed.. My first guess is that stop words are messing you up Erick On Wed, Feb 24, 2010 at 3:51 PM, Murdoch, Paul wrote: > Hi, > > > > I'm indexing a field using the StandardAnalyzer 2.9. > > > > fi

Phrase Search and NOT_ANALYZED

2010-02-24 Thread Murdoch, Paul
Hi, I'm indexing a field using the StandardAnalyzer 2.9. field = new Field(fieldName, fieldValue, Field.Store.YES, Field.Index.NOT_ANALYZED); Let's say fieldName is "name" and fieldValue is "something in the index". When I perform the query... name:"something in the index" ...

Re: Phrase search

2009-06-11 Thread Savvas-Andreas Moysidis
Hello, You could use a PhraseQuery with the terms "cool" and "gaming" and "computer" and set the slop factor you reckon is right. Then could assign a boost to this query only, which will make it bubble up the list. I don't think you can get away without specifying a slop factor though(like in the

Re: Phrase search

2009-06-10 Thread Daniel Noll
On Fri, Jun 5, 2009 at 21:31, Abhi wrote: > Say I have indexed the following strings: > > 1. "cool gaming laptop" > 2. "cool gaming lappy" > 3. "gaming laptop cool" > > Now when I search with a query say "cool gaming computer", I want string 1 > and 2 to appear on top (where search terms are closer

Phrase search

2009-06-05 Thread Abhi
Say I have indexed the following strings: 1. "cool gaming laptop" 2. "cool gaming lappy" 3. "gaming laptop cool" Now when I search with a query say "cool gaming computer", I want string 1 and 2 to appear on top (where search terms are closer to each other) followed by 3. I can use a Term query t

RE: Contrib Highlighter and Phrase search

2008-03-19 Thread Itamar Syn-Hershko
(since I'm inflating the query). Does this make sense? Itamar. -Original Message- From: Daniel Noll [mailto:[EMAIL PROTECTED] Sent: Thursday, March 20, 2008 12:44 AM To: java-user@lucene.apache.org Subject: Re: Contrib Highlighter and Phrase search On Wednesday 19 March 2008 18:28:15 Ita

Re: Contrib Highlighter and Phrase search

2008-03-19 Thread Daniel Noll
On Wednesday 19 March 2008 18:28:15 Itamar Syn-Hershko wrote: > 1. Build a Radix tree (PATRICIA) and populate it with all search terms. > Phrase queries will be considered as one big string, regardless their > spaces. > > 2. Iterate through your text ignoring spaces and punctuation marks, and for >

RE: Contrib Highlighter and Phrase search

2008-03-19 Thread Itamar Syn-Hershko
t color. This allows for fast and exact highlighting of large texts as well as smaller ones. I would love to hear any comments on the above. Itamar. -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 18, 2008 10:51 PM To: java-user@lucene.apache.org Subject: R

Re: Contrib Highlighter and Phrase search

2008-03-18 Thread markharw00d
See https://issues.apache.org/jira/browse/LUCENE-794 Spencer Tickner wrote: Hi List, Thanks in advance for any help. I'm working with the contrib highlighting class and am having issues when doing searches with a phrase. I've been able to duplicate this behaviour in the HighlighterTest class.

Re: phrase search with custom TokenFilter

2008-03-18 Thread Chris Hostetter
You're going to want to change your TokenFilter so that it emits the split pieces tokens immediately after the original token and with a positionIncrement of "0" .. don't buffer then up and wait for the entire stream to finish first. it true order of the tokens in the tokenstream and the posit

Re: Contrib Highlighter and Phrase search

2008-03-18 Thread Spencer Tickner
Thanks, I'll give that a try. Cheers, Spencer On Tue, Mar 18, 2008 at 1:50 PM, Mark Miller <[EMAIL PROTECTED]> wrote: > The contrib Highlighter is not position sensitive. You can try out the > patch I have been working here if you are interested: > https://issues.apache.org/jira/browse/LUCENE-

Re: Contrib Highlighter and Phrase search

2008-03-18 Thread Mark Miller
The contrib Highlighter is not position sensitive. You can try out the patch I have been working here if you are interested: https://issues.apache.org/jira/browse/LUCENE-794 Spencer Tickner wrote: Hi List, Thanks in advance for any help. I'm working with the contrib highlighting class and am

Contrib Highlighter and Phrase search

2008-03-18 Thread Spencer Tickner
Hi List, Thanks in advance for any help. I'm working with the contrib highlighting class and am having issues when doing searches with a phrase. I've been able to duplicate this behaviour in the HighlighterTest class. When calling the testGetBestFragmentsPhrase() method I get the correct: John K

phrase search with custom TokenFilter

2008-03-10 Thread Embry, Clay
Hi, I have written a TokenFilter which breaks up words with internal dot characters and adds the whole word plus the pieces as tokens in the stream. I am using that TokenFilter with the StandardAnalyzer to index my documents. Then I do searches using the StandardAnalyzer. Everything is working g

Re: Phrase Search not returning results

2007-08-23 Thread Spencer Tickner
M, Spencer Tickner wrote: > > > Hi List, > > > > Thanks in advance for the help. I'm creating a simple searching test > > based on Query Parser and from what I've read it should have no > > problems with a Phrase Search. However I can't seem to get an

Re: Phrase Search not returning results

2007-08-23 Thread Grant Ingersoll
3:04 PM, Spencer Tickner wrote: Hi List, Thanks in advance for the help. I'm creating a simple searching test based on Query Parser and from what I've read it should have no problems with a Phrase Search. However I can't seem to get any results back. I'm doing a si

Phrase Search not returning results

2007-08-23 Thread Spencer Tickner
Hi List, Thanks in advance for the help. I'm creating a simple searching test based on Query Parser and from what I've read it should have no problems with a Phrase Search. However I can't seem to get any results back. I'm doing a simple index using the StandardAnalyzer. Outp

Re: Phrase Search

2007-06-18 Thread Laxmilal Menaria
Ok.. thanks, I have tried to index address field as UN_TOKENIZED and search using above query, its return Nothing, How can I specified " NOT tokenize" in query.. --Thanks, On 6/18/07, Erick Erickson <[EMAIL PROTECTED]> wrote: Phrase queries won't help you here Your particular issue can be

Re: Phrase Search

2007-06-18 Thread Chris Hostetter
: Another good old trick is to index field values (tokenized) with : appended special starting and ending tokens, e.g. instead of "Hiran : Magri" use "_start_ Hiran Magri _end_". Then you can query for fields : that are exactly equal to a phrase, while still retaining the : possibility to search b

Re: Phrase Search

2007-06-18 Thread Andrzej Bialecki
Erick Erickson wrote: Phrase queries won't help you here Your particular issue can be addressed, but I'm not sure it's a reasonable long-term solution If you indexed your address field as UN_TOKENIZED, and did NOT tokenize your query, it should give you what you want. What's happening i

Re: Phrase Search

2007-06-18 Thread Erick Erickson
Phrase queries won't help you here Your particular issue can be addressed, but I'm not sure it's a reasonable long-term solution If you indexed your address field as UN_TOKENIZED, and did NOT tokenize your query, it should give you what you want. What's happening is that StandardAnalyzer

Phrase Search

2007-06-18 Thread Laxmilal Menaria
Hello everyone, I am lucene user and tried to implement pharse query, But now getting some logical problems in searching.. My index have 4 fields: Name, Address & City and 6 docs. i.e 1. "Laxmilal Menaria", "Hiran Magri", "Udaipur", 2. "Mohan Sharma", "Hiran Magri Sec 10", "Udaipur"

Re: Problem using wildcardsearch in phrase search

2007-05-18 Thread Paul Taylor
Ive proposed a simple improvement in issue https://issues.apache.org/jira/browse/LUCENE-884 thanks Paul Chris Hostetter wrote: But as i said: if you have suggestions for clarifying the docs, please submit them as a patch. just saying the docs need to be improved without providing a specific

Re: Problem using wildcardsearch in phrase search

2007-05-14 Thread Chris Hostetter
: queryparsersyntax page which is where I expect most novices (such as : myself) start with lucene seems to indicate that wildcards can be used : in, and this page is : as far as one should need to go to understand basic query syntax, this : page should be corrected. if you have a suggestion for

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Paul Taylor
Chris Hostetter wrote: : > You can't use a wildcard within double quotes. The Lucene syntax : > grammar does not look for such things. : This is the bit I don't get (I have got round the problem), why can't : you use wildcards within double quotes, this isnt mentioned anywhere in : http://lucene

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Chris Hostetter
: > You can't use a wildcard within double quotes. The Lucene syntax : > grammar does not look for such things. : This is the bit I don't get (I have got round the problem), why can't : you use wildcards within double quotes, this isnt mentioned anywhere in : http://lucene.apache.org/java/docs/qu

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Mark Miller
I do not know enough about PhraseQuery to say how hard it would be to add support for wildcards, but I am sure there is some method of doing it -- it has just not been done. From what I can tell it would be easier to stop using PhraseQuery and use SpanQuery's if you wanted to do this. Maybe som

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Paul Taylor
Mark Miller wrote: You cannot use wildcards in quotes simply because the QueryParser syntax does not look for such things...at the top level it is either looking for a Wildcard token OR a Quoted token. There is good reason for this: a phrase query does not support wildcards. OK thanks for all t

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Mark Miller
You cannot use wildcards in quotes simply because the QueryParser syntax does not look for such things...at the top level it is either looking for a Wildcard token OR a Quoted token. There is good reason for this: a phrase query does not support wildcards. The hack that I suggested (looking for

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Paul Taylor
Mark Miller wrote: You can't use a wildcard within double quotes. The Lucene syntax grammar does not look for such things. This is the bit I don't get (I have got round the problem), why can't you use wildcards within double quotes, this isnt mentioned anywhere in http://lucene.apache.org/java

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Mark Miller
I think the KeywordAnlyser bit is maybe a red herring, the problem seems to be that you cant use * within double quotes, I made some changes to my data and index to remove the space character You can't use a wildcard within double quotes. The Lucene syntax grammar does not look for such thin

RE: Problem using wildcardsearch in phrase search

2007-05-13 Thread Max Metral
@lucene.apache.org Subject: Re: Problem using wildcardsearch in phrase search I think the KeywordAnlyser bit is maybe a red herring, the problem seems to be that you cant use * within double quotes, I made some changes to my data and index to remove the space character If I fed 54:puid* to my code

Re: Problem using wildcardsearch in phrase search

2007-05-13 Thread Paul Taylor
I think the KeywordAnlyser bit is maybe a red herring, the problem seems to be that you cant use * within double quotes, I made some changes to my data and index to remove the space character If I fed 54:puid* to my code it generates a Prefix Query and works as required Search Query Is54:puid

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Mark Miller
Perhaps not like whitespaceanalyzer does in all cases, but this code QueryParser qp = new QueryParser("field", new WhitespaceAnalyzer()); Query q = qp.parse("Does this tokenize*"); System.out.println(q.toString()); produces field:Does field:this field:token

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Erick Erickson
See below On 5/12/07, Mark Miller <[EMAIL PROTECTED]> wrote: Paul Taylor wrote: > I seem to be having problems using a * in a phrase term query > > This is my search String, its not finding any matches > 54:"MusicIP PUID*" > > If I match on a particular record it works ok > 54:"MusicIP PU

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Mark Miller
This just keeps running around in my head... I was wrong on one point...if you use the KeywordAnalyzer and you put your search in quotes then you will not generate a phrase query because a PhraseQuery is only generated if the analyzer produces more than one token. The problem is that, instead

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Mark Miller
Well I am confused so I suppose I'll let someone else give it a shot. Just in case though...if you are using the query: fieldname:"MusicIP Puid*" Then you should not...you need to leave out the quotes...quotes create a phrasequery, and a phrasequery will not match what is in your index. This may

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Paul Taylor
Mark Miller wrote: Didn't you say you where using a phrasequery? If you are, things will not work as expected. You need to leave the quotes out of your search as a phrasequery will not match what you are putting in your index. If you are not using a phrasequery then things should work as you wo

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Mark Miller
Paul Taylor wrote: Mark Miller wrote: "MusicIP PUID*" means to search for MusicIP within one of PUID* Sorry I dont understand, can you give me a further reference ...I am pretty sure that KeywordAnalyzer does not split on whitespace like WhiteSpaceAnalyzer does...which means that MusicIP is

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Paul Taylor
Mark Miller wrote: "MusicIP PUID*" means to search for MusicIP within one of PUID* Sorry I dont understand, can you give me a further reference ...I am pretty sure that KeywordAnalyzer does not split on whitespace like WhiteSpaceAnalyzer does...which means that MusicIP is never within one of

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Mark Miller
Paul Taylor wrote: I seem to be having problems using a * in a phrase term query This is my search String, its not finding any matches 54:"MusicIP PUID*" If I match on a particular record it works ok 54:"MusicIP PUIDa39494bf-927e-1638-fb06-782ec55ac22d" "MusicIP PUID*" means to search for Mu

Re: Problem using wildcardsearch in phrase search

2007-05-12 Thread Erick Erickson
Somewhere in the list, I remember one of the guys who know what they're talking about mentions something about KeywordAnalyzer being "subject to the meta-semantics of the QueryParser". So try looking at query.toString() in your example. What I think you'll find is that KeywordAnalyzer doesn't qui

Problem using wildcardsearch in phrase search

2007-05-12 Thread Paul Taylor
I seem to be having problems using a * in a phrase term query This is my search String, its not finding any matches 54:"MusicIP PUID*" If I match on a particular record it works ok 54:"MusicIP PUIDa39494bf-927e-1638-fb06-782ec55ac22d" The problem appears to be the space character, because I hav

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
: Sorry for the confusion and thanks for taking the time to educate me. So, if : I am just indexing literal values, what is the best way to do that (what : analyzer)? Sounds like this approach, even though it works, is not the : preferred method. if you truely want just the literal values then

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Philip Brown
as seperate values > : > intead of trying to find one big vlaue containing "my brown-cow red > fox" > : > > : > : in the results if the case is identical to how it was added? (This > : > seems to > : > : be what I observe anyway. And whether I add as TOKENI

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
rom: Philip Brown <[EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: Re: Phrase search using quotes -- special Tokenizer : : : Here's a little sample program (borrowed some code from Erick Erickson :)). : Whether I add as TOKENIZED or UN_

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Philip Brown
; 1) wether case matters is determined enitrely by your analyzer, if it >produces differnet tokens for "Blue" and "BLUE" then case matters > 2) use TOKENIZED or your Analyzer will be completely irrelevant > 3) if you observse something working differently then you

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Mark Miller
okens indicate that you are going for a phrase search. A phrase search is generated. A phrase search with stopwords removed has interesting sloppy matching. A phrase search can also match out of order given enough slop. This is normally fine behavior for most applications I can think of. You need to con

Re: Phrase search using quotes -- special Tokenizer

2006-09-05 Thread Chris Hostetter
: So, if I do as you suggest below (using PerFieldAnalyzerWrapper with : StandardAnalyzer) then I still need to enclose in quotes the phrases : (keywords with spaces) when I issue the search, and they are only returned Yes, quotes will be neccessary to tell the QueryParser "this is one chunk of t

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Philip Brown
-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Phrase-search-using-quotesspecial-Tokenizer-tf2200760.html#a6145591 Sent from the Lucene - Java Users forum at Nabble.com.

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Chris Hostetter
: Yeah, they are more complex than the "exactish" match -- basically, there are : more fields involved -- combined sometimes with AND and sometimes with OR, : and sometimes negated field values, sometimes groupings, etc. These other : field values are all single words (no spaces), and a search mi

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Mark Miller
TermQuery out of it for the neccessary field. : > : > ...that's it. that's all she wrote -- don't even look in QueryParser's : > general direction, at all. : > : > : > : > -Hoss : > : > : >

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Philip Brown
want. > : > b) use this Analyzer when you add the fields to your documents, even > : > though you don't want *real* tokenization, add make the field type > : > TOKENIZED so your analyzer gets used. > : > c) when you get some text input to serach on, pass it to the same > : > Analyzer, take the Token you get back and manualy

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Mark Miller
e text input to serach on, pass it to the same : > Analyzer, take the Token you get back and manualy construct a : > TermQuery out of it for the neccessary field. : > : > ...that's it. that's all she wrote -- don't even look in QueryParser's : > genera

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Chris Hostetter
ZED so your analyzer gets used. : > c) when you get some text input to serach on, pass it to the same : > Analyzer, take the Token you get back and manualy construct a : > TermQuery out of it for the neccessary field. : > : > ...that's it. that's all she wrote -- don't even

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Philip Brown
> c) when you get some text input to serach on, pass it to the same > Analyzer, take the Token you get back and manualy construct a > TermQuery out of it for the neccessary field. > > ...that's it. that's all she wrote -- don't even look in QueryParser's &

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Erick Erickson
Yeah, what he said On 9/3/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: I haven't really been following this thread, but it's gotten so long i got interested. from whta i can tell skimming the discussion so far, it seems like the biggest confusion is about the definition of a "phrase" a

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Chris Hostetter
I haven't really been following this thread, but it's gotten so long i got interested. from whta i can tell skimming the discussion so far, it seems like the biggest confusion is about the definition of a "phrase" and what analyzers do with "quote" characters and what the QueryParser does with "q

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Philip Brown
> >> > - Mark >> >>> >> > >> >>> >> > On 9/1/06, Philip Brown <[EMAIL PROTECTED]> wrote: >> >>> >> >> >> >>> >> >> >> >>> >> >> Well, I tried that, and it doesn't seem

Re: Phrase search using quotes -- special Tokenizer

2006-09-03 Thread Erick Erickson
t;>> >> >> phrases. From >>> http://lucene.apache.org/java/docs/api/index.html --> >>> A >>> >> >> Phrase is a group of words surrounded by double quotes such as >>> "hello >>> >> >> dolly". So, this should be easy, righ

Re: Phrase search using quotes -- special Tokenizer

2006-09-02 Thread Philip Brown
t; the >>> >> >> NUM >>> >> >> > token but nothing I'd worry about. maybe you want to use Unicode >>> for >>> >> >> &#

Re: Phrase search using quotes -- special Tokenizer

2006-09-02 Thread Mark Miller
| "." ( ".")+ > >> >> > >> >> > // company names like AT&T and [EMAIL PROTECTED] >> >> > | ("&"|"@") > >> >> > >> >> > // email addresses >> >> > |

Re: Phrase search using quotes -- special Tokenizer

2006-09-02 Thread Erick Erickson
7;" )+ > >> >> > >> >> > // acronyms: U.S.A., I.B.M., etc. >> >> > // use a post-filter to remove dots >> >> > | "." ( ".")+ > >> >> > >> >> > // company names like AT&T and [EMAIL PROTECTED] >> >> > | ("&"|"@") > >> >> > >> >> > //

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
t;_"|"-"|"/"|"."|",") > | <#HAS_DIGIT: // at least one digit (|)* (|)* > | < #ALPHA: ()+> | < #LETTER: // unicode letters [ "\u0041"-"\u005a", "\u0061"-"\u007a", "\u00c0"-"

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Mark Miller
ode letters [ "\u0041"-"\u005a", "\u0061"-"\u007a", "\u00c0"-"\u00d6", "\u00d8"-"\u00f6", "\u00f8"-"\u00ff", "\u0100"-"\u1fff", &qu

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
, etc. >> >> > // use a post-filter to remove dots >> >> > | "." ( ".")+ > >> >> > >> >> > // company names like AT&T and [EMAIL PROTECTED] >> >> > | ("&"|"@") > >> >> > >> >> > // email addresses >> >> > | (("."|"-"|"_") )* "@

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Erick Erickson
| >> > | ( )+ >> >| ( )+ >> >|( )+ >> >|( )+ >> > ) >> > > >> > | <#P: ("_"|"-"|"/"|"."|",") > >> > | &

Re: Phrase search using quotes -- special Tokenizer

2006-09-01 Thread Philip Brown
s, etc. >> > // every other segment must have at least one digit >> > | >> >| >> > | ( )+ >> >| ( )+ >> >|( )+ >> >|( )+ >> >

  1   2   >