Re: Poor performances with Shingle and Phrase query

2016-01-21 Thread Bertil Chapuis
chmark is using a lot of memory (~40-50%) and according to the javadoc the benchmark script I run is single threaded and the cpu usage reflect that (~100%). Are there some other parameters I should check? Thank you very much. On 21 January 2016 at 21:14, Michael McCandless wrote: > Shingles sh

Re: Poor performances with Shingle and Phrase query

2016-01-21 Thread Michael McCandless
Shingles should make a huge different on phrase query performance if 1) the phrase queries involve high frequency terms and 2) you have a substantial number of documents in the index (so that time-to-visit-postings dominates over time-to-lookup-terms). 118 rec/sec is already very fast for a long

Re: Poor performances with Shingle and Phrase query

2016-01-21 Thread Jack Krupansky
Be sure to check and see if your app is compute or I/O bound during this process - whether too little of your index is cached in system memory and each query requires I/O, lots of it. -- Jack Krupansky On Thu, Jan 21, 2016 at 1:52 PM, Doug Turnbull < dturnb...@opensourceconnections.com> wrote: >

Re: Poor performances with Shingle and Phrase query

2016-01-21 Thread Doug Turnbull
In my experience, shingles can hurt query performance because the term dictionary grows quite a bit. There's far more unique bigrams than there are words. While the lookup time doesn't grow linearly with the number of terms, it still grows. I haven't specifically compared performance numbers shing

Poor performances with Shingle and Phrase query

2016-01-21 Thread Bertil Chapuis
Hello, I'm trying improve the speed of an index when searching for long phrases. I performed some tests with the benchmark module. With a simple analyser and PhraseQueries and get a throughput of 118 rec/sec. My test dataset is the latest dump of wikipedia. Here is the filters I use at indexation

Re: Using ​phrase query in Termfilters

2015-11-25 Thread Kumaran Ramasubramanian
com] > > Sent: Wednesday, November 25, 2015 9:13 AM > > To: java-user@lucene.apache.org > > Subject: Using ​phrase query in Termfilters > > > > Hi All, > > > >Am using lucene 4.10.4. Is it right to add analyzed multi valued > fields > > & phrase quer

RE: Using ​phrase query in Termfilters

2015-11-25 Thread Uwe Schindler
nal Message- > From: Kumaran Ramasubramanian [mailto:kums@gmail.com] > Sent: Wednesday, November 25, 2015 9:13 AM > To: java-user@lucene.apache.org > Subject: Using ​phrase query in Termfilters > > Hi All, > >Am using lucene 4.10.4. Is it right to add a

Using ​phrase query in Termfilters

2015-11-25 Thread Kumaran Ramasubramanian
Hi All, Am using lucene 4.10.4. Is it right to add analyzed multi valued fields & phrase query for the same field in boolean filter. i believe we could not apply analyzers to values in filters. So am not getting results for those filters' match. String phraseTerm = "hello worl

Phrase Query - Increment Position

2015-06-01 Thread patel mrugesh
Hi, I am facing an issue with phrase query and increment Position. I have tried following queries and although there were data, 0 result returned. 2) Search Query --> name:"at&t inc" Parsed Query --> +name:"at&t inc" Result return 

Phrase Query With Special Character

2015-05-27 Thread patel mrugesh
Hi, I am facing an issue with phrase query having special character (like &, dot, comma, : etc). I have tried following queries and although there were data, 0 result returned. 1) Search Query  --> name:"Pep:Trans vaccines, GSK" Parsed Query --> +name:"pep:trans v

Re: Phrase query given a word

2015-04-23 Thread Ahmet Arslan
Hi, May be LUCENE-5317 relevant? Ahmet On Thursday, April 23, 2015 8:33 PM, Shashidhar Rao wrote: Hi, I have a large text and from that I need to calculated the top frequencies of words , say 'Driving' occurs the most. Now , I need to find phrase containing 'Driving' in the given text and th

Phrase query given a word

2015-04-23 Thread Shashidhar Rao
Hi, I have a large text and from that I need to calculated the top frequencies of words , say 'Driving' occurs the most. Now , I need to find phrase containing 'Driving' in the given text and the frequency count of that phrase. The phrase could be three words where driving could be in the middle

phrase query, stop words, and highlighting?

2014-09-22 Thread Rob Nikander
Hi, I just noticed that a search like "rooms to go" is failing to highlight. (I use FastVectorHighlighter). I know it's caused the stop word (to). Is there a recommended way to fix this? I may just re-index without stop words, and see if that causes any problems. thanks, Rob

Re: How to not span fields with phrase query?

2014-08-28 Thread craiglang44
`getPositionIncrementGap` Sent from my BlackBerry® smartphone -Original Message- From: Rob Nikander Date: Thu, 28 Aug 2014 10:26:00 To: Reply-To: java-user@lucene.apache.org Subject: Re: How to not span fields with phrase query? Thank you for the explanation. I subclassed Analyzer

Re: How to not span fields with phrase query?

2014-08-28 Thread craiglang44
`getPositionIncrementGap` Sent from my BlackBerry® smartphone -Original Message- From: Rob Nikander Date: Thu, 28 Aug 2014 10:26:00 To: Reply-To: java-user@lucene.apache.org Subject: Re: How to not span fields with phrase query? Thank you for the explanation. I subclassed Analyzer

Re: How to not span fields with phrase query?

2014-08-28 Thread Rob Nikander
Thank you for the explanation. I subclassed Analyzer and overrode `getPositionIncrementGap` for this field. It appears to have worked. Rob On Thu, Aug 28, 2014 at 10:21 AM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: > Usually that's referred to as multiple "values" for the same f

Re: How to not span fields with phrase query?

2014-08-28 Thread Michael Sokolov
Usually that's referred to as multiple "values" for the same field; in the index there is no distinction between title:C and title:X as far as which field they are in -- they're in the same field. If you want to prevent phrase queries from matching B C X, insert a position gap between C and X;

How to not span fields with phrase query?

2014-08-28 Thread Rob Nikander
Hi, If I have document with multiple fields "title" title: A B C title: X Y Z A phrase search for title:"B C X" matches this document. Can I prevent that? thanks, Rob

Re: Example phrase query with lucene version 4

2013-03-12 Thread Arlei Ferreira Farnetani Junior
rect_hits.3F > . > > > -- > Ian. > > > On Tue, Mar 12, 2013 at 12:45 AM, Arlei Ferreira Farnetani Junior > wrote: > > Hello, could someone give me an example of how to conduct a search in an > > already built index with Lucene 4 mode phrase query using a specific

Re: Example phrase query with lucene version 4

2013-03-12 Thread Ian Lea
-- Ian. On Tue, Mar 12, 2013 at 12:45 AM, Arlei Ferreira Farnetani Junior wrote: > Hello, could someone give me an example of how to conduct a search in an > already built index with Lucene 4 mode phrase query using a specific > analyzer. I tested here with the phrase the search qu

Re: using phrase query with wildcard

2012-07-23 Thread Ahmet Arslan
> I'm trying to create a phrase query with wildcard, from the > forums it seems that the solution is not trivial. > I'm trying to create the following queries: "this is a > phrase*"  OR  "*This is a phrase" and > Get hits on every possibility where the

Re: using phrase query with wildcard

2012-07-22 Thread Jack Krupansky
, 2012 4:51 AM To: java-user@lucene.apache.org Subject: RE: using phrase query with wildcard It can be both. -Original Message- From: Doron Yaacoby [mailto:dor...@gingersoftware.com] Sent: יום א 22 יולי 2012 11:48 To: java-user@lucene.apache.org Subject: RE: using phrase query with wildcard Is

RE: using phrase query with wildcard

2012-07-22 Thread Levin, Ilya
It can be both. -Original Message- From: Doron Yaacoby [mailto:dor...@gingersoftware.com] Sent: יום א 22 יולי 2012 11:48 To: java-user@lucene.apache.org Subject: RE: using phrase query with wildcard Is * a placeholder for a term or a part of a term? -Original Message- From

RE: using phrase query with wildcard

2012-07-22 Thread Doron Yaacoby
Is * a placeholder for a term or a part of a term? -Original Message- From: Levin, Ilya [mailto:ilya.le...@hp.com] Sent: 22 July 2012 11:29 To: java-user@lucene.apache.org Subject: using phrase query with wildcard Hi, I'm trying to create a phrase query with wildcard, from the f

using phrase query with wildcard

2012-07-22 Thread Levin, Ilya
Hi, I'm trying to create a phrase query with wildcard, from the forums it seems that the solution is not trivial. I'm trying to create the following queries: "this is a phrase*" OR "*This is a phrase" and Get hits on every possibility where the * resides. What i

Re: phrase query highlighter spans matching

2011-03-08 Thread shrinath.m
How to use it ? Example please ? Regards - Shrinath. M -- View this message in context: http://lucene.472066.n3.nabble.com/phrase-query-highlighter-spans-matching-tp828257p2653941.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Phrase query with boolean matches

2011-02-14 Thread Christopher Condit
nts in the index. I know that "and" is a stop word, but I'm curious why it's translated into ? instead of a * during this parsing (or just left along because it's a phrase query)... Can I escape boolean keywords somehow? Here&#x

RE: lucene 3.0.3 | phrase query problem

2011-02-11 Thread Zhang, Lisheng
bruary 10, 2011 10:41 PM To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org Subject: lucene 3.0.3 | phrase query problem Hi Anshum, Thanks for your replay.. Yes, I am agree with you. As right now, I am using StandardAnalyzer it remove stop words, Puts text in lowercase and do not crea

lucene 3.0.3 | phrase query problem

2011-02-10 Thread Ranjit Kumar
Hi Anshum, Thanks for your replay.. Yes, I am agree with you. As right now, I am using StandardAnalyzer it remove stop words, Puts text in lowercase and do not create index for most common word in English. Searching on index created by StandardAnalyzer it gives result as discussed

Re: lucene 3.0.3 | phrase query problem

2011-02-10 Thread Anshum
10, 2011 at 7:59 PM, Ranjit Kumar wrote: > searchString = "i am using sql. server setting is easy task."; > > > > while i am searching for phrase query "Sql Server" in above string it gives > result which is not correct. As In the above string sql and server

lucene 3.0.3 | phrase query problem

2011-02-10 Thread Ranjit Kumar
searchString = "i am using sql. server setting is easy task."; while i am searching for phrase query "Sql Server" in above string it gives result which is not correct. As In the above string sql and server is seperated by dot(.) using both PhraseQuery and SpanQuery giv

RE: lucene 3.0.3 | phrase query problem

2011-02-09 Thread Zhang, Lisheng
uot;sql. server" we should not get result? Best regards, Lisheng -Original Message- From: Ranjit Kumar [mailto:ranjit.ku...@otssolutions.com] Sent: Wednesday, February 09, 2011 9:39 PM To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org Subject: lucene 3.0.3 | phrase que

lucene 3.0.3 | phrase query problem

2011-02-09 Thread Ranjit Kumar
Hi, I am using SpanQuery and SpanNearQuery to get phrase query like "Sql Server". In my text file in which I am searching, it is present like (sql. server) mean 'sql dot server' which is not a span like "Sql Server". While searching for phrase query "Sq

Re: Phrase query on multiple fields

2011-01-22 Thread amg qas
query > yourself.  The latter is quite straightforward: > >  BooleanQuery bq = new BooleanQuery(); >  PhraseQuery pq1 = ...; >  PhraseQuery pq2 = ...; >  bq.add(pq1, ...); >  bq.add(pq2, ...); >  etc. > > > -- > Ian. > > > On Thu, Jan 20, 2011 at 3:13 AM, amg

Re: Phrase query on multiple fields

2011-01-20 Thread Ian Lea
bq.add(pq1, ...); bq.add(pq2, ...); etc. -- Ian. On Thu, Jan 20, 2011 at 3:13 AM, amg qas wrote: > Hi, > > I have two question regarding phrase query : > > 1) How can I execute a phrase query over multiple fields ? I can only > get PhraseQuery to work over a single field -

Phrase query on multiple fields

2011-01-19 Thread amg qas
Hi, I have two question regarding phrase query : 1) How can I execute a phrase query over multiple fields ? I can only get PhraseQuery to work over a single field - For eg something like this : PhraseQuery query = new PhraseQuery(); query.setSlop

Re: phrase query highlighter spans matching

2010-05-31 Thread Koji Sekiguchi
(10/05/19 13:58), Li Li wrote: hi all, I read lucene in action 2nd Ed. It says SimpleSpanFragmenter will "make fragments that always include the spans matching each document". And also a SpanScorer existed for this use. But I can't find any class named SpanScorer in lucene 3.0.1. And the res

phrase query highlighter spans matching

2010-05-18 Thread Li Li
hi all, I read lucene in action 2nd Ed. It says SimpleSpanFragmenter will "make fragments that always include the spans matching each document". And also a SpanScorer existed for this use. But I can't find any class named SpanScorer in lucene 3.0.1. And the result of HighlighterTest class in c

Re: Phrase query with terms at same location

2009-11-19 Thread Erick Erickson
Hmmm, you're beyond what I've tried to do, so all I can do is speculate. But I don't believe that two terms on top of each other are considered when calculating slop. But I really don't know for sure, so I'd create a couple of unit tests to verify. You're right, the combinatorial explosion with pu

Re: Phrase query with terms at same location

2009-11-19 Thread Christopher Tignor
Thanks again for this. I would like to able to do several things with this data if possible. As per Mark's post, I'd like to be able to query for phrases like "He _v"~1 (where _v is my verb part of speech token) to recover string like: "He later apologized". This already in fact seems to be worki

Re: Phrase query with terms at same location

2009-11-19 Thread Erick Erickson
Ahhh, I should have followed the link. I was interpreting your first note as emitting two tokens NOT at the same offset. My mistake, ignore my nonsense about unexpected consequences. Your original assumption is correct, zero offsets are pretty transparent. What do you really want to do here? Mark'

Re: Phrase query with terms at same location

2009-11-19 Thread Christopher Tignor
Thanks, Erick - Indeed every word will have a part of speech token but Is this how the slop actually works? My understanding was that if I have two tokens in the same location then each will not effect searches involving other in terms of the slop as slop indicates the number of words *between* s

Re: Phrase query with terms at same location

2009-11-19 Thread Erick Erickson
If I'm reading this right, your tokenizer creates two tokens. One "report" and one "_n"... I suspect if so that this will create some "interesting" behaviors. For instance, if you put two tokens in place, are you going to double the slop when you don't care about part of speech? Is every word going

Phrase query with terms at same location

2009-11-18 Thread Christopher Tignor
Hello, I have indexed words in my documents with part of speech tags at the same location as these words using a custom Tokenizer as described, very helpfully, here: http://mail-archives.apache.org/mod_mbox/lucene-java-user/200607.mbox/%3c20060712115026.38897.qm...@web26002.mail.ukl.yahoo.com%3e

Re: term position in phrase query using queryparser

2009-03-02 Thread Matt Ronge
, but I don't know how to search for basically two tokens at the same term position using the queryparser syntax. I don't think this is available from the QueryParser. You could make a subclass that does this for the phrase query syntax. So if you have something like "term1 term

term position in phrase query using queryparser

2009-02-25 Thread Tim Williams
Is there a syntax to set the term position in a query built with queryparser? For example, I would like something like: PhraseQuery q = new PhraseQuery(); q.add(t1, 0); q.add(t2, 0); q.setSlop(0); As I understand it, the slop defaults to 0, but I don't know how to search for basically two tokens

Re: Phrase query-like query that doesn't requre all the terms?

2008-11-14 Thread Yonik Seeley
On Fri, Nov 14, 2008 at 12:05 PM, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote: > My problem with Phrase Query is that it requires > existence of all the terms in documents. I want them more > permissible. I want it to match with lower score. > Does dismax also requires a

RE: Phrase query-like query that doesn't requre all the terms?

2008-11-14 Thread Teruhiko Kurosaka
Yonik, Thank you for your reply. My problem with Phrase Query is that it requires existence of all the terms in documents. I want them more permissible. I want it to match with lower score. Does dismax also requires all the terms? > Solr's dismax parser can generate queries that do

Re: Phrase query-like query that doesn't requre all the terms?

2008-11-14 Thread Yonik Seeley
Solr's dismax parser can generate queries that do most of this... it's a combination of term queries and sloppy phrase queries. Simplest example: +(DEF GHI) "DEF GHI"~10^5 The only thing that it doesn't work for is the terms out of order (they will still be matched). You could use span queries i

Phrase query-like query that doesn't requre all the terms?

2008-11-14 Thread Teruhiko Kurosaka
PhraseQuery requires all the terms in the phrase exists in the field being searched. I am looking for a more permissible version of PhraseQuery which is sensitive to the order of the terms but allows missing terms, which would lower the score but still matches. For example, query "DEF GHI" would

Re: Exact Phrase Query

2008-11-02 Thread semelak ss
g forward to your response.) --- On Sun, 11/2/08, Erick Erickson <[EMAIL PROTECTED]> wrote: > From: Erick Erickson <[EMAIL PROTECTED]> > Subject: Re: Exact Phrase Query > To: java-user@lucene.apache.org, [EMAIL PROTECTED] > Date: Sunday, November 2, 2008, 12:11 PM &

Re: Exact Phrase Query

2008-11-02 Thread Erick Erickson
: score and words > and that no tokenization is needed, what would be the most efficient way for > implementing this index using Lucene ? > > > --- On Sun, 11/2/08, semelak ss <[EMAIL PROTECTED]> wrote: > > > From: semelak ss <[EMAIL PROTECTED]> > > Subject

Re: Exact Phrase Query

2008-11-02 Thread semelak ss
ting this index using Lucene ? --- On Sun, 11/2/08, semelak ss <[EMAIL PROTECTED]> wrote: > From: semelak ss <[EMAIL PROTECTED]> > Subject: Re: Exact Phrase Query > To: java-user@lucene.apache.org > Date: Sunday, November 2, 2008, 7:26 AM > I was in a hurry when copy

Re: Exact Phrase Query

2008-11-02 Thread semelak ss
t way of handling things. I would appreciate any input in this regard on how to improve the efficiency. --- On Sat, 11/1/08, Erick Erickson <[EMAIL PROTECTED]> wrote: > From: Erick Erickson <[EMAIL PROTECTED]> > Subject: Re: Exact Phrase Query > To: java-user@lucene.ap

Re: Exact Phrase Query

2008-11-01 Thread Erick Erickson
th the > indexWriter, but I can not pinpoint the exact cause of the problem. > > > --- On Sat, 11/1/08, semelak ss <[EMAIL PROTECTED]> wrote: > > > From: semelak ss <[EMAIL PROTECTED]> > > Subject: Re: Exact Phrase Query > > To: java-user@lucene.apache.org

Re: Exact Phrase Query

2008-11-01 Thread semelak ss
thing to do with the indexWriter, but I can not pinpoint the exact cause of the problem. --- On Sat, 11/1/08, semelak ss <[EMAIL PROTECTED]> wrote: > From: semelak ss <[EMAIL PROTECTED]> > Subject: Re: Exact Phrase Query > To: java-user@lucene.apache.org > Date: Saturda

Re: Exact Phrase Query

2008-11-01 Thread semelak ss
s (their size is 0) --- On Fri, 10/31/08, semelak ss <[EMAIL PROTECTED]> wrote: > From: semelak ss <[EMAIL PROTECTED]> > Subject: Re: Exact Phrase Query > To: java-user@lucene.apache.org > Date: Friday, October 31, 2008, 9:41 AM > For indexing, I use

Re: Exact Phrase Query

2008-10-31 Thread semelak ss
08, Erick Erickson <[EMAIL PROTECTED]> wrote: > From: Erick Erickson <[EMAIL PROTECTED]> > Subject: Re: Exact Phrase Query > To: java-user@lucene.apache.org, [EMAIL PROTECTED] > Date: Friday, October 31, 2008, 5:57 AM > You need to give us more information for meaningful repli

Re: Exact Phrase Query

2008-10-31 Thread Erick Erickson
You need to give us more information for meaningful replies, like the analyzers you use when indexing and searching, the exact query you use, perhaps the snippets of the code, etc. That said, things to check: Get a copy of Luke and examine your index. You can even run queries through that tool and

Re: Exact Phrase Query

2008-10-31 Thread semelak ss
semelak ss <[EMAIL PROTECTED]> wrote: > From: semelak ss <[EMAIL PROTECTED]> > Subject: Exact Phrase Query > To: java-user@lucene.apache.org > Date: Friday, October 31, 2008, 5:44 AM > I have documents containing multiple words in the the field > "word" > f

Exact Phrase Query

2008-10-31 Thread semelak ss
I have documents containing multiple words in the the field "word" for example, one of the documents contain in the field "word" the following: homeowners work When searching for single words (i.e. homewoners ) I get hits. However, searching for the exact phrase "homeowners work" gives me no hits

Re: Phrase Query

2008-09-17 Thread Cam Bazz
" - so I want a tokenizer take those two, and make two tokens as keyword analyzer would do. This way one can search bidirectionally edges of a graph with one phrase query of slop 2. Is it possible to do such a manupulation? Best. On Tue, Sep 16, 2008 at 7:15 PM, Otis Gospodnetic <[EMAIL PROTE

Re: Phrase Query

2008-09-16 Thread Otis Gospodnetic
Are the terms stopwords? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Cam Bazz <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Tuesday, September 16, 2008 1:33:48 AM > Subject: Phrase Query > >

Re: Phrase Query

2008-09-16 Thread Antony Bowesman
Is it possible to write a document with different analyzers in different fields? PerFieldAnalyzerWrapper - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Phrase Query

2008-09-15 Thread Cam Bazz
Hello, Lets say I have two documents, both containing field F. document 0 has the string "a b" as F document 1 has the string "b a" as F I am trying to make a phrasequery like: PhraseQuery pq = new PhraseQuery(); pq.add(new Term("F", "a")); pq.add(new Term("F", "b"));

Re: Phrase Query

2008-09-15 Thread Cam Bazz
I noticed this was because I was using a KeywordAnalyzer. Is it possible to write a document with different analyzers in different fields? Best. On Tue, Sep 16, 2008 at 8:33 AM, Cam Bazz <[EMAIL PROTECTED]> wrote: > Hello, > > Lets say I have two documents, both containing field F. > > document

Re: combine wildcard and phrase query

2008-03-07 Thread Chris Hostetter
: No, as far as I know you can't combine wildcards in phrases. This would The QueryParser doesn't support it, and there is no native query type for it, but if you are willing to do the query expansion yourself, you can build a MultiPhraseQuery (where you generate the terms using a WildcardTerm

Re: combine wildcard and phrase query

2008-03-07 Thread Erick Erickson
201 > signature: LA A 202 > signature: LA B 200 > signature: LC B 300 > Now i use getFields and search them. > Let's assume i'm searching for a signature like "LA B 200". If i use a > phrase query, no problem. I search all the fields and the only if the > field

Re: combine wildcard and phrase query

2008-03-07 Thread JensBurkhardt
assume i'm searching for a signature like "LA B 200". If i use a phrase query, no problem. I search all the fields and the only if the field value and query exactly match, i get a hit. But what if you want to use wildcards and search for something like LA A 20*. Now all the LA signatures w

Re: combine wildcard and phrase query

2008-03-06 Thread JensBurkhardt
has several signature numbers i want to save them in a field >> signature and when i search for such a number i want the search hit every >> single field and not all fields together. >> Right now i separate the string using an unique separator (in this case >> just >> $

Re: combine wildcard and phrase query

2008-03-06 Thread Erick Erickson
; Right now i separate the string using an unique separator (in this case > just > $$$) so i can split the string into the numbers but i think this is kinda > the worst form doing it. > > > > > JensBurkhardt wrote: > > > > hey everybody, > > > >

Re: combine wildcard and phrase query

2008-03-06 Thread JensBurkhardt
: > > hey everybody, > > I'm wondering if it's possible to combine wildcards and phrase query. > > For example "term1 term*" > > I know that the documentation says "Lucene supports single and multiple > character wildcard searches within single

combine wildcard and phrase query

2008-03-06 Thread JensBurkhardt
hey everybody, I'm wondering if it's possible to combine wildcards and phrase query. For example "term1 term*" I know that the documentation says "Lucene supports single and multiple character wildcard searches within single terms (not within phrase queries)"

Re: Phrase Query Problem

2007-12-18 Thread Erick Erickson
ing is to make sure the exact query requirement, > then picking up analyzer. > > Best regards, Lisheng > > -- > View this message in context: > http://www.nabble.com/Phrase-Query-Problem-tp14373945p14404143.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >

RE: Phrase Query Problem

2007-12-18 Thread Sirish Vadala
hing is to make sure the exact query requirement, then picking up analyzer. Best regards, Lisheng -- View this message in context: http://www.nabble.com/Phrase-Query-Problem-tp14373945p14404143.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. ---

RE: Phrase Query Problem

2007-12-18 Thread Zhang, Lisheng
ish Vadala [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 18, 2007 10:26 AM To: java-user@lucene.apache.org Subject: RE: Phrase Query Problem ok, thnx... I will implement using the WhiteSpaceAnalyzer... Let me check the indexing speed... I mean time taken to index my data set... If that takes t

RE: Phrase Query Problem

2007-12-18 Thread Sirish Vadala
-Original Message- > From: mark harwood [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 18, 2007 9:42 AM > To: java-user@lucene.apache.org > Subject: Re: Phrase Query Problem > > > You could write a custom analyzer that drops stopwords but adds an extra 1 > to the &

RE: Phrase Query Problem

2007-12-18 Thread Zhang, Lisheng
- From: mark harwood [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 18, 2007 9:42 AM To: java-user@lucene.apache.org Subject: Re: Phrase Query Problem You could write a custom analyzer that drops stopwords but adds an extra 1 to the "positionIncrement" property for the next v

Re: Phrase Query Problem

2007-12-18 Thread mark harwood
ecause the remaining words are not recorded as being directly next to each other) Cheers Mark - Original Message From: Sirish Vadala <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, 18 December, 2007 5:10:19 PM Subject: RE: Phrase Query Problem Yes... If my

RE: Phrase Query Problem

2007-12-18 Thread Sirish Vadala
ers out "and" (also "or, "in" and others) as stop > words during indexing, and the QueryParser filters those > words out also. > > Best regards, Lisheng > > -Original Message- > From: Sirish Vadala [mailto:[EMAIL PROTECTED] > Sent: Monday,

RE: Phrase Query Problem

2007-12-17 Thread Zhang, Lisheng
Hi Sirish, A few hours ago I sent a reply to your message, if my understanding is correct, you indexed a doc with text as Health and Safety and you used phrase Health Safety to create a phrase query. If that is the case, this is normal since you used StandardAnalyzer to tokenize the input

RE: Phrase Query Problem

2007-12-17 Thread Zhang, Lisheng
eryParser filters those words out also. Best regards, Lisheng -Original Message- From: Sirish Vadala [mailto:[EMAIL PROTECTED] Sent: Monday, December 17, 2007 9:49 AM To: java-user@lucene.apache.org Subject: Phrase Query Problem I have the following code for search: BooleanQuery b

Phrase Query Problem

2007-12-17 Thread Sirish Vadala
archer.search(bQuery, sort); Now My problem here is: If I do a search on a phrase with text Health Safety, it is fetching me all the records where in the text is Health and/or/in Safety. It is fetching me these records even after setting the slop of the phrase query to zero for exact match. I am u

Re: Maximum phrase query?

2007-07-30 Thread Erick Erickson
not that I know of Erick On 7/30/07, Max Metral <[EMAIL PROTECTED]> wrote: > > I have a set of tags associated with content in my corpus. I also have > normal text. Our system tries to figure out which "words" are tags and > which are text, and falls back on text when tags fail. I'm wonder

Maximum phrase query?

2007-07-30 Thread Max Metral
I have a set of tags associated with content in my corpus. I also have normal text. Our system tries to figure out which "words" are tags and which are text, and falls back on text when tags fail. I'm wondering, is there anything in Lucene which might help disambiguate multi-word tags from text?

Re: highlighting phrase query

2007-07-03 Thread Mark Miller
has any one used Lucene-794? how stable it it. is it widely used in industry. I have used it extensively and I would say it is extremely stable. As I said, much of the code from it is literally the same compiled code from Contrib Highlighter (It is really just a new Scorer class for the

Re: highlighting phrase query

2007-07-02 Thread sandeep chawla
to my manager. :) --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Monday, July 02, 2007 2:11 PM To: java-user@lucene.apache.org Subject: Re: highlighting phrase query There has been a lot of Highlighter discussion lately, but just to try and sum up the st

RE: highlighting phrase query

2007-07-02 Thread Renaud Waldura
Mark: Thanks a million for this comprehensive analysis. This is going straight to my manager. :) --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Monday, July 02, 2007 2:11 PM To: java-user@lucene.apache.org Subject: Re: highlighting phrase query There

Re: highlighting phrase query

2007-07-02 Thread Mark Miller
There has been a lot of Highlighter discussion lately, but just to try and sum up the state of Highlighting in the Lucene world: There are four Highlighter implementations that I know of. From what I can tell, only the original Contrib Highlighter has received sustained active development by m

highlighting phrase query

2007-07-02 Thread sandeep chawla
Hi All, I am developing a search tool using lucene. I am using lucene 2.1. i have a requirement to highlight query words in the results. .Lucene-highlighter 2.1 doesn't work well in highlighting phase query. For example - if i have a query string "lucene Java" .It highlights not only occurrence

Re: I have a question about phrase query with stop words

2007-04-13 Thread Paul Elschot
is a stop work, so matching "find an answer" is as expected, but > > there > > is no stop word between "you" and "find" in the original input string. I > > do > > not see why "you find an answer" matches. > > > > What am

Re: I have a question about phrase query with stop words

2007-04-12 Thread Erick Erickson
As I understand it, there really is no "space indicator". I think of it as replacing the stop word with a space, which is then discarded. so, you're indexing 'you find answer', and both your searches are looking for 'you find answer', the stop words are just gone as though they never were. So bo

I have a question about phrase query with stop words

2007-04-12 Thread Bill Taylor
I found some discussions of this question from back in 2003, but that was many updates ago. I have built an index using the standard stop analyser which uses the standard list of stop words. "will" and :the" are stop words. As I understand analyzers and phrase queries, when I search for you wi

Re: Lucene 2.1, inconsistent phrase query results with slop

2007-03-08 Thread Erick Erickson
Sorry about that. I think II found the diagram you're talking about on page 89. It even addresses the exact problem I'm talking about. It's not the first time I've looked like a fool, you'd think I'd be getting used to it by now . So, it seems like the most reasonable solution to this issue woul

Re: Lucene 2.1, inconsistent phrase query results with slop

2007-03-08 Thread Chris Hostetter
: I think that's "working as designed". Although I could understand : someone wanting it to work differently. The slop is sort of like the : edit distance from the current given phrase, hence the order of terms : in the phrase matters. correct ... LIA has a great diagram explaining this ... th

Re: Lucene 2.1, inconsistent phrase query results with slop

2007-03-08 Thread Yonik Seeley
On 3/8/07, Erick Erickson <[EMAIL PROTECTED]> wrote: In a nutshell, reversing the order of the terms in a phrase query can result in different hit counts. That is, "person place"~3 may return different results from "place person"~3, depending on the number of interveni

Lucene 2.1, inconsistent phrase query results with slop

2007-03-08 Thread Erick Erickson
In a nutshell, reversing the order of the terms in a phrase query can result in different hit counts. That is, "person place"~3 may return different results from "place person"~3, depending on the number of intervening terms. There's a self-contained program below

wildcard in phrase query: problem with idf / scoring; QueryParser; MultiPhraseQuery

2006-07-03 Thread W.H. van Atteveldt
Assuming it is, I subclassed the QueryParser to handle phrase prefixes, replacing getFieldQuery by a version that calls the super, and if the result is a phrase query and the text contains an asterisk tries to create a MultiPhraseQuery. This seems to work ok, returning a MultiPhraseQuery "micr

Re: Phrase Query query

2006-03-28 Thread Richard Gunderson
Hi Otis Thanks for the information. I'm actually writing something to search files containing code (such as JSP files) so I do expect there will be a few problems like this because I guess Lucene's out-of-the box analyzers are really suited to natural languages. But, I was wondering if you could

Re: Phrase Query query

2006-03-27 Thread Otis Gospodnetic
!Character.isWhitespace(c); } Otis - Original Message From: Richard Gunderson <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, March 27, 2006 10:56:18 AM Subject: Phrase Query query Hi I'm using PhraseQuery in conjunction with WhiteSpaceAnalyzer but it's giving me

  1   2   >