Hi Steve, I went through the article. Thanks for the link. The span query mentions the position i.e n positions from the terms. The problem was like this :
Lucene was <some more text> made by Doug <some more text> cutting If Doug is found between the words Lucene and cutting then it is a hit. [there can be any number of positions which is unknown]. If the number of positions are known then the spans could be used. I came across qsol where in the paragraphseperator and sentence seperator can be specified and string can be searched within the paragraph. Can you give your comments. - Rgds Ba3 Steven A Rowe wrote: > > Hi ba3, > > Did you read the Lucid Imagination article I linked to?: > > http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ > > > It has examples, including specifying the term indicating the end of the > span. > > If the article doesn't do it for you, I need more information to be able > to help. Can you give an example of what you want to do? > > Thanks, > Steve > >> -----Original Message----- >> From: ba3 [mailto:sbadhrin...@gmail.com] >> Sent: Tuesday, July 28, 2009 10:39 PM >> To: java-user@lucene.apache.org >> Subject: RE: Multiline Regex with Lucene >> >> >> Hi Steve, >> >> In case of span queries, the span first query can specify the start of >> the >> span, is it possible to specify the term [not the position] indicating >> the >> end of the span ? >> >> -- Regards >> Ba3 >> >> >> Steven A Rowe wrote: >> > >> > Hi ba3, >> > >> > Check out the list of "Direct Known Subclasses" from the SpanQuery >> > javadocs to see what's available: >> > >> > >> http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/spans/ >> SpanQuery.html >> > >> > SpanRegexQuery may be what you're looking for: >> > >> > >> http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/regex/ >> SpanRegexQuery.html >> > >> > >> > Steve >> > >> >> -----Original Message----- >> >> From: ba3 [mailto:sbadhrin...@gmail.com] >> >> Sent: Tuesday, July 28, 2009 12:53 PM >> >> To: java-user@lucene.apache.org >> >> Subject: Re: Multiline Regex with Lucene >> >> >> >> >> >> Hi, >> >> >> >> Thanks for the pointers. I will try the span queries. >> >> But can span query support regexp as a term ? >> >> >> >> Also for more details in the problem : >> >> The problem is like this: >> >> find a search string inside a block of statements. >> >> The block starts with a string and ends with a character. >> >> >> >> -- Regards >> >> Ba3 >> >> >> >> >> >> >> >> Erick Erickson wrote: >> >> > >> >> > I doubt you're thinking in terms of tokens. Your inputstream is >> broken >> >> up >> >> > into tokens (think of them as words, >> >> > depending upon the analyzer) and regex searchers are >> >> > confined to those *tokens*. So the concept of a multi-line >> >> > regex in a search is kind of ...odd... >> >> > >> >> > You could possibly index your input as UN_TOKENIZED, but >> >> > I really have no clue what Lucene would do with that. I think >> >> > you're off in uncharted territory here. >> >> > >> >> > Perhaps a better thing would be for you to explain *why* you >> >> > want to do this and perhaps folks can come up with some >> >> > suggestions, I suspect this may be an XY problem, see >> >> > http://www.perlmonks.org/index.pl?node_id=542341 >> >> > >> >> > Best >> >> > Erick >> >> > >> >> > On Sun, Jul 26, 2009 at 9:52 AM, ba3 <sbadhrin...@gmail.com> >> wrote: >> >> > >> >> >> >> >> >> I was trying to do a regex search with the lucene and >> >> >> JavaUtilRegexCapabilities. >> >> >> The code used is : >> >> >> RegexQuery query = new RegexQuery(new >> >> >> Term("contents","(?m)hello.*(\r[^#]*)This is to be >> >> >> searched.*(\r[^#]*)#")); >> >> >> query.setRegexImplementation(new JavaUtilRegexCapabilities()); >> >> >> >> >> >> I verified the regex in : http://www.gskinner.com/RegExr/ [with >> the >> >> >> multi >> >> >> line checked] >> >> >> In lucene though there are no hits. Can you please point me in >> the >> >> right >> >> >> direction >> >> >> >> >> >> -- Rgds >> >> >> Ba3 >> >> >> >> >> >> Regex : >> >> >> hello.*(\r[^#]*)This is to be searched.*(\r[^#]*)# >> >> >> >> >> >> Content : >> >> >> hello world >> >> >> This is to be searched >> >> >> # >> >> >> Test line should not be selected >> >> >> hello >> >> >> This should not work >> >> >> some other lines >> >> >> # >> >> >> Not to be selected >> >> >> hello world >> >> >> Some lines >> >> >> This is to be searched >> >> >> Some lines >> >> >> # >> >> >> hello earth >> >> >> some lines >> >> >> # >> >> >> -- >> >> >> View this message in context: >> >> >> >> >> http://www.nabble.com/Multiline-Regex-with-Lucene- >> tp24667109p24667109.html >> >> >> Sent from the Lucene - Java Users mailing list archive at >> Nabble.com. >> >> >> >> >> >> >> >> >> ----------------------------------------------------------------- >> ---- >> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> >> >> >> > >> >> > >> >> >> >> -- >> >> View this message in context: http://www.nabble.com/Multiline-Regex- >> with- >> >> Lucene-tp24667109p24703547.html >> >> Sent from the Lucene - Java Users mailing list archive at >> Nabble.com. >> >> >> >> >> >> -------------------------------------------------------------------- >> - >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> > >> >> -- >> View this message in context: http://www.nabble.com/Multiline-Regex- >> with-Lucene-tp24667109p24711445.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- View this message in context: http://www.nabble.com/Multiline-Regex-with-Lucene-tp24667109p24723404.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org