The RegexQuery class uses that package, and for that reason the expression matches.
If my records contained only one word each, this code would work, but I need to apply that regular expression to a phrase... Ian Lea wrote: > > The default regex package is java.util.regex and I can't see anywhere > that you tell it to use the Jakarta regexp package. So I don't think > that ".in" will match. Also, you are storing your contents field as > NOT_ANALYZED so you will need to be wary of case sensitivity. Maybe > this is what you want, but maybe not. > > > -- > Ian. > > > On Mon, May 11, 2009 at 9:00 AM, Huntsman84 <tpgarci...@gmail.com> wrote: >> >> This is the code for searching: >> >> String index = "index"; >> String field = "contents"; >> IndexReader reader = IndexReader.open(index); >> Searcher searcher = new IndexSearcher(reader); >> >> System.out.println("Enter query: "); >> String line = ".IN.";//in jakarta regexp this is like * IN * >> RegexQuery rxquery = new RegexQuery(new Term(field,line)); >> Hits hits = searcher.search(rxquery); >> >> if(hits!=null){ >> for(int k = 0; k<100 && k<hits.length(); k++){ >> if(hits.doc(k)!=null) >> >> System.out.println(hits.doc(k).getField("contents").stringValue()); >> } >> } >> >> >> >> And this is the part of creating the index: >> >> >> File directory = new File("index"); >> IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), >> true, >> IndexWriter.MaxFieldLength.LIMITED); >> List<String> records = getRecords();//returns a list of record values >> from >> database, all of them are phrases >> Iterator<String> i = records.iterator(); >> while(i.hasNext()){ >> Document doc = new Document(); >> doc.add(new Field(field, i.next(), Field.Store.YES, >> Field.Index.NOT_ANALYZED)); >> writer.addDocument(doc); >> } >> writer.optimize(); >> writer.close(); >> >> >> >> This code works as I want but just matching with the first word of the >> phrase. I think the problem is the index building, but I don't know how >> to >> fix it... >> >> Any ideas? >> >> Thank you so much!! >> >> >> >> Steven A Rowe wrote: >>> >>> On 5/8/2009 at 9:13 AM, Ian Lee wrote: >>>> I'm surprised that it matches either - don't you need ".*in" where .* >>>> means match any character zero or more times? See the javadoc for >>>> java.util.regex.Pattern, or for Jakarta Regexp if you are using that >>>> package. >>>> >>>> Unless you're an expert in regexps it is probably worth playing with >>>> them outside your lucene code to start with e.g. with simple >>>> String.matches(regexp) calls. They can take some getting used to. >>>> And try to avoid anything with backslashes if you can! >>> >>> The java.util.regex.Pattern implementation (the default RegexQuery >>> implementation) actually uses Matcher.lookingAt(), which is equivalent >>> to >>> prepending a "^" anchor to the beginning of the pattern, so if >>> Huntsman84 >>> is using the default implementation, then I agree with Ian: I'm >>> surprised >>> it matches either. >>> >>> However, the Jakarta Regexp implementation uses RE.match(), which does >>> *not* require a beginning-of-string match. >>> >>> Hunstman84, are you using the Jakarta Regexp implementation? If so, >>> then >>> like you, I'm surprised it's not matching both :). >>> >>> It would be useful to see some real code, including how you index your >>> records. >>> >>> Steve >>> >>>> On Fri, May 8, 2009 at 1:42 PM, Huntsman84 <tpgarci...@gmail.com> >>>> wrote: >>>> > >>>> > Hi, >>>> > >>>> > I am using RegexQuery for searching in a set of records wich are >>>> > phrases of several words each. My aim is to find any phrase that >>>> > contains the given group of letters (e.g. "in"). For that case, >>>> > I am building the query with the regular expression ".in.", so it >>>> > should return all phrases with contain "in", but the search only >>>> > matches with the first word of the phrase. >>>> > >>>> > For example, if my records are "Knowing yourself" and "Old >>>> > clinic", the correct search would return 2 matches, but it only >>>> > matches with "Knowing yourself". >>>> > >>>> > How could I fix this? >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/RegexQuery-Incomplete-Results-tp23445235p23478720.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- View this message in context: http://www.nabble.com/RegexQuery-Incomplete-Results-tp23445235p23482532.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org