Thanks
it works

On 5/12/09, Mark Miller <markrmil...@gmail.com> wrote:
> Use JavaUtilRegexCapabilities or put the Jakarata RegEx jar on your
> classpath: http://jakarta.apache.org/regexp/index.html
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
> Seid Mohammed wrote:
>> I need it similar functionality, but while running the above code it
>> breaks after outputing the following
>> ========================================================================
>> Added Knowing yourself
>> Added Old clinic
>> Added INSIDE
>> Added Not INSIDE
>>
>> Default
>> regexcapabilities=org.apache.lucene.search.regex.javautilregexcapabilit...@0
>>
>> org.apache.lucene.search.regex.javautilregexcapabilit...@0
>> 0 hits for text:.in
>> 2 hits for text:.*in
>> 0 hits for text:.IN
>> 2 hits for text:.*IN
>> org.apache.lucene.search.regex.jakartaregexpcapabilit...@0
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/regexp/RE
>>      at
>> org.apache.lucene.search.regex.JakartaRegexpCapabilities.compile(JakartaRegexpCapabilities.java:32)
>>      at
>> org.apache.lucene.search.regex.RegexTermEnum.<init>(RegexTermEnum.java:47)
>>      at org.apache.lucene.search.regex.RegexQuery.getEnum(RegexQuery.java:59)
>>      at
>> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:55)
>>      at 
>> org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:162)
>>      at org.apache.lucene.search.Query.weight(Query.java:94)
>>      at org.apache.lucene.search.Hits.<init>(Hits.java:76)
>>      at org.apache.lucene.search.Searcher.search(Searcher.java:50)
>>      at org.apache.lucene.search.Searcher.search(Searcher.java:40)
>>      at Regex2.main(Regex2.java:43)
>> Caused by: java.lang.ClassNotFoundException: org.apache.regexp.RE
>>      at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>>      at java.security.AccessController.doPrivileged(Native Method)
>>      at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>>      at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
>>      at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
>>      at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
>>      ... 10 more
>> ===================================================================
>>
>> thanks a lot
>>
>> On 5/11/09, Huntsman84 <tpgarci...@gmail.com> wrote:
>>
>>> That's it!!!
>>>
>>> The problem was with the regular expression, the one I need is ".*IN"!!
>>>
>>> Thank you so much, I was turning mad... =)
>>>
>>>
>>> Ian Lea wrote:
>>>
>>>> The little self-contained program below runs regex queries for a few
>>>> regexps against a few phrases for both the java.util and jakarta
>>>> regexp packages.
>>>>
>>>> Output when run with lucene 2.4.1 and jakarta-regexp 1.5 is
>>>>
>>>> Added Knowing yourself
>>>> Added Old clinic
>>>> Added INSIDE
>>>> Added Not INSIDE
>>>>
>>>> Default
>>>> regexcapabilities=org.apache.lucene.search.regex.javautilregexcapabilit...@0
>>>>
>>>> org.apache.lucene.search.regex.javautilregexcapabilit...@0
>>>> 0 hits for text:.in
>>>> 2 hits for text:.*in
>>>> 0 hits for text:.IN
>>>> 2 hits for text:.*IN
>>>> org.apache.lucene.search.regex.jakartaregexpcapabilit...@0
>>>> 2 hits for text:.in
>>>> 2 hits for text:.*in
>>>> 1 hits for text:.IN
>>>> 2 hits for text:.*IN
>>>>
>>>> Hope that helps.
>>>>
>>>> --
>>>> Ian.
>>>>
>>>>
>>>> import org.apache.lucene.index.*;
>>>> import org.apache.lucene.store.*;
>>>> import org.apache.lucene.document.*;
>>>> import org.apache.lucene.analysis.*;
>>>> import org.apache.lucene.analysis.standard.*;
>>>> import org.apache.lucene.search.*;
>>>> import org.apache.lucene.search.regex.*;
>>>>
>>>> public class luctest {
>>>>
>>>>     public static void main(String[] _args) throws Exception {
>>>>    RAMDirectory rdir = new RAMDirectory();
>>>>    IndexWriter writer = new IndexWriter(rdir, new StandardAnalyzer(),
>>>> true);
>>>>    String[] docterms = { "Knowing yourself",
>>>>                          "Old clinic",
>>>>                          "INSIDE",
>>>>                          "Not INSIDE" };
>>>>
>>>>    for (String s : docterms) {
>>>>        Document d = new Document();
>>>>        d.add(new Field("text",
>>>>                        s,
>>>>                        Field.Store.YES,
>>>>                        Field.Index.NOT_ANALYZED));
>>>>        writer.addDocument(d);
>>>>        System.out.printf("Added %s\n", s);
>>>>    }
>>>>    writer.close();
>>>>
>>>>    IndexSearcher searcher = new IndexSearcher(rdir);
>>>>    String[] queries = { ".in", ".*in", ".IN", ".*IN" };
>>>>    RegexCapabilities[] rcaps = { new JavaUtilRegexCapabilities(),
>>>>                                  new JakartaRegexpCapabilities() };
>>>>    RegexQuery qx = new RegexQuery(new Term("x", "x"));
>>>>    System.out.printf("\nDefault RegexCapabilities=%s\n\n",
>>>>                      qx.getRegexImplementation());
>>>>    for (RegexCapabilities rcap : rcaps) {
>>>>        System.out.println(rcap);
>>>>        for (String s : queries) {
>>>>            Term t = new Term("text", s);
>>>>            RegexQuery q = new RegexQuery(t);
>>>>            q.setRegexImplementation(rcap);
>>>>            Hits h = searcher.search(q);
>>>>            System.out.printf("%s hits for %s\n",
>>>>                              h.length(),
>>>>                              q.toString());
>>>>        }
>>>>    }
>>>>     }
>>>> }
>>>>
>>>>
>>>> On Mon, May 11, 2009 at 1:39 PM, Huntsman84 <tpgarci...@gmail.com>
>>>> wrote:
>>>>
>>>>> The RegexQuery class uses that package, and for that reason the
>>>>> expression
>>>>> matches.
>>>>>
>>>>> If my records contained only one word each, this code would work, but I
>>>>> need
>>>>> to apply that regular expression to a phrase...
>>>>>
>>>>>
>>>>> Ian Lea wrote:
>>>>>
>>>>>> The default regex package is java.util.regex and I can't see anywhere
>>>>>> that you tell it to use the Jakarta regexp package.  So I don't think
>>>>>> that ".in" will match.  Also, you are storing your contents field as
>>>>>> NOT_ANALYZED so you will need to be wary of case sensitivity.  Maybe
>>>>>> this is what you want, but maybe not.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ian.
>>>>>>
>>>>>>
>>>>>> On Mon, May 11, 2009 at 9:00 AM, Huntsman84 <tpgarci...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> This is the code for searching:
>>>>>>>
>>>>>>> String index = "index";
>>>>>>> String field = "contents";
>>>>>>> IndexReader reader = IndexReader.open(index);
>>>>>>> Searcher searcher = new IndexSearcher(reader);
>>>>>>>
>>>>>>> System.out.println("Enter query: ");
>>>>>>> String line = ".IN.";//in jakarta regexp this is like * IN *
>>>>>>> RegexQuery rxquery = new RegexQuery(new Term(field,line));
>>>>>>> Hits hits = searcher.search(rxquery);
>>>>>>>
>>>>>>> if(hits!=null){
>>>>>>>    for(int k = 0; k<100 && k<hits.length(); k++){
>>>>>>>        if(hits.doc(k)!=null)
>>>>>>>
>>>>>>>  System.out.println(hits.doc(k).getField("contents").stringValue());
>>>>>>>    }
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> And this is the part of creating the index:
>>>>>>>
>>>>>>>
>>>>>>> File directory = new File("index");
>>>>>>> IndexWriter writer = new IndexWriter(directory, new
>>>>>>> StandardAnalyzer(),
>>>>>>> true,
>>>>>>>                            IndexWriter.MaxFieldLength.LIMITED);
>>>>>>> List<String> records = getRecords();//returns a list of record values
>>>>>>> from
>>>>>>> database, all of them are phrases
>>>>>>> Iterator<String> i = records.iterator();
>>>>>>> while(i.hasNext()){
>>>>>>>           Document doc = new Document();
>>>>>>>           doc.add(new Field(field, i.next(), Field.Store.YES,
>>>>>>> Field.Index.NOT_ANALYZED));
>>>>>>>        writer.addDocument(doc);
>>>>>>> }
>>>>>>> writer.optimize();
>>>>>>> writer.close();
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This code works as I want but just matching with the first word of
>>>>>>> the
>>>>>>> phrase. I think the problem is the index building, but I don't know
>>>>>>> how
>>>>>>> to
>>>>>>> fix it...
>>>>>>>
>>>>>>> Any ideas?
>>>>>>>
>>>>>>> Thank you so much!!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Steven A Rowe wrote:
>>>>>>>
>>>>>>>> On 5/8/2009 at 9:13 AM, Ian Lee wrote:
>>>>>>>>
>>>>>>>>> I'm surprised that it matches either - don't you need ".*in" where
>>>>>>>>> .*
>>>>>>>>> means match any character zero or more times?  See the javadoc for
>>>>>>>>> java.util.regex.Pattern, or for Jakarta Regexp if you are using
>>>>>>>>> that
>>>>>>>>> package.
>>>>>>>>>
>>>>>>>>> Unless you're an expert in regexps it is probably worth playing
>>>>>>>>> with
>>>>>>>>> them outside your lucene code to start with e.g. with simple
>>>>>>>>> String.matches(regexp) calls.  They can take some getting used to.
>>>>>>>>> And try to avoid anything with backslashes if you can!
>>>>>>>>>
>>>>>>>> The java.util.regex.Pattern implementation (the default RegexQuery
>>>>>>>> implementation) actually uses Matcher.lookingAt(), which is
>>>>>>>> equivalent
>>>>>>>> to
>>>>>>>> prepending a "^" anchor to the beginning of the pattern, so if
>>>>>>>> Huntsman84
>>>>>>>> is using the default implementation, then I agree with Ian: I'm
>>>>>>>> surprised
>>>>>>>> it matches either.
>>>>>>>>
>>>>>>>> However, the Jakarta Regexp implementation uses RE.match(), which
>>>>>>>> does
>>>>>>>> *not* require a beginning-of-string match.
>>>>>>>>
>>>>>>>> Hunstman84, are you using the Jakarta Regexp implementation?  If so,
>>>>>>>> then
>>>>>>>> like you, I'm surprised it's not matching both :).
>>>>>>>>
>>>>>>>> It would be useful to see some real code, including how you index
>>>>>>>> your
>>>>>>>> records.
>>>>>>>>
>>>>>>>> Steve
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Fri, May 8, 2009 at 1:42 PM, Huntsman84 <tpgarci...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am using RegexQuery for searching in a set of records wich are
>>>>>>>>>> phrases of several words each. My aim is to find any phrase that
>>>>>>>>>> contains the given group of letters (e.g. "in"). For that case,
>>>>>>>>>> I am building the query with the regular expression ".in.", so it
>>>>>>>>>> should return all phrases with contain "in", but the search only
>>>>>>>>>> matches with the first word of the phrase.
>>>>>>>>>>
>>>>>>>>>> For example, if my records are "Knowing yourself" and "Old
>>>>>>>>>> clinic", the correct search would return 2 matches, but it only
>>>>>>>>>> matches with "Knowing yourself".
>>>>>>>>>>
>>>>>>>>>> How could I fix this?
>>>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://www.nabble.com/RegexQuery-Incomplete-Results-tp23445235p23478720.html
>>>>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/RegexQuery-Incomplete-Results-tp23445235p23482532.html
>>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>
>>>>
>>>>
>>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/RegexQuery-Incomplete-Results-tp23445235p23486350.html
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>>
>>
>>
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


-- 
"RABI ZIDNI ILMA"

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to