Hi Ian, In Lucene Is there any Default Analyzer we can use which will ignore only Spaces. All other numbers,punctuation,dates everything it should preserve.
I created my analyzer with tokenizer which returns Character.isDefined(cn) && (!Character.isWhitespace(cn)). My analyzer will use a lowe case filter on top of the tokenizer.This Woks Perfect in case of 3.6 In 4.3 it is creating problems in offsets of tokens. On Mon, Sep 30, 2013 at 8:21 PM, Ian Lea <ian....@gmail.com> wrote: > Whenever someone says they are using a custom analyzer that has to be > a suspect. Does it work if you use one of the core lucene analyzers > instead? Have you used Luke to verify that the index holds what you > think it does? > > > -- > Ian. > > > On Mon, Sep 30, 2013 at 3:21 PM, VIGNESH S <vigneshkln...@gmail.com> > wrote: > > Hi, > > > > It is not the problem with case..Because Iam using LowercaseFilter. > > > > My Analyzer is a custom analyzer which will ignore just white spaces.All > > other numbers date and other special characters it will consider.The Same > > analyzer works for Lucene 3.6. > > > > > > When i do a single term query for "Geoffrey" it is giving hits..But when > > given as a part of multiphrase query ,it is not able to find..When the > > below code is Executed with say word ="Geoffrey",it is not finding the > word > > itself .. > > > > if(TermsEnum.SeekStatus.FOUND ==trm.seekCeil(new BytesRef(word))) > > { do { > > String s = trm.term().utf8ToString(); > > if (s.equals(word)) { > > termsWithPrefix.add(new > Term("content", > > s)); > > } else { > > break; > > } > > } > > while (trm.next() != null); > > } > > > > > > > > On Mon, Sep 30, 2013 at 3:01 PM, Ian Lea <ian....@gmail.com> wrote: > > > >> Whenever someone says something along the lines of a search for > >> "geoffrey" not matching "Geoffrey" the case difference springs out, > >> Can't recall what if anything you said about the analysis side of > >> things but that could be the cause. See > >> > >> > http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2F_incorrect_hits.3F > >> > >> If on the other hand the problem is more obscure, and only related to > >> the multi phrase stuff, I suggest you build a tiny but complete > >> RAMDirectory based program or test case that shows the problem and > >> post it here. > >> > >> > >> -- > >> Ian. > >> > >> > >> > >> On Mon, Sep 30, 2013 at 6:46 AM, VIGNESH S <vigneshkln...@gmail.com> > >> wrote: > >> > Hi, > >> > > >> > Thanks for your Reply.The Problem I face is there is a word called > >> Geoffrey > >> > Romer in my Field. > >> > > >> > I am Forming a Multiphrase query object properly like " Geoffrey > >> Romer".But > >> > When i do a Search,it is not returning Hits.This Problem I am facing > is > >> not > >> > for all phrases > >> > This Problem happens for only few Phrases. > >> > > >> > When i do a single query like Geoffrey it is giving a Hit..But when i > do > >> it > >> > in MultiphraseQuery it is not able to find "geoffrey". I confirmed > this > >> by > >> > doing trm.seekCeil(new BytesRef("Geoffrey")) and then and then when i > >> > do String s = trm.term().utf8ToString().It is pointing to a diffrent > word > >> > instead of geoffrey.seekceil is working properly for many phrases > though. > >> > > >> > What could be the problem..please kindly suggest. > >> > > >> > > >> > > >> > On Fri, Sep 27, 2013 at 6:58 PM, Allison, Timothy B. < > talli...@mitre.org > >> >wrote: > >> > > >> >> 1) An alternate method to your original question would be to do > >> something > >> >> like this (I haven't compiled or tested this!): > >> >> > >> >> Query q = new PrefixQuery(new Term("field", "app")); > >> >> > >> >> q = q.rewrite(indexReader) ; > >> >> Set<Term> terms = new HashSet<Term>(); > >> >> q.extractTerms(terms); > >> >> Term[] arr = terms.toArray(new Term[terms.size()]); > >> >> MultiPhraseQuery mpq = new MultiPhraseQuery(); > >> >> mpq.add(new Term("field", "microsoft"); > >> >> mpq.add(arr); > >> >> > >> >> > >> >> 2) At a higher level, do you need to generate your query > >> programmatically? > >> >> Here are three parsers that could handle this: > >> >> a) ComplexPhraseQueryParser > >> >> b) SurroundQueryParser: oal.queryparser.surround.parser.QueryParser > >> >> c) experimental: <self_promotion degree="shameless"> > >> >> http://issues.apache.org/jira/browse/LUCENE-5205</self_promotion> > >> >> > >> >> > >> >> -----Original Message----- > >> >> From: VIGNESH S [mailto:vigneshkln...@gmail.com] > >> >> Sent: Friday, September 27, 2013 3:33 AM > >> >> To: java-user@lucene.apache.org > >> >> Subject: Re: Multiphrase Query in Lucene 4.3 > >> >> > >> >> Hi, > >> >> > >> >> The word i am giving is "Romer Geoffrey ".The Word is in the Field. > >> >> > >> >> trm.seekCeil(new BytesRef("Geoffrey")) and then when i do String s = > >> >> trm.term().utf8ToString(); and hence > >> >> > >> >> It is giving a diffrent word..I think this is why my > multiphrasequery is > >> >> not giving desired results. > >> >> > >> >> What may be the reason.. > >> >> > >> >> > >> >> > >> >> > >> >> On Fri, Sep 27, 2013 at 11:49 AM, VIGNESH S <vigneshkln...@gmail.com > > > >> >> wrote: > >> >> > >> >> > Hi Lan, > >> >> > > >> >> > Thanks for your Reply. > >> >> > > >> >> > I am doing similar to this only..In MultiPhraseQuery object actual > >> phrase > >> >> > is going proper but it is not returning any hits.. > >> >> > > >> >> > In Lucene 3.6,I implemented the same logic and it is working. > >> >> > > >> >> > In Lucene 4.3,I implemented the Index for that using > >> >> > > >> >> > FieldType offsetsType = new FieldType(TextField.TYPE_STORED); > >> >> > > >> >> > > >> >> > >> > > offsetsType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS); > >> >> > > >> >> > For MultiphraseQuery, whether I need to add any other parameter in > >> >> > addition to this while indexing? > >> >> > > >> >> > Is there any MultiPhraseQueryTest java file for Lucene 4.3? I > checked > >> in > >> >> > Lucene branch and i was not able to find..Please kindly help. > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > On Thu, Sep 26, 2013 at 2:55 PM, Ian Lea <ian....@gmail.com> > wrote: > >> >> > > >> >> >> I use the code below to do something like this. Not exactly what > you > >> >> >> want but should be easy to adapt. > >> >> >> > >> >> >> > >> >> >> public List<String> findTerms(IndexReader _reader, > >> >> >> String _field) throws IOException { > >> >> >> List<String> l = new ArrayList<String>(); > >> >> >> Fields ff = MultiFields.getFields(_reader); > >> >> >> Terms trms = ff.terms(_field); > >> >> >> TermsEnum te = trms.iterator(null); > >> >> >> BytesRef br; > >> >> >> while ((br = te.next()) != null) { > >> >> >> l.add(br.utf8ToString()); > >> >> >> } > >> >> >> return l; > >> >> >> } > >> >> >> > >> >> >> -- > >> >> >> Ian. > >> >> >> > >> >> >> On Wed, Sep 25, 2013 at 3:04 PM, VIGNESH S < > vigneshkln...@gmail.com> > >> >> >> wrote: > >> >> >> > Hi, > >> >> >> > > >> >> >> > In the Example of Multiphrase Query it is mentioned > >> >> >> > > >> >> >> > "To use this class, to search for the phrase "Microsoft app*" > first > >> >> use > >> >> >> > add(Term) on the term "Microsoft", then find all terms that have > >> "app" > >> >> >> as > >> >> >> > prefix using IndexReader.terms(Term), and use > >> >> >> MultiPhraseQuery.add(Term[] > >> >> >> > terms) to add them to the query" > >> >> >> > > >> >> >> > > >> >> >> > How can i replicate the Same in Lucene 4.3 since > >> >> >> IndexReader.terms(Term) is > >> >> >> > no more used > >> >> >> > > >> >> >> > -- > >> >> >> > Thanks and Regards > >> >> >> > Vignesh Srinivasan > >> >> >> > >> >> >> > --------------------------------------------------------------------- > >> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> >> >> > >> >> >> > >> >> > > >> >> > > >> >> > -- > >> >> > Thanks and Regards > >> >> > Vignesh Srinivasan > >> >> > 9739135640 > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Thanks and Regards > >> >> Vignesh Srinivasan > >> >> 9739135640 > >> >> > >> >> --------------------------------------------------------------------- > >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> >> > >> >> > >> > > >> > > >> > -- > >> > Thanks and Regards > >> > Vignesh Srinivasan > >> > 9739135640 > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > > > > > -- > > Thanks and Regards > > Vignesh Srinivasan > > 9739135640 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Thanks and Regards Vignesh Srinivasan 9739135640