Then I suggest you start a new thread, posting all relevant details and preferably a cut down but complete program, with all relevant code, and no irrelevant code, with simple examples, input and output, of what does and doesn't work,
-- Ian. On Thu, Oct 3, 2013 at 12:28 PM, VIGNESH S <vigneshkln...@gmail.com> wrote: > Ian, > Thanks for your reply.. > I am facing the same problem if i use whiteSpaceTokenizer also. > My analyzer works perfect in case of Lucene 3.6. > > Thanks and Regards > Vignesh Srinivasan > > On Thu, Oct 3, 2013 at 3:23 PM, Ian Lea <ian....@gmail.com> wrote: > >> Certainly sounds like a bug in your analyzer. You could start a new >> thread if you need help with that. But from your previous email it >> sounds like you could use WhitespaceTokenizer chained with >> LowerCaseFilter. >> >> >> -- >> Ian. >> >> >> On Thu, Oct 3, 2013 at 7:16 AM, VIGNESH S <vigneshkln...@gmail.com> wrote: >> > Hi, >> > >> > In my Analyzer,problem actually occurs for words which are preceded by >> > punctuation marks.. >> > >> > For Example: >> > If I am Indexing content ",Andrey Gubarev,JingGoogle,Inc." >> > >> > If I search "Andrew Gubarev" ,It is not working properly since word >> Andrew >> > is preceded by punctuation ",". >> > >> > >> > On Thu, Oct 3, 2013 at 11:23 AM, VIGNESH S <vigneshkln...@gmail.com> >> wrote: >> > >> >> Hi Ian, >> >> >> >> In Lucene Is there any Default Analyzer we can use which will ignore >> only >> >> Spaces. >> >> All other numbers,punctuation,dates everything it should preserve. >> >> >> >> I created my analyzer with tokenizer which returns >> >> Character.isDefined(cn) && (!Character.isWhitespace(cn)). >> >> My analyzer will use a lowe case filter on top of the tokenizer.This >> Woks >> >> Perfect in case of 3.6 >> >> In 4.3 it is creating problems in offsets of tokens. >> >> >> >> >> >> >> >> >> >> On Mon, Sep 30, 2013 at 8:21 PM, Ian Lea <ian....@gmail.com> wrote: >> >> >> >>> Whenever someone says they are using a custom analyzer that has to be >> >>> a suspect. Does it work if you use one of the core lucene analyzers >> >>> instead? Have you used Luke to verify that the index holds what you >> >>> think it does? >> >>> >> >>> >> >>> -- >> >>> Ian. >> >>> >> >>> >> >>> On Mon, Sep 30, 2013 at 3:21 PM, VIGNESH S <vigneshkln...@gmail.com> >> >>> wrote: >> >>> > Hi, >> >>> > >> >>> > It is not the problem with case..Because Iam using LowercaseFilter. >> >>> > >> >>> > My Analyzer is a custom analyzer which will ignore just white >> spaces.All >> >>> > other numbers date and other special characters it will consider.The >> >>> Same >> >>> > analyzer works for Lucene 3.6. >> >>> > >> >>> > >> >>> > When i do a single term query for "Geoffrey" it is giving hits..But >> when >> >>> > given as a part of multiphrase query ,it is not able to find..When >> the >> >>> > below code is Executed with say word ="Geoffrey",it is not finding >> the >> >>> word >> >>> > itself .. >> >>> > >> >>> > if(TermsEnum.SeekStatus.FOUND ==trm.seekCeil(new BytesRef(word))) >> >>> > { do { >> >>> > String s = >> trm.term().utf8ToString(); >> >>> > if (s.equals(word)) { >> >>> > termsWithPrefix.add(new >> >>> Term("content", >> >>> > s)); >> >>> > } else { >> >>> > break; >> >>> > } >> >>> > } >> >>> > while (trm.next() != null); >> >>> > } >> >>> > >> >>> > >> >>> > >> >>> > On Mon, Sep 30, 2013 at 3:01 PM, Ian Lea <ian....@gmail.com> wrote: >> >>> > >> >>> >> Whenever someone says something along the lines of a search for >> >>> >> "geoffrey" not matching "Geoffrey" the case difference springs out, >> >>> >> Can't recall what if anything you said about the analysis side of >> >>> >> things but that could be the cause. See >> >>> >> >> >>> >> >> >>> >> http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2F_incorrect_hits.3F >> >>> >> >> >>> >> If on the other hand the problem is more obscure, and only related >> to >> >>> >> the multi phrase stuff, I suggest you build a tiny but complete >> >>> >> RAMDirectory based program or test case that shows the problem and >> >>> >> post it here. >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Ian. >> >>> >> >> >>> >> >> >>> >> >> >>> >> On Mon, Sep 30, 2013 at 6:46 AM, VIGNESH S <vigneshkln...@gmail.com >> > >> >>> >> wrote: >> >>> >> > Hi, >> >>> >> > >> >>> >> > Thanks for your Reply.The Problem I face is there is a word called >> >>> >> Geoffrey >> >>> >> > Romer in my Field. >> >>> >> > >> >>> >> > I am Forming a Multiphrase query object properly like " Geoffrey >> >>> >> Romer".But >> >>> >> > When i do a Search,it is not returning Hits.This Problem I am >> facing >> >>> is >> >>> >> not >> >>> >> > for all phrases >> >>> >> > This Problem happens for only few Phrases. >> >>> >> > >> >>> >> > When i do a single query like Geoffrey it is giving a Hit..But >> when >> >>> i do >> >>> >> it >> >>> >> > in MultiphraseQuery it is not able to find "geoffrey". I confirmed >> >>> this >> >>> >> by >> >>> >> > doing trm.seekCeil(new BytesRef("Geoffrey")) and then and then >> when >> >>> i >> >>> >> > do String s = trm.term().utf8ToString().It is pointing to a >> diffrent >> >>> word >> >>> >> > instead of geoffrey.seekceil is working properly for many phrases >> >>> though. >> >>> >> > >> >>> >> > What could be the problem..please kindly suggest. >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > On Fri, Sep 27, 2013 at 6:58 PM, Allison, Timothy B. < >> >>> talli...@mitre.org >> >>> >> >wrote: >> >>> >> > >> >>> >> >> 1) An alternate method to your original question would be to do >> >>> >> something >> >>> >> >> like this (I haven't compiled or tested this!): >> >>> >> >> >> >>> >> >> Query q = new PrefixQuery(new Term("field", "app")); >> >>> >> >> >> >>> >> >> q = q.rewrite(indexReader) ; >> >>> >> >> Set<Term> terms = new HashSet<Term>(); >> >>> >> >> q.extractTerms(terms); >> >>> >> >> Term[] arr = terms.toArray(new Term[terms.size()]); >> >>> >> >> MultiPhraseQuery mpq = new MultiPhraseQuery(); >> >>> >> >> mpq.add(new Term("field", "microsoft"); >> >>> >> >> mpq.add(arr); >> >>> >> >> >> >>> >> >> >> >>> >> >> 2) At a higher level, do you need to generate your query >> >>> >> programmatically? >> >>> >> >> Here are three parsers that could handle this: >> >>> >> >> a) ComplexPhraseQueryParser >> >>> >> >> b) SurroundQueryParser: >> >>> oal.queryparser.surround.parser.QueryParser >> >>> >> >> c) experimental: <self_promotion degree="shameless"> >> >>> >> >> http://issues.apache.org/jira/browse/LUCENE-5205 >> </self_promotion> >> >>> >> >> >> >>> >> >> >> >>> >> >> -----Original Message----- >> >>> >> >> From: VIGNESH S [mailto:vigneshkln...@gmail.com] >> >>> >> >> Sent: Friday, September 27, 2013 3:33 AM >> >>> >> >> To: java-user@lucene.apache.org >> >>> >> >> Subject: Re: Multiphrase Query in Lucene 4.3 >> >>> >> >> >> >>> >> >> Hi, >> >>> >> >> >> >>> >> >> The word i am giving is "Romer Geoffrey ".The Word is in the >> Field. >> >>> >> >> >> >>> >> >> trm.seekCeil(new BytesRef("Geoffrey")) and then when i do >> String s >> >>> = >> >>> >> >> trm.term().utf8ToString(); and hence >> >>> >> >> >> >>> >> >> It is giving a diffrent word..I think this is why my >> >>> multiphrasequery is >> >>> >> >> not giving desired results. >> >>> >> >> >> >>> >> >> What may be the reason.. >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> On Fri, Sep 27, 2013 at 11:49 AM, VIGNESH S < >> >>> vigneshkln...@gmail.com> >> >>> >> >> wrote: >> >>> >> >> >> >>> >> >> > Hi Lan, >> >>> >> >> > >> >>> >> >> > Thanks for your Reply. >> >>> >> >> > >> >>> >> >> > I am doing similar to this only..In MultiPhraseQuery object >> actual >> >>> >> phrase >> >>> >> >> > is going proper but it is not returning any hits.. >> >>> >> >> > >> >>> >> >> > In Lucene 3.6,I implemented the same logic and it is working. >> >>> >> >> > >> >>> >> >> > In Lucene 4.3,I implemented the Index for that using >> >>> >> >> > >> >>> >> >> > FieldType offsetsType = new FieldType(TextField.TYPE_STORED); >> >>> >> >> > >> >>> >> >> > >> >>> >> >> >> >>> >> >> >>> >> >> offsetsType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS); >> >>> >> >> > >> >>> >> >> > For MultiphraseQuery, whether I need to add any other >> parameter in >> >>> >> >> > addition to this while indexing? >> >>> >> >> > >> >>> >> >> > Is there any MultiPhraseQueryTest java file for Lucene 4.3? I >> >>> checked >> >>> >> in >> >>> >> >> > Lucene branch and i was not able to find..Please kindly help. >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > On Thu, Sep 26, 2013 at 2:55 PM, Ian Lea <ian....@gmail.com> >> >>> wrote: >> >>> >> >> > >> >>> >> >> >> I use the code below to do something like this. Not exactly >> >>> what you >> >>> >> >> >> want but should be easy to adapt. >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> public List<String> findTerms(IndexReader _reader, >> >>> >> >> >> String _field) throws >> IOException { >> >>> >> >> >> List<String> l = new ArrayList<String>(); >> >>> >> >> >> Fields ff = MultiFields.getFields(_reader); >> >>> >> >> >> Terms trms = ff.terms(_field); >> >>> >> >> >> TermsEnum te = trms.iterator(null); >> >>> >> >> >> BytesRef br; >> >>> >> >> >> while ((br = te.next()) != null) { >> >>> >> >> >> l.add(br.utf8ToString()); >> >>> >> >> >> } >> >>> >> >> >> return l; >> >>> >> >> >> } >> >>> >> >> >> >> >>> >> >> >> -- >> >>> >> >> >> Ian. >> >>> >> >> >> >> >>> >> >> >> On Wed, Sep 25, 2013 at 3:04 PM, VIGNESH S < >> >>> vigneshkln...@gmail.com> >> >>> >> >> >> wrote: >> >>> >> >> >> > Hi, >> >>> >> >> >> > >> >>> >> >> >> > In the Example of Multiphrase Query it is mentioned >> >>> >> >> >> > >> >>> >> >> >> > "To use this class, to search for the phrase "Microsoft >> app*" >> >>> first >> >>> >> >> use >> >>> >> >> >> > add(Term) on the term "Microsoft", then find all terms that >> >>> have >> >>> >> "app" >> >>> >> >> >> as >> >>> >> >> >> > prefix using IndexReader.terms(Term), and use >> >>> >> >> >> MultiPhraseQuery.add(Term[] >> >>> >> >> >> > terms) to add them to the query" >> >>> >> >> >> > >> >>> >> >> >> > >> >>> >> >> >> > How can i replicate the Same in Lucene 4.3 since >> >>> >> >> >> IndexReader.terms(Term) is >> >>> >> >> >> > no more used >> >>> >> >> >> > >> >>> >> >> >> > -- >> >>> >> >> >> > Thanks and Regards >> >>> >> >> >> > Vignesh Srinivasan >> >>> >> >> >> >> >>> >> >> >> >> >>> --------------------------------------------------------------------- >> >>> >> >> >> To unsubscribe, e-mail: >> java-user-unsubscr...@lucene.apache.org >> >>> >> >> >> For additional commands, e-mail: >> >>> java-user-h...@lucene.apache.org >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > -- >> >>> >> >> > Thanks and Regards >> >>> >> >> > Vignesh Srinivasan >> >>> >> >> > 9739135640 >> >>> >> >> > >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> -- >> >>> >> >> Thanks and Regards >> >>> >> >> Vignesh Srinivasan >> >>> >> >> 9739135640 >> >>> >> >> >> >>> >> >> >> >>> --------------------------------------------------------------------- >> >>> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >>> >> >> For additional commands, e-mail: >> java-user-h...@lucene.apache.org >> >>> >> >> >> >>> >> >> >> >>> >> > >> >>> >> > >> >>> >> > -- >> >>> >> > Thanks and Regards >> >>> >> > Vignesh Srinivasan >> >>> >> > 9739135640 >> >>> >> >> >>> >> >> --------------------------------------------------------------------- >> >>> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >>> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >>> >> >> >>> >> >> >>> > >> >>> > >> >>> > -- >> >>> > Thanks and Regards >> >>> > Vignesh Srinivasan >> >>> > 9739135640 >> >>> >> >>> --------------------------------------------------------------------- >> >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >>> >> >>> >> >> >> >> >> >> -- >> >> Thanks and Regards >> >> Vignesh Srinivasan >> >> 9739135640 >> >> >> > >> > >> > >> > -- >> > Thanks and Regards >> > Vignesh Srinivasan >> > 9739135640 >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > > -- > Thanks and Regards > Vignesh Srinivasan > 9739135640 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org