I did the following: highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);
which works. On Thu, Mar 12, 2009 at 6:41 PM, Amin Mohammed-Coleman <ami...@gmail.com>wrote: > JIRA updated. Includes new testcase which shows highlighter not working as > expected. > > > On Thu, Mar 12, 2009 at 5:56 PM, Amin Mohammed-Coleman > <ami...@gmail.com>wrote: > >> Hi >> >> I have found that it is not issue with POI. I extracted text using PoI but >> differenlty and the term is extracted properly. When I store the text and >> retrieve it the term exists. However running the text through highlighter >> doesn't work >> >> I will post test case with plain text file on JIRA. Currently on a cramped >> train! >> >> Cheers >> >> >> >> On 11 Mar 2009, at 18:11, markharw00d <markharw...@yahoo.co.uk> wrote: >> >> If you can supply a Junit test that recreates the problem I think we can >>> start to make progress on this. >>> >>> >>> >>> Amin Mohammed-Coleman wrote: >>> >>>> Hi >>>> >>>> Apologies for re sending this mail. Just wondering if anyone has >>>> experienced the below. I'm not sure if this could happen due nature of >>>> document. It does seem strange one term search returns summary while >>>> another >>>> does not even though same document is being returned. >>>> >>>> I'm asking this so I can code around this if is normal. >>>> >>>> >>>> Apologies again for re sending this mail >>>> >>>> Cheers >>>> >>>> Amin >>>> >>>> Sent from my iPhone >>>> >>>> On 9 Mar 2009, at 07:50, Amin Mohammed-Coleman <ami...@gmail.com> >>>> wrote: >>>> >>>> Hi >>>>> >>>>> I am seeing some strange behaviour with the highlighter and I'm >>>>> wondering if anyone else is experiencing this. In certain instances I >>>>> don't >>>>> get a summary being generated. I perform the search and the search >>>>> returns >>>>> the correct document. I can see that the lucene document contains the >>>>> text >>>>> in the field. However after doing: >>>>> >>>>> SimpleHTMLFormatter simpleHTMLFormatter = new >>>>> SimpleHTMLFormatter("<span class=\"highlight\"><b>", "</b></span>"); >>>>> //required for highlighting >>>>> Query query2 = multiSearcher.rewrite(query); >>>>> Highlighter highlighter = new >>>>> Highlighter(simpleHTMLFormatter, new QueryScorer(query2)); >>>>> ... >>>>> >>>>> String text= doc.get(FieldNameEnum.BODY.getDescription()); >>>>> TokenStream tokenStream = >>>>> analyzer.tokenStream(FieldNameEnum.BODY.getDescription(), new >>>>> StringReader(text)); >>>>> String result = highlighter.getBestFragments(tokenStream, >>>>> text, 3, "..."); >>>>> >>>>> >>>>> the string result is empty. This is very strange, if i try a different >>>>> term that exists in the document then I get a summary. For example I >>>>> have a >>>>> word document that contains the term "document" and "aspectj". If I >>>>> search >>>>> for "document" I get the correct document but no highlighted summary. >>>>> However if I search using "aspectj" I get the same doucment with >>>>> highlighted summary. >>>>> >>>>> Just to mentioned I do rewrite the original query before performing the >>>>> highlighting. >>>>> >>>>> I'm not sure what i'm missing here. Any help would be appreciated. >>>>> >>>>> Cheers >>>>> Amin >>>>> >>>>> On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman < >>>>> ami...@gmail.com> wrote: >>>>> Hi >>>>> >>>>> Got it working! Thanks again for your help! >>>>> >>>>> >>>>> Amin >>>>> >>>>> >>>>> On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman < >>>>> ami...@gmail.com> wrote: >>>>> Thanks! The final piece that I needed to do for the project! >>>>> >>>>> Cheers >>>>> >>>>> Amin >>>>> >>>>> On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <u...@thetaphi.de> >>>>> wrote: >>>>> > cool. i will use compression and store in index. is there anything >>>>> > special >>>>> > i need to for decompressing the text? i presume i can just do >>>>> > doc.get("content")? >>>>> > thanks for your advice all! >>>>> >>>>> No just use Field.Store.COMPRESS when adding to index and >>>>> Document.get() >>>>> when fetching. The decompression is automatically done. >>>>> >>>>> You may think, why not enable compression for all fields? The case is, >>>>> that >>>>> this is an overhead for very small and short fields. So you should only >>>>> use >>>>> it for large contents (it's the same like compressing very small files >>>>> as >>>>> ZIP/GZIP: These files mostly get larger than without compression). >>>>> >>>>> Uwe >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>> >>>>> >>>>> >>>>> >>>>> >>>> ------------------------------------------------------------------------ >>>> >>>> >>>> No virus found in this incoming message. >>>> Checked by AVG - www.avg.com Version: 8.0.237 / Virus Database: >>>> 270.11.10/1995 - Release Date: 03/11/09 08:28:00 >>>> >>>> >>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >