Lucene Highlighting mergeContiguous

2020-07-21 Thread Patricia Reddy
Hello All, Trying to highlight a phrase "John Doe" using Lucene highlighter but the content highlights each separate term. Contiguous terms are not merged together. For eg: JohnDoe is returned instead of John Doe I have set the mergeContiguous parameter on the getBestTextFragments method to tru

Re: Using Sentence Information for Lucene Highlighting

2014-04-12 Thread Furkan KAMACI
Hi; I could find a way to achieve it when I debugged the source code. I've shared the same information at Solr mail list too. Defining a delimiter and indexing it as an individual token is the first step. Writing a regex that "matches" for given delimiter is the next step. Last step is defining th

Using Sentence Information for Lucene Highlighting

2014-04-08 Thread Furkan KAMACI
Hi; I could not get an answer for my question at Solr list and I wanted to ask it here because I think that it is more Lucene specific question. I have indexed my documents and there is a special character sequence that shows the end of a string. It is: *|* For example: The quick brown fox jum

Re: Regarding Lucene Highlighting feature.

2013-07-10 Thread VIGNESH S
Hi Robert, Thanks for the reply. My Actual Usecase is to Highlight the First occurence of the search word in the sentence it occured. In my case,I do not have access to original documents . Iam looking for optimum way by which i need to reduce the index disk space. I tried SimpleHighlighter an

Re: Regarding Lucene Highlighting feature.

2013-07-05 Thread Roberto Ragusa
On 07/05/2013 01:27 PM, VIGNESH S wrote: > Hi, > > I think using CompressingStoredFieldsFormat Feature introduced in Lucene > 4.1 may help reduce index size. > > Any other comments and suggestions are welcome in this topic.. > Do you have access to the original documents, outside Lucene? If so,

Re: Regarding Lucene Highlighting feature.

2013-07-05 Thread VIGNESH S
Hi, I think using CompressingStoredFieldsFormat Feature introduced in Lucene 4.1 may help reduce index size. Any other comments and suggestions are welcome in this topic.. Thanks and Regards Vignesh Srinivasan 9739135640 On Thu, Jul 4, 2013 at 6:38 PM, VIGNESH S wrote: > Hi, > > Is it mandat

Regarding Lucene Highlighting feature.

2013-07-04 Thread VIGNESH S
Hi, Is it mandatory to use "Store.YES" when using Highlighting Feature. is it Possible to use Highlighting Feature without using "Store.Yes" while indexing because it almost doubles index size. Please Kindly Help. -- Thanks and Regards Vignesh Srinivasan 9739135640

Re: HTML tags and Lucene highlighting

2012-04-05 Thread Koji Sekiguchi
(12/04/06 2:34), okayndc wrote: Hello, I currently use Lucene version 3.0...probably need to upgrade to a more current version soon. The problem that I have is when I test search for a an HTML tag (ex. ), Lucene returns the highlighted HTML tag ~ which is what I DO NOT want. Is there a way to "

Re: HTML tags and Lucene highlighting

2012-04-05 Thread okayndc
ripCharFilter should do what you want. > > Steve > > From: okayndc [mailto:bodymo...@gmail.com] > Sent: Thursday, April 05, 2012 3:36 PM > To: Steven A Rowe > Cc: java-user@lucene.apache.org > Subject: Re: HTML tags and Lucene highlighting > > Hello, > > I

RE: HTML tags and Lucene highlighting

2012-04-05 Thread Steven A Rowe
tags (in the field configured to use HTMLStripCharFilter, anyway). So HTMLStripCharFilter should do what you want. Steve From: okayndc [mailto:bodymo...@gmail.com] Sent: Thursday, April 05, 2012 3:36 PM To: Steven A Rowe Cc: java-user@lucene.apache.org Subject: Re: HTML tags and Lucene highlig

Re: HTML tags and Lucene highlighting

2012-04-05 Thread okayndc
> > Steve > > -Original Message- > From: okayndc [mailto:bodymo...@gmail.com] > Sent: Thursday, April 05, 2012 1:34 PM > To: java-user@lucene.apache.org > Subject: HTML tags and Lucene highlighting > > Hello, > > I currently use Lucene version 3.0...probably need to up

RE: HTML tags and Lucene highlighting

2012-04-05 Thread Steven A Rowe
Hi okayndc, What *do* you want? Steve -Original Message- From: okayndc [mailto:bodymo...@gmail.com] Sent: Thursday, April 05, 2012 1:34 PM To: java-user@lucene.apache.org Subject: HTML tags and Lucene highlighting Hello, I currently use Lucene version 3.0...probably need to upgrade

HTML tags and Lucene highlighting

2012-04-05 Thread okayndc
Hello, I currently use Lucene version 3.0...probably need to upgrade to a more current version soon. The problem that I have is when I test search for a an HTML tag (ex. ), Lucene returns the highlighted HTML tag ~ which is what I DO NOT want. Is there a way to "filter" HTML tags? I have read up

Re: Lucene Highlighting and Dynamic Summaries

2009-03-13 Thread Amin Mohammed-Coleman
Ok. I tried to apply the patch(s) and completely messed it up (user error). Is there a full example of the highlighter that is available that I can apply and test? Cheers Amin On Fri, Mar 13, 2009 at 12:09 PM, Amin Mohammed-Coleman wrote: > Absolutely! I have received considerable help from

Re: Lucene Highlighting and Dynamic Summaries

2009-03-13 Thread Amin Mohammed-Coleman
Absolutely! I have received considerable help from the community and there are so many more stuff I want to ask! Cheers! Amin On Fri, Mar 13, 2009 at 10:41 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > > Well, it's not yet committed. > > You can use it now by pulling the patch a

Re: Lucene Highlighting and Dynamic Summaries

2009-03-13 Thread Michael McCandless
Well, it's not yet committed. You can use it now by pulling the patch attached to the issue & testing it yourself. If you do so, please report back! This is how Lucene improves. I'm hoping we can include it in 2.9... Mike On Mar 13, 2009, at 6:35 AM, Amin Mohammed-Coleman wrote: Swee

Re: Lucene Highlighting and Dynamic Summaries

2009-03-13 Thread Amin Mohammed-Coleman
Sweet! When will this highlighter be available? Can I use this now? Cheers! On Fri, Mar 13, 2009 at 10:10 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > > Amin Mohammed-Coleman wrote: > > I think that would be good. >> > > I'll open an issue. > > Probably a silly thing to ask

Re: Lucene Highlighting and Dynamic Summaries

2009-03-13 Thread Michael McCandless
Amin Mohammed-Coleman wrote: I think that would be good. I'll open an issue. Probably a silly thing to ask but I guess there is a performance implication by setting it to max value. Right. And it's tough choosing a default in situations like this -- performance vs losing stuff. Howe

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
Hi I think that would be good. Probably a silly thing to ask but I guess there is a performance implication by setting it to max value. Is there a general setting that other developers use? Cheers Amin On 12 Mar 2009, at 22:03, Michael McCandless wrote: IndexWriter has such behavi

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Michael McCandless
IndexWriter has such behavior too, and because it was such a common trap (developers could not understand why their content was being truncated), we made that setting explicit, up front so you were aware of it. I think this in general is a reasonable approach for settings that "lose" stuff

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
I did the following: highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE); which works. On Thu, Mar 12, 2009 at 6:41 PM, Amin Mohammed-Coleman wrote: > JIRA updated. Includes new testcase which shows highlighter not working as > expected. > > > On Thu, Mar 12, 2009 at 5:56 PM, Amin Mohammed

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
JIRA updated. Includes new testcase which shows highlighter not working as expected. On Thu, Mar 12, 2009 at 5:56 PM, Amin Mohammed-Coleman wrote: > Hi > > I have found that it is not issue with POI. I extracted text using PoI but > differenlty and the term is extracted properly. When I store t

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
Hi I have found that it is not issue with POI. I extracted text using PoI but differenlty and the term is extracted properly. When I store the text and retrieve it the term exists. However running the text through highlighter doesn't work I will post test case with plain text file on JIR

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
t;> The attachment didn't make it through here. Can you add it as an >> attachment to a new JIRA issue? >> >> Thanks, >> Mark >> >> >> >> >> >> >> From: Amin Mohammed-Coleman >> To: java

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
> From: Amin Mohammed-Coleman > To: java-user@lucene.apache.org > Sent: Thursday, 12 March, 2009 7:47:20 > Subject: Re: Lucene Highlighting and Dynamic Summaries > > Hi > > Please find attadched a test case plus a document. Just to mention this > occurs

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread mark harwood
The attachment didn't make it through here. Can you add it as an attachment to a new JIRA issue? Thanks, Mark From: Amin Mohammed-Coleman To: java-user@lucene.apache.org Sent: Thursday, 12 March, 2009 7:47:20 Subject: Re: Lucene Highlighting and Dy

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
Hi Please find attadched a test case plus a document. Just to mention this occurs sometimes for other files. Cheers Amin On Wed, Mar 11, 2009 at 6:11 PM, markharw00d wrote: > If you can supply a Junit test that recreates the problem I think we can > start to make progress on this. > > > > Amin

Re: Lucene Highlighting and Dynamic Summaries

2009-03-11 Thread markharw00d
If you can supply a Junit test that recreates the problem I think we can start to make progress on this. Amin Mohammed-Coleman wrote: Hi Apologies for re sending this mail. Just wondering if anyone has experienced the below. I'm not sure if this could happen due nature of document. It does

Re: Lucene Highlighting and Dynamic Summaries

2009-03-11 Thread Amin Mohammed-Coleman
Hi Apologies for re sending this mail. Just wondering if anyone has experienced the below. I'm not sure if this could happen due nature of document. It does seem strange one term search returns summary while another does not even though same document is being returned. I'm asking this so

Re: Lucene Highlighting and Dynamic Summaries

2009-03-09 Thread Amin Mohammed-Coleman
Hi I am seeing some strange behaviour with the highlighter and I'm wondering if anyone else is experiencing this. In certain instances I don't get a summary being generated. I perform the search and the search returns the correct document. I can see that the lucene document contains the text in

Re: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Amin Mohammed-Coleman
Hi Got it working! Thanks again for your help! Amin On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman wrote: > Thanks! The final piece that I needed to do for the project! > Cheers > > Amin > > On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler wrote: > >> > cool. i will use compression an

Re: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Amin Mohammed-Coleman
Thanks! The final piece that I needed to do for the project! Cheers Amin On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler wrote: > > cool. i will use compression and store in index. is there anything > > special > > i need to for decompressing the text? i presume i can just do > > doc.get("cont

RE: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Uwe Schindler
> cool. i will use compression and store in index. is there anything > special > i need to for decompressing the text? i presume i can just do > doc.get("content")? > thanks for your advice all! No just use Field.Store.COMPRESS when adding to index and Document.get() when fetching. The decompress

Re: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Amin Mohammed-Coleman
turday, March 07, 2009 12:46 PM > > To: java-user@lucene.apache.org > > Subject: Re: Lucene Highlighting and Dynamic Summaries > > > > It depends :) > > > > It's a trade-off. If storing is not prohibitive, I recommend that as > > it makes life easier

RE: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Uwe Schindler
o: java-user@lucene.apache.org > Subject: Re: Lucene Highlighting and Dynamic Summaries > > It depends :) > > It's a trade-off. If storing is not prohibitive, I recommend that as > it makes life easier for highlighting. > > Erik > > On Mar 7, 2009, at

Re: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Erik Hatcher
It depends :) It's a trade-off. If storing is not prohibitive, I recommend that as it makes life easier for highlighting. Erik On Mar 7, 2009, at 6:37 AM, Amin Mohammed-Coleman wrote: hi that's what i was thinking about. i would need to get the file and extract the text again a

Re: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Amin Mohammed-Coleman
hi that's what i was thinking about. i would need to get the file and extract the text again and then pass through the highlighter. The other option is storing the content in the index the downside being index is going to be large. Which would be the recommended approach? Cheers Amin On Sat,

Re: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Erik Hatcher
With the caveat that if you're not storing the text you want highlighted, you'll have to retrieve it somehow and send it into the Highlighter yourself. Erik On Mar 7, 2009, at 5:40 AM, Michael McCandless wrote: You should look at contrib/highlighter, which does exactly this. Mike

Re: Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Michael McCandless
You should look at contrib/highlighter, which does exactly this. Mike Amin Mohammed-Coleman wrote: Hi I am currently indexing documents (pdf, ms word, etc) that are uploaded, these documents can be searched and what the search returns to the user are summaries of the documents. Currently

Lucene Highlighting and Dynamic Summaries

2009-03-07 Thread Amin Mohammed-Coleman
Hi I am currently indexing documents (pdf, ms word, etc) that are uploaded, these documents can be searched and what the search returns to the user are summaries of the documents. Currently the summaries are extracted when indexing the file (summary constructed by taking the first 10 lines of the

RE: Lucene highlighting

2007-11-28 Thread Scott Smith
From: Matthijs Bierman [mailto:[EMAIL PROTECTED] Sent: Wed 11/28/2007 5:58 AM To: java-user@lucene.apache.org Subject: Re: Lucene highlighting This would only highlight plaintext though, not in the original document as I suspect the TS would like. Matthijs markharw00d wrote: > >> I nee

RE: Lucene highlighting

2007-11-28 Thread Scott Smith
xml with embedded xhtml From: Matthijs Bierman [mailto:[EMAIL PROTECTED] Sent: Wed 11/28/2007 3:26 AM To: java-user@lucene.apache.org Subject: Re: Lucene highlighting Hi Scott, The highlighter code does not do this. You need to implement your own highlighter

Re: Lucene highlighting

2007-11-28 Thread Matthijs Bierman
This would only highlight plaintext though, not in the original document as I suspect the TS would like. Matthijs markharw00d wrote: I need to highlight an entire document as it is displayed See NullFragmenter - To uns

Re: Lucene highlighting

2007-11-28 Thread markharw00d
I need to highlight an entire document as it is displayed See NullFragmenter - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene highlighting

2007-11-28 Thread Matthijs Bierman
Hi Scott, The highlighter code does not do this. You need to implement your own highlighter. What kind of documents are you indexing? Matthijs Scott Smith wrote: I've been looking at the highlighter examples. All of them seem to deal with fragments. I need to highlight an entire document

Lucene highlighting

2007-11-27 Thread Scott Smith
I've been looking at the highlighter examples. All of them seem to deal with fragments. I need to highlight an entire document as it is displayed (i.e., highlight all of the keywords in it). Can someone point me to some examples of this or does the highlighter code not do this? Thanks Sco