RE: Highlighting query results, my method is too crude, but how to improve it?

2023-02-21 Thread Trevor Nicholls
Thank you David, very useful cheers T -Original Message- From: Dawid Weiss Sent: Tuesday, February 21, 2023 7:17 PM To: java-user@lucene.apache.org Subject: Re: Highlighting query results, my method is too crude, but how to improve it? You can use two different queries - the query is

Re: Highlighting query results, my method is too crude, but how to improve it?

2023-02-20 Thread Dawid Weiss
gt; missing something. > > cheers > T > > -Original Message- > From: Mikhail Khludnev > Sent: Tuesday, February 21, 2023 3:22 AM > To: java-user@lucene.apache.org > Subject: Re: Highlighting query results, my method is too crude, but how > to improve it?

RE: Highlighting query results, my method is too crude, but how to improve it?

2023-02-20 Thread Trevor Nicholls
I thought. But I might be missing something. cheers T -Original Message- From: Mikhail Khludnev Sent: Tuesday, February 21, 2023 3:22 AM To: java-user@lucene.apache.org Subject: Re: Highlighting query results, my method is too crude, but how to improve it? Hello, Maybe I'm mi

Re: Highlighting query results, my method is too crude, but how to improve it?

2023-02-20 Thread Mikhail Khludnev
Hello, Maybe I'm missing some point. But, can you highlight another query than one you search for? On Mon, Feb 20, 2023 at 5:07 PM Trevor Nicholls wrote: > Sorry I apologize for this being a bit long and for explaining the problem > at the very bottom after all the background, rather than starti

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread Dawid Weiss
https://issues.apache.org/jira/browse/SOLR-1105 Yes, this is spot-on what I need with regard to copyTo fields, thanks for the link! > Or are the overlaps coming from passage offset ranges from separate queries > to the same content? The overlaps are caused by the fact that we have multiple sour

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread David Smiley
On Tue, May 30, 2017 at 9:25 AM Dawid Weiss wrote: > > #2 & #3 is the same requirement; you elaborate on #2 with more detail in > #3. > > The UH can't currently do this; but with the OH (original Highlighter) > you > > can but it appears somewhat awkward. See SimpleSpanFragmenter. I had > said

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread Dawid Weiss
> #2 & #3 is the same requirement; you elaborate on #2 with more detail in #3. > The UH can't currently do this; but with the OH (original Highlighter) you > can but it appears somewhat awkward. See SimpleSpanFragmenter. I had said > it was easy but I was mistaken; I'm getting rustier on the OH.

Re: Highlighting and delineating Passages (fragmenting)

2017-05-30 Thread David Smiley
Looks like you should use the original Highlighter until requirement #2,3 can be done with the UnifiedHighlighter. Other than #2,3, the UH can handle all these requirements, and the OH can do all. On Sat, May 27, 2017 at 6:08 AM Dawid Weiss wrote: > Thanks for your explanation, David. > > I act

Re: Highlighting and delineating Passages (fragmenting)

2017-05-27 Thread Evert Wagenaar
I always assumed this was the default behaviour of the Lucene TermHighlighter but I could be mistaken with an older version. I found out that there are major differences between Lucene and Solr though, with which I have similar problems. Best regards, Evert Wagenaar http://www.evertwagenaar.com/

Re: Highlighting and delineating Passages (fragmenting)

2017-05-27 Thread Dawid Weiss
Thanks for your explanation, David. I actually found working with all Lucene highlighters pretty difficult. I have a few requirements which seemed deceptively simple: 1) highlight query hit regions (phrase, fuzzy, terms); 2) try to organise the resulting snippets to visually "center" the hit regi

Re: highlighting with best text fragment from multi-value field

2016-12-14 Thread Tech Behemoth
Hi all Any idea of best practice for getting fragmented highlighted string ( Lucene 5.3.2) of multi-value field? Thanks On Mon, Dec 12, 2016 at 12:11 AM, Tech Behemoth wrote: > Hi all > > How to provide highlighting for fragmented string which is created from > multi-value field using Lucene

RE: Highlighting deprecation?

2015-12-02 Thread Allison, Timothy B.
01511.mbox/%3CBY2PR09MB112F5004CD269FA936812DFC72B0%40BY2PR09MB112.namprd09.prod.outlook.com%3E -Original Message- From: scott cote [mailto:scottcc...@gmail.com] Sent: Tuesday, December 01, 2015 5:26 PM To: java-user@lucene.apache.org Subject: Re: Highlighting deprecation? checkout the

Re: Highlighting deprecation?

2015-12-01 Thread scott cote
checkout the highlight package … https://lucene.apache.org/core/5_3_0/highlighter/org/apache/lucene/search/highlight/package-summary.html SCott > On Dec 1, 2015, at 4:16 PM, Kunzman, Dougl

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-06 Thread Michael Sokolov
On 2/6/2014 12:53 AM, Earl Hood wrote: On Tue, Feb 4, 2014 at 6:05 PM, Michael Sokolov wrote: Thanks for the feedback. I think it's difficult to know what to do about attribute value highlighting in the general case - do you have any suggestions? That is a challenging one since one has to kno

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-05 Thread Earl Hood
On Tue, Feb 4, 2014 at 6:05 PM, Michael Sokolov wrote: > Thanks for the feedback. I think it's difficult to know what to do about > attribute value highlighting in the general case - do you have any > suggestions? That is a challenging one since one has to know how attribute data will be transfo

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-05 Thread Olivier Binda
On 02/05/2014 01:05 AM, Michael Sokolov wrote: On 2/4/2014 2:50 PM, Earl Hood wrote: On Tue, Feb 4, 2014 at 1:16 PM, Michael Sokolov wrote: You might be interested in looking at Lux, which layers XML services like XQuery on top of Lucene and Solr, and includes an XML-aware highlighter: https:

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-05 Thread Trejkaz
On Wed, Feb 5, 2014 at 4:16 AM, Earl Hood wrote: > Our current solution is to do highlighting on the client-side. When > search happens, the search results from the server includes the parsed > query terms so the client has an idea of which terms to highlight vs > trying to reimplement a complete

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Michael Sokolov
On 2/4/2014 2:50 PM, Earl Hood wrote: On Tue, Feb 4, 2014 at 1:16 PM, Michael Sokolov wrote: You might be interested in looking at Lux, which layers XML services like XQuery on top of Lucene and Solr, and includes an XML-aware highlighter: https://github.com/msokolov/lux/blob/master/src/main/ja

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Earl Hood
On Tue, Feb 4, 2014 at 1:16 PM, Michael Sokolov wrote: > You might be interested in looking at Lux, which layers XML services like > XQuery on top of Lucene and Solr, and includes an XML-aware highlighter: > https://github.com/msokolov/lux/blob/master/src/main/java/lux/search/highlight/XmlHighligh

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Michael Sokolov
On 2/4/14 12:16 PM, Earl Hood wrote: On Tue, Feb 4, 2014 at 12:20 AM, Trejkaz wrote: I'm trying to find a precise and reasonably efficient way to highlight all occurrences of terms in the query, only highlighting fields which ... [snip] I am in a similiar situation with a web-based applica

Re: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Earl Hood
On Tue, Feb 4, 2014 at 12:20 AM, Trejkaz wrote: > I'm trying to find a precise and reasonably efficient way to highlight > all occurrences of terms in the query, only highlighting fields which > match the corresponding fields used in the query. This seems like it > would be a fairly common require

RE: Highlighting text, do I seriously have to reimplement this from scratch?

2014-02-04 Thread Allison, Timothy B.
This will be of no immediate help, but in the next iteration of LUCENE-5317, which I'll post in a few weeks (if I can find the time), I'll have an option to pull concordance windows from character offsets which can be stored at index time (so you wouldn't have to re-analyze). The current versio

RE: Highlighting phrases

2013-11-27 Thread Scott Smith
Never mind. I figured it out. Thanks anyway. -Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Wednesday, November 27, 2013 9:27 AM To: java-user@lucene.apache.org Subject: Highlighting phrases I'm doing some highlighting with the following code fragment:

Re: highlighting component to searchComponent

2013-07-01 Thread Jack Krupansky
Try asking your question on the “Solr user” email list – this is the Lucene user list! -- Jack Krupansky From: Adrien RUFFIE Sent: Monday, July 01, 2013 4:36 AM To: java-user@lucene.apache.org Subject: highlighting component to searchComponent Hello all I had the following configuration in my

Re: Highlighting search words in full document

2013-04-08 Thread Darren Hoffman
Thanks, Erick. I'll try that. Darren On 2013-04-07 3:25 PM, "Erick Erickson" wrote: >Well, at that point you have a doc ID presumably. When you format your >responses to the initial query, the link you provide for each verse is >something like > >yourserver/solr/collection1/select?q=id:chapter_

Re: Highlighting search words in full document

2013-04-07 Thread Erick Erickson
Well, at that point you have a doc ID presumably. When you format your responses to the initial query, the link you provide for each verse is something like yourserver/solr/collection1/select?q=id:chapter_id&hl=true&hl.fl=fullchaptertext&hl.q=. So when the user clicks on it, you get a response wi

Re: Highlighting search words in full document

2013-04-07 Thread Darren Hoffman
Thanks for the response, Erick. I have implemented just what you have described. The question I have is how to highlight the searched words in the entire chapter that were highlighted in the selected verse. Thanks! Sent from my iPhone On Apr 7, 2013, at 5:38 AM, Erick Erickson wrote: > Soun

Re: Highlighting search words in full document

2013-04-07 Thread Erick Erickson
Sounds like what you want to do is 1> with each verse, store the chapter ID. This could be the ID of another document. There's no requirement that all docs in an index have the same structure. In this case, you could have a "type" field in each doc with values like "verse" and "chapter". For your v

Re: Highlighting and InvalidTokenOffsetsException in Lucene 4.0

2012-11-28 Thread nbuso
Scott Smith mainstreamdata.com> writes: > > I'm migrating code from Lucene 3.5 to 4.0. I have the following code which is supposed to highlight text. I > get the exception InvalidTokenOffsetsException. I have no idea what that means. I am using a custom > analyzer which seems to work for sea

Re: Highlighting html pages

2012-11-06 Thread Michael Sokolov
On 11/6/2012 3:29 AM, Steve Rowe wrote: Hi Scott, HTMLStripCharFilter doesn't require that its input be valid HTML - there is no assumption of balanced tags. Also, highlighted sections could span tags, e.g. if you highlight "this phrase", and the original HTML looks like: … thisphras

Re: Highlighting html pages

2012-11-06 Thread Steve Rowe
gt; properly nested. > > Cheers > > Scott > > -Original Message- > From: Scott Smith [mailto:ssm...@mainstreamdata.com] > Sent: Thursday, November 01, 2012 7:16 PM > To: Michael Sokolov; java-user@lucene.apache.org > Subject: RE: Highlighting

RE: Highlighting html pages

2012-11-05 Thread Scott Smith
tags being properly nested. Cheers Scott -Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Thursday, November 01, 2012 7:16 PM To: Michael Sokolov; java-user@lucene.apache.org Subject: RE: Highlighting html pages I was trying to play with this. Am I correct in

Re: Highlighting html pages

2012-11-05 Thread Michael Sokolov
ve me after I've stripped the HTML. Suggestions? Scott -Original Message- From: Michael Sokolov [mailto:soko...@ifactory.com] Sent: Tuesday, October 23, 2012 9:04 PM To: java-user@lucene.apache.org Cc: Scott Smith Subject: Re: Highlighting html pages If you use HTMLStripCharFilter, i

RE: Highlighting html pages

2012-11-01 Thread Scott Smith
actory.com] Sent: Tuesday, October 23, 2012 9:04 PM To: java-user@lucene.apache.org Cc: Scott Smith Subject: Re: Highlighting html pages If you use HTMLStripCharFilter, it extracts the text only, leaving tags out, and remembering the word positions so that highlighting works properly. Should do ex

Re: Highlighting html pages

2012-10-23 Thread Michael Sokolov
If you use HTMLStripCharFilter, it extracts the text only, leaving tags out, and remembering the word positions so that highlighting works properly. Should do exactly what you want out of the box... On 10/23/2012 8:00 PM, Scott Smith wrote: I need to take an html page that I retrieve from m

Re: highlighting

2011-08-03 Thread govind bhardwaj
Hi Sabeer, I used Lucene 3.3.0 for testing your code. (I doubt that Lucene 4.0 has been released as version 3.3.0 was released recently in July). In the second case, due to exact-matching there is no output i.e. there is no "transport" (no exact match) , but "transportation" in sourceText. One c

Re: highlighting

2011-07-18 Thread Sabeer Hussain
I am using Lucene 4.0 and trying to use its highlighting feature. I am not getting the desired result due to some mistake that I am not able to identify. My source code looks like String sourceText = "liver disease kidney transplant"; String termString ="\"transplant\"";

Re: highlighting performance

2011-06-22 Thread Itamar Syn-Hershko
I'm not intimately familiar with FVH myself, but that sounds reasonable. Tests usually don't lie. I'd definitely like to see a patched version that avoids that! Itamar. On 22/06/2011 05:29, Michael Sokolov wrote: OK - it seems as if there is a blow-up in FieldPhraseList if a document has a la

Re: highlighting performance

2011-06-21 Thread Michael Sokolov
OK - it seems as if there is a blow-up in FieldPhraseList if a document has a large number of occurrences of a term that is in the query. In one example, I searched for "1", and this occurs just under 2000 times in one of my test documents (as the value of HTML attributes). Admittedly a weird

Re: highlighting performance

2011-06-21 Thread Michael Sokolov
I did that, and the benchmark indicates FVH is 10x faster than Highlighter now. I ran with a subset of the wikipedia data since I didn't want to deal with the whole thing. I'm trying to reconcile these weirdly varying results. One difference is that the benchmark doesn't use PhraseQueries -

Re: highlighting performance

2011-06-20 Thread Michael Sokolov
Koji- I'm not familiar with the benchmarking system, but maybe I'll see if I can run that benchmark on my test data as a point of comparison - thanks for the pointer! -Mike On 6/20/2011 8:21 PM, Koji Sekiguchi wrote: Mike, FVH used to be faster for large docs. I wrote FVH section for Lucene

Re: highlighting performance

2011-06-20 Thread Koji Sekiguchi
Mike, FVH used to be faster for large docs. I wrote FVH section for Lucene in Action and it said: In contrib/benchmark (covered in appendix C), there’s an algorithm file called highlight-vs-vector-highlight.alg that lets you see the difference between two highlighters in processing time. As of

Re: Highlighting a phrase with "Single"

2011-04-06 Thread shrinath.m
Thats right :) Thanks Koji :) On Wed, Apr 6, 2011 at 3:31 PM, Koji Sekiguchi [via Lucene] < ml-node+2784321-1329059645-376...@n3.nabble.com> wrote: > (11/04/06 14:01), shrinath.m wrote: > > > If there is a phrase in search, the highlighter highlights every word > > separately.. > > Like this : >

Re: Highlighting a phrase with "Single"

2011-04-06 Thread Koji Sekiguchi
(11/04/06 14:01), shrinath.m wrote: If there is a phrase in search, the highlighter highlights every word separately.. Like this : I love Lucene Instead what I want is like this : I love Lucene Not sure my mailer problem or not, I don't see the difference between above two. But reading t

Re: Highlighting large documents (Lucene 3.0.0)

2010-03-01 Thread Koji Sekiguchi
-Arne- wrote: Hi Koji, thanks for your answer. Can you help me a once again? What exactly I suposse to do? The concrete program in my mind here: public class TestHighlightTruncatedSearchQuery { static Directory dir = new RAMDirectory(); static Analyzer analyzer = new BiGramAnalyzer();

Re: Highlighting large documents (Lucene 3.0.0)

2010-03-01 Thread -Arne-
Hi Koji, thanks for your answer. Can you help me a once again? What exactly I suposse to do? Koji Sekiguchi-2 wrote: > > -Arne- wrote: >> Hi, >> >> I'm using Lucene 3.0.0 and have large documents to search (logfiles >> 0,5-20MB). For better search results the query tokens are truncated left

Re: Highlighting large documents (Lucene 3.0.0)

2010-03-01 Thread Koji Sekiguchi
-Arne- wrote: Hi, I'm using Lucene 3.0.0 and have large documents to search (logfiles 0,5-20MB). For better search results the query tokens are truncated left and right. A search for "user" is made to "*user*". The performance of searching even complex queries with more than one searchterm is qu

Re: Highlighting phrases in 2.9

2009-09-30 Thread Mark Miller
Scott Smith wrote: > I've been looking at the changes I have to make in my code to go from > 2.4.1 to 2.9. One of the features I have is to highlight query hits in > documents which meet the search criteria. If the query has a phrase, > then I need to highlight the phrase, but not isolated words

Re: highlighting the result within a file

2009-07-04 Thread Jaison Sunny
hi Ritu, > this is jaison. i am new to such search. can you just help me out. i want some body who can guide me in lucene > Thanks in advance. > > > - >

Re: highlighting searched results in document

2009-05-28 Thread Ritu choudhary
No i am not indexing the html tags here. I just want to highlight the searched word in the html or xml file(the file from which the index was created) ,can't i trace that? Is there any function to trace the positions of the term stored in lucene index to find where it actually is in the file? Can o

Re: highlighting searched results in document

2009-05-28 Thread KK
As I know, you extract the text out of html pages, I dont think you want to index the tags as well, right? So what gets indexed by lucene is just the text and what you get as search result is what you've indexed. I'm repeating myself, once you have the search result, its upon you to do what you wan

Re: highlighting searched results in document

2009-05-28 Thread Ritu choudhary
Is this possible through lucene or has anybody tried such thing? On 28/05/2009, Ritu choudhary wrote: > well friend let me explain the whole thing to you then: > > i created lucene index out of some .xml and .html files and i also > checked this index through luke and its pretty alright till here

Re: highlighting searched results in document

2009-05-28 Thread Ritu choudhary
well friend let me explain the whole thing to you then: i created lucene index out of some .xml and .html files and i also checked this index through luke and its pretty alright till here . I searched the terms and can find them too but how do i use this result . I want to open the document ,the

Re: highlighting searched results in document

2009-05-27 Thread KK
Yes, the getBestFragment() returns the matched fragment "fragmentcount" numbers each separated with the "fragmentseparator". what exactly you mean by "highlight the searched word in the document." what is this document??? first let us know what exactly you want to with the search results. --KK On

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
The output i want is to show the text i searched let's say "search" in the document where it occured. Like this: the text i want to search is this and it should be highlighted as this . Search currently is giving just the word "search". The result i want is to highlight the searched word in the d

Re: highlighting searched results in document

2009-05-27 Thread KK
what exactly is your requirement? Displaying the final search results in a webpage? or anything else. The results that you are getting is correct. Now you have to decide what you want to do with that. I thought you are trying to show the results in a webpage. --KK On Thu, May 28, 2009 at 11:54 AM

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
no i am doing it on eclipse ganymede On 28/05/2009, KK wrote: > Forgot: > Are you trying all this from command line? Because thats wehn you get the > ouput as unprocessed html , those span tags, when you pass the same to > display the content as a webpage they will be processed by the browser and

Re: highlighting searched results in document

2009-05-27 Thread KK
Forgot: Are you trying all this from command line? Because thats wehn you get the ouput as unprocessed html , those span tags, when you pass the same to display the content as a webpage they will be processed by the browser and you will see the colored matches. --KK On Thu, May 28, 2009 at 11:49

Re: highlighting searched results in document

2009-05-27 Thread KK
Yes , thats the expected output. Now put that full content[whatever the searcer returned] in the html page alongwith the styling for the same, and you will see the matches in yellow [you chose yellow as color for highlighting]. --KK On Thu, May 28, 2009 at 11:42 AM, Ritu choudhary wrote: > I h

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
I have added the lines you suggested and now its giving the following output , still can't get what's wrong... THE CHANGES I HAVE DONE: SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("", ""); Highlighter highlighter = new Highlighter(formatter, new QueryScorer(query

Re: highlighting searched results in document

2009-05-27 Thread KK
Yes, your code is wrong! Where is the highlighter span/formatter, because from your code what I can see is that you are just passsing the score to Queryscorer, instead you should pass both queryscore as well as formatter >From my previous mail you can see the following code and mimic the same and i

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
Am i coding it wrongly ...please reply.

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
Thank you so much for your patience and support but i am still not getting the correct result. Here is my code can you please tell me what wrong have i done in it? (I don't want to use org.apache.search.hit so i have used terms in place of that) package highlighted; import java.io.FileWriter; i

Re: highlighting searched results in document

2009-05-27 Thread KK
@Ritu Wouter's reply must have fixed the problem, right? Or still stuck? --KK On Wed, May 27, 2009 at 1:46 PM, Wouter Heijke wrote: > Hi, > It sounds to me that you are highlighting the query string and not the > document. You will have to pass the document's content to > getBestFragments() and

Re: highlighting searched results in document

2009-05-27 Thread Wouter Heijke
Hi, It sounds to me that you are highlighting the query string and not the document. You will have to pass the document's content to getBestFragments() and it will work I think. Wouter > hi there, > I am using lucene highlighter to highlight the searched result > but it shows only the query s

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
I want to confirm the output of the below statement , what i get into "result" is just the word i am searching (let's say d word is registered). How can i get the whole fragment in which the word is found and show the highlighted word in that fragment or document. String result = highlighte

Re: highlighting searched results in document

2009-05-26 Thread KK
Hi , AFAIK, the default option is to bold the matched text. If you want to do something else, say highlight it with some color then you have to do that instead of doing the default bolding. The following is a working example from LIA2ndEdn, [verbatim copy] for hit highlighting. import java.io.*; i

RE: Highlighting phrases

2008-04-21 Thread Scott Smith
And a well deserved beer it would be... Thanks Scott -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Sunday, April 20, 2008 7:50 PM To: java-user@lucene.apache.org Subject: Re: Highlighting phrases https://issues.apache.org/jira/browse/LUCENE-794 Because its for

Re: Highlighting phrases

2008-04-20 Thread Mark Miller
https://issues.apache.org/jira/browse/LUCENE-794 Because its for a customer, that will be 1 beer... On Sun, 2008-04-20 at 17:12 -0600, Scott Smith wrote: > I've written some code to highlight items from a search using the standard > Highlighter class, QueryScorer, and NullFragmenter. Everything

Re: Highlighting with wildcards?

2008-01-18 Thread John Byrne
I think the way to do this is to run the 'rewrite()' method on the wilcard query; this turns it into a boolean collection of term queries, with a term for each match for the wildcard. That way, you're just highlighting a normal term query. I think that would also work for fuzzy queries. Hope th

Re: Highlighting + phrase queries

2008-01-10 Thread Mark Miller
I don't think you would see much of gain. Shoving the TokenStream into the MemoryIndex is actually pretty fast and I wouldn't be surprised if it was much faster than reading from disk. Most of the computational time is spent in reconstructing the TokenStream, whether you use term-vectors or re-

Re: Highlighting + phrase queries

2008-01-10 Thread Marjan Celikik
Mark Miller wrote: That is why the original contrib does not work with PhraseQuery's. It simply matches Tokens from the query with those in the TokenStream. LUCENE-794 takes the TokenStream and shoves it into a MemoryIndex. Then, after converting the query to a SpanQuery approximation, getSp

Re: Highlighting + phrase queries

2008-01-10 Thread Marjan Celikik
Marjan Celikik wrote: Mark Miller wrote: The Highlighter works by comparing the TokenStream of the document with the Tokens in the query. The TokenStream can be rebuilt from the index if you use TermVectors with TokenSources or you can get it by reanalyzing the document. Each Token from the T

Re: Highlighting + phrase queries

2008-01-10 Thread Marjan Celikik
Mark Miller wrote: The Highlighter works by comparing the TokenStream of the document with the Tokens in the query. The TokenStream can be rebuilt from the index if you use TermVectors with TokenSources or you can get it by reanalyzing the document. Each Token from the TokenStream is checked

Re: Highlighting + phrase queries

2008-01-10 Thread Mark Miller
The Highlighter works by comparing the TokenStream of the document with the Tokens in the query. The TokenStream can be rebuilt from the index if you use TermVectors with TokenSources or you can get it by reanalyzing the document. Each Token from the TokenStream is checked against Tokens in th

Re: Highlighting + phrase queries

2008-01-10 Thread Marjan Celikik
Mark Miller wrote: Oh yeah...something that you may not have seen is that this has a dependency on MemoryIndex from contrib. You need that jar as well. - Mark Hm, I need the source code. How do I download the files from https://issues.apache.org/jira/browse/LUCENE-794 (all I see are some .pat

Re: Highlighting + phrase queries

2008-01-10 Thread Mark Miller
Oh yeah...something that you may not have seen is that this has a dependency on MemoryIndex from contrib. You need that jar as well. - Mark Marjan Celikik wrote: Mark Miller wrote: The contrib Highlighter doesn't know and highlights them all. Check out my patch here for position sensitive hi

Re: Highlighting + phrase queries

2008-01-10 Thread Mark Miller
It should work no problem with 2.2. What are the compile errors you are getting? If you send me a note directly I will send you a jar. - Mark Marjan Celikik wrote: Mark Miller wrote: The contrib Highlighter doesn't know and highlights them all. Check out my patch here for position sensitive

Re: Highlighting + phrase queries

2008-01-10 Thread Marjan Celikik
Mark Miller wrote: The contrib Highlighter doesn't know and highlights them all. Check out my patch here for position sensitive highlighting: https://issues.apache.org/jira/browse/LUCENE-794 It seems that the patch does not work with Lucene 2.2 as I get some compile errors. Is this really the

Re: Highlighting + phrase queries

2008-01-09 Thread Mark Miller
It works exactly the same as the standard contrib Highlighter except that it tries not to highlight spurious results for a positional query. This is exact with Span queries, but more approximate for phrase queries. The approximation is pretty darn good, but let me know if you find a case that d

Re: Highlighting + phrase queries

2008-01-09 Thread Marjan Celikik
Mark Miller wrote: The contrib Highlighter doesn't know and highlights them all. Check out my patch here for position sensitive highlighting: https://issues.apache.org/jira/browse/LUCENE-794 OK, before trying it out, I would like to know does the patch work for mixed queries, e.g. "a b" +c -d "

Re: Highlighting + phrase queries

2008-01-09 Thread Mark Miller
The contrib Highlighter doesn't know and highlights them all. Check out my patch here for position sensitive highlighting: https://issues.apache.org/jira/browse/LUCENE-794 Marjan Celikik wrote: Dear all, Let's assume I have a phrase query and a document which contain the phrase but also it co

Re: highlighting and fragments

2007-09-21 Thread Michael J. Prichard
One index is around 3,000,000 items. I think around 10 fields. I store some and don't others. I index email content and attachment content. I store some smaller fields and not the content fields. That current index is around 10GB but that is nothing that is about to come down the pike. Ma

Re: highlighting and fragments

2007-09-21 Thread Erick Erickson
Out of curiosity, how big is huge? And how many documents and fields? And a silly question, are you storing your fields or not (i.e. Field.Store.NO Erick On 9/20/07, Michael J. Prichard <[EMAIL PROTECTED]> wrote: > > Hello Folks, > > I wanted to stay away from storing text in the indexes in

Re: highlighting and fragments

2007-09-20 Thread Mark Miller
Lucene's storing functionality is just a simple storage mechanism. You can certainly and easily use your own storage mechanism. When you get your user created id back from Lucene due to a hit, just pass that id to your storage system to get the original text and then feed that to the Highlighte

Re: highlighting phrase query

2007-07-03 Thread Mark Miller
has any one used Lucene-794? how stable it it. is it widely used in industry. I have used it extensively and I would say it is extremely stable. As I said, much of the code from it is literally the same compiled code from Contrib Highlighter (It is really just a new Scorer class for the

Re: highlighting phrase query

2007-07-02 Thread sandeep chawla
to my manager. :) --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Monday, July 02, 2007 2:11 PM To: java-user@lucene.apache.org Subject: Re: highlighting phrase query There has been a lot of Highlighter discussion lately, but just to try and sum up the st

RE: highlighting phrase query

2007-07-02 Thread Renaud Waldura
Mark: Thanks a million for this comprehensive analysis. This is going straight to my manager. :) --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Monday, July 02, 2007 2:11 PM To: java-user@lucene.apache.org Subject: Re: highlighting phrase query There

Re: highlighting phrase query

2007-07-02 Thread Mark Miller
There has been a lot of Highlighter discussion lately, but just to try and sum up the state of Highlighting in the Lucene world: There are four Highlighter implementations that I know of. From what I can tell, only the original Contrib Highlighter has received sustained active development by m

Re: Highlighting of original documents

2007-03-13 Thread Oystein Reigem
Mark Miller wrote: Depends on the work you want to do. If you want to highlight a simple XML doc the approach would be to extract all of the text elements and run them through the highlighter and then correctly update them. That would be mostly simple DOM manipulation. OK. I guess there wil

Re: Highlighting of original documents

2007-03-13 Thread Mark Miller
Depends on the work you want to do. If you want to highlight a simple XML doc the approach would be to extract all of the text elements and run them through the highlighter and then correctly update them. That would be mostly simple DOM manipulation. The same approach should work with any forma

Re: Highlighting issues

2007-02-27 Thread mark harwood
This snippet from the Highlighter JUnit test should reveal the solution: public void testFieldSpecificHighlighting() throws IOException, ParseException { String docMainText="fred is one of the people"; QueryParser parser=new QueryParser(FIELD_NAME,analyzer); Query

Re: Highlighting brackets bug ?

2007-01-15 Thread Steven Rowe
heikki doeleman wrote: > One question though .. is there an easy way to download the sources > from the svn repository, in one go ? I did it now by right-clicking > links to files The "Source Code" section of the Lucene Java Developer Resources page

Re: Highlighting brackets bug ?

2007-01-15 Thread Mark Miller
Depends on your environment...I think you need an SVN client to efficiently check out many files...ever try tortoise SVN? I use eclipse and the subclipse plugin...working with the latest version of Lucene is as easy as adding the Lucene svn browse url and then checking Lucene out as a new proje

Re: Highlighting brackets bug ?

2007-01-15 Thread heikki doeleman
yes, thank you ! I built a highlighter jar from here http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/highlighter/src/java/org/apache/lucene/search/highlight/(revision 496265) and it is working fine ! One question though .. is there an easy way to download the sources from the svn reposi

Re: Highlighting brackets bug ?

2007-01-13 Thread Mark Miller
Which version are you using? I believe that this is a bug that was fixed last August...but that the fix is only in the 2.1 Highlighter version. Try grabbing the latest highlighter code from the trunk. - Mark heikki doeleman wrote: Hi there, I'm having some strange behaviour using the highlig

Re: Highlighting span for Phrase Queries

2006-11-14 Thread Erik Hatcher
On Nov 14, 2006, at 4:08 AM, Heikki Doeleman wrote: thanks for pointing out these, however neither seems to do exactly what I want, i.e. highlight a phrase when a phrase search was done. A technique I've employed for a client is to convert a general Query object into a SpanQuery, and creat

Re: Highlighting span for Phrase Queries

2006-11-14 Thread Heikki Doeleman
t; To java-user@lucene.apache.org cc bcc Subject Re: Highlighting span for Phrase Queries mark harwood <[EMAIL PROTECTED]> Please respond to java-user@lucene.apache.org 10/11/2006 17:46 There have been a couple of alternative Highlighter contributions recently, I can't recall whi

Re: Highlighting span for Phrase Queries

2006-11-10 Thread mark harwood
There have been a couple of alternative Highlighter contributions recently, I can't recall which claim to support "proper" highlighting of phrases but you might want to give them a try. http://issues.apache.org/jira/browse/LUCENE-644 http://issues.apache.org/jira/browse/LUCENE-663 Ultimately

Re: Highlighting "really" found terms

2006-10-27 Thread Shane
Is your objective to avoid highlighting matching tokens which are not in a phrase? I recently received the request to avoid highlighting single tokens which appear in the hit (vs. sequences of matched tokens). I have just completed a partial re-write of the getBestTextFragments to allow this.

  1   2   >