^5 ;)

On Mon, Mar 25, 2013 at 11:02 PM, Bushman, Lamont <bus08...@byui.edu> wrote:
> Thank you very much for the help Simon.  I am amazed I was able to accomplish 
> what I wanted.  I didn't store the body in the Index.  And I used Highlighter 
> to return the best fragments by parsing my original document.
> ________________________________________
> From: Simon Willnauer [simon.willna...@gmail.com]
> Sent: Monday, March 25, 2013 4:07 AM
> To: java-user@lucene.apache.org
> Subject: Re: Compression and Highlighter
>
> On Mon, Mar 25, 2013 at 8:13 AM, Bushman, Lamont <bus08...@byui.edu> wrote:
>>     I have a project where I need to index documents using Lucene 4.1.0.  
>> One of the fields for the stored Document is the actual text from the 
>> document(.pdf, .docx, etc.)  I want to be able to highlight text from the 
>> documents  in the search results.  I was looking at some older tutorials 
>> about storing the field with TermVectors and also storing it in the index 
>> with Store.COMPRESS.  However, with Lucene 4.1 they have done away with 
>> Store.COMPRESS.  Is there still a way to compress the field?
>
> Lucene 4.1 uses a compressed stored fields format under the hoods by
> default. The compression is completely transparent and enabled by
> default. Here is some background:
> http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene
>
>>     I am worried about the amount of space that will be stored in the index 
>> if I have to have the "body" Field stored and uncompressed.
>>     Are there ways around having to store the whole Field in its original 
>> form?
>>     Since I am already going to be storing the actual documents on the 
>> server, would it be feasible (time) to not store TermVectors or Store the 
>> field at all until the user searches for a document.  Then at runtime I can 
>> re-index the top docs from the original documents in RAM and use Highlighter 
>> to return fragments?
>
> this is what the highlighter does if you are not using the
> FastVectorHighlighter. You can just pass in the string value you wanna
> highlight no matter if you stored it in lucene or not. You just need
> to see if that works for you performance wise without storing TV.
>
> simon
>>
>> Thanks
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to