There are two answers:

Its often a good idea, if you mostly need the full representation in one
call. E.g. we have the complete XML representation in a stored field and use
it for display with XSLT and so on. Other fields are for indexing only and
do not get stored.

BUT:

If you only need parts of the document and want to use FieldSelectors for
faster fetching of stored fields, you should store separately. Also for
highlighting: If you use the old highlighter, other stored fields are also
needed for highlighting queries on the same field. Fast vector highlighter
only uses term vectors for highlighting, but you should then be sure to have
the correct offsets indexed to work on the full field.

Often a mix is also good (we have a stored field with the full XML
representation for the full display (single document view), a short fragment
for result list display in another stored field and so on). It would be even
possible to store the whole PDF file as a binary stored field if you like. 

So it depends on what you want to do.

I always say to my customers, that you should strictly differentiate between
stored and indexed fields. The revision of the document API will enforce
this in future.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Paul Taylor [mailto:paul_t...@fastmail.fm]
> Sent: Monday, November 30, 2009 7:01 PM
> To: java-user@lucene.apache.org
> Subject: Deciding on strategy for storing indexed fields
> 
> Currently in our Lucene Search we have a number of distinct fields that
> are indexed and stored, so that the fields can be searched and we can
> then construct an xml representation of the match
> (http://wiki.musicbrainz.org/Next_Generation_Schema/SearchServerXML) but
> on further reading it appears that the field is stored twice quite
> independently ,  and sometimes we store '-' fields  just so that we can
> match up pairs of fields
> (http://wiki.musicbrainz.org/User:Murdos/NGS_Search_Server). I'm
> wondering if it would make sense (and actually use less space) to have
> the same indexed fields but only have one stored field which could be a
> serialized representation of the JAXB class we use to output the xml for
> the matches. This could have the additional benefit that on doing a
> search the JAXB classes for returing the matching results are already
> created so the search would be quicker.
> 
> Is this a good idea ?
> 
> thanks Paul
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to