[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField

Andrzej Bialecki (JIRA) Thu, 31 May 2012 13:38:25 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286909#comment-13286909
 ]


Andrzej Bialecki  commented on LUCENE-3312:
-------------------------------------------

Comments to patch 04:

* index.Document is an interface, I think for better extensibility in the 
future it could be an abstract class - who knows what we will want to put there 
in addition to the iterators...
* as noted on IRC, this strong decoupling of stored and indexed content poses 
some interesting questions:
** since you can add multiple fields with the same name, you can now add an 
arbitrary sequence of Stored and Indexed fields (all with the same name). This 
means that you can now store parts of a field that are not indexed, and parts 
of a field that are indexed but not stored.
** previously, if a field was flagged as indexed but didn't have a tokenStream, 
its String or Reader value would be used to create a token stream. Now if you 
want a value to be stored and indexed you have to add two fields with the same 
name - one StoredField and the other an IndexedField for which you create a 
token stream from the value. My assumption is that StoredField-s will never be 
used anymore as potential sources of token streams?
* maybe this is a good moment to change all getters that return arrays of 
fields or values to return List-s, since all the code is doing underneath is 
collecting them into lists and then converting to arrays?
* previously we allowed one to remove fields from document by name, are we 
going to allow this now separately for indexed and stored fields?

* minor nit: there's a grammar mistake in Field.setTokenStream(..): 
"TokenStream fields tokenized".
                
> Break out StorableField from IndexableField
> -------------------------------------------
>
>                 Key: LUCENE-3312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3312
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Nikola Tankovic
>              Labels: gsoc2012, lucene-gsoc-12
>             Fix For: Field Type branch
>
>         Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, 
> lucene-3312-patch-03.patch, lucene-3312-patch-04.patch
>
>
> In the field type branch we have strongly decoupled
> Document/Field/FieldType impl from the indexer, by having only a
> narrow API (IndexableField) passed to IndexWriter.  This frees apps up
> use their own "documents" instead of the "user-space" impls we provide
> in oal.document.
> Similarly, with LUCENE-3309, we've done the same thing on the
> doc/field retrieval side (from IndexReader), with the
> StoredFieldsVisitor.
> But, maybe we should break out StorableField from IndexableField,
> such that when you index a doc you provide two Iterables -- one for the
> IndexableFields and one for the StorableFields.  Either can be null.
> One downside is possible perf hit for fields that are both indexed &
> stored (ie, we visit them twice, lookup their name in a hash twice,
> etc.).  But the upside is a cleaner separation of concerns in API....

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField

Reply via email to