[ 
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13655529#comment-13655529
 ] 

Michael McCandless commented on LUCENE-4583:
--------------------------------------------

bq. I have to admit that I still don't have a 100% handle on the use case(s) 
for docvalues vs. stored fields, even though I've asked on the list. I mean, 
sometimes the chatter seems to suggest that dv is the successor to stored 
values. Hmmm... in that case, I should be able to store the full text of a 24 
MB PDF file in a dv. Now, I know that isn't true.

The big difference is that DV fields are stored column stride, so you
can decide on a field by field basis whether it will be in RAM on disk
etc., and you get faster access if you know you just need to work with
just one or two fields.

Vs stored fields where all fields for one document are stored
"together".

Each has different tradeoffs so it's really up to the app to decide
which is best... if you know you need 12 fields loaded for each
document you are presenting on the current page, stored fields is
probably best.

But if you need one field to use as a scoring factor (eg maybe you are
boosting by recency) then column-stride is better.

                
> StraightBytesDocValuesField fails if bytes > 32k
> ------------------------------------------------
>
>                 Key: LUCENE-4583
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4583
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: David Smiley
>            Priority: Critical
>             Fix For: 4.4
>
>         Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch
>
>
> I didn't observe any limitations on the size of a bytes based DocValues field 
> value in the docs.  It appears that the limit is 32k, although I didn't get 
> any friendly error telling me that was the limit.  32k is kind of small IMO; 
> I suspect this limit is unintended and as such is a bug.    The following 
> test fails:
> {code:java}
>   public void testBigDocValue() throws IOException {
>     Directory dir = newDirectory();
>     IndexWriter writer = new IndexWriter(dir, writerConfig(false));
>     Document doc = new Document();
>     BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
>     bytes.length = bytes.bytes.length;//byte data doesn't matter
>     doc.add(new StraightBytesDocValuesField("dvField", bytes));
>     writer.addDocument(doc);
>     writer.commit();
>     writer.close();
>     DirectoryReader reader = DirectoryReader.open(dir);
>     DocValues docValues = MultiDocValues.getDocValues(reader, "dvField");
>     //FAILS IF BYTES IS BIG!
>     docValues.getSource().getBytes(0, bytes);
>     reader.close();
>     dir.close();
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to