[ 
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4583:
---------------------------------------

    Attachment: LUCENE-4583.patch

bq. I do not think this should change. 

OK I just removed that nocommit.

bq. I just had a quick discussion about this with Robert, and since 
AppendingLongBuffer stores deltas from the minimum value of the block (and not 
0), AppendingLongBuffer is better (ie. faster and more compact) than 
MonotonicAppendingLongBuffer to store lengths. This means that if all lengths 
are 7, 8 or 9 in a block, it will only require 2 bits per value instead of 4.

Ahh, OK: I switched back to AppendingLongBuffer.

New patch.  I factored the testHugeBinaryValues up into the
BaseDocValuesFormatTestCase base class, and added protected method so
the codecs that don't accept huge binary values can say so.  I also
added a test case for Facet42DVFormat, and cut back to
AppendingLongBuffer.

I downgraded the nocommit about the spooky unused PagedBytes.blockEnds
to a TODO ... this class is somewhat dangerous because e.g. you can
use copyUsingLengthPrefix method and then get a .getDataOutput and get
corrumpted bytes out.

                
> StraightBytesDocValuesField fails if bytes > 32k
> ------------------------------------------------
>
>                 Key: LUCENE-4583
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4583
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: David Smiley
>            Priority: Critical
>             Fix For: 4.4
>
>         Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, 
> LUCENE-4583.patch, LUCENE-4583.patch
>
>
> I didn't observe any limitations on the size of a bytes based DocValues field 
> value in the docs.  It appears that the limit is 32k, although I didn't get 
> any friendly error telling me that was the limit.  32k is kind of small IMO; 
> I suspect this limit is unintended and as such is a bug.    The following 
> test fails:
> {code:java}
>   public void testBigDocValue() throws IOException {
>     Directory dir = newDirectory();
>     IndexWriter writer = new IndexWriter(dir, writerConfig(false));
>     Document doc = new Document();
>     BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
>     bytes.length = bytes.bytes.length;//byte data doesn't matter
>     doc.add(new StraightBytesDocValuesField("dvField", bytes));
>     writer.addDocument(doc);
>     writer.commit();
>     writer.close();
>     DirectoryReader reader = DirectoryReader.open(dir);
>     DocValues docValues = MultiDocValues.getDocValues(reader, "dvField");
>     //FAILS IF BYTES IS BIG!
>     docValues.getSource().getBytes(0, bytes);
>     reader.close();
>     dir.close();
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to