[
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-4583:
---------------------------------------
Attachment: LUCENE-4583.patch
Another iteration on the patch:
* I added constants MAX_BINARY_FIELD_LENGTH to
Lucene4{0,2}DocValuesFormat, and then reference that in the
Writer/Consumer to catch too-big values.
* I added another test to BaseDocValuesFormatTestCase, to test the
exact maximum length value.
* I fixed that test failure, by passing the String field to the
codecAcceptsHugeBinaryValues method, and adding a _TestUtil helper
method to check this.
An alternative to the protected method would be to have two separate
tests in the base class, one test verifying a clean IllegalArgumentExc
is thrown when the value is too big, and another verifying huge binary
values can be indexed successfully. And then I'd fix each DVFormat's
test to subclass and @Ignore whichever base test is not appropriate.
But I don't think this would simplify things much? Ie,
TestDocValuesFormat would still need logic to check depending on the
default codec.
> StraightBytesDocValuesField fails if bytes > 32k
> ------------------------------------------------
>
> Key: LUCENE-4583
> URL: https://issues.apache.org/jira/browse/LUCENE-4583
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/index
> Affects Versions: 4.0, 4.1, 5.0
> Reporter: David Smiley
> Priority: Critical
> Fix For: 4.4
>
> Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch,
> LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch
>
>
> I didn't observe any limitations on the size of a bytes based DocValues field
> value in the docs. It appears that the limit is 32k, although I didn't get
> any friendly error telling me that was the limit. 32k is kind of small IMO;
> I suspect this limit is unintended and as such is a bug. The following
> test fails:
> {code:java}
> public void testBigDocValue() throws IOException {
> Directory dir = newDirectory();
> IndexWriter writer = new IndexWriter(dir, writerConfig(false));
> Document doc = new Document();
> BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
> bytes.length = bytes.bytes.length;//byte data doesn't matter
> doc.add(new StraightBytesDocValuesField("dvField", bytes));
> writer.addDocument(doc);
> writer.commit();
> writer.close();
> DirectoryReader reader = DirectoryReader.open(dir);
> DocValues docValues = MultiDocValues.getDocValues(reader, "dvField");
> //FAILS IF BYTES IS BIG!
> docValues.getSource().getBytes(0, bytes);
> reader.close();
> dir.close();
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]