Jonathan Vexler created HUDI-9607:
-------------------------------------

             Summary: Flink VARBINARY in array and map field index oob read 
issue
                 Key: HUDI-9607
                 URL: https://issues.apache.org/jira/browse/HUDI-9607
             Project: Apache Hudi
          Issue Type: Bug
          Components: flink, reader-core
    Affects Versions: 1.0.2
            Reporter: Jonathan Vexler
             Fix For: 1.1.0


{code:java}
java.lang.RuntimeException: java.lang.IllegalArgumentException: 72 > 36
        at 
org.apache.hudi.common.table.read.TestHoodieFileGroupReaderBase.lambda$readRecordsFromFileGroup$9(TestHoodieFileGroupReaderBase.java:698)
    at java.util.ArrayList.forEach(ArrayList.java:1259)     at 
org.apache.hudi.common.table.read.TestHoodieFileGroupReaderBase.readRecordsFromFileGroup(TestHoodieFileGroupReaderBase.java:691)
     at 
org.apache.hudi.common.table.read.TestHoodieFileGroupReaderBase.validateOutputFromFileGroupReaderWithNativeRecords(TestHoodieFileGroupReaderBase.java:560)
   at 
org.apache.hudi.common.table.read.TestHoodieFileGroupReaderBase.testSchemaEvolutionWhenBaseFilesWithDifferentSchema(TestHoodieFileGroupReaderBase.java:244)
  at java.lang.reflect.Method.invoke(Method.java:498)     at 
java.util.ArrayList.forEach(ArrayList.java:1259)     at 
java.util.ArrayList.forEach(ArrayList.java:1259)Caused by: 
java.lang.IllegalArgumentException: 72 > 36       at 
java.util.Arrays.copyOfRange(Arrays.java:3519)       at 
org.apache.flink.table.data.columnar.ColumnarArrayData.getBinary(ColumnarArrayData.java:138)
 at 
org.apache.hudi.table.format.cow.vector.ColumnarGroupRowData.getBinary(ColumnarGroupRowData.java:121)
        at 
org.apache.flink.table.data.RowData.lambda$createFieldGetter$245ca7d1$3(RowData.java:228)
    at 
org.apache.flink.table.runtime.typeutils.RowDataSerializer.toBinaryRow(RowDataSerializer.java:207)
   at 
org.apache.flink.table.data.writer.AbstractBinaryWriter.writeRow(AbstractBinaryWriter.java:147)
      at 
org.apache.flink.table.data.writer.BinaryArrayWriter.writeRow(BinaryArrayWriter.java:30)
     at 
org.apache.flink.table.data.writer.BinaryWriter.write(BinaryWriter.java:155) 
{code}


Schema of offending field with issue: 
{code:java}
{
  "type" : "map",
  "values" : {
    "type" : "record",
    "name" : "customMapRecord",
    "doc" : "",
    "fields" : [ {
      "name" : "customFieldMap0",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap1",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap2",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap3",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap4",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap5",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap6",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap7",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap8",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap9",
      "type" : "float",
      "doc" : ""
    }, {
      "name" : "customFieldMap10",
      "type" : "float",
      "doc" : ""
    }, {
      "name" : "customFieldMap11",
      "type" : "float",
      "doc" : ""
    }, {
      "name" : "customFieldMap12",
      "type" : "double",
      "doc" : ""
    }, {
      "name" : "customFieldMap13",
      "type" : "double",
      "doc" : ""
    }, {
      "name" : "customFieldMap14",
      "type" : "string",
      "doc" : ""
    }, {
      "name" : "customFieldMap15",
      "type" : "string",
      "doc" : ""
    }, {
      "name" : "customFieldMap16",
      "type" : "bytes",
      "doc" : ""
    }, {
      "name" : "customFieldMap17",
      "type" : "bytes",
      "doc" : ""
    } ]
  }
} {code}
Schema with flag to prevent byte fields (this schema doesn't cause failure)
{code:java}
{
  "type" : "map",
  "values" : {
    "type" : "record",
    "name" : "customMapRecord",
    "doc" : "",
    "fields" : [ {
      "name" : "customFieldMap0",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap1",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap2",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap3",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap4",
      "type" : "int",
      "doc" : ""
    }, {
      "name" : "customFieldMap5",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap6",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap7",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap8",
      "type" : "long",
      "doc" : ""
    }, {
      "name" : "customFieldMap9",
      "type" : "float",
      "doc" : ""
    }, {
      "name" : "customFieldMap10",
      "type" : "float",
      "doc" : ""
    }, {
      "name" : "customFieldMap11",
      "type" : "float",
      "doc" : ""
    }, {
      "name" : "customFieldMap12",
      "type" : "double",
      "doc" : ""
    }, {
      "name" : "customFieldMap13",
      "type" : "double",
      "doc" : ""
    }, {
      "name" : "customFieldMap14",
      "type" : "string",
      "doc" : ""
    }, {
      "name" : "customFieldMap15",
      "type" : "string",
      "doc" : ""
    }, {
      "name" : "customFieldMap16",
      "type" : "string",
      "doc" : ""
    }, {
      "name" : "customFieldMap17",
      "type" : "string",
      "doc" : ""
    } ]
  }
} {code}
Test flag is `supportBytesInArrayMap` to expose the error. There are also TODOs 
to remove code when this is fixed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to