Pierre Villard created NIFI-13843:
-------------------------------------

             Summary: Unknown fields not dropped by JSON Writer as expected by 
specified schema
                 Key: NIFI-13843
                 URL: https://issues.apache.org/jira/browse/NIFI-13843
             Project: Apache NiFi
          Issue Type: Bug
          Components: Extensions
    Affects Versions: 2.0.0-M4, 1.27.0
            Reporter: Pierre Villard
            Assignee: Pierre Villard


Consider the following use case:
 * GFF Processor, generating a JSON with 3 fields: a, b, and c
 * ConvertRecord with JSON Reader / JSON Writer
 ** Both reader and writer are configured with a schema only specifying fields 
a and b

The expected result is a JSON that only contains fields a and b.

We're following the below path in the code:
 * AbstractRecordProcessor (L131)

{code:java}
Record firstRecord = reader.nextRecord(); {code}
In this case, the default method for nextRecord() is defined in RecordReader 
(L50)
{code:java}
default Record nextRecord() throws IOException, MalformedRecordException {
    return nextRecord(true, false);
} {code}
where we are NOT dropping the unknown fields (Java doc needs some fixing here 
as it is saying the opposite)

We get to 
{code:java}
writer.write(firstRecord); {code}
which gets us to
 * WriteJsonResult (L206)

Here, we do a check
{code:java}
isUseSerializeForm(record, writeSchema) {code}
which currently returns true when it should not. Because of this we write the 
serialised form which ignores the writer schema.

In this method isUseSerializeForm(), we do check
{code:java}
record.getSchema().equals(writeSchema) {code}
But at this point record.getSchema() returns the schema defined in the reader 
which is equal to the one defined in the writer - even though the record has 
additional fields compared to the defined schema.

The suggested fix is check is to also add a check on
{code:java}
record.isDropUnknownFields() {code}
If dropUnknownFields is false, then we do not use the serialised form.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to