[ 
https://issues.apache.org/jira/browse/NIFI-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard updated NIFI-13843:
----------------------------------
    Status: Patch Available  (was: Open)

> Unknown fields not dropped by JSON Writer as expected by specified schema
> -------------------------------------------------------------------------
>
>                 Key: NIFI-13843
>                 URL: https://issues.apache.org/jira/browse/NIFI-13843
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 2.0.0-M4, 1.27.0
>            Reporter: Pierre Villard
>            Assignee: Pierre Villard
>            Priority: Major
>
> Consider the following use case:
>  * GFF Processor, generating a JSON with 3 fields: a, b, and c
>  * ConvertRecord with JSON Reader / JSON Writer
>  ** Both reader and writer are configured with a schema only specifying 
> fields a and b
> The expected result is a JSON that only contains fields a and b.
> We're following the below path in the code:
>  * AbstractRecordProcessor (L131)
> {code:java}
> Record firstRecord = reader.nextRecord(); {code}
> In this case, the default method for nextRecord() is defined in RecordReader 
> (L50)
> {code:java}
> default Record nextRecord() throws IOException, MalformedRecordException {
>     return nextRecord(true, false);
> } {code}
> where we are NOT dropping the unknown fields (Java doc needs some fixing here 
> as it is saying the opposite)
> We get to 
> {code:java}
> writer.write(firstRecord); {code}
> which gets us to
>  * WriteJsonResult (L206)
> Here, we do a check
> {code:java}
> isUseSerializeForm(record, writeSchema) {code}
> which currently returns true when it should not. Because of this we write the 
> serialised form which ignores the writer schema.
> In this method isUseSerializeForm(), we do check
> {code:java}
> record.getSchema().equals(writeSchema) {code}
> But at this point record.getSchema() returns the schema defined in the reader 
> which is equal to the one defined in the writer - even though the record has 
> additional fields compared to the defined schema.
> The suggested fix is check is to also add a check on
> {code:java}
> record.isDropUnknownFields() {code}
> If dropUnknownFields is false, then we do not use the serialised form.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to