[
https://issues.apache.org/jira/browse/NIFI-14324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Samar closed NIFI-14324.
------------------------
> ExtractText Processor Adds Matched Values many Times Incorrectly
> ----------------------------------------------------------------
>
> Key: NIFI-14324
> URL: https://issues.apache.org/jira/browse/NIFI-14324
> Project: Apache NiFi
> Issue Type: Bug
> Components: NiFi API
> Affects Versions: 1.15.0, 1.25.0
> Reporter: Samar
> Priority: Major
> Attachments: Screenshot from 2025-03-05 12-57-41.png
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> The *ExtractText processor* is unexpectedly adding matched values *three
> times* when processing FlowFiles with a regex pattern. Even with a simple
> pattern like {{{}(?s)(.*){}}}, the extracted value is stored *exactly three
> times* in the FlowFile attributes.
> h4. *Steps to Reproduce*
> # Create an *ExtractText* processor in NiFi.
> # Set the *Enable repeating capture group* property to {{{}true{}}}.
> # Use a simple regex pattern, such as {{{}(?s)(.*){}}}.
> # Process a FlowFile containing any text.
> # Observe the extracted attributes.
> h4. *Expected Behavior*
> * The extracted text should be stored *once* in the attributes when matching
> the entire content.
> h4. *Actual Behavior*
> * The extracted text is added {*}three times{*}, leading to duplicate
> entries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)