[ 
https://issues.apache.org/jira/browse/NIFI-15025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Handermann resolved NIFI-15025.
-------------------------------------
    Resolution: Fixed

> LookupFailureException with large HTTP responses due to BufferedInputStream 
> mark/reset limitation
> -------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-15025
>                 URL: https://issues.apache.org/jira/browse/NIFI-15025
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: RAVINARAYAN SINGH
>            Assignee: RAVINARAYAN SINGH
>            Priority: Major
>              Labels: ControllerService
>             Fix For: 2.7.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> When wrapping an HTTP response body stream with BufferedInputStream for large 
> files, NiFi record readers can fail with the following error:
> {code:java}
> org.apache.nifi.lookup.LookupFailureException: java.io.IOException: Resetting 
> to invalid mark
> Caused by: java.io.IOException: Resetting to invalid mark {code}
> This happens because BufferedInputStream only supports mark/reset up to its 
> internal buffer size (default 8 KB). Once the reader attempts to reset beyond 
> that buffer, the stream becomes invalid.
> Code from 
> [RestLookupService.java|https://github.com/apache/nifi/blob/60330769f668abf963f8a32202841cafa10f1885/nifi-extension-bundles/nifi-standard-services/nifi-lookup-services-bundle/nifi-lookup-services/src/main/java/org/apache/nifi/lookup/RestLookupService.java#L383-L388]
> {code:java}
> final Record record;
> try (final InputStream is = responseBody.byteStream();
>      final InputStream bufferedIn = new BufferedInputStream(is)) {
>     record = handleResponse(bufferedIn, responseBody.contentLength(), 
> context);
> } {code}
> h3. *Proposed Fix / Solutions*
>  # *Remove BufferedInputStream*
> Use responseBody.byteStream() directly if mark/reset is not required by the 
> reader.
>  # *Configurable Buffer Size*
> Introduce a NiFi property to configure the buffer size for streams that 
> require buffering. Default could remain 8 KB, but users may increase it for 
> larger payloads.
>  # *Spooling InputStream Wrapper (Recommended)*
> Provide a robust InputStream wrapper that preserves streaming while 
> supporting unlimited mark/reset via spooling to disk:
>  * 
>  ** mark() records the current absolute position.
>  ** reset() replays bytes starting from the marked position.
>  ** Additional bytes beyond the replay window are streamed and spooled 
> transparently.
>  ** Temporary spool file is automatically deleted on stream close.
> This ensures NiFi processors handle large HTTP payloads correctly without 
> running into mark/reset limits or heap issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to