[jira] [Resolved] (NIFI-15681) Enhance PutElasticsearchJson to support NDJSON, JSON Array, and Single JSON input formats with size-based batching

Pierre Villard (Jira) Tue, 07 Apr 2026 07:16:09 -0700


     [ 
https://issues.apache.org/jira/browse/NIFI-15681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Pierre Villard resolved NIFI-15681.
-----------------------------------
    Fix Version/s: 2.9.0
         Assignee: Adam Turley
       Resolution: Fixed

> Enhance PutElasticsearchJson to support NDJSON, JSON Array, and Single JSON 
> input formats with size-based batching
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-15681
>                 URL: https://issues.apache.org/jira/browse/NIFI-15681
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 2.8.0
>         Environment: Containerized NiFi 2.8.0 on Rhel 9
>            Reporter: Adam Turley
>            Assignee: Adam Turley
>            Priority: Major
>             Fix For: 2.9.0
>
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The existing PutElasticsearchJson processor is limited to indexing one JSON 
> document per FlowFile. This creates significant overhead in high-volume 
> ingest scenarios, requiring upstream flow logic to reshape data before it can 
> be sent to Elasticsearch. Additionally, ingesting large datasets requires one 
> FlowFile per document, creating excessive NiFi session overhead and making it 
> impractical to send pre-aggregated NDJSON or JSON array payloads directly.
> This improvement enhances PutElasticsearchJson in-place while remaining fully 
> backwards compatible with existing flows. No schema, Record Reader, or schema 
> registry is required — JSON is passed through directly, making it suitable 
> for dynamic or schema-less documents.
> Why not PutElasticsearchRecord?
> PutElasticsearchRecord is the right choice when data arrives in a structured, 
> well-known format (Avro, CSV, Parquet, etc.) and field-level type mapping, 
> schema enforcement, or schema evolution is needed. However, it introduces 
> significant overhead that is unnecessary in many JSON ingest pipelines:
>  * Schema requirement — a Record Reader and schema (via schema registry, 
> inferred, or embedded) must be defined and maintained. For JSON data with 
> dynamic fields, deeply nested structures, or schema-less designs, this is a 
> configuration burden with no benefit.
>  * Deserialization cost — PutElasticsearchRecord fully deserializes the input 
> into NiFi's internal Record object model and then re-serializes it to JSON 
> for the _bulk request. This is a two-way type conversion for data that is 
> already valid JSON, adding CPU and memory overhead on every document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (NIFI-15681) Enhance PutElasticsearchJson to support NDJSON, JSON Array, and Single JSON input formats with size-based batching

Reply via email to