[ 
https://issues.apache.org/jira/browse/NIFI-14335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935387#comment-17935387
 ] 

Bob Paulin commented on NIFI-14335:
-----------------------------------

Hi I think the type of batching being added in this issue and the type of 
batching currently supported in the Put processors is a bit different.  The 
current batching in PutElasticsearchJson and PutElasticsearchRecord are based 
on the number of records to write at once to Elastic.  So by batching less 
calls are made to elastic which can increase performance as the number of HTTP 
calls can be a bottleneck.  This is not true of SupportsBatching.  The 
annotation SupportsBatching is meant to enable the run duration slider as 
mentioned in 
[https://stackoverflow.com/questions/75654570/what-does-supportsbatching-exactly-do-on-top-of-nifis-processor-class]
 .  So adding this annotation will result in the same number of calls to 
Elastic.  It will however reduce the number of times NiFi writes to the 
repository.  It could still benefit but I'm curious if you have a usecase to 
demonstrate this provides an improvement for you.

> Support NiFi framework batching in Elasticsearch processors
> -----------------------------------------------------------
>
>                 Key: NIFI-14335
>                 URL: https://issues.apache.org/jira/browse/NIFI-14335
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 2.2.0
>            Reporter: Vijaya Gorla
>            Priority: Minor
>              Labels: elasticsearch
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Elasticsearch processors currently do not support NiFi framework batching 
> (using {{SupportsBatching}} annotation). Although {{PutElasticsearchJson}} 
> and {{PutElasticsearchRecord}} processors support batching but this is 
> implemented in the processor, not using {{SupportsBatching}} annotation.
> Following processors could benefit from framework batching where high 
> throughput is required.
>  * {{GetElasticsearch}}
>  * {{JsonQueryElasticsearch}}
>  * {{UpdateByQueryElasticsearch}}
>  * {{DeleteByQueryElasticsearch}}
> Adding {{SupportsBatching}} with {{DefaultRunDuration.NO_BATCHING}} would 
> preserve the existing behaviour by default and enable batching if required.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to