[
https://issues.apache.org/jira/browse/NIFI-14301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dariusz Seweryn updated NIFI-14301:
-----------------------------------
Description:
ConsumeKinesisStream currently uses "Max Concurrent Tasks" value as setting for
the underlying NettyNioAsyncHttpClient maxConcurrency value.
On the surface this seems OK from the framework perspective but it seems
wasteful and non-optimal due to the nature of HTTP work which is generally IO
bound whereas the framework assumes the tasks are Compute bound.
ConsumeKinesisStream onTrigger method does two things:
* at first creates a new thread for a AWS Scheduler class that does all the
heavy lifting for Kinesis interaction
* when Scheduler is already created it yields
Having more Concurrent Tasks yielding does not seem to be a productive usage of
framework threads.
Proposal: set NettyNioAsyncHttpClient maxConcurrency to minimum of
`Runtime.getRuntime().availableProcessors()` and processor's "Max Concurrent
Tasks"
was:
ConsumeKinesisStream currently uses "Max Concurrent Tasks" value as setting for
the underlying NettyNioAsyncHttpClient maxConcurrency value.
On the surface this seems OK from the framework perspective but it seems
wasteful and non-optimal due to the nature of HTTP work which is generally IO
bound whereas the framework assumes the tasks are Compute bound.
ConsumeKinesisStream onTrigger method does two things:
* at first creates a new thread for a AWS Scheduler class that does all the
heavy lifting for Kinesis interaction
* when Scheduler is already created it yields
Having more Concurrent Tasks yielding does not seem to be a productive usage of
framework threads.
Proposal: set NettyNioAsyncHttpClient maxConcurrency to
`Runtime.getRuntime().availableProcessors()`
> ConsumeKinesisStream decouple HttpClient maxConcurrency from "Max Concurrent
> Tasks"
> -----------------------------------------------------------------------------------
>
> Key: NIFI-14301
> URL: https://issues.apache.org/jira/browse/NIFI-14301
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Dariusz Seweryn
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> ConsumeKinesisStream currently uses "Max Concurrent Tasks" value as setting
> for the underlying NettyNioAsyncHttpClient maxConcurrency value.
> On the surface this seems OK from the framework perspective but it seems
> wasteful and non-optimal due to the nature of HTTP work which is generally IO
> bound whereas the framework assumes the tasks are Compute bound.
> ConsumeKinesisStream onTrigger method does two things:
> * at first creates a new thread for a AWS Scheduler class that does all the
> heavy lifting for Kinesis interaction
> * when Scheduler is already created it yields
> Having more Concurrent Tasks yielding does not seem to be a productive usage
> of framework threads.
> Proposal: set NettyNioAsyncHttpClient maxConcurrency to minimum of
> `Runtime.getRuntime().availableProcessors()` and processor's "Max Concurrent
> Tasks"
--
This message was sent by Atlassian Jira
(v8.20.10#820010)