[ 
https://issues.apache.org/jira/browse/MINIFICPP-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324910#comment-17324910
 ] 

Adam Debreceni commented on MINIFICPP-1538:
-------------------------------------------

Current progress 
[here.|https://github.com/adamdebreceni/nifi-minifi-cpp/tree/MINIFICPP-1538]

Based on measurements on Windows machine, during flow operations there is no 
discernible benefit using per-process-group column families, when the number of 
parallel process groups is 2, both of them using a [ConsumeWindowsEventLog -> 
LogAttribute] setup

!Screenshot 2021-04-14 at 12.50.05.png|width=797,height=535!

!Screenshot 2021-04-14 at 12.50.14.png|width=807,height=538!
On synthetic benchmarks continuously reading/writing in 2 parallel "flows" each 
one having 4 readers and 4 writers the benefit is measurable:


windows:
 * with column families: ~470k
 * without: ~300k

mac:
 * with column families: ~710k
 * without: ~620k

 

Based on these measurements the complexity/amount of code required for this 
feature outweighs the provided benefits.

> Investigate per process group column families
> ---------------------------------------------
>
>                 Key: MINIFICPP-1538
>                 URL: https://issues.apache.org/jira/browse/MINIFICPP-1538
>             Project: Apache NiFi MiNiFi C++
>          Issue Type: New Feature
>            Reporter: Adam Debreceni
>            Assignee: Adam Debreceni
>            Priority: Major
>         Attachments: Screenshot 2021-04-14 at 12.50.05.png, Screenshot 
> 2021-04-14 at 12.50.14.png
>
>
> Currently all operations in FlowFileRepository go through a single column 
> family (the default). We should investigate the performance benefit, if any, 
> of using different column families for different process groups (special care 
> must be taken for RPGs, the processors of which logically belong to their 
> parent groups).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to