pvillard31 opened a new pull request, #10186:
URL: https://github.com/apache/nifi/pull/10186

   # Summary
   
   NIFI-14837 - Performance improvement GitHub Registry Client
   
   I'm using the GitHub Registry Client in my NiFi instance. I have about 50 
process groups that are versioned. Every process group matches a versioned flow 
that may have tens of commits.
   
   When I want to change version, the current implementation will list all 
commits, and for each commit, will make an API call to GiHub in order to 
retrieve some specific informations (commit message, commit date, etc).
   
   This is extremely ineffective and changing the version a flow ends up taking 
a very long time. For some cases with many commits, I cannot change version 
because the call in the NiFi UI would time out before the backend has sent back 
the full list of commits with all of the information.
   
   This becomes very not friendly and barely usable. This will also impact the 
API rate limits a lot.
   
   This change is to introduce multiple improvements that are making all of 
this MUCH better.
   
   - The GitHub client being used is initialized with an optional OkHttp client 
cache (see https://hub4j.github.io/github-api/)
    > This library comes with a pluggable connector to use different HTTP 
client implementations through HttpConnector. In particular, this means you can 
use [OkHttp](https://square.github.io/okhttp/), so we can make use of its HTTP 
response cache. Making a conditional request against the GitHub API and 
receiving a 304 response [does not count against the rate 
limit](https://docs.github.com/en/rest/overview/resources-in-the-rest-api?apiVersion=2022-11-28#conditional-requests).
   - Adding a LRU cache to the client with a fixed size of 1000 commits maximum 
in order to keep an internal cache of commit SHA to commit details.
   - Expose a property to limit the number of commits retrieved. The client 
does not ensure a chronological order but guarantees a topological ordering. So 
it should be chronological except in some specific edge cases like rebase, 
merge commits, cherry-pick, commits with manual dates, etc. However, this is 
very unlikely to happen with a normal usage of the client. Regardless the 
default is to retrieve all commits like it is right now.
   - Adding a Rate Abuse Limit Handler to log an error when abusing the API 
limits.
   
   # Tracking
   
   Please complete the following tracking steps prior to pull request creation.
   
   ### Issue Tracking
   
   - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue 
created
   
   ### Pull Request Tracking
   
   - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as 
`NIFI-00000`
   - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, 
as such `NIFI-00000`
   
   ### Pull Request Formatting
   
   - [ ] Pull Request based on current revision of the `main` branch
   - [ ] Pull Request refers to a feature branch with one commit containing 
changes
   
   # Verification
   
   Please indicate the verification steps performed prior to pull request 
creation.
   
   ### Build
   
   - [ ] Build completed using `mvn clean install -P contrib-check`
     - [ ] JDK 21
   
   ### Licensing
   
   - [ ] New dependencies are compatible with the [Apache License 
2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License 
Policy](https://www.apache.org/legal/resolved.html)
   - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` 
files
   
   ### Documentation
   
   - [ ] Documentation formatting appears as expected in rendered files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to