featzhang opened a new pull request, #27714:
URL: https://github.com/apache/flink/pull/27714

   ## What is the purpose of the change
   
   This PR adds configuration support for the Management Blocklist feature 
(PR-4 in the FLINK-39176 series). It enables administrators to control the node 
quarantine behavior through Flink's configuration system, including 
enabling/disabling the feature, setting default block duration, and maximum 
allowed duration.
   
   This is part of the Node Health Management & Quarantine Framework (Phase 1), 
which provides a Resource Manager-level mechanism for manually blocking 
problematic nodes from receiving new slot allocations.
   
   ## Brief change log
   
   - Add `ManagementOptions` configuration class with `blocklist.enabled`, 
`blocklist.default-duration`, `blocklist.max-duration`
   - Add `DefaultManagementBlocklistHandler` with automatic expiration cleanup 
via `ScheduledExecutor`
   - Add `ManagementBlocklistHandlerFactory` for configuration-based handler 
creation
   - Add REST API endpoints: `POST/DELETE/GET /cluster/blocklist` with proper 
request/response bodies
   - Extend `ResourceManagerGateway` with management blocklist methods
   - Add comprehensive test suite (42 tests covering core functionality, edge 
cases, performance, and end-to-end scenarios)
   - Add documentation in `docs/content/docs/ops/management_blocklist.md`
   
   ## Verifying this change
   
   This change is verified by 42 new unit and integration tests:
   
   ```bash
   mvn test -Dtest="*ManagementBlocklist*,SimpleBlocklistHandlerTest" -pl 
flink-runtime
   ```
   
   Test coverage includes:
   - Core functionality validation (add/remove/check operations)
   - Edge cases (null parameters, empty strings, special characters, duplicate 
nodes)
   - Boundary conditions (very short/long durations, large node counts)
   - Concurrent operations and thread safety
   - Automatic expiration and cleanup mechanisms
   - ResourceManager gateway integration
   - Performance benchmarks (~0.0003ms per add, ~0.0002ms per check)
   
   ## Does this pull request potentially affect one of the following parts
   
   - Dependencies (does it add or upgrade a dependency): no
   - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
   - The serializers: no
   - The runtime per-record code paths (performance sensitive): no
   - Anything that affects deployment or recovery: no
   - The (de)serialization that stored state depends on: no
   
   ## Documentation
   
   - Does this pull request introduce a new feature? yes
   - If yes, how is the feature documented? docs + JavaDocs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to