DanielCarter-stack commented on PR #10594:
URL: https://github.com/apache/seatunnel/pull/10594#issuecomment-4045126196
<!-- code-pr-reviewer -->
<!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10594", "part": 1,
"total": 1} -->
### Issue 1: Missing Comment Documentation
**Location**:
`seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/common/container/seatunnel/SeaTunnelContainer.java:403`
```java
|| s.startsWith("pending-job-schedule-runner")
```
**Related Context**:
- Thread creation: `CoordinatorService.java:230`
- Similar comment: `SeaTunnelContainer.java:422` has `// Add heartbeat
threads as system threads`
**Problem Description**:
The code does not explain why `pending-job-schedule-runner` is a system
thread, nor the purpose of this thread. Although the comment in
`CoordinatorService` states "Start pending job schedule thread," there is no
corresponding comment in the E2E test code.
**Potential Risks**:
- **Risk 1**: Future maintainers may not understand why this thread needs to
be filtered
- **Risk 2**: If the thread name changes, maintainers may not know to update
this filter condition
**Scope of Impact**:
- **Direct Impact**: `SeaTunnelContainer.isSystemThread` method
- **Indirect Impact**: None
- **Impact Area**: E2E test framework
**Severity**: MINOR
**Improvement Suggestion**:
```java
|| s.startsWith("pending-job-schedule-runner") // CoordinatorService
pending job scheduler thread
```
**Rationale**:
- Maintains consistency with the comment style at line 422 in the same file
- Documents the thread source and purpose for easier future maintenance
---
### Issue 2: Maintenance Risk from Hardcoded Thread Names
**Location**:
-
`seatunnel-engine/seatunnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/CoordinatorService.java:230`
-
`seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/common/container/seatunnel/SeaTunnelContainer.java:403`
```java
// CoordinatorService.java:230
Thread.currentThread().setName("pending-job-schedule-runner");
// SeaTunnelContainer.java:403
|| s.startsWith("pending-job-schedule-runner")
```
**Related Context**:
- Similar pattern: `seatunnel-coordinator-service` uses
`ThreadFactoryBuilder.setNameFormat`
- Similar pattern: `http-report-event-scheduler` uses
`ThreadFactoryBuilder.setNameFormat`
**Problem Description**:
Thread names are hardcoded as strings in two places:
1. Setting the name when creating the thread in `CoordinatorService`
2. Checking the name when filtering threads in `SeaTunnelContainer`
This dual hardcoding creates maintenance burden: if the thread needs to be
renamed in the future, both code locations must be modified.
**Potential Risks**:
- **Risk 1**: If only one location is modified (e.g., renaming the thread
but forgetting to update the E2E filter), it will cause test failures
- **Risk 2**: New contributors may not be aware of the thread name
dependencies
**Scope of Impact**:
- **Direct Impact**: `CoordinatorService` and `SeaTunnelContainer`
- **Indirect Impact**: All Connectors using E2E tests
- **Impact Area**: Core framework + E2E test framework
**Severity**: MAJOR
**Improvement Suggestion**:
```java
// Define constants in CoordinatorService
public class CoordinatorService {
public static final String PENDING_JOB_SCHEDULE_THREAD_NAME =
"pending-job-schedule-runner";
private void startPendingJobScheduleThread() {
Runnable pendingJobScheduleTask = () -> {
Thread.currentThread().setName(PENDING_JOB_SCHEDULE_THREAD_NAME);
// ...
};
}
}
// Reference constants in SeaTunnelContainer
private static boolean isSystemThread(String s) {
return s.startsWith("hz.main")
|| s.startsWith("seatunnel-coordinator-service")
||
s.startsWith(CoordinatorService.PENDING_JOB_SCHEDULE_THREAD_NAME)
// ...
}
```
**Rationale**:
- Eliminates dual hardcoding by using a single source of truth
- IDE refactoring tools can automatically assist with renaming
- Note: This introduces a dependency from test code to production code,
which may increase module coupling
---
### Issue 3: Not Considering Future Possible Thread Numbering
**Location**:
`seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/common/container/seatunnel/SeaTunnelContainer.java:403`
```java
|| s.startsWith("pending-job-schedule-runner")
```
**Related Context**:
- Similar pattern: `seatunnel-coordinator-service-*` uses
`ThreadFactoryBuilder.setNameFormat("seatunnel-coordinator-service-%d")`
- Similar pattern: `http-report-event-scheduler-%d` uses numbering
**Problem Description**:
Currently using `startsWith("pending-job-schedule-runner")` for matching,
but in `CoordinatorService` this thread is submitted through
`executorService.submit()`, which uses `ThreadPoolExecutor`, with the thread
name format being `"seatunnel-coordinator-service-%d"`.
This means the actual thread name might be
`seatunnel-coordinator-service-1`, `seatunnel-coordinator-service-2`, etc.,
while `pending-job-schedule-runner` is reset through
`Thread.currentThread().setName()` inside the task execution.
**Potential Risks**:
- **Risk 1**: If multiple pending job schedulers are introduced in the
future (e.g., high availability scenarios), numbering might be used
- **Risk 2**: Current implementation assumes only one
`pending-job-schedule-runner`; if the implementation changes, this filter needs
to be updated accordingly
**Scope of Impact**:
- **Direct Impact**: `SeaTunnelContainer.isSystemThread` method
- **Indirect Impact**: If multiple schedulers are introduced, this filter
needs modification
- **Impact Area**: E2E test framework
**Severity**: MINOR
**Improvement Suggestion**:
```java
// The current implementation is sufficient, but if there is a numbering
requirement in the future, consider:
|| s.startsWith("pending-job-schedule-runner")
|| s.matches("pending-job-schedule-runner-\\d+")
```
**Rationale**:
- Current implementation uses `setName()` rather than `setNameFormat()`, so
numbering is unlikely
- However, adding regex matching can improve future extensibility
- However, following the YAGNI principle, the current implementation is
sufficient
---
### Issue 4: Similar System Threads May Not Be Recognized
**Location**:
`seatunnel-engine/seatunnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/CoordinatorService.java:217`
```java
masterActiveListener = Executors.newSingleThreadScheduledExecutor();
masterActiveListener.scheduleAtFixedRate(
this::checkNewActiveMaster, 0, 100, TimeUnit.MILLISECONDS);
```
**Related Context**:
- `isSystemThread` method in `SeaTunnelContainer.java:400-430`
- Other `ScheduledExecutorService` threads: `http-report-event-scheduler`
has been filtered
**Problem Description**:
There is also a `masterActiveListener` thread in `CoordinatorService`,
created using `Executors.newSingleThreadScheduledExecutor()`, but without a
custom thread name set. The default name for this thread is in
`pool-[N]-thread-[M]` format, while `aqsThread.matcher(s).matches()` already
exists in `isSystemThread` to match this pattern.
**Potential Risks**:
- **Risk 1**: If the default thread name of `masterActiveListener` does not
match the `pool-[0-9]-thread-[0-9]` pattern, it may cause E2E test failures
- **Risk 2**: Other threads using the default thread pool may have similar
issues
**Scope of Impact**:
- **Direct Impact**: `CoordinatorService.masterActiveListener` thread
- **Indirect Impact**: Thread leak detection in E2E tests
- **Impact Area**: Core framework
**Severity**: MAJOR (Requires verification)
**Improvement Suggestion**:
```java
// Explicitly name the masterActiveListener thread in CoordinatorService
masterActiveListener = Executors.newSingleThreadScheduledExecutor(
new ThreadFactoryBuilder()
.setNameFormat("coordinator-master-active-listener")
.build());
```
**Rationale**:
- Explicitly names the thread for easier debugging and monitoring
- If this thread needs to be filtered by E2E, an explicit filter condition
can be added
- Maintains consistency with other `ThreadFactoryBuilder` usage in the
project
---
### Issue 5: Not Verified Whether `masterActiveListener` Thread Needs
Filtering
**Location**:
`seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/common/container/seatunnel/SeaTunnelContainer.java:400`
```java
private static boolean isSystemThread(String s) {
Pattern aqsThread = Pattern.compile("pool-[0-9]-thread-[0-9]");
return s.startsWith("hz.main")
|| s.startsWith("seatunnel-coordinator-service")
|| s.startsWith("pending-job-schedule-runner")
// ...
|| aqsThread.matcher(s).matches()
}
```
**Related Context**:
- `CoordinatorService.java:217-219` creates `masterActiveListener`
- `aqsThread` regex matching in `SeaTunnelContainer.java:400`
**Problem Description**:
The `masterActiveListener` thread uses the default
`Executors.newSingleThreadScheduledExecutor()`, with a thread name format of
`pool-[N]-thread-1` (single thread). The `aqsThread` regex
`pool-[0-9]-thread-[0-9]` in `isSystemThread` can match this thread.
**Verification Required**:
Need to verify whether the `masterActiveListener` thread is already
correctly filtered by `aqsThread`, and whether it will cause E2E test failures.
**Potential Risks**:
- **Risk 1**: If `N` of `pool-[N]-thread-1` exceeds 9 (i.e., two digits),
`pool-[0-9]-thread-[0-9]` cannot match
- **Risk 2**: If there are multiple `ScheduledExecutorService`,
`pool-1-thread-1`, `pool-2-thread-1`, etc. may be generated; need to confirm
whether the regex covers them
**Scope of Impact**:
- **Direct Impact**: `CoordinatorService.masterActiveListener` thread
- **Indirect Impact**: Stability of E2E tests
- **Impact Area**: Core framework + E2E test framework
**Severity**: MAJOR (Requires verification)
**Improvement Suggestion**:
```java
// Improve regular expression to support multiple thread pool numbering
Pattern aqsThread = Pattern.compile("pool-\\d+-thread-\\d+");
```
**Rationale**:
- Current regex `[0-9]` only matches single digits, not supporting two
digits or larger numbers
- Changing to `\d+` can match thread pools and thread numbers of any digit
length
- This is a potential bug that may cause E2E tests to pass unnoticed
(filtering fails but is not detected)
---
### Issue 6: Timing Issue of `seatunnel-metrics-fetch-` Thread
**Location**:
-
`seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/common/container/seatunnel/SeaTunnelContainer.java:403`
(current PR)
-
`seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/common/container/seatunnel/SeaTunnelContainer.java:403`
(`1f5ce49f4` commit)
```java
// Commit 1f5ce49f4 added (2026-03-06)
|| s.startsWith("seatunnel-metrics-fetch-")
// Commit bc8e63d60 added (2026-03-12)
|| s.startsWith("pending-job-schedule-runner")
```
**Related Context**:
- Commit `1f5ce49f4` [Improve] Add metrics fetch thread prefix to log
filtering in SeaTunnelContainer (2026-03-06)
- Commit `bc8e63d60` [Test][E2E] Ignore coordinator scheduler thread in E2E
cleanup (2026-03-12)
**Problem Description**:
Two consecutive commits both fix the same issue: system threads not in the
E2E whitelist. This indicates:
1. `1f5ce49f4` added `seatunnel-metrics-fetch-` thread filtering (for
realtime observability functionality)
2. `bc8e63d60` added `pending-job-schedule-runner` thread filtering (for
pending job rescheduling functionality)
Both features were introduced recently (January-March 2026), indicating that
it's easy to forget to update the E2E whitelist synchronously when introducing
new features.
**Potential Risks**:
- **Risk 1**: When introducing new system threads in the future, the E2E
whitelist may again be forgotten to be updated
- **Risk 2**: This is a systemic issue that requires establishing processes
or mechanisms to prevent recurrence
**Scope of Impact**:
- **Direct Impact**: All newly introduced system threads
- **Indirect Impact**: E2E test stability and development efficiency
- **Impact Area**: Entire project
**Severity**: MAJOR (Systemic issue)
**Improvement Suggestion**:
```java
// Add documentation comments in SeaTunnelContainer
/**
* Checks if a thread is a system thread that should be ignored during E2E
cleanup.
*
* <p><b>IMPORTANT:</b> When adding new long-lived threads to
CoordinatorService or other
* core services, remember to add the thread name prefix to this method's
whitelist.
*
* <p>Examples of system threads:
* <ul>
* <li>seatunnel-coordinator-service-* - CoordinatorService thread
pool</li>
* <li>pending-job-schedule-runner - Pending job scheduler thread</li>
* <li>seatunnel-metrics-fetch-* - Realtime metrics fetcher threads</li>
* </ul>
*
* @param s the thread name to check
* @return true if the thread is a system thread, false otherwise
*/
private static boolean isSystemThread(String s) {
// ...
}
```
**Rationale**:
- Increase visibility through documentation to remind developers to update
synchronously
- Recommend adding a checklist item to the development workflow: new system
threads require updating the E2E whitelist
- Long-term solution: Consider using an automatic discovery mechanism (e.g.,
thread naming conventions) to reduce manual maintenance
---
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]