cryptoe commented on code in PR #18254:
URL: https://github.com/apache/druid/pull/18254#discussion_r2214906895
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -1379,13 +1377,20 @@ private void contactWorkersForStage(
final boolean retryOnFailure
)
{
- // Sorted copy of target worker numbers to ensure consistent iteration
order.
+ // Sorted copy of target worker numbers to ensure a consistent iteration
order.
final List<Integer> workersCopy = Ordering.natural().sortedCopy(workers);
final List<String> workerIds = getWorkerIds();
final List<ListenableFuture<Void>> workerFutures = new
ArrayList<>(workersCopy.size());
try {
- workerManager.waitForWorkers(workers);
+ workerManager.waitForWorkers(
+ workers,
+ (workerTask, fault) -> {
+ throwIfNonRetriableFault(fault);
+ // no need to add it to the kernel manipulation queue since this
is the main controller thread calling this function.
+ addToRetryQueue(queryKernel, workerTask.getWorkerNumber(), fault);
Review Comment:
There are 2 blocking class in the workerManager which is called by the main
controller thread :
> workerManager.launchWorkersIfNeeded(
> workerManager.waitForWorkers(
In case the worker spawned fails for any reason , the worker manager thread
, calls the failure listener :
https://github.com/apache/druid/blob/aa29d43eac515905836a9641eaa83f074e2cc8c0/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/MSQWorkerTaskLauncher.java#L615
which adds the retry event in the controller kernel queue.
Since we need access to work orders, we can only access that information via
the controller thread due to the single threaded nature of the query kernel.
Since the main controller thread is blocked on launchingTheWorker,
waitingForWorker to launch, we never really launch the worker.
Let me try to find a way to pass the failure listener as part of the
constructor
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]