Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22280 )

Change subject: IMPALA-13566: Expose query cancellation status to UDFs
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/22280/3/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/22280/3/be/src/exprs/utility-functions-ir.cc@172
PS3, Line 172: SLEEP_UNINTERRUPTIBLE_INTERVAL_MS
One concern I have about the current implementation is that it may decrease the 
accuracy of sleep(), which is actually important for some tests. SleepForMs can 
take longer than the specified time, and these time differences can add up in 
multiple calls. This is probably not critical with the interval of 1s, as we 
rarely sleep much more than that in tests. An improvement could be to check the 
actual clock to sleep as precisely as possible instead of assuming that the 
requested time has passed.


http://gerrit.cloudera.org:8080/#/c/22280/3/be/src/exprs/utility-functions-ir.cc@171
PS3, Line 171:   for (int remaining_sleep_duration = milliseconds.val; 
remaining_sleep_duration > 0;
             :        remaining_sleep_duration -= 
SLEEP_UNINTERRUPTIBLE_INTERVAL_MS) {
             :     if (ctx->IsQueryCancelled()) {
             :       return BooleanVal(true);
             :     }
             :     SleepForMs(min(remaining_sleep_duration, 
SLEEP_UNINTERRUPTIBLE_INTERVAL_MS));
             :   }
> Maybe a simpler approach would help here. Check cancel first, then break in
Commenting on someone else's comment :)

I don't think that performance is important here, sleep() in SQL is only useful 
for testing. Also, compared to actually sleeping the other operations seem 
negligible.

The "// no need to check cancel" part looks problematic because these functions 
are often run in batches and query cancellation may not be checked between 
invocations. So for select sleep(500) from t; the cancellation would not be 
checked at all during the batch (e.g. for 1024 sleep calls).


http://gerrit.cloudera.org:8080/#/c/22280/3/tests/common/impala_connection.py
File tests/common/impala_connection.py:

http://gerrit.cloudera.org:8080/#/c/22280/3/tests/common/impala_connection.py@618
PS3, Line 618:     while True:
             :       operation_state = self.get_state(operation_handle)
             :       if operation_state not in ('PENDING_STATE', 
'INITIALIZED_STATE', 'RUNNING_STATE'):
             :         break
This will ping the coordinator very frequently for the state. Some small sleep 
could be added + it would be nice to have a timeout to avoid hanging the test 
very long if there is some error.


http://gerrit.cloudera.org:8080/#/c/22280/3/tests/query_test/test_cancellation.py
File tests/query_test/test_cancellation.py:

http://gerrit.cloudera.org:8080/#/c/22280/3/tests/query_test/test_cancellation.py@212
PS3, Line 212: execute_serially
Is it really needed to run this serially?
Note that I also don't get why test_misformatted_profile_text is marked as 
serial


http://gerrit.cloudera.org:8080/#/c/22280/3/tests/query_test/test_cancellation.py@228
PS3, Line 228: 1.1
This timeout looks a bit strict to me and I am worried about the test being 
flaky. It seems safe to me to bump timeout (e.g. to 3) because if the fetch 
thread finishes earlier we won't need to wait this long, so the test won't 
become very slow.



--
To view, visit http://gerrit.cloudera.org:8080/22280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9430167f7e46bbdf66153abb4645541cd8cf0142
Gerrit-Change-Number: 22280
Gerrit-PatchSet: 3
Gerrit-Owner: Xuebin Su <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Thu, 02 Jan 2025 18:39:52 +0000
Gerrit-HasComments: Yes

Reply via email to