pepijnve commented on PR #16398: URL: https://github.com/apache/datafusion/pull/16398#issuecomment-2987865677
I added some output to be able to see what the coop logic was doing. Ignore the times; this is dev profile. What you can see is that the PR hardly forces yields, while main does. This is explained by the yield frequency being 64 on main and 128 in the PR (to match Tokio's budget). So it doesn't look like increased yielding is the culprit. **Yield output:** <details> ``` Main (same for datafusion_coop="per_stream" and datafusion_coop="tokio_fallback") Polled 1146 times; 20 pending, 1118 ready, 8 forced yields Polled 1262 times; 20 pending, 1230 ready, 12 forced yields Polled 1076 times; 20 pending, 1048 ready, 8 forced yields Polled 1124 times; 24 pending, 1095 ready, 5 forced yields Polled 1183 times; 25 pending, 1153 ready, 5 forced yields Polled 1209 times; 25 pending, 1179 ready, 5 forced yields Polled 1157 times; 28 pending, 1127 ready, 2 forced yields Polled 1218 times; 26 pending, 1187 ready, 5 forced yields Polled 1572 times; 28 pending, 1535 ready, 9 forced yields Polled 1594 times; 29 pending, 1556 ready, 9 forced yields Query 4 iteration 2 took 6382.8 ms and returned 1 rows ``` vs ``` PR Polled 1138 times; 20 pending, 1118 ready, 0 forced yields Polled 1251 times; 20 pending, 1230 ready, 1 forced yields Polled 1068 times; 19 pending, 1048 ready, 1 forced yields Polled 1118 times; 23 pending, 1095 ready, 0 forced yields Polled 1178 times; 25 pending, 1153 ready, 0 forced yields Polled 1154 times; 27 pending, 1127 ready, 0 forced yields Polled 1204 times; 25 pending, 1179 ready, 0 forced yields Polled 1213 times; 26 pending, 1187 ready, 0 forced yields Polled 1584 times; 28 pending, 1556 ready, 0 forced yields Polled 1563 times; 28 pending, 1535 ready, 0 forced yields Query 4 iteration 6 took 6362.7 ms and returned 1 rows ``` </details> The next thing I can think of is the access to the budget thread local. But that doesn't explain why we got a very similar benchmark delta with the non-thread local counter. Comparing the clickbench1 run we had this in https://github.com/apache/datafusion/pull/16398#issuecomment-2980961066 with per_stream ``` Total Time (HEAD) │ 56209.26ms Total Time (task_budget) │ 57381.15ms Average Time (HEAD) │ 1307.19ms Average Time (task_budget) │ 1334.45ms ``` vs the last result in https://github.com/apache/datafusion/pull/16398#issuecomment-2981793916 with tokio_fallback ``` Total Time (HEAD) │ 55925.88ms Total Time (task_budget) │ 56847.01ms Average Time (HEAD) │ 1300.60ms Average Time (task_budget) │ 1322.02ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org