pepijnve commented on PR #16398: URL: https://github.com/apache/datafusion/pull/16398#issuecomment-2991004224
@alamb @Dandandan Another update. The realization that Tokio is not P/E-core aware out of the box and that you have very little to no explicit control over thread affinity on macOS (QoS remains a hint) led me to try and find some other machine to run the perf tests on. We still had some Intel desktop machines lying around in our server room, so I installed a fresh Ubuntu server 24.04 on one of them and ran the test. Results below. Strengthens my believe that I'm measuring noise. Additionally I ran some tests with Instruments yesterday comparing total number of retired instructions, branch mispredictions, etc. There's some variability across runs, but it's more or less the same (as you would expect). The hardware specs this ran on are: - i9-9900 @ 3.10GHz (no P vs E cores) - 32GiB memory - 1TB 970 EVO/PRO SSD <details> ``` Comparing baseline and branch -------------------- Benchmark clickbench_extended.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ baseline ┃ branch ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 1816.81 / 1846.02 ±16.82 / │ 1487.24 / 1592.59 ±105.23 / │ +1.16x faster │ │ │ 1864.51 ms │ 1766.04 ms │ │ │ QQuery 1 │ 1154.89 / 1168.67 ±9.90 / │ 1170.36 / 1176.54 ±5.75 / │ no change │ │ │ 1180.63 ms │ 1185.78 ms │ │ │ QQuery 2 │ 2149.19 / 2178.05 ±15.43 / │ 2167.67 / 2205.78 ±30.69 / │ no change │ │ │ 2194.42 ms │ 2249.29 ms │ │ │ QQuery 3 │ 950.54 / 965.78 ±11.70 / 979.71 │ 929.35 / 942.41 ±10.50 / │ no change │ │ │ ms │ 956.59 ms │ │ │ QQuery 4 │ 2318.32 / 2380.99 ±49.47 / │ 2175.43 / 2251.66 ±47.23 / │ +1.06x faster │ │ │ 2426.34 ms │ 2323.00 ms │ │ │ QQuery 5 │ 20654.23 / 20842.04 ±135.24 / │ 20474.42 / 20693.12 ±278.58 / │ no change │ │ │ 21072.72 ms │ 21236.04 ms │ │ │ QQuery 6 │ 3365.89 / 3380.75 ±14.32 / │ 3303.32 / 3470.30 ±286.78 / │ no change │ │ │ 3407.26 ms │ 4042.51 ms │ │ │ QQuery 7 │ 3667.60 / 3734.69 ±78.34 / │ 3720.23 / 3752.53 ±20.55 / │ no change │ │ │ 3883.45 ms │ 3776.15 ms │ │ │ QQuery 8 │ 1189.93 / 1214.20 ±21.58 / │ 1167.36 / 1189.68 ±21.82 / │ no change │ │ │ 1254.50 ms │ 1226.90 ms │ │ └──────────────┴─────────────────────────────────┴────────────────────────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (baseline) │ 37711.20ms │ │ Total Time (branch) │ 37274.61ms │ │ Average Time (baseline) │ 4190.13ms │ │ Average Time (branch) │ 4141.62ms │ │ Queries Faster │ 2 │ │ Queries Slower │ 0 │ │ Queries with No Change │ 7 │ │ Queries with Failure │ 0 │ └─────────────────────────┴────────────┘ -------------------- Benchmark clickbench_partitioned.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ baseline ┃ branch ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 16.50 / 20.72 ±8.17 / 37.06 ms │ 16.10 / 16.77 ±0.61 / 17.75 ms │ +1.24x faster │ │ QQuery 1 │ 37.95 / 39.92 ±2.24 / 44.24 ms │ 36.54 / 37.35 ±0.70 / 38.59 ms │ +1.07x faster │ │ QQuery 2 │ 114.76 / 119.67 ±3.62 / 123.85 │ 93.35 / 96.31 ±1.79 / 98.98 ms │ +1.24x faster │ │ │ ms │ │ │ │ QQuery 3 │ 100.46 / 143.26 ±68.08 / 279.05 │ 95.18 / 97.89 ±2.08 / 101.42 │ +1.46x faster │ │ │ ms │ ms │ │ │ QQuery 4 │ 1019.10 / 1046.67 ±31.71 / │ 967.72 / 998.89 ±17.07 / │ no change │ │ │ 1102.74 ms │ 1013.90 ms │ │ │ QQuery 5 │ 1099.85 / 1126.37 ±22.00 / │ 1075.93 / 1084.52 ±8.80 / │ no change │ │ │ 1164.14 ms │ 1096.82 ms │ │ │ QQuery 6 │ 28.26 / 32.57 ±4.41 / 40.21 ms │ 26.20 / 29.64 ±2.27 / 33.33 ms │ +1.10x faster │ │ QQuery 7 │ 48.48 / 51.67 ±1.66 / 52.95 ms │ 45.85 / 47.43 ±1.14 / 48.65 ms │ +1.09x faster │ │ QQuery 8 │ 1301.94 / 1310.48 ±11.55 / │ 1241.74 / 1251.86 ±7.81 / │ no change │ │ │ 1333.12 ms │ 1262.60 ms │ │ │ QQuery 9 │ 1381.11 / 1391.08 ±7.87 / │ 1334.99 / 1342.58 ±4.51 / │ no change │ │ │ 1400.07 ms │ 1348.99 ms │ │ │ QQuery 10 │ 350.32 / 355.63 ±5.80 / 365.56 │ 348.43 / 351.11 ±1.43 / 352.37 │ no change │ │ │ ms │ ms │ │ │ QQuery 11 │ 386.57 / 395.36 ±5.94 / 403.46 │ 393.27 / 396.59 ±2.43 / 399.68 │ no change │ │ │ ms │ ms │ │ │ QQuery 12 │ 1155.30 / 1158.69 ±2.15 / │ 1104.36 / 1107.24 ±2.54 / │ no change │ │ │ 1161.21 ms │ 1111.15 ms │ │ │ QQuery 13 │ 1988.21 / 2007.19 ±9.94 / │ 1898.72 / 1929.14 ±16.27 / │ no change │ │ │ 2015.68 ms │ 1946.54 ms │ │ │ QQuery 14 │ 1212.31 / 1218.28 ±6.63 / │ 1160.51 / 1168.44 ±6.35 / │ no change │ │ │ 1230.82 ms │ 1176.61 ms │ │ │ QQuery 15 │ 1167.31 / 1173.50 ±9.25 / │ 1101.69 / 1108.11 ±7.11 / │ +1.06x faster │ │ │ 1191.88 ms │ 1121.69 ms │ │ │ QQuery 16 │ 2588.75 / 2734.29 ±138.85 / │ 2494.71 / 2516.10 ±19.90 / │ +1.09x faster │ │ │ 2908.22 ms │ 2549.44 ms │ │ │ QQuery 17 │ 2567.55 / 2584.06 ±19.09 / │ 2473.06 / 2479.84 ±4.24 / │ no change │ │ │ 2620.85 ms │ 2483.78 ms │ │ │ QQuery 18 │ 5061.62 / 5150.17 ±119.14 / │ 4893.81 / 5094.60 ±259.69 / │ no change │ │ │ 5382.84 ms │ 5574.29 ms │ │ │ QQuery 19 │ 104.53 / 111.22 ±7.69 / 126.24 │ 101.36 / 105.48 ±3.19 / 109.25 │ +1.05x faster │ │ │ ms │ ms │ │ │ QQuery 20 │ 1608.60 / 1624.36 ±13.22 / │ 1604.68 / 1622.16 ±9.65 / │ no change │ │ │ 1643.68 ms │ 1632.41 ms │ │ │ QQuery 21 │ 1886.35 / 1908.10 ±17.39 / │ 1883.89 / 1910.36 ±30.98 / │ no change │ │ │ 1939.48 ms │ 1969.64 ms │ │ │ QQuery 22 │ 3230.90 / 3291.20 ±96.36 / │ 3215.98 / 3269.94 ±73.17 / │ no change │ │ │ 3483.50 ms │ 3413.69 ms │ │ │ QQuery 23 │ 12742.62 / 12986.18 ±234.60 / │ 12887.51 / 13036.66 ±120.87 / │ no change │ │ │ 13407.25 ms │ 13246.23 ms │ │ │ QQuery 24 │ 641.82 / 659.54 ±19.55 / 696.41 │ 643.40 / 655.68 ±8.45 / 664.67 │ no change │ │ │ ms │ ms │ │ │ QQuery 25 │ 425.36 / 433.39 ±6.10 / 441.01 │ 429.37 / 435.14 ±5.45 / 445.48 │ no change │ │ │ ms │ ms │ │ │ QQuery 26 │ 635.17 / 647.46 ±7.49 / 654.76 │ 649.43 / 654.46 ±4.11 / 659.08 │ no change │ │ │ ms │ ms │ │ │ QQuery 27 │ 2362.95 / 2379.85 ±13.64 / │ 2392.53 / 2399.95 ±6.61 / │ no change │ │ │ 2395.36 ms │ 2410.38 ms │ │ │ QQuery 28 │ 18378.22 / 18533.23 ±95.66 / │ 18441.65 / 18594.61 ±125.29 / │ no change │ │ │ 18668.42 ms │ 18821.72 ms │ │ │ QQuery 29 │ 1150.49 / 1168.63 ±13.63 / │ 1148.00 / 1156.96 ±8.18 / │ no change │ │ │ 1192.78 ms │ 1171.97 ms │ │ │ QQuery 30 │ 1100.90 / 1105.88 ±4.96 / │ 1064.98 / 1069.64 ±3.15 / │ no change │ │ │ 1113.19 ms │ 1072.83 ms │ │ │ QQuery 31 │ 1254.63 / 1264.53 ±9.24 / │ 1226.02 / 1241.47 ±15.83 / │ no change │ │ │ 1277.99 ms │ 1270.33 ms │ │ │ QQuery 32 │ 4979.64 / 5108.61 ±189.24 / │ 4980.73 / 5114.22 ±222.24 / │ no change │ │ │ 5481.98 ms │ 5557.95 ms │ │ │ QQuery 33 │ 5183.53 / 5233.23 ±45.34 / │ 5203.84 / 5271.23 ±71.45 / │ no change │ │ │ 5307.24 ms │ 5373.36 ms │ │ │ QQuery 34 │ 5178.24 / 5257.93 ±110.05 / │ 5275.85 / 5359.43 ±79.80 / │ no change │ │ │ 5474.69 ms │ 5463.29 ms │ │ │ QQuery 35 │ 1862.37 / 1876.24 ±13.10 / │ 1866.96 / 1881.38 ±17.30 / │ no change │ │ │ 1896.66 ms │ 1915.02 ms │ │ │ QQuery 36 │ 99.14 / 111.55 ±22.24 / 155.97 │ 98.30 / 111.96 ±23.17 / 158.19 │ no change │ │ │ ms │ ms │ │ │ QQuery 37 │ 50.94 / 55.96 ±8.58 / 73.09 ms │ 51.04 / 57.79 ±9.18 / 75.97 ms │ no change │ │ QQuery 38 │ 98.17 / 99.93 ±1.36 / 102.03 ms │ 100.21 / 103.01 ±2.28 / 106.51 │ no change │ │ │ │ ms │ │ │ QQuery 39 │ 156.90 / 164.82 ±13.32 / 191.37 │ 154.78 / 166.31 ±13.91 / │ no change │ │ │ ms │ 193.63 ms │ │ │ QQuery 40 │ 43.33 / 50.75 ±10.62 / 71.86 ms │ 46.58 / 53.38 ±9.48 / 72.08 ms │ 1.05x slower │ │ QQuery 41 │ 42.01 / 45.40 ±2.33 / 48.16 ms │ 38.83 / 42.60 ±2.58 / 46.57 ms │ +1.07x faster │ │ QQuery 42 │ 35.95 / 40.04 ±3.71 / 46.25 ms │ 35.48 / 40.16 ±2.77 / 43.74 ms │ no change │ └──────────────┴─────────────────────────────────┴────────────────────────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (baseline) │ 86217.62ms │ │ Total Time (branch) │ 85508.40ms │ │ Average Time (baseline) │ 2005.06ms │ │ Average Time (branch) │ 1988.57ms │ │ Queries Faster │ 10 │ │ Queries Slower │ 1 │ │ Queries with No Change │ 32 │ │ Queries with Failure │ 0 │ └─────────────────────────┴────────────┘ -------------------- Benchmark tpch_mem_sf1.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ baseline ┃ branch ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 1 │ 184.98 / 189.83 ±2.53 / 191.91 │ 186.28 / 191.11 ±3.04 / 194.56 │ no change │ │ │ ms │ ms │ │ │ QQuery 2 │ 30.88 / 33.57 ±2.75 / 38.50 ms │ 25.66 / 29.63 ±3.12 / 33.86 ms │ +1.13x faster │ │ QQuery 3 │ 57.19 / 58.34 ±0.91 / 59.31 ms │ 57.45 / 58.47 ±0.81 / 59.45 ms │ no change │ │ QQuery 4 │ 19.17 / 21.17 ±1.30 / 22.81 ms │ 22.04 / 23.09 ±0.83 / 24.30 ms │ 1.09x slower │ │ QQuery 5 │ 95.47 / 96.71 ±1.12 / 98.56 ms │ 94.46 / 95.11 ±0.47 / 95.73 ms │ no change │ │ QQuery 6 │ 16.97 / 17.28 ±0.28 / 17.82 ms │ 16.97 / 17.30 ±0.23 / 17.56 ms │ no change │ │ QQuery 7 │ 178.92 / 183.10 ±6.38 / 195.80 │ 177.82 / 181.92 ±7.13 / 196.16 │ no change │ │ │ ms │ ms │ │ │ QQuery 8 │ 30.31 / 31.15 ±0.75 / 32.39 ms │ 26.75 / 28.40 ±1.13 / 29.71 ms │ +1.10x faster │ │ QQuery 9 │ 80.96 / 105.44 ±43.82 / 192.90 │ 84.11 / 85.20 ±0.72 / 86.30 ms │ +1.24x faster │ │ │ ms │ │ │ │ QQuery 10 │ 69.22 / 69.53 ±0.31 / 70.07 ms │ 69.66 / 72.36 ±1.50 / 73.73 ms │ no change │ │ QQuery 11 │ 12.78 / 14.23 ±1.42 / 16.07 ms │ 14.70 / 15.91 ±1.02 / 17.44 ms │ 1.12x slower │ │ QQuery 12 │ 46.39 / 49.56 ±2.85 / 53.70 ms │ 46.59 / 49.78 ±2.35 / 52.93 ms │ no change │ │ QQuery 13 │ 38.61 / 39.09 ±0.35 / 39.52 ms │ 37.73 / 38.36 ±0.47 / 39.17 ms │ no change │ │ QQuery 14 │ 14.05 / 14.55 ±0.43 / 15.22 ms │ 13.30 / 13.88 ±0.53 / 14.79 ms │ no change │ │ QQuery 15 │ 22.79 / 23.77 ±0.62 / 24.72 ms │ 21.87 / 21.93 ±0.05 / 22.00 ms │ +1.08x faster │ │ QQuery 16 │ 27.54 / 28.36 ±0.89 / 29.44 ms │ 28.28 / 28.62 ±0.37 / 29.34 ms │ no change │ │ QQuery 17 │ 161.75 / 162.75 ±0.75 / 163.87 │ 161.42 / 162.39 ±0.82 / 163.86 │ no change │ │ │ ms │ ms │ │ │ QQuery 18 │ 359.51 / 371.92 ±20.52 / │ 359.23 / 371.58 ±22.78 / 417.12 │ no change │ │ │ 412.89 ms │ ms │ │ │ QQuery 19 │ 35.48 / 37.25 ±1.68 / 40.40 ms │ 35.82 / 38.29 ±3.46 / 45.14 ms │ no change │ │ QQuery 20 │ 52.43 / 53.09 ±0.69 / 54.27 ms │ 51.79 / 52.69 ±0.50 / 53.09 ms │ no change │ │ QQuery 21 │ 237.65 / 240.66 ±1.77 / 242.82 │ 236.26 / 240.45 ±6.19 / 252.67 │ no change │ │ │ ms │ ms │ │ │ QQuery 22 │ 38.43 / 54.37 ±13.22 / 74.36 │ 49.39 / 53.63 ±5.71 / 64.71 ms │ no change │ │ │ ms │ │ │ └──────────────┴────────────────────────────────┴─────────────────────────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩ │ Total Time (baseline) │ 1895.71ms │ │ Total Time (branch) │ 1870.10ms │ │ Average Time (baseline) │ 86.17ms │ │ Average Time (branch) │ 85.00ms │ │ Queries Faster │ 4 │ │ Queries Slower │ 2 │ │ Queries with No Change │ 16 │ │ Queries with Failure │ 0 │ └─────────────────────────┴───────────┘ ``` </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org