xiangfu0 commented on PR #17247:
URL: https://github.com/apache/pinot/pull/17247#issuecomment-3817847209
> Please run a benchmark to quantify the performance overhead of enabling
tracking
did one round of the perf benchmark, no significant change:
# Distinct MSQE Tracking Overhead
## Run summary
- Date: 2026-01-16 14:45:53
- Benchmark: org.apache.pinot.perf.BenchmarkDistinctQueriesMSQE
- Dataset rows per segment: 1500000 (2 segments)
- Scenarios: EXP(0.001), EXP(0.5), EXP(0.999)
- JMH: 1.37
- JDK: OpenJDK 17.0.15
- Warmup: 5 x 1s, Measurement: 10 x 1s, Forks: 3
## Command
```bash
JAVA_HOME=/opt/homebrew/Cellar/openjdk@17/17.0.15/libexec/openjdk.jdk/Contents/Home
\
PATH=/opt/homebrew/Cellar/openjdk@17/17.0.15/libexec/openjdk.jdk/Contents/Home/bin:$PATH
\
JAVA_TOOL_OPTIONS="--add-opens=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED" \
java -jar pinot-perf/target/benchmarks.jar BenchmarkDistinctQueriesMSQE \
-p _scenario='EXP(0.001),EXP(0.5),EXP(0.999)' -p
_trackingMode=enabled,disabled -wi 5 -i 10 -f 3 \
-rf json -rff /tmp/jmh-distinct-msqe.json
```
## Results (avg ms/op)
| Scenario | Query | Enabled (ms ± err) | Disabled (ms ± err) | Delta (ms) |
Delta (%) | Within combined err? |
|---|---|---:|---:|---:|---:|:---:|
| EXP(0.001) | SELECT DISTINCT INT_COL FROM MyTable LIMIT 100000 | 4.401 ±
0.313 | 4.505 ± 0.372 | -0.104 | -2.3% | yes |
| EXP(0.001) | SELECT DISTINCT INT_COL FROM MyTable ORDER BY INT_COL DESC
LIMIT 100000 | 5.897 ± 0.419 | 5.730 ± 0.331 | 0.167 | 2.9% | yes |
| EXP(0.001) | SELECT DISTINCT INT_COL, LOW_CARDINALITY_STRING_COL FROM
MyTable LIMIT 100000 | 115.513 ± 6.820 | 109.707 ± 2.066 | 5.806 | 5.3% | yes |
| EXP(0.001) | SELECT DISTINCT LOW_CARDINALITY_STRING_COL FROM MyTable LIMIT
1000 | 2.513 ± 0.315 | 2.396 ± 0.356 | 0.117 | 4.9% | yes |
| EXP(0.001) | SELECT DISTINCT RAW_STRING_COL FROM MyTable LIMIT 100000 |
99.181 ± 1.644 | 97.187 ± 1.633 | 1.995 | 2.1% | yes |
| EXP(0.001) | SELECT DISTINCT RAW_STRING_COL FROM MyTable WHERE
LOW_CARDINALITY_STRING_COL = 'value1' LIMIT 100000 | 32.749 ± 0.623 | 33.785 ±
1.160 | -1.036 | -3.1% | yes |
| EXP(0.5) | SELECT DISTINCT INT_COL FROM MyTable LIMIT 100000 | 2.546 ±
0.316 | 2.562 ± 0.365 | -0.015 | -0.6% | yes |
| EXP(0.5) | SELECT DISTINCT INT_COL FROM MyTable ORDER BY INT_COL DESC
LIMIT 100000 | 2.527 ± 0.408 | 2.483 ± 0.363 | 0.045 | 1.8% | yes |
| EXP(0.5) | SELECT DISTINCT INT_COL, LOW_CARDINALITY_STRING_COL FROM
MyTable LIMIT 100000 | 42.006 ± 0.785 | 44.234 ± 3.330 | -2.229 | -5.0% | yes |
| EXP(0.5) | SELECT DISTINCT LOW_CARDINALITY_STRING_COL FROM MyTable LIMIT
1000 | 2.458 ± 0.348 | 2.584 ± 0.261 | -0.126 | -4.9% | yes |
| EXP(0.5) | SELECT DISTINCT RAW_STRING_COL FROM MyTable LIMIT 100000 |
69.407 ± 2.116 | 70.811 ± 6.542 | -1.404 | -2.0% | yes |
| EXP(0.5) | SELECT DISTINCT RAW_STRING_COL FROM MyTable WHERE
LOW_CARDINALITY_STRING_COL = 'value1' LIMIT 100000 | 17.463 ± 0.472 | 17.431 ±
0.384 | 0.033 | 0.2% | yes |
| EXP(0.999) | SELECT DISTINCT INT_COL FROM MyTable LIMIT 100000 | 2.407 ±
0.402 | 2.434 ± 0.312 | -0.027 | -1.1% | yes |
| EXP(0.999) | SELECT DISTINCT INT_COL FROM MyTable ORDER BY INT_COL DESC
LIMIT 100000 | 2.446 ± 0.426 | 2.690 ± 0.425 | -0.243 | -9.0% | yes |
| EXP(0.999) | SELECT DISTINCT INT_COL, LOW_CARDINALITY_STRING_COL FROM
MyTable LIMIT 100000 | 41.789 ± 0.519 | 43.100 ± 1.493 | -1.311 | -3.0% | yes |
| EXP(0.999) | SELECT DISTINCT LOW_CARDINALITY_STRING_COL FROM MyTable LIMIT
1000 | 2.534 ± 0.360 | 2.474 ± 0.353 | 0.060 | 2.4% | yes |
| EXP(0.999) | SELECT DISTINCT RAW_STRING_COL FROM MyTable LIMIT 100000 |
71.486 ± 2.025 | 70.870 ± 1.141 | 0.616 | 0.9% | yes |
| EXP(0.999) | SELECT DISTINCT RAW_STRING_COL FROM MyTable WHERE
LOW_CARDINALITY_STRING_COL = 'value1' LIMIT 100000 | 18.834 ± 0.500 | 18.970 ±
0.660 | -0.137 | -0.7% | yes |
## Overhead evaluation
- Positive delta means tracking enabled is slower (overhead).
- Negative delta means tracking enabled is faster.
- For all scenario/query pairs, deltas remain within the combined 99.9% CI
error bounds, so no significant difference is established.
## Notes
- JMH reported lingering Netty/async threads after completion; forks were
force-terminated after the shutdown timeout.
- The run logged warnings about direct reserved memory; consider running
with more direct memory if noise persists.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]