terrymanu commented on issue #35625:
URL:
https://github.com/apache/shardingsphere/issues/35625#issuecomment-3642817232
## Issue Understanding
In an hourly INTERVAL sharding setup (~7 months * 31 days * 24 hours ≈
5,208 tables), TPS drops from ~10k to single digits when routing with the
sharding key, suspected to be caused by DataNode creation in routeTables().
## Root Cause
The heavy cost is not DataNode construction but the INTERVAL sharding
algorithm’s time-slice iteration plus full table-suffix matching:
getMatchedTables iterates from datetime-lower to datetime-upper, and for each
time slice filters all available table names with
endsWith (see
features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/algorithm/sharding/datetime/IntervalShardingAlgorithm.java
lines 139-158). Time-slice count × table count yields ~5,208 × 5,208 ≈ 27M
string matches for this configuration; if
datetime-upper defaults to now() or spans more, iterations increase.
DataNode is just a value object; its cost is negligible compared to the string
matching.
## Analysis
- Official docs (INTERVAL algorithm:
docs/document/content/user-manual/common-config/builtin-algorithm/sharding.en.md#interval-sharding-algorithm;
algorithm list: docs/document/content/dev-manual/sharding.en.md) show range
routing scans the time window and does not
index table suffixes.
- If the SQL lacks a sharding key or the format mismatches,
tableShardingValues becomes empty and routes to all tables, inflating matches.
- Changing step to DAYS only reduces the number of time slices but keeps
the “time slices × all tables” complexity.
- Generic optimization ideas (design-level) include building a suffix
index over available tables, formatting suffix directly for precise sharding,
and pruning time slices/table coverage for range sharding; these are design
evolutions, not current defects.
## Conclusion
This is the intended behavior of the INTERVAL algorithm, not a DataNode
bug. Recommended usage: explicitly narrow datetime-lower/datetime-upper to the
actual table coverage, increase granularity or shorten the time span to fit
throughput needs, and ensure SQL carries a sharding key matching
datetime-pattern to avoid full-table routing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]