terrymanu commented on issue #28182:
URL:
https://github.com/apache/shardingsphere/issues/28182#issuecomment-3677691466
Issue Understanding
- Environment: ShardingSphere-Proxy 5.4.0. Database sharding by user_id
(ds_${user_id % 2}); table sharding by web_id (website_${web_id % 2},
ntable_${web_id % 2}); website/ntable are bound tables.
- Symptom: After running the sample DDL/DML, SELECT DISTINCT t1.id FROM
website t1 LEFT JOIN ntable t2 ON t1.id = t2.id ORDER BY id; returns 1..8 on
native MySQL single table, but only 1..4 via Proxy.
Root Cause
- Column id uses MySQL local AUTO_INCREMENT per physical shard and is not
globally unique.
- Sharding routes to ds_1 (all user_id = 1), and table shards split by
web_id, producing two physical tables. Each table generates id = 1..4; after
merge, DISTINCT collapses duplicates, leaving 1..4.
- The right table ntable is empty; the LEFT JOIN does not filter rows, so
the discrepancy is solely due to duplicate id values across shards.
Problem Analysis
- The config does not assign a distributed key to id; the Snowflake
generator is on web_id and inserts explicitly set it, so id remains physical
auto-increment.
- At the logical layer, id repeats across shards; any de-duplication,
ordering, or joins relying on id uniqueness diverge from single-database
behavior.
- This stems from sharding design: ShardingSphere does not merge
auto-increment sequences into a global unique stream; consistent results
require an explicit globally unique key or using the sharding key in
identifiers.
Conclusion
- This is expected behavior from the current sharding design, not a LEFT
JOIN bug. To align with single-database results, use a globally distributed key
for the primary identifier (e.g., apply Snowflake to the primary key and let
the engine generate it) or join/query using an identifier that includes the
sharding key, avoiding reliance on per-shard auto-increment id.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]