terrymanu commented on issue #36109:
URL:
https://github.com/apache/shardingsphere/issues/36109#issuecomment-3503268817
You are encountering an issue where when performing a LEFT JOIN between
tableA (single table with 1000 records) and sharded table tableB (split by year
into tableB_2021, tableB_2022, etc.), the result shows duplicate data
(increasing from 1000 to 2000 records).
Root Cause Analysis Requires More Information
Based on the information provided, we cannot accurately determine the root
cause because the database deployment architecture has a decisive impact on the
nature of the problem:
Case A: If all tables are in the same database instance
- May be a routing or result merging logic issue
Case B: If tables are distributed across different database instances
- Will use SQL federation query engine for processing
- The issue may be in the distributed JOIN algorithm of the federation
query
Key Information Required
Most Critical Configuration Information
# Please provide your data source configuration
dataSources:
# Which data sources are tableA, tableB_2021, tableB_2022 deployed on?
```yaml
rules:
- !SHARDING
tables:
tableA:
# tableA configuration
tableB:
actualDataNodes: ???
tableStrategy:
# What is the sharding strategy and sharding column?
```
Other Important Information
1. Database Architecture: Are tableA, tableB_2021, tableB_2022 in the same
database instance?
2. Federation Query Configuration: Have you enabled SQL federation query
functionality?
3. Sharding Rules: What is the sharding column for tableB?
4. Business Logic: Can records from tableA potentially match multiple
tableB shards simultaneously?
Next Steps
1. Provide Configuration Information: Please supplement the above critical
configuration and architecture information
2. Confirm Deployment Architecture: Clearly specify the deployment
location of each table (same database or cross-database)
3. Enable SQL Logging: Check the actual SQL execution and execution path
in ShardingSphere
Special Note
This is an issue that requires technical detail support. Please provide
specific configuration information, especially data source deployment
architecture and federation query configuration, so we can provide more
accurate technical analysis and solutions.
Only after obtaining this necessary information can we determine the true
root cause of the problem and provide corresponding solutions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]