Fottas commented on issue #36454: URL: https://github.com/apache/shardingsphere/issues/36454#issuecomment-3609814922
### 1. Parameter filtering modeling Current tokens can only rewrite SQL via `toString(RouteUnit)`, but cannot tell the rewrite engine which parameters to remove. For IN optimization, I need both capabilities: rewrite SQL per route AND filter parameters per route. That's why I added `ParameterFilterable` interface in `infra.rewrite.core` - it's an extension point that lets tokens communicate their parameter filtering needs to the engine. Currently designed for IN optimization, but the mechanism is generic enough for other scenarios if needed. ### 2. Compatibility with existing token generators The interface is optional. Existing tokens don't implement it, so they're completely unaffected. When no filterable tokens exist, the engine returns original parameters unchanged. ### 3. Custom algorithm compatibility I don't modify any SPI interfaces. For each IN value, I call the algorithm's existing `doSharding()` method (same as routing does) to determine target tables, then map them to route units. If the algorithm returns null or empty, the value is marked as orphan and excluded from all routes. Since I'm using the same SPI that routing uses, any algorithm that works with routing will work with this optimization. ### 4. Complex SQL support The optimization works at the parameter level, not SQL structure level. It applies to any query using `StandardParameterBuilder` (all non-batch queries including subqueries and JOINs), regardless of SQL complexity. In the token generator, I currently support `StandardShardingAlgorithm` and `ComplexKeysShardingAlgorithm`. For each IN value, I invoke the algorithm to determine target tables. For other strategy types, the value is included in all routes (original behavior), ensuring compatibility without requiring changes. --- ## PR Split Plan ### PR 1: Add ParameterFilterable interface Introduce the parameter filtering extension point in `infra.rewrite.core`. This interface allows tokens to declare which parameters should be removed for each route, enabling route-aware parameter filtering in the rewrite engine. --- ### PR 2: Integrate parameter filtering into RouteSQLRewriteEngine Implement the parameter filtering mechanism in the rewrite engine. When processing each route, the engine checks for filterable tokens and applies parameter filtering accordingly. When no filterable tokens exist, it maintains original behavior with zero overhead. --- ### PR 3: Add data structures (ShardingInPredicateValue and Token) Introduce the data model for IN predicate optimization. `ShardingInPredicateValue` represents a single IN value with its target route information. `ShardingInPredicateToken` generates optimized SQL per route and implements parameter filtering, with single-value optimization (IN → =) for better database performance. --- ### PR 4: Simple optimization - literal values only First end-to-end working version. The token generator detects IN predicates on sharding keys, calls sharding algorithms to determine target routes for each value, and generates tokens with route distribution information. This PR handles only literal values to keep the change manageable. --- ### PR 5: Add parameter marker support Extend the generator to support parameter markers (`?`) in IN predicates. Implement `ParametersAware` interface to access bound parameters, enabling optimization for prepared statements. --- ### PR 6: Complex algorithm support Extend support to `ComplexKeysShardingAlgorithm` for multi-column sharding keys. The generator builds appropriate sharding values and invokes the algorithm to determine target routes. --- ### PR 7: Nested expression support Support IN predicates nested in AND/OR expressions. The generator recursively extracts IN predicates from binary operations, handling arbitrary nesting depth and multiple IN predicates in the same query. --- @terrymanu Thanks again for the detailed feedback. I've tried to address the architectural concerns you raised and broken down the work into smaller PRs. I'm not entirely sure if this is what you're looking for - if I've misunderstood anything or if you'd like me to approach it differently, please let me know. Happy to adjust the plan based on your guidance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
