Fottas commented on issue #36454:
URL: 
https://github.com/apache/shardingsphere/issues/36454#issuecomment-3609814922

   ### 1. Parameter filtering modeling
   
   Current tokens can only rewrite SQL via `toString(RouteUnit)`, but cannot 
tell the rewrite engine which parameters to remove. For IN optimization, I need 
both capabilities: rewrite SQL per route AND filter parameters per route.
   
   That's why I added `ParameterFilterable` interface in `infra.rewrite.core` - 
it's an extension point that lets tokens communicate their parameter filtering 
needs to the engine. Currently designed for IN optimization, but the mechanism 
is generic enough for other scenarios if needed.
   
   ### 2. Compatibility with existing token generators
   
   The interface is optional. Existing tokens don't implement it, so they're 
completely unaffected. When no filterable tokens exist, the engine returns 
original parameters unchanged.
   
   ### 3. Custom algorithm compatibility
   
   I don't modify any SPI interfaces. For each IN value, I call the algorithm's 
existing `doSharding()` method (same as routing does) to determine target 
tables, then map them to route units. If the algorithm returns null or empty, 
the value is marked as orphan and excluded from all routes.
   
   Since I'm using the same SPI that routing uses, any algorithm that works 
with routing will work with this optimization.
   
   ### 4. Complex SQL support
   
   The optimization works at the parameter level, not SQL structure level. It 
applies to any query using `StandardParameterBuilder` (all non-batch queries 
including subqueries and JOINs), regardless of SQL complexity.
   
   In the token generator, I currently support `StandardShardingAlgorithm` and 
`ComplexKeysShardingAlgorithm`. For each IN value, I invoke the algorithm to 
determine target tables. For other strategy types, the value is included in all 
routes (original behavior), ensuring compatibility without requiring changes.
   
   ---
   
   ## PR Split Plan
   
   ### PR 1: Add ParameterFilterable interface
   
   Introduce the parameter filtering extension point in `infra.rewrite.core`. 
This interface allows tokens to declare which parameters should be removed for 
each route, enabling route-aware parameter filtering in the rewrite engine.
   
   ---
   
   ### PR 2: Integrate parameter filtering into RouteSQLRewriteEngine
   
   Implement the parameter filtering mechanism in the rewrite engine. When 
processing each route, the engine checks for filterable tokens and applies 
parameter filtering accordingly. When no filterable tokens exist, it maintains 
original behavior with zero overhead.
   
   ---
   
   ### PR 3: Add data structures (ShardingInPredicateValue and Token)
   
   Introduce the data model for IN predicate optimization. 
`ShardingInPredicateValue` represents a single IN value with its target route 
information. `ShardingInPredicateToken` generates optimized SQL per route and 
implements parameter filtering, with single-value optimization (IN → =) for 
better database performance.
   
   ---
   
   ### PR 4: Simple optimization - literal values only
   
   First end-to-end working version. The token generator detects IN predicates 
on sharding keys, calls sharding algorithms to determine target routes for each 
value, and generates tokens with route distribution information. This PR 
handles only literal values to keep the change manageable.
   
   ---
   
   ### PR 5: Add parameter marker support
   
   Extend the generator to support parameter markers (`?`) in IN predicates. 
Implement `ParametersAware` interface to access bound parameters, enabling 
optimization for prepared statements.
   
   ---
   
   ### PR 6: Complex algorithm support
   
   Extend support to `ComplexKeysShardingAlgorithm` for multi-column sharding 
keys. The generator builds appropriate sharding values and invokes the 
algorithm to determine target routes.
   
   ---
   
   ### PR 7: Nested expression support
   
   Support IN predicates nested in AND/OR expressions. The generator 
recursively extracts IN predicates from binary operations, handling arbitrary 
nesting depth and multiple IN predicates in the same query.
   
   ---
   
   @terrymanu  Thanks again for the detailed feedback. I've tried to address 
the architectural concerns you raised and broken down the work into smaller 
PRs. 
   
   I'm not entirely sure if this is what you're looking for - if I've 
misunderstood anything or if you'd like me to approach it differently, please 
let me know. Happy to adjust the plan based on your guidance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to