TeslaCN commented on code in PR #19791: URL: https://github.com/apache/shardingsphere/pull/19791#discussion_r935324284
########## docs/blog/content/material/2022_04_13_A_Practical_Guide_to_Apache_ShardingSphere's_HINT.en.md: ########## @@ -0,0 +1,382 @@ ++++ +title = "A Practical Guide to Apache ShardingSphere’s HINT" +weight = 49 +chapter = true ++++ + +## Background +[Apache ShardingSphere](https://shardingsphere.apache.org/) has gradually introduced various features based on practical user requirements, such as data sharding and read/write splitting. + +The data sharding feature contains many practical sharding strategies such as Standard Sharding Strategy and Complex Sharding Strategy, and users can easily configure the corresponding sharding algorithms. + +When it comes to Read/Write Splitting, Apache ShardingSphere provides users with two types called Static and Dynamic, and abundant load balancing algorithms. + +Sharding and Read/Write Splitting functions of ShardingSphere are already very useful, but scenarios are ever-changing. + +Take a multi-tenant case as an example: a user expects to shard data according to the tenant to which the login account belongs, but the tenant information does not exist in every business SQL. In this case, the algorithm for extracting sharding fields from SQL is not feasible. + +Additionally, in most read/write splitting scenarios, users want to route queries to the secondary database for execution, but in some scenarios with a requirement for real-time operations, users want to route SQL to the primary database for execution. Currently, read/write splitting cannot meet business requirements. + +Considering the above-mentioned pain points, Apache ShardingSphere created the `Hint` function to allow users to utilize different logic rather than SQL to implement forced routing or sharding. + +Currently, ShardingSphere provides users with two `Hint` methods. One is a manual programming method with Java API and uses `HintManager` for forced routing and sharding. This method is very friendly to applications programmed with JDBC because developers don’t need to write too much code and can easily implement SQL-independent sharding or forced routing functions. + +Based on distributed SQL ([DistSQL](https://opensource.com/article/21/9/distsql), ShardingSphere designed `SQL HINT` and `DistSQL HINT` to provide users with sharding and forced routing functions that can be implemented without coding. The method is more friendly to database administrators (DBAs). + +Next, let’s take a close look at the two methods. + +## Manual Programming Based on HintManager + +ShardingSphere can implement the functions of forced route and sharding via the `HintManager` objects. With `HintManager`, users can complete data sharding without SQL. It also allows users to shard data or force routing more flexibly, greatly expanding user scenarios. + +At the moment, with the help of `HintManager`, users can utilize ShardingSphere’s built-in or custom `Hint` algorithms to implement the sharding function, and can set specified data source or force primary database to do read/write splitting to implement the forced routing function. + +I’d like to explain its basic implementation principle first to help you gain a better understanding of `HintManager`. + +- **The Implementation of HintManager** +The code snippet below can help you quickly understand the principle of `HintManager`. + +``` +@NoArgsConstructor(access = AccessLevel.PRIVATE) +public final class HintManager implements AutoCloseable { + + private static final ThreadLocal<HintManager> HINT_MANAGER_HOLDER = new ThreadLocal<>(); +} +``` + +As shown above, ShardingSphere implements the function of `HintManager` with `ThreadLocal`: as long as they are in the same thread, user’s sharding settings are preserved. Therefore, the user only needs to call relevant `HintManager` functions before executing SQL statements, and then ShardingSphere can obtain the sharding or mandatory routing conditions set by the user in the current thread so as to perform sharding or routing operations. + +Next, let’s learn how to use it. + +- **How to Use HitManager** + +1. Use `HINT` for Sharding +To use the `Hint Sharding Algorithm`, users are required to implement the interface `org.apache.shardingsphere.sharding.api.sharding.hint.HintShardingAlgorithm`. When Apache ShardingSphere performs routing, it will obtain shard values from `HintManager` for routing operations. + +> **The configuration is as follows:** + +``` +rules: +- !SHARDING + tables: + t_order: + actualDataNodes: demo_ds_${0..1}.t_order_${0..1} + databaseStrategy: + hint: + algorithmClassName: xxx.xxx.xxx.HintXXXAlgorithm + tableStrategy: + hint: + algorithmClassName: xxx.xxx.xxx.HintXXXAlgorithm + defaultTableStrategy: + none: + defaultKeyGenerateStrategy: + type: SNOWFLAKE + column: order_id + +props: + sql-show: true +``` + +> **Get the HintManager instance:** + +``` +HintManager hintManager = HintManager.getInstance(); +``` + +> **Add shard key:** + +- Use `hintManager.addDatabaseShardingValue` to add data source shard key +- `hintManager.addTableShardingValue` is used to add table shard key + +Note: In the case of database sharding without table sharding, when using HINT to force routing to a database shard, you can use `hintManager.setDatabaseShardingValue` to add `Shard`. + +> **Delete shard key:** + +`Shard Key` is stored in `ThreadLocal` so you need to call `hintManager.close()` at the end of the operation to clear the content in `ThreadLocal` + +> **The complete code snippet example is as follows:** + +``` +String sql = "SELECT * FROM t_order"; +try (HintManager hintManager = HintManager.getInstance(); + Connection conn = dataSource.getConnection(); + PreparedStatement preparedStatement = conn.prepareStatement(sql)) { + hintManager.addDatabaseShardingValue("t_order", 1); + hintManager.addTableShardingValue("t_order", 2); + try (ResultSet rs = preparedStatement.executeQuery()) { + while (rs.next()) { + // ... + } + } +} + +String sql = "SELECT * FROM t_order"; +try (HintManager hintManager = HintManager.getInstance(); + Connection conn = dataSource.getConnection(); + PreparedStatement preparedStatement = conn.prepareStatement(sql)) { + hintManager.setDatabaseShardingValue(3); + try (ResultSet rs = preparedStatement.executeQuery()) { + while (rs.next()) { + // ... + } + } +} +``` + +2. Use `HINT` to `Force Primary Database Route` + +> Obtain HintManager + +It is the same as HINT-based Data Sharding described above. + +> Set Primary Database Route + +Use hintManager.setWriteRouteOnly to complete setting. + +> Clear Shard Key Value + +It is the same as HINT-based Data Sharding described above. + +> The complete code snippet example is as follows: + +``` +String sql = "SELECT * FROM t_order"; +try (HintManager hintManager = HintManager.getInstance(); + Connection conn = dataSource.getConnection(); + PreparedStatement preparedStatement = conn.prepareStatement(sql)) { + hintManager.setWriteRouteOnly(); + try (ResultSet rs = preparedStatement.executeQuery()) { + while (rs.next()) { + // ... + } + } +} +``` + +3. Use `HINT` to Implement Route to Specified Database + +> Obtain HintManager + +It is the same as HINT-based Data Sharding described above. + +> Set Route to Specified Database + +Use `hintManager.setWriteRouteOnly` to set database name. + +> The complete code snippet example is as follows: + +``` +String sql = "SELECT * FROM t_order"; +try (HintManager hintManager = HintManager.getInstance(); + Connection conn = dataSource.getConnection(); + PreparedStatement preparedStatement = conn.prepareStatement(sql)) { + hintManager.setDataSourceName("ds_0"); + try (ResultSet rs = preparedStatement.executeQuery()) { + while (rs.next()) { + // ... + } + } +} +``` + +> Delete Forced Route Value + +It is the same as HINT-based Data Sharding described above. + +After understanding the manual programming method based on `HintManager`, let’s take a look at another HINT solution provided by ShardingSphere based on distributed SQL. + +## DistSQL Based HINT +DistSQL HINT provided by Apache ShardingSphere is composed of two functions: one is called `SQL HINT` that is based on SQL annotations, and the other is the function that acts on `HintManager` through DistSQL implementation. + +## SQL HINT + +`SQL HINT` is a HINT method to implement forced routing by adding annotations to SQL statements, reducing the cost of code modification for users. This means that it is not subjected to the limitations of Java API, and is available in both ShardingSphere-JDBC and ShardingSphere-Proxy. + +Take the following SQL statement as an example. Even if the user configures the relevant sharding algorithm for t_order, the SQL statement will be directly executed on the database ds_0 and the execution result will be returned. +``` +/* ShardingSphere hint: dataSourceName=ds_0 */ +SELECT * FROM t_order; +``` + +By means of annotations, we can easily send the SQL statement directly to the specified database for execution with no need for considering other sharding logic. + +Taking the multi-tenant scenario as an example. Users do not need to configure complex database sharding logic or modify business logic any more, but only need to add the specified database to the annotation. + +Next, I’d like to explain the implementation principle of `SQL HINT`. + +- **The Implementation of SQL HINT** + +If you’ve heard of Apache ShardingSphere before, you must be familiar with its SQL Parser engine. The first step to implementing SQL HINT is to extract SQL annotations. + +With Access Channel in ANTLR4, SQL annotation can be sent to a specific hidden channel separately. ShardingSphere also uses this function to extract annotations in the hidden channel while generating the parsing result. + +The specific implementation is shown in the following code snippet: + +- Feed SQL comments into the hidden channel: +``` +lexer grammar Comments; + +import Symbol; + +BLOCK_COMMENT: '/*' .*? '*/' -> channel(HIDDEN); +INLINE_COMMENT: (('-- ' | '#') ~[\r\n]* ('\r'? '\n' | EOF) | '--' ('\r'? '\n' | EOF)) -> channel(HIDDEN); +``` + +- Access the syntax tree and add the extraction of the annotation + +``` +public <T> T visit(final ParseContext parseContext) { + ParseTreeVisitor<T> visitor = SQLVisitorFactory.newInstance(databaseType, visitorType, SQLVisitorRule.valueOf(parseContext.getParseTree().getClass()), props); + T result = parseContext.getParseTree().accept(visitor); + appendSQLComments(parseContext, result); + return result; +} + +private <T> void appendSQLComments(final ParseContext parseContext, final T visitResult) { + if (!parseContext.getHiddenTokens().isEmpty() && visitResult instanceof AbstractSQLStatement) { + Collection<CommentSegment> commentSegments = parseContext.getHiddenTokens().stream().map(each -> new CommentSegment(each.getText(), each.getStartIndex(), each.getStopIndex())) + .collect(Collectors.toList()); + ((AbstractSQLStatement) visitResult).getCommentSegments().addAll(commentSegments); + } +} +``` + +After extracting the SQL annotation information, we need to perform related mandatory routing based on the information. For routing, it is normal to use Apache ShardingSphere’s Router engine. + +We have made some modifications for HINT on the Router engine. + +``` +public RouteContext route(final LogicSQL logicSQL, final ShardingSphereMetaData metaData) { + RouteContext result = new RouteContext(); + Optional<String> dataSourceName = findDataSourceByHint(logicSQL.getSqlStatementContext(), metaData.getResource().getDataSources()); + if (dataSourceName.isPresent()) { + result.getRouteUnits().add(new RouteUnit(new RouteMapper(dataSourceName.get(), dataSourceName.get()), Collections.emptyList())); + return result; + } + for (Entry<ShardingSphereRule, SQLRouter> entry : routers.entrySet()) { + if (result.getRouteUnits().isEmpty()) { + result = entry.getValue().createRouteContext(logicSQL, metaData, entry.getKey(), props); + } else { + entry.getValue().decorateRouteContext(result, logicSQL, metaData, entry.getKey(), props); + } + } + if (result.getRouteUnits().isEmpty() && 1 == metaData.getResource().getDataSources().size()) { + String singleDataSourceName = metaData.getResource().getDataSources().keySet().iterator().next(); + result.getRouteUnits().add(new RouteUnit(new RouteMapper(singleDataSourceName, singleDataSourceName), Collections.emptyList())); + } + return result; +} +``` + + +ShardingSphere first finds SQL annotations that meet the definition, and after verification, it will directly return the routing result specified by the user, thus implementing the forced routing function. + +Next, I’d like to showcase how to use `SQL HINT`. + +- **How to Use SQL HINT** +`SQL HINT` is easy to use with ShardingSphere-JDBC and ShardingSphere-Proxy. + +Step 1: Turn on the annotation parser and set `sqlCommentParseEnabled` to true. + +Step 2: Add SQL comments. Currently `SQL HINT` supports specifying data source routing and primary database routing. + +- Data source-specified Routing: currently only supports routing to one data source. The comment format only supports `/* */` for the time being and starts with `ShardingSphere hint`: with the attribute name `dataSourceName`. + +``` Review Comment: ```suggestion ```sql ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
