[GitHub] [shardingsphere] TeslaCN commented on a diff in pull request #19791: feat(blog): add new blogs

GitBox Tue, 02 Aug 2022 02:26:03 -0700


TeslaCN commented on code in PR #19791:
URL: https://github.com/apache/shardingsphere/pull/19791#discussion_r935324284



##########
docs/blog/content/material/2022_04_13_A_Practical_Guide_to_Apache_ShardingSphere's_HINT.en.md:
##########
@@ -0,0 +1,382 @@
++++ 
+title = "A Practical Guide to Apache ShardingSphere’s HINT"
+weight = 49
+chapter = true 
++++
+
+## Background
+[Apache ShardingSphere](https://shardingsphere.apache.org/) has gradually 
introduced various features based on practical user requirements, such as data 
sharding and read/write splitting.
+
+The data sharding feature contains many practical sharding strategies such as 
Standard Sharding Strategy and Complex Sharding Strategy, and users can easily 
configure the corresponding sharding algorithms.
+
+When it comes to Read/Write Splitting, Apache ShardingSphere provides users 
with two types called Static and Dynamic, and abundant load balancing 
algorithms.
+
+Sharding and Read/Write Splitting functions of ShardingSphere are already very 
useful, but scenarios are ever-changing.
+
+Take a multi-tenant case as an example: a user expects to shard data according 
to the tenant to which the login account belongs, but the tenant information 
does not exist in every business SQL. In this case, the algorithm for 
extracting sharding fields from SQL is not feasible.
+
+Additionally, in most read/write splitting scenarios, users want to route 
queries to the secondary database for execution, but in some scenarios with a 
requirement for real-time operations, users want to route SQL to the primary 
database for execution. Currently, read/write splitting cannot meet business 
requirements.
+
+Considering the above-mentioned pain points, Apache ShardingSphere created the 
`Hint` function to allow users to utilize different logic rather than SQL to 
implement forced routing or sharding.
+
+Currently, ShardingSphere provides users with two `Hint` methods. One is a 
manual programming method with Java API and uses `HintManager` for forced 
routing and sharding. This method is very friendly to applications programmed 
with JDBC because developers don’t need to write too much code and can easily 
implement SQL-independent sharding or forced routing functions.
+
+Based on distributed SQL 
([DistSQL](https://opensource.com/article/21/9/distsql), ShardingSphere 
designed `SQL HINT` and `DistSQL HINT` to provide users with sharding and 
forced routing functions that can be implemented without coding. The method is 
more friendly to database administrators (DBAs).
+
+Next, let’s take a close look at the two methods.
+
+## Manual Programming Based on HintManager
+
+ShardingSphere can implement the functions of forced route and sharding via 
the `HintManager` objects. With `HintManager`, users can complete data sharding 
without SQL. It also allows users to shard data or force routing more flexibly, 
greatly expanding user scenarios.
+
+At the moment, with the help of `HintManager`, users can utilize 
ShardingSphere’s built-in or custom `Hint` algorithms to implement the sharding 
function, and can set specified data source or force primary database to do 
read/write splitting to implement the forced routing function.
+
+I’d like to explain its basic implementation principle first to help you gain 
a better understanding of `HintManager`.
+
+- **The Implementation of HintManager**
+The code snippet below can help you quickly understand the principle of 
`HintManager`.
+
+```
+@NoArgsConstructor(access = AccessLevel.PRIVATE)
+public final class HintManager implements AutoCloseable {
+
+    private static final ThreadLocal<HintManager> HINT_MANAGER_HOLDER = new 
ThreadLocal<>();
+}
+```
+
+As shown above, ShardingSphere implements the function of `HintManager` with 
`ThreadLocal`: as long as they are in the same thread, user’s sharding settings 
are preserved. Therefore, the user only needs to call relevant `HintManager` 
functions before executing SQL statements, and then ShardingSphere can obtain 
the sharding or mandatory routing conditions set by the user in the current 
thread so as to perform sharding or routing operations.
+
+Next, let’s learn how to use it.
+
+- **How to Use HitManager** 
+
+1. Use `HINT` for Sharding
+To use the `Hint Sharding Algorithm`, users are required to implement the 
interface 
`org.apache.shardingsphere.sharding.api.sharding.hint.HintShardingAlgorithm`. 
When Apache ShardingSphere performs routing, it will obtain shard values from 
`HintManager` for routing operations.
+
+> **The configuration is as follows:**
+
+```
+rules:
+- !SHARDING
+  tables:
+    t_order:
+      actualDataNodes: demo_ds_${0..1}.t_order_${0..1}
+      databaseStrategy:
+        hint:
+          algorithmClassName: xxx.xxx.xxx.HintXXXAlgorithm
+      tableStrategy:
+        hint:
+          algorithmClassName: xxx.xxx.xxx.HintXXXAlgorithm
+  defaultTableStrategy:
+    none:
+  defaultKeyGenerateStrategy:
+    type: SNOWFLAKE
+    column: order_id
+
+props:
+    sql-show: true
+```
+
+> **Get the HintManager instance:**
+
+```
+HintManager hintManager = HintManager.getInstance();
+```
+
+> **Add shard key:**
+
+- Use `hintManager.addDatabaseShardingValue` to add data source shard key
+- `hintManager.addTableShardingValue` is used to add table shard key
+
+Note: In the case of database sharding without table sharding, when using HINT 
to force routing to a database shard, you can use 
`hintManager.setDatabaseShardingValue` to add `Shard`.
+
+> **Delete shard key:**
+
+`Shard Key` is stored in `ThreadLocal` so you need to call 
`hintManager.close()` at the end of the operation to clear the content in 
`ThreadLocal`
+
+> **The complete code snippet example is as follows:**
+
+```
+String sql = "SELECT * FROM t_order";
+try (HintManager hintManager = HintManager.getInstance();
+     Connection conn = dataSource.getConnection();
+     PreparedStatement preparedStatement = conn.prepareStatement(sql)) {
+    hintManager.addDatabaseShardingValue("t_order", 1);
+    hintManager.addTableShardingValue("t_order", 2);
+    try (ResultSet rs = preparedStatement.executeQuery()) {
+        while (rs.next()) {
+            // ...
+        }
+    }
+}
+
+String sql = "SELECT * FROM t_order";
+try (HintManager hintManager = HintManager.getInstance();
+     Connection conn = dataSource.getConnection();
+     PreparedStatement preparedStatement = conn.prepareStatement(sql)) {
+    hintManager.setDatabaseShardingValue(3);
+    try (ResultSet rs = preparedStatement.executeQuery()) {
+        while (rs.next()) {
+            // ...
+        }
+    }
+}
+```
+
+2. Use `HINT` to `Force Primary Database Route`
+
+> Obtain HintManager
+
+It is the same as HINT-based Data Sharding described above.
+
+> Set Primary Database Route
+
+Use hintManager.setWriteRouteOnly to complete setting.
+
+> Clear Shard Key Value
+
+It is the same as HINT-based Data Sharding described above.
+
+> The complete code snippet example is as follows:
+
+```
+String sql = "SELECT * FROM t_order";
+try (HintManager hintManager = HintManager.getInstance();
+     Connection conn = dataSource.getConnection();
+     PreparedStatement preparedStatement = conn.prepareStatement(sql)) {
+    hintManager.setWriteRouteOnly();
+    try (ResultSet rs = preparedStatement.executeQuery()) {
+        while (rs.next()) {
+            // ...
+        }
+    }
+}
+```
+
+3. Use `HINT` to Implement Route to Specified Database
+
+> Obtain HintManager
+
+It is the same as HINT-based Data Sharding described above.
+
+> Set Route to Specified Database
+
+Use `hintManager.setWriteRouteOnly` to set database name.
+
+> The complete code snippet example is as follows:
+
+```
+String sql = "SELECT * FROM t_order";
+try (HintManager hintManager = HintManager.getInstance();
+     Connection conn = dataSource.getConnection();
+     PreparedStatement preparedStatement = conn.prepareStatement(sql)) {
+    hintManager.setDataSourceName("ds_0");
+    try (ResultSet rs = preparedStatement.executeQuery()) {
+        while (rs.next()) {
+            // ...
+        }
+    }
+}
+```
+
+> Delete Forced Route Value
+
+It is the same as HINT-based Data Sharding described above.
+
+After understanding the manual programming method based on `HintManager`, 
let’s take a look at another HINT solution provided by ShardingSphere based on 
distributed SQL.
+
+## DistSQL Based HINT
+DistSQL HINT provided by Apache ShardingSphere is composed of two functions: 
one is called `SQL HINT` that is based on SQL annotations, and the other is the 
function that acts on `HintManager` through DistSQL implementation.
+
+## SQL HINT
+
+`SQL HINT` is a HINT method to implement forced routing by adding annotations 
to SQL statements, reducing the cost of code modification for users. This means 
that it is not subjected to the limitations of Java API, and is available in 
both ShardingSphere-JDBC and ShardingSphere-Proxy.
+
+Take the following SQL statement as an example. Even if the user configures 
the relevant sharding algorithm for t_order, the SQL statement will be directly 
executed on the database ds_0 and the execution result will be returned.
+```
+/* ShardingSphere hint: dataSourceName=ds_0 */
+SELECT * FROM t_order;
+```
+
+By means of annotations, we can easily send the SQL statement directly to the 
specified database for execution with no need for considering other sharding 
logic.
+
+Taking the multi-tenant scenario as an example. Users do not need to configure 
complex database sharding logic or modify business logic any more, but only 
need to add the specified database to the annotation.
+
+Next, I’d like to explain the implementation principle of `SQL HINT`.
+
+- **The Implementation of SQL HINT**
+
+If you’ve heard of Apache ShardingSphere before, you must be familiar with its 
SQL Parser engine. The first step to implementing SQL HINT is to extract SQL 
annotations.
+
+With Access Channel in ANTLR4, SQL annotation can be sent to a specific hidden 
channel separately. ShardingSphere also uses this function to extract 
annotations in the hidden channel while generating the parsing result.
+
+The specific implementation is shown in the following code snippet:
+
+- Feed SQL comments into the hidden channel:
+```
+lexer grammar Comments;
+
+import Symbol;
+
+BLOCK_COMMENT:  '/*' .*? '*/' -> channel(HIDDEN);
+INLINE_COMMENT: (('-- ' | '#') ~[\r\n]* ('\r'? '\n' | EOF) | '--' ('\r'? '\n' 
| EOF)) -> channel(HIDDEN);
+```
+
+- Access the syntax tree and add the extraction of the annotation
+
+```
+public <T> T visit(final ParseContext parseContext) {
+    ParseTreeVisitor<T> visitor = SQLVisitorFactory.newInstance(databaseType, 
visitorType, SQLVisitorRule.valueOf(parseContext.getParseTree().getClass()), 
props);
+    T result = parseContext.getParseTree().accept(visitor);
+    appendSQLComments(parseContext, result);
+    return result;
+}
+
+private <T> void appendSQLComments(final ParseContext parseContext, final T 
visitResult) {
+    if (!parseContext.getHiddenTokens().isEmpty() && visitResult instanceof 
AbstractSQLStatement) {
+        Collection<CommentSegment> commentSegments = 
parseContext.getHiddenTokens().stream().map(each -> new 
CommentSegment(each.getText(), each.getStartIndex(), each.getStopIndex()))
+                .collect(Collectors.toList());
+        ((AbstractSQLStatement) 
visitResult).getCommentSegments().addAll(commentSegments);
+    }
+}
+```
+
+After extracting the SQL annotation information, we need to perform related 
mandatory routing based on the information. For routing, it is normal to use 
Apache ShardingSphere’s Router engine.
+
+We have made some modifications for HINT on the Router engine.
+
+```
+public RouteContext route(final LogicSQL logicSQL, final 
ShardingSphereMetaData metaData) {
+    RouteContext result = new RouteContext();
+    Optional<String> dataSourceName = 
findDataSourceByHint(logicSQL.getSqlStatementContext(), 
metaData.getResource().getDataSources());
+    if (dataSourceName.isPresent()) {
+        result.getRouteUnits().add(new RouteUnit(new 
RouteMapper(dataSourceName.get(), dataSourceName.get()), 
Collections.emptyList()));
+        return result;
+    }
+    for (Entry<ShardingSphereRule, SQLRouter> entry : routers.entrySet()) {
+        if (result.getRouteUnits().isEmpty()) {
+            result = entry.getValue().createRouteContext(logicSQL, metaData, 
entry.getKey(), props);
+        } else {
+            entry.getValue().decorateRouteContext(result, logicSQL, metaData, 
entry.getKey(), props);
+        }
+    }
+    if (result.getRouteUnits().isEmpty() && 1 == 
metaData.getResource().getDataSources().size()) {
+        String singleDataSourceName = 
metaData.getResource().getDataSources().keySet().iterator().next();
+        result.getRouteUnits().add(new RouteUnit(new 
RouteMapper(singleDataSourceName, singleDataSourceName), 
Collections.emptyList()));
+    }
+    return result;
+}
+```
+
+
+ShardingSphere first finds SQL annotations that meet the definition, and after 
verification, it will directly return the routing result specified by the user, 
thus implementing the forced routing function.
+
+Next, I’d like to showcase how to use `SQL HINT`.
+
+- **How to Use SQL HINT**
+`SQL HINT` is easy to use with ShardingSphere-JDBC and ShardingSphere-Proxy.
+
+Step 1: Turn on the annotation parser and set `sqlCommentParseEnabled` to true.
+
+Step 2: Add SQL comments. Currently `SQL HINT` supports specifying data source 
routing and primary database routing.
+
+- Data source-specified Routing: currently only supports routing to one data 
source. The comment format only supports `/* */` for the time being and starts 
with `ShardingSphere hint`: with the attribute name `dataSourceName`.
+
+```

Review Comment:
   ```suggestion
   ```sql
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [shardingsphere] TeslaCN commented on a diff in pull request #19791: feat(blog): add new blogs

Reply via email to