This is an automated email from the ASF dual-hosted git repository.
duanzhengqiang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/shardingsphere.git
The following commit(s) were added to refs/heads/master by this push:
new b6fa71fc0b2 update (#19663)
b6fa71fc0b2 is described below
commit b6fa71fc0b202be70f347b164e0c5bc22bec7c0a
Author: Mike0601 <[email protected]>
AuthorDate: Fri Jul 29 10:52:59 2022 +0800
update (#19663)
* Refactor document
Remove concepts document
* refactor
* update feature sharding
* update according new module
* update according to new module
* update according to new module
* update according to new module
* modify according to new module
* update according new module
* update
* update
* update according new module
* update according to new module
* update
* update
* update according to new module
* update according new module
* for conflict
* update
* update
---
.../content/reference/encrypt/_index.en.md | 239 ++++++++-------------
1 file changed, 89 insertions(+), 150 deletions(-)
diff --git a/docs/document/content/reference/encrypt/_index.en.md
b/docs/document/content/reference/encrypt/_index.en.md
index 5e2b9cc0832..80c4eb5b7a3 100644
--- a/docs/document/content/reference/encrypt/_index.en.md
+++ b/docs/document/content/reference/encrypt/_index.en.md
@@ -1,99 +1,64 @@
+++
pre = "<b>7.6. </b>"
title = "Encryption"
-weight = 6
+weight = 7
+++
-## Process Details
+Apache ShardingSphere parses the SQL entered by users and rewrites the SQL
according to the encryption rules provided by users, to encrypt the source data
and store the source data (optional) and ciphertext data in the underlying
database.
-Apache ShardingSphere can encrypt the plaintext by parsing and rewriting SQL
according to the encryption rule,
-and store the plaintext (optional) and ciphertext data to the database at the
same time.
-Queries data only extracts the ciphertext data from database and decrypts it,
and finally returns the plaintext to user.
-Apache ShardingSphere transparently process of data encryption, so that users
do not need to know to the implementation details of it, use encrypted data
just like as regular data.
-In addition, Apache ShardingSphere can provide a relatively complete set of
solutions whether the online business system has been encrypted or the new
online business system uses the encryption function.
+When a user queries data, it only retrieves ciphertext data from the database,
decrypts it, and finally returns the decrypted source data to the user. Apache
ShardingSphere achieves a transparent and automatic data encryption process.
Users can use encrypted data as normal data without paying attention to the
implementation details of data encryption.
### Overall Architecture

-Encrypt module intercepts SQL initiated by user, analyzes and understands SQL
behavior through the SQL syntax parser.
-According to the encryption rules passed by the user, find out the fields that
need to be encrypted/decrypted and the encryptor/decryptor used to
encrypt/decrypt the target fields,
-and then interact with the underlying database.
-ShardingSphere will encrypt the plaintext requested by the user and store it
in the underlying database;
-and when the user queries, the ciphertext will be taken out of the database
for decryption and returned to the end user.
-ShardingSphere shields the encryption of data, so that users do not need to
perceive the process of parsing SQL, data encryption, and data decryption,
-just like using ordinary data.
+The encrypted module intercepts the SQL initiated by the user and parses and
understands the SQL behavior through the SQL syntactic parser. Then it finds
out the fields to be encrypted and the encryption and decryption algorithm
according to the encryption rules introduced by the user and interacts with the
underlying database.
-### Encryption Rule
+Apache ShardingSphere will encrypt the plaintext requested by users and store
it in the underlying database. When the user queries, the ciphertext is
extracted from the database, decrypted, and returned to the terminal user. By
shielding the data encryption process, users do not need to operate the SQL
parsing process, data encryption, and data decryption.
-Before explaining the whole process in detail, we need to understand the
encryption rules and configuration, which is the basis of understanding the
whole process.
-The encryption configuration is mainly divided into four parts: data source
configuration, encrypt algorithm configuration, encryption table rule
configuration, and query attribute configuration.
- The details are shown in the following figure:
+### Encryption Rules
+
+Before explaining the whole process, we need to understand the encryption
rules and configuration. Encryption configuration is mainly divided into four
parts: data source configuration, encryptor configuration, encryption table
configuration, and query attribute configuration, as shown in the figure below:

-**Data Source Configuration**:The configuration of DataSource.
+Data source configuration: the configuration of the data source.
-**Encrypt Algorithm Configuration**:What kind of encryption strategy to use
for encryption and decryption.
-Currently ShardingSphere has five built-in encryption/decryption strategies:
AES, MD5, RC4, SM3, SM4.
-Users can also implement a set of encryption/decryption algorithms by
implementing the interface provided by Apache ShardingSphere.
+Encryptor configuration: refers to the encryption algorithm used for
encryption and decryption. Currently, ShardingSphere has three built-in
encryption and decryption algorithms: AES, MD5, and RC4. Users can also
implement a set of encryption and decryption algorithms by implementing the
interfaces provided by ShardingSphere.
-**Encryption Table Configuration**:Show the ShardingSphere data table which
column is used to store cipher column data (cipherColumn),
-what algorithm is used to encryption/decryption (encryptorName), which column
is used to store assisted query data (assistedQueryColumn),
-what algorithm is used to encrypt/decrypt assisted query data
(assistedQueryEncryptorName), which column is used to store plain text data
(plainColumn),
-and which column users want to use for SQL writing (logicColumn)
+Encryption table configuration: it is used to tell ShardingSphere which column
in the data table is used to store ciphertext data (`cipherColumn`), which
column is used to store plaintext data (`plainColumn`), and which column the
user would like to use for SQL writing (`logicColumn`).
-> How to understand `Which column do users want to use to write SQL
(logicColumn)`?
->
-> We can understand according to the meaning of Apache ShardingSphere.
-The ultimate goal of Apache ShardingSphere is to shield the encryption of the
underlying data, that is, we do not want users to know how the data is
encrypted/decrypted,
-how to store plaintext data in plainColumn, ciphertext data in cipherColumn,
and store assisted query data to the assistedQueryColumn.
-In other words, we do not even want users to know the existence and use of
plainColumn, cipherColumn and assistedQueryColumn.
-Therefore, we need to provide users with a column in conceptual. This column
can be separated from the real column of the underlying database.
-It can be a real column in the database table or not, so that the user can
freely change the plainColumn and The column name of cipherColumn,
assistedQueryColumn.
-Or delete plainColumn and choose to never store plain text and only store
cipher text.
-As long as the user's SQL is written according to this logical column, and the
correct mapping relationship between logicColumn and plainColumn, cipherColumn,
assistedQueryColumn is given in the encryption rule.
+> What does it mean by "which column the user would like to use for SQL
writing (logicColumn)"?
+We have to know first why the encrypted module exists. The goal of the
encrypted module is to shield the underlying data encryption process, which
means we don't want users to know how data is encrypted and decrypted, and how
to store plaintext data into `plainColumn` and ciphertext data into
`cipherColumn`. In other words, we don't want users to know there is a
`plainColumn` and `cipherColumn` or how they are used. Therefore, we need to
provide the user with a conceptual column that can [...]
>
-> Why do you do this? The answer is at the end of the article, that is, to
enable the online services to seamlessly, transparently, and safely carry out
data encryption migration.
-**Query Attribute configuration**:When the plaintext data and ciphertext data
are stored in the underlying database table at the same time,
-this attribute switch is used to decide whether to directly query the
plaintext data in the database table to return,
-or to query the ciphertext data and decrypt it through Apache ShardingSphere
to return. This switch supports table level and whole rule level configuration,
and table level has the highest priority.
+Query attribute configuration: if both plaintext and ciphertext data are
stored in the underlying database table, this attribute can be used to
determine whether to query the plaintext data in the database table and return
it directly, or query the ciphertext data and return it after decryption
through Apache ShardingSphere. This attribute can be configured at the table
level and the entire rule level. The table-level has the highest priority.
### Encryption Process
-For example, if there is a table in the database called t_user, there are
actually two fields pwd_plain in this table, used to store plain text data,
pwd_cipher, used to store cipher text data, pwd_assisted_query, used to store
the auxiliary query data, and define logicColumn as pwd.
-Then, when writing SQL, users should write to logicColumn, that is, `INSERT
INTO t_user SET pwd = '123'`.
-Apache ShardingSphere receives the SQL, and through the encryption
configuration provided by the user, finds that pwd is a logicColumn, so it
decrypt the logical column and its corresponding plaintext data.
-As can be seen that ** Apache ShardingSphere has carried out the
column-sensitive and data-sensitive mapping conversion of the logical column
facing the user and the plaintext and ciphertext columns facing the underlying
database.
-As shown below:
+For example, if there is a table named `t_user` in the database, and they're
two fields in the table: `pwd_plain` for storing plaintext data and
`pwd_cipher` for storing ciphertext data, and logicColumn is defined as `pwd`,
then users should write SQL for `logicColumn`, that is `INSERT INTO t_user SET
pwd = '123'`. Apache ShardingSphere receives the SQL and finds that the `pwd`
is the `logicColumn` based on the encryption configuration provided by the
user. Therefore, it encrypts the log [...]
+
+Apache ShardingSphere transforms the column names and data encryption mapping
between the logical columns facing users and the plain and cipher columns
facing the underlying database. As shown in the figure below:

-This is also the core meaning of Apache ShardingSphere, which is to separate
user SQL from the underlying data table structure according to the encryption
rules provided by the user,
-so that the SQL writer by user no longer depends on the actual database table
structure.
-The connection, mapping, and conversion between the user and the underlying
database are handled by Apache ShardingSphere.
-Why should we do this?
-It is still the same : in order to enable the online business to seamlessly,
transparently and safely perform data encryption migration.
+The user's SQL is separated from the underlying data table structure according
to the encryption rules provided by the user so that the user's SQL writing
does not depend on the real database table structure.
+
+The connection, mapping, and transformation between the user and the
underlying database are handled by Apache ShardingSphere.
-In order to make the reader more clearly understand the core processing flow
of Apache ShardingSphere,
-the following picture shows the processing flow and conversion logic when
using Apache ShardingSphere to add, delete, modify and check, as shown in the
following figure.
+The picture below shows the processing flow and conversion logic when the
encryption module is used to add, delete, change and check, as shown in the
figure below.

## Detailed Solution
-After understanding the Apache ShardingSphere encryption process, you can
combine the encryption configuration and encryption process with the actual
scenario.
-All design and development are to solve the problems encountered in business
scenarios. So for the business scenario requirements mentioned earlier,
-how should ShardingSphere be used to achieve business requirements?
+After understanding Apache ShardingSphere's encryption process, you can
combine the encryption configuration and encryption process according to your
scenario. The entire design & development was conceived to address the pain
points encountered in business scenarios. So, how to use Apache ShardingSphere
to meet the business requirements mentioned before?
### New Business
-Business scenario analysis: The newly launched business is relatively simple
because everything starts from scratch and there is no historical data cleaning
problem.
+Business scenario analysis: the newly launched business is relatively simple
because it starts from scratch and there's no need to clean up historical data.
-Solution description: After selecting the appropriate encrypt algorithm, such
as AES,
-you only need to configure the logical column (write SQL for users) and the
ciphertext column (the data table stores the ciphertext data).
-It can also be different **. The recommended configuration is as follows
(shown in Yaml format):
+Solution description: after selecting the appropriate encryption algorithm,
such as AES, you only need to configure the logical column (write SQL for
users) and the ciphertext column (the data table stores the ciphertext data).
The logical columns and ciphertext columns can also be different. The following
configurations are recommended (in YAML format):
```yaml
-!ENCRYPT
@@ -113,46 +78,31 @@ It can also be different **. The recommended configuration
is as follows (shown
queryWithCipherColumn: true
```
-With this configuration, Apache ShardingSphere only needs to convert
logicColumn and cipherColumn, assistedQueryColumn.
-The underlying data table does not store plain text, only cipher text.
-This is also a requirement of the security audit part. If users want to store
plain text and cipher text together in the database,
-they just need to add plainColumn configuration. The overall processing flow
is shown below:
+With the above configuration, Apache ShardingSphere only needs to convert
`logicColumn`, `cipherColumn`, and `assistedQueryColumn`.
+
+The underlying data table does not store plaintext, and only ciphertext is
stored, which is also the requirement of the security audit. If you want to
store both plaintext and ciphertext in the database, add the `plainColumn`
configuration. The overall processing flow is shown in the figure below:

### Online Business Transformation
-Business scenario analysis: As the business is already running online, there
must be a large amount of plain text historical data stored in the database.
-The current challenges are how to enable historical data to be encrypted and
cleaned, how to enable incremental data to be encrypted,
-and how to allow businesses to seamlessly and transparently migrate between
the old and new data systems.
-
-Solution description: Before providing a solution, let ’s brainstorm:
-First, if the old business needs to be desensitized, it must have stored very
important and sensitive information.
-This information has a high gold content and the business is relatively
important.
-If it is broken, the whole team KPI is over.
-Therefore, it is impossible to suspend business immediately, prohibit writing
of new data, encrypt and clean all historical data with an encrypt algorithm,
-and then deploy the previously reconstructed code online, so that it can
encrypt and decrypt online and incremental data.
-Such a simple and rough way, based on historical experience, will definitely
not work.
-
-Then another relatively safe approach is to rebuild a pre-release environment
exactly like the production environment,
-and then encrypt the **Inventory plaintext data** of the production
environment through the relevant migration and washing tools and store it in
the pre-release environment.
-The **Increment data** is encrypted by tools such as MySQL replica query and
the business party ’s own development,
-encrypted and stored in the database of the pre-release environment, and then
the refactored code can be deployed to the pre-release environment.
-In this way, the production environment is a set of environment for
**modified/queries with plain text as the core**;
-the pre-release environment is a set of **encrypt/decrypt queries modified
with ciphertext as the core**.
-After comparing for a period of time, the production flow can be cut into the
pre-release environment at night.
-This solution is relatively safe and reliable, but it takes more time,
manpower, capital, and costs.
-It mainly includes: pre-release environment construction, production code
rectification, and related auxiliary tool development.
-Unless there is no way to go, business developers generally go from getting
started to giving up.
-
-Business developers must hope: reduce the burden of capital costs, do not
modify the business code, and be able to safely and smoothly migrate the
system.
-So, the encryption function module of ShardingSphere was born. It can be
divided into three steps:
+Business scenario analysis: as the business is already running, the database
will already have stored a large amount of plaintext historical data. The
current challenges are how to encrypt and clean up the historical data, how to
encrypt and process the incremental data, and how to seamlessly and
transparently migrate business between the old and new data systems.
+
+Solution Description: before coming up with a solution, let's brainstorm.
+
+First, since it is an old business that needs to be encrypted and transformed,
it must have stored very important and sensitive information, which is valuable
and related to critical businesses. Therefore, it is impossible to suspend
business immediately, prohibit writing new data, encrypt and clean all
historical data with an encryption algorithm. And then deploy and launch the
reconstructed code to encrypt and decrypt the stock and incremental data
online. Such a complex solution will [...]
+
+Another relatively safe solution is to build a set of pre-released
environments exactly the same as the production environment, and then encrypt
the stock original data of the production environment and store it in the
pre-released environment through migration and data cleansing tools.
+
+The new data is encrypted and stored in the database of the pre-released
environment through tools such as MySQL primary/secondary replication and
self-developed ones by the business side. The reconfigurable code that can be
encrypted and decrypted is deployed to the pre-released environment. This way,
the production environment takes plaintext as the core used for queries and
modifications.
+
+The pre-released environment is a ciphertext-based environment for encrypted
and decrypted queries and modifications. After comparison, the production flow
can be transferred to the pre-released environment by nighttime operation. This
method is relatively safe and reliable, but time consuming,labor and capital
intensive, mainly including building a pre-released environment, modifying
production code, developing auxiliary tools, etc.
+
+The most popular solutions for developers are to reduce the capital cost, not
change the business code, and be able to migrate the system safely and
smoothly. Thus, the encryption function module of ShardingSphere was created.
It can be divided into three steps:
1. Before system migration
-Assuming that the system needs to encrypt the pwd field of t_user, the
business side uses Apache ShardingSphere to replace the standardized JDBC
interface,
-which basically requires no additional modification (we also provide Spring
Boot Starter, Spring Namespace, YAML and other access methods to achieve
different services demand).
-In addition, demonstrate a set of encryption configuration rules, as follows:
+Assuming that the system needs to encrypt the `pwd` field of `t_user`, the
business side uses Apache ShardingSphere to replace the standardized JDBC
interface, which basically requires no additional modification (we also provide
Spring Boot Starter, Spring Namespace, YAML and other access methods to meet
different business requirements). In addition, we would like to demonstrate a
set of encryption configuration rules, as follows:
```yaml
-!ENCRYPT
@@ -173,54 +123,45 @@ In addition, demonstrate a set of encryption
configuration rules, as follows:
queryWithCipherColumn: false
```
-According to the above encryption rules, we need to add a column called
pwd_cipher in the t_user table, that is, cipherColumn, which is used to store
ciphertext data.
-At the same time, we set plainColumn to pwd, which is used to store plaintext
data, and logicColumn is also set to pwd.
-Because the previous SQL was written using pwd, that is, the SQL was written
for logical columns, so the business code did not need to be changed.
-Through Apache ShardingSphere, for the incremental data, the plain text will
be written to the pwd column, and the plain text will be encrypted and stored
in the pwd_cipher column.
-At this time, because `queryWithCipherColumn` is set to false, for business
applications, the plain text column of pwd is still used for query storage,
-but the cipher text data of the new data is additionally stored on the
underlying database table pwd_cipher. The processing flow is shown below:
+According to the above encryption rules, we need to add a field called
`pwd_cipher`, namely `cipherColumn`, in the `t_user` table, which is used to
store ciphertext data.
+
+At the same time, we set `plainColumn` to `pwd`, which is used to store
plaintext data, and `logicColumn` is also set to `pwd`.
+
+Because the previous SQL was written using `pwd`, the SQL was written for
logical columns, and the business code does not need to be changed. Through
Apache ShardingSphere, for the incremental data, the plaintext will be written
to the `pwd` column and be encrypted and stored in the `pwd_cipher` column.
+
+At this time, because `queryWithCipherColumn` is set to `false`, for business
applications, the plaintext column of `pwd` is still used for query and
storage, but the ciphertext data of the new data is additionally stored on the
underlying database table `pwd_cipher`. The processing flow is shown below:

-When the newly added data is inserted, it is encrypted as ciphertext data
through Apache ShardingSphere and stored in the cipherColumn.
-Now it is necessary to process historical plaintext inventory data.
-**As Apache ShardingSphere currently does not provide the corresponding
migration and washing tools, the business party needs to encrypt and store the
plain text data in pwd to pwd_cipher.**
+When the new data is inserted, it is encrypted as ciphertext data by Apache
ShardingSphere and stored in the `cipherColumn`. Now you need to deal with the
historical plaintext stock data. Apache ShardingSphere currently does not
provide a migration and data cleansing tool, so you need to encrypt the
plaintext data in the `pwd` and store it in the `pwd_cipher`.
2. During system migration
-The incremental data has been stored by Apache ShardingSphere in the
ciphertext column and the plaintext is stored in the plaintext column; after
the historical data is encrypted and cleaned by the business party itself,
-the ciphertext is also stored in the ciphertext column. That is to say, the
plaintext and the ciphertext are stored in the current database.
-Since the `queryWithCipherColumn = false` in the configuration item, the
ciphertext has never been used.
-Now we need to set the `queryWithCipherColumn` in the encryption configuration
to true in order for the system to cut the ciphertext data for query.
-After restarting the system, we found that the system business is normal, but
Apache ShardingSphere has started to extract the ciphertext data from the
database,
-decrypt it and return it to the user; and for the user's insert, delete and
update requirements,
-the original data will still be stored The plaintext column, the encrypted
ciphertext data is stored in the ciphertext column.
-
-Although the business system extracts the data in the ciphertext column and
returns it after decryption;
-however, it will still save a copy of the original data to the plaintext
column during storage.
-Why? The answer is: in order to be able to roll back the system.
-**Because as long as the ciphertext and plaintext always exist at the same
time, we can freely switch the business query to cipherColumn or plainColumn
through the configuration of the switch item.**
-In other words, if the system is switched to the ciphertext column for query,
the system reports an error and needs to be rolled back.
-Then just set `queryWithCipherColumn = false`, Apache ShardingSphere will
restore, that is, start using plainColumn to query again.
-The processing flow is shown in the following figure:
+The new ciphertext data is stored in the `cipherColumn` and the new plaintext
one is stored in the `plainColumn` by Apache ShardingSphere. After the
historical data is encrypted and cleaned by the business side, its ciphertext
is also stored in the `cipherColumn`. In other words, the current database
stores both plaintext and ciphertext.
+
+Owing to the configuration item `queryWithCipherColumn = false`, the
ciphertext is never used. Now we need to set `queryWithCipherColumn` in the
encryption configuration to true in order for the system to query ciphertext
data.
+
+After restarting the system, we found that all system businesses are normal,
but Apache ShardingSphere has started to take out and decrypt the cipherColumn
data from the database and returned those data to the user. In terms of users'
requirements of addition, deletion and modification, the original data is still
stored in the `plainColumn`, and the encrypted ciphertext data is stored in the
`cipherColumn`.
+
+Although the business system has taken out the data in the `cipherColumn` and
returned it after decryption, it will still save a copy of the original data to
the `plainColumn`. Why? The answer is: to enable system rollback.
+
+Because as long as the ciphertext and plaintext always exist at the same time,
we can freely switch the business query to `cipherColumn` or `plainColumn`
through the configuration of the switch item.
+
+In other words, if the system is switched to the ciphertext column for query,
the system reports an error and needs to be rolled back. Then we only need to
set `queryWithCipherColumn = false`, and Apache ShardingSphere will restore and
start using `plainColumn` to query again. The processing flow is shown in the
following figure:

3. After system migration
-Due to the requirements of the security audit department,
-it is generally impossible for the business system to keep the plaintext and
ciphertext columns of the database permanently synchronized.
-We need to delete the plaintext data after the system is stable. That is, we
need to delete plainColumn (ie pwd) after system migration.
-The problem is that now the business code is written for pwd SQL,
-delete the pwd in the underlying data table stored in plain text, and use
pwd_cipher to decrypt to get the original data,
-does that mean that the business side needs to rectify all SQL, thus Do not
use the pwd column that is about to be deleted?
-Remember the core meaning of our encrypt module?
+As required by security audit teams, it is generally impossible for the
business system to permanently synchronize the plaintext column and ciphertext
column of the database, so we need to delete the plaintext column data after
the system is stable.
-> This is also the core meaning of encrypt module. According to the encryption
rules provided by the user, the user SQL is separated from the underlying
database table structure, so that the user's SQL writing no longer depends on
the actual database table structure. The connection, mapping, and conversion
between the user and the underlying database are handled by ShardingSphere.
+That is, we need to delete plainColumn (i.e.`pwd`) after system migration. The
problem is that now the business code is written for `pwd` SQL, and we delete
the pwd that stores plaintext in the underlying data table and use the
`pwd_cipher` to decrypt the original data.
-Yes, because of the existence of logicColumn, users write SQL for this virtual
column.
-Apache ShardingSphere can map this logical column and the ciphertext column in
the underlying data table.
-So the encryption configuration after migration is:
+Does that mean that the business side needs to change all SQL, to not use the
pwd column to be deleted? No. Remember the core concept of Apache
ShardingSphere?
+
+> That is exactly the core concept of Apache ShardingSphere's encryption
module. According to the encryption rules provided by the user, the user SQL is
separated from the underlying database table structure, so that the user’s SQL
writing no longer depends on the actual database table structure. The
connection, mapping, and conversion between the user and the underlying
database are handled by ShardingSphere.
+
+The existence of the `logicColumn` means that users write SQL for this virtual
column. Apache ShardingSphere can map this logical column and the ciphertext
column in the underlying data table. So the encryption configuration after the
migration is:
```yaml
-!ENCRYPT
@@ -232,7 +173,7 @@ So the encryption configuration after migration is:
tables:
t_user:
columns:
- pwd: # pwd与pwd_cipher的转换映射
+ pwd: # pwd and pwd_cipher transformation mapping
cipherColumn: pwd_cipher
encryptorName: aes_encryptor
assistedQueryColumn: pwd_assisted_query
@@ -246,39 +187,37 @@ The processing flow is as follows:
4. System migration completed
-Security audit department then requested, the business system needs to modify
the key periodically or certain emergency security events trigger,
-we need to migrate the number of wash again, that is, use the old key
decryption and then use the new key encryption.
-Both to and also to the problem came, the plaintext column data has been
deleted, the amount of data in the database table tens of millions,
-the migration shuffle takes a certain amount of time, the migration shuffle
process in the cipher column changes, the system also needs to provide services
correctly.
-What to do? The answer is: auxiliary query column.
-**Because auxiliary query columns generally use algorithms such as
irreversible MD5 and SM3, queries based on auxiliary columns are served
correctly by the system even during the migration shuffle.**
+As required by security audit teams, the business system needs to periodically
trigger key modifications or through some emergency events. We need to perform
migration data cleansing again, which means using the old key to decrypt and
then use the new key to encrypt.
+
+The problem persists. The plaintext column data has been deleted, and the
amount of data in the database table is tens of millions. Additionally, the
migration and cleansing take a certain amount of time, during which the cipher
column changes.
-So far, the online service encryption and rectification solutions have all
been demonstrated.
-We provide Java, YAML, Spring Boot Starter, Spring Namespace multiple ways for
users to choose to use, and strive to fulfill business requirements.
-The solution has been continuously launched on JD Digits, providing internal
basic service support.
+Under these circumstances, the system still needs to provide services
correctly. What can we do? The answer lies in the auxiliary query column.
Because auxiliary query columns generally use algorithms such as irreversible
MD5 and SM3. Queries based on auxiliary columns are performed correctly by the
system even during the migration and data cleansing process.
+
+So far, the encryption rectification solution for the released business has
been completely demonstrated. We provide Java, YAML, Spring Boot Starter, and
Spring namespace for users to choose and access to meet different business
requirements. This solution has been continuously verified by enterprise users
such as JD Technology.
## The advantages of Middleware encryption service
-1. Transparent data encryption process, users do not need to pay attention to
the implementation details of encryption.
-2. Provide a variety of built-in, third-party (AKS) encryption strategies,
users only need to modify the configuration to use.
-3. Provides a encryption strategy API interface, users can implement the
interface to use a custom encryption strategy for data encryption.
-4. Support switching different encryption strategies.
-5. For online services, it is possible to store plaintext data and ciphertext
data synchronously, and decide whether to use plaintext or ciphertext columns
for query through configuration.
-Without changing the business query SQL, the on-line system can safely and
transparently migrate data before and after encryption.
+1. Automatic and transparent data encryption process. Encryption
implementation details are no longer a concern for users.
+2. It provides a variety of built-in and third-party (AKS) encryption
algorithms, which are available through simple configurations.
+3. It provides an encryption algorithm API interface. Users can implement the
interface to use a custom encryption algorithm for data encryption.
+4. It can switch among different encryption algorithms.
+5. For businesses already launched, it is possible to store plaintext data and
ciphertext data synchronously. And you can decide whether to use plaintext or
ciphertext columns for query through configuration. Without changing the
business query SQL, the released system can safely and transparently migrate
data before and after encryption.
## Solution
-Apache ShardingSphere has provided the encryption solution for data
encryption, the `EncryptAlgorithm`.
+Apache ShardingSphere provides an encryption algorithm for data encryption,
namely `EncryptAlgorithm`.
+
+On the one hand, Apache ShardingSphere provides users with built-in
implementation classes for encryption and decryption, which are available
through configurations by users.
+
+On the other hand, in order to be applicable to different scenarios, we also
opened the encryption and decryption interfaces, and users can provide specific
implementation classes according to these two types of interfaces.
-On the one hand, Apache ShardingSphere has provided internal encryption and
decryption implementations for users, which can be used by them only after
configuration.
-On the other hand, to satisfy users' requirements for different scenarios, we
have also opened relevant encryption and decryption interfaces, according to
which, users can provide specific implementation types.
-Then, after simple configurations, Apache ShardingSphere can use encryption
and decryption solutions defined by users themselves to desensitize data.
+After simple configuration, Apache ShardingSphere can call user-defined
encryption and decryption schemes for data encryption.
### EncryptAlgorithm
-The solution has provided two methods `encrypt()` and `decrypt()` to
encrypt/decrypt data for encryption.
+The solution provides two methods, `encrypt()` and `decrypt()`, to encrypt or
decrypt data.
+When users perform `INSERT`, `DELETE` and `UPDATE` operations, ShardingSphere
will parse, rewrite and route SQL according to the configuration.
-When users `INSERT`, `DELETE` and `UPDATE`, ShardingSphere will parse, rewrite
and route SQL according to the configuration. It will also use `encrypt()` to
encrypt data and store them in the database. When using `SELECT`,
-they will decrypt sensitive data from the database with `decrypt()` reversely
and return them to users at last.
+It will also use `encrypt()` to encrypt data and store them in the database.
When using SELECT, they will decrypt sensitive data from the database with
`decrypt()` and finally return the original data to users.
-Currently, Apache ShardingSphere has provided five types of implementations
for this kind of encrypt solution, MD5 (irreversible), AES (reversible), RC4
(reversible), SM3 (irreversible) and SM4 (reversible), which can be used after
configuration.
+Currently, Apache ShardingSphere provides five types of implementations for
this kind of encryption solution, including MD5 (irreversible), AES
(reversible), RC4 (reversible), SM3 (irreversible) and SM4 (reversible), which
can be used after configuration.