This is an automated email from the ASF dual-hosted git repository.

aloyszhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/inlong-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 66dc3b6190b [INLONG-1041][Audit] Update audit documentation 
instructions (#1042)
66dc3b6190b is described below

commit 66dc3b6190b9cc9e6171307be78409348346ec08
Author: doleyzi <43397300+dole...@users.noreply.github.com>
AuthorDate: Mon Sep 30 10:25:11 2024 +0800

    [INLONG-1041][Audit] Update audit documentation instructions (#1042)
---
 docs/modules/audit/overview.md                     | 165 ++++++++++-----------
 .../current/modules/audit/overview.md              | 143 +++++++++---------
 2 files changed, 145 insertions(+), 163 deletions(-)

diff --git a/docs/modules/audit/overview.md b/docs/modules/audit/overview.md
index 834779bc557..59ee0e806e2 100644
--- a/docs/modules/audit/overview.md
+++ b/docs/modules/audit/overview.md
@@ -3,104 +3,97 @@ title: Overview
 sidebar_position: 1
 ---
 
-InLong audit is a subsystem independent of InLong, which performs real-time 
audit and reconciliation on the incoming and outgoing traffic of the Agent, 
DataProxy, and Sort modules of the InLong system.
+InLong audit is a subsystem independent of InLong, which performs real-time 
audit and reconciliation on the incoming and
+outgoing traffic of the Agent, DataProxy, and Sort modules of the InLong 
system.
 There are three granularities for reconciliation: minutes, hours, and days.
 
-The audit reconciliation is based on the log reporting time, and each service 
participating in the audit will conduct real-time reconciliation according to 
the same log time. Through audit reconciliation, we can clearly understand 
InLong
+The audit reconciliation is based on the log reporting time, and each service 
participating in the audit will conduct
+real-time reconciliation according to the same log time. Through audit 
reconciliation, we can clearly understand InLong
 The transmission status of each module, and whether the data stream is lost or 
repeated
 
 ## Architecture
+
 ![](img/audit_architecture.png)
-1. The audit SDK is nested in the service that needs to be audited, audits the 
service, and sends the audit result to the audit access layer
-2. The audit proxy writes audit data to MQ (Pulsar, Kafka or TubeMQ)
-3. The distribution service consumes the audit data of MQ, and writes the 
audit data to MySQL or StarRocks.
-4. The interface layer encapsulates the data of MySQL or StarRocks.
-5. Application scenarios mainly include report display, audit reconciliation, 
etc.
-6. Support audit and reconciliation of data supplementary recording scenarios.
-7. Support audit reconciliation in Flink checkpoint scenarios.
+
+- The audit SDK is nested in the service that needs to be audited, audits the 
service, and sends the audit result to
+  the audit access layer
+- The audit proxy writes audit data to MQ (Pulsar, Kafka or TubeMQ)
+- The distribution service consumes the audit data of MQ, and writes the audit 
data to MySQL or StarRocks.
+- The interface layer encapsulates the data of MySQL or StarRocks.
+- Application scenarios mainly include report display, audit reconciliation, 
etc.
+- Support audit and reconciliation of data supplementary recording scenarios.
+- Support audit reconciliation in Flink checkpoint scenarios.
 
 ## Module
 
-| Modules                     | Description                                    
                                              |
-|:----------------------------|:---------------------------------------------------------------------------------------------|
-| audit-sdk                   | Audit hidden points are reported. Each module 
uses the SDK to report audit data              |
-| audit-proxy                 | Audit proxy layer, receives data reported by 
SDK and forwards it to MQ (pulsar/kafka/tubeMQ) |
-| audit-store                 | Audit storage layer, supporting common JDBC 
protocol                                         |
-| audit-service               | Audit service layer, providing aggregation, 
cache, OpenAPI and other capabilities            |
+| Modules       | Description                                                  
                                |
+|:--------------|:---------------------------------------------------------------------------------------------|
+| audit-sdk     | Audit hidden points are reported. Each module uses the SDK 
to report audit data              |
+| audit-proxy   | Audit proxy layer, receives data reported by SDK and 
forwards it to MQ (pulsar/kafka/tubeMQ) |
+| audit-store   | Audit storage layer, supporting common JDBC protocol         
                                |
+| audit-service | Audit service layer, providing aggregation, cache, OpenAPI 
and other capabilities            |
 
 ## Audit Dimension
-| | | || | | | | | |
-| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+
+|            |              |           ||                    |          |     
            |                  |                   |      |
+|------------|--------------|-----------|--------------------|----------|-----------------|------------------|-------------------|------|
 ---- |
 | Machine ip | Container ID | Thread ID | Log time (minutes) | Audit ID | 
inlong_group_id | inlong_stream_id | Number of records | Size | Transmission 
delay (ms) |
+
 ## Audit ID
+
 The receiving and sending of each module are respectively an independent audit 
item ID
 
-|Inlong Service Module |Audit ID |
-|----|----|
-|Inlong API Received Successfully      |1 |
-|Inlong API Send Successfully  |2|
-|Inlong Agent Received Successfully    |3|
-|Inlong Agent Send Successfully        |4|
-|Inlong DataProxy Received Successfully        |5|
-|Inlong DataProxy Send Successfully    |6|
-
-## Data Transfer Protocol
-The transmission protocol between SDK, Audit Proxy, and Audit Store is 
Protocol Buffers
-```markdown
-syntax = "proto3";
-
-package org.apache.inlong.audit.protocol;
-
-message BaseCommand {
-    enum Type {
-        PING          = 0;
-        PONG          = 1;
-        AUDITREQUEST  = 2;
-        AUDITREPLY    = 3;
-    }
-    Type type                            = 1;
-    optional AuditRequest audit_request  = 2;
-    optional AuditReply audit_reply      = 3;
-    optional Ping ping                   = 4;
-    optional Pong pong                   = 5;
-}
-
-message Ping {
-}
-
-message Pong {
-}
-
-message AuditRequest {
-  AuditMessageHeader msg_header = 1;   
-  repeated AuditMessageBody msg_body = 2;   
-}
-
-message AuditMessageHeader {
-  string ip = 1;            
-  string docker_id = 2;     
-  string thread_id = 3;     
-  uint64 sdk_ts = 4;        
-  uint64 packet_id = 5;     
-}
-
-message AuditMessageBody {
-  uint64 log_ts = 1;   
-  string inlong_group_id= 2;   
-  string inlong_stream_id= 3; 
-  string audit_id = 4;   
-  uint64 count = 5;     
-  uint64 size = 6;      
-  int64  delay = 7;      
-}
-
-message AuditReply {
-  enum RSP_CODE {
-    SUCCESS  = 0;  
-    FAILED   = 1;   
-    DISASTER = 2; 
-  }
-  RSP_CODE rsp_code = 1;   
-  optional string message = 2;
-}
-```
\ No newline at end of file
+| Inlong Service Module                   | Audit ID |
+|-----------------------------------------|----------|
+| Inlong API Received Successfully            | 1        |
+| Inlong API Send Successfully            | 2        |
+| Inlong Agent Received Successfully        | 3        |
+| Inlong Agent Send Successfully                | 4        |
+| Inlong DataProxy Received Successfully        | 5        |
+| Inlong DataProxy Send Successfully        | 6        |
+
+## Audit data storage
+
+Audit Store supports writing operations to all storage components compatible 
with the JDBC protocol. Therefore, when
+selecting a storage component compatible with the JDBC protocol, it is only 
necessary to ensure that it meets the
+following schema:
+
+```mysql
+CREATE TABLE IF NOT EXISTS `audit_data`
+(
+    `id`               int(32)      NOT NULL PRIMARY KEY AUTO_INCREMENT 
COMMENT 'Incremental primary key',
+    `ip`               varchar(32)  NOT NULL DEFAULT '' COMMENT 'Client IP',
+    `docker_id`        varchar(100) NOT NULL DEFAULT '' COMMENT 'Client docker 
id',
+    `thread_id`        varchar(50)  NOT NULL DEFAULT '' COMMENT 'Client thread 
id',
+    `sdk_ts`           TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 
'SDK timestamp',
+    `packet_id`        BIGINT       NOT NULL DEFAULT '0' COMMENT 'Packet id',
+    `log_ts`           TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 
'Log timestamp',
+    `inlong_group_id`  varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong group id',
+    `inlong_stream_id` varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong stream id',
+    `audit_id`         varchar(100) NOT NULL DEFAULT '' COMMENT 'Audit id',
+    `audit_tag`        varchar(100)          DEFAULT '' COMMENT 'Audit tag',
+    `audit_version`    BIGINT                DEFAULT -1 COMMENT 'Audit 
version',
+    `count`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
count',
+    `size`             BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
size',
+    `delay`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
delay count',
+    `update_time`      timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON 
UPDATE CURRENT_TIMESTAMP COMMENT 'Update time',
+    INDEX group_stream_audit_id (`inlong_group_id`, `inlong_stream_id`, 
`audit_id`, `log_ts`)
+) ENGINE = InnoDB
+  DEFAULT CHARSET = UTF8 COMMENT ='InLong audit data table';
+```
+
+- ip: Represents the client's IP address;
+- docker_id: String of length 100 that represents the client's Docker ID;
+- thread_id: String of length 50 that represents the client's thread ID;
+- sdk_ts: TIMESTAMP type that represents the SDK timestamp, with a default 
value of the current timestamp;
+- packet_id: 64-bit integer that represents the ID of the data packet;
+- log_ts: TIMESTAMP type that represents the timestamp of the log, with a 
default value of the current timestamp;
+- inlong_group_id: String of length 100 that represents the ID of the target 
Inlong group;
+- inlong_stream_id: String of length 100 that represents the ID of the target 
Inlong stream;
+- audit_id: String of length 100 that represents the audit ID;
+- audit_tag: String of length 100 that represents the audit tag, with a 
default value of an empty string;
+- audit_version: 64-bit integer that represents the audit version, with a 
default value of -1;
+- count: 64-bit integer that represents the message count, with a default 
value of 0;
+- size: 64-bit integer that represents the message size, with a default value 
of 0;
+- delay: 64-bit integer that represents the message delay count, with a 
default value of 0;
+- update_time: TIMESTAMP type that represents the update time, with a default 
value of the current timestamp.
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md
index e0b925d8932..0aa2cb98cbc 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md
@@ -6,34 +6,39 @@ sidebar_position: 1
 * InLong 审计是独立于 InLong 的一个子系统,对 InLong 系统的 Agent、DataProxy、Sort 
模块的入流量、出流量进行实时审计对账。
 * 对账的粒度有分钟、10分钟、30分钟、小时、天等等。
 
-审计对账以日志上报时间为统一的口径,参与审计的各个服务将按照相同的日志时间进行实时对账。通过审计对账,我们可以清晰的了解 InLong 
+审计对账以日志上报时间为统一的口径,参与审计的各个服务将按照相同的日志时间进行实时对账。通过审计对账,我们可以清晰的了解
+InLong
 各个模块的传输情况,以及数据流是否有丢失或者重复
 
 ## 架构
+
 ![](img/audit_architecture.png)
-1. 审计SDK嵌套在需要审计的服务,对服务进行审计,将审计结果发送到审计接入层。
-2. 审计接入层将审计数据写到 MQ (Pulsar、Kafka 或者 TubeMQ)。
-3. 分发服务消费 MQ 的审计数据,将审计数据写到 MySQL、StarRocks。
-4. 接口层将 MySQL、StarRocks 的数据进行实时聚合并且 cache,对外提供 OpenAPI。
-5. 应用场景主要包括报表展示、审计对账等等。
-6. 支持数据补录场景的审计对账。
-7. 支持 Flink CheckPoint 场景的审计对账。
+
+- 审计SDK嵌套在需要审计的服务,对服务进行审计,将审计结果发送到审计接入层。
+- 审计接入层将审计数据写到 MQ (Pulsar、Kafka 或者 TubeMQ)。
+- 分发服务消费 MQ 的审计数据,将审计数据写到 MySQL、StarRocks。
+- 接口层将 MySQL、StarRocks 的数据进行实时聚合并且 cache,对外提供 OpenAPI。
+- 应用场景主要包括报表展示、审计对账等等。
+- 支持数据补录场景的审计对账。
+- 支持 Flink CheckPoint 场景的审计对账。
 
 ## 模块
 
-| 模块             | 描述                                                 |
-|:---------------|:---------------------------------------------------|
-| audit-sdk      | 审计埋点上报,各个模块使用该 SDK 上报审计数据                          |
-| audit-proxy    | 审计代理层,接收 SDK 上报数据,转发到 MQ (pulsar / kafka / tubeMQ) |
-| audit-store    | 审计存储层,支持通用的 JDBC 协议                                |
-| audit-service  | 审计服务层,提供聚合、cache、OpenAPI 等能力                       |
+| 模块            | 描述                                                 |
+|:--------------|:---------------------------------------------------|
+| audit-sdk     | 审计埋点上报,各个模块使用该 SDK 上报审计数据                          |
+| audit-proxy   | 审计代理层,接收 SDK 上报数据,转发到 MQ (pulsar / kafka / tubeMQ) |
+| audit-store   | 审计存储层,支持通用的 JDBC 协议                                |
+| audit-service | 审计服务层,提供聚合、cache、OpenAPI 等能力                       |
 
 ## 审计维度
-|       |       |       || |       | | | | |
-|-------|-------|-------| ---- |-------| ---- | ---- | ---- | ---- | ---- |
-| 机器 ip | 容器 ID | 线程 ID | 日志时间(分钟) | 审计 ID | inlong_group_id | 
inlong_stream_id | 条数 | 大小 | 传输时延(ms) |
+
+|       |       |       ||          |       |                 |                
  |     |     |
+|-------|-------|-------|----------|-------|-----------------|------------------|-----|-----|
 ---- |
+| 机器 ip | 容器 ID | 线程 ID | 日志时间(分钟) | 审计 ID | inlong_group_id | 
inlong_stream_id | 条数  | 大小  | 传输时延(ms) |
 
 ## 审计项 ID
+
 每个模块的接收与发送分别为一个独立的审计项 ID
 
 | InLong 服务模块            | 审计 ID |
@@ -45,63 +50,47 @@ sidebar_position: 1
 | InLong DataProxy 接收成功         | 5     |
 | InLong DataProxy 发送成功         | 6     |
 
-## 数据传输协议
-SDK、接入层、分发层之间的传输协议为 Protocol Buffers
-```markdown
-syntax = "proto3";
-
-package org.apache.inlong.audit.protocol;
-
-message BaseCommand {
-    enum Type {
-        PING          = 0;
-        PONG          = 1;
-        AUDITREQUEST  = 2;
-        AUDITREPLY    = 3;
-    }
-    Type type                            = 1;
-    optional AuditRequest audit_request  = 2;
-    optional AuditReply audit_reply      = 3;
-    optional Ping ping                   = 4;
-    optional Pong pong                   = 5;
-}
-
-message Ping {
-}
-
-message Pong {
-}
-
-message AuditRequest {
-  AuditMessageHeader msg_header = 1;   // 包头
-  repeated AuditMessageBody msg_body = 2;   // 包体
-}
-
-message AuditMessageHeader {
-  string ip = 1;            // SDK 客户端 ip
-  string docker_id = 2;     // SDK 所在容器 ID
-  string thread_id = 3;     // SDK 所在的线程 ID
-  uint64 sdk_ts = 4;        // SDK 上报时间
-  uint64 packet_id = 5;     // SDK 上报的包 ID
-}
-
-message AuditMessageBody {
-  uint64 log_ts = 1;    // 日志时间
-  string inlong_group_id= 2;   // InLong Group ID
-  string inlong_stream_id= 3; // InLong Stream ID
-  string audit_id = 4;   // 审计 ID
-  uint64 count = 5;     // 条数
-  uint64 size = 6;      // 大小
-  int64  delay = 7;      // 总传输延时
-}
-
-message AuditReply {
-  enum RSP_CODE {
-    SUCCESS  = 0;  // 成功
-    FAILED   = 1;   // 失败
-    DISASTER = 2; // 容灾
-  }
-  RSP_CODE rsp_code = 1;   // 服务端返回码
-  optional string message = 2;
-}
-```
\ No newline at end of file
+## 审计数据存储
+
+Audit Store 能够支持所有兼容 JDBC 协议的存储组件的写入操作。因此,在选择兼容 JDBC 协议的存储组件时,只需确保其满足以下
+Schema 即可:
+
+```mysql
+CREATE TABLE IF NOT EXISTS `audit_data`
+(
+    `id`               int(32)      NOT NULL PRIMARY KEY AUTO_INCREMENT 
COMMENT 'Incremental primary key',
+    `ip`               varchar(32)  NOT NULL DEFAULT '' COMMENT 'Client IP',
+    `docker_id`        varchar(100) NOT NULL DEFAULT '' COMMENT 'Client docker 
id',
+    `thread_id`        varchar(50)  NOT NULL DEFAULT '' COMMENT 'Client thread 
id',
+    `sdk_ts`           TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 
'SDK timestamp',
+    `packet_id`        BIGINT       NOT NULL DEFAULT '0' COMMENT 'Packet id',
+    `log_ts`           TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 
'Log timestamp',
+    `inlong_group_id`  varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong group id',
+    `inlong_stream_id` varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong stream id',
+    `audit_id`         varchar(100) NOT NULL DEFAULT '' COMMENT 'Audit id',
+    `audit_tag`        varchar(100)          DEFAULT '' COMMENT 'Audit tag',
+    `audit_version`    BIGINT                DEFAULT -1 COMMENT 'Audit 
version',
+    `count`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
count',
+    `size`             BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
size',
+    `delay`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
delay count',
+    `update_time`      timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON 
UPDATE CURRENT_TIMESTAMP COMMENT 'Update time',
+    INDEX group_stream_audit_id (`inlong_group_id`, `inlong_stream_id`, 
`audit_id`, `log_ts`)
+) ENGINE = InnoDB
+  DEFAULT CHARSET = UTF8 COMMENT ='InLong audit data table';
+```
+
+- ip:表示客户端的 IP 地址;
+- docker_id:长度为 100 的字符串,表示客户端的 Docker ID;
+- thread_id:长度为 50 的字符串,表示客户端的线程 ID;
+- sdk_ts:TIMESTAMP 类型,表示 SDK 的时间戳,默认值为当前时间戳;
+- packet_id:64 位整数,表示数据包的ID;
+- log_ts:TIMESTAMP 类型,表示日志的时间戳,默认值为当前时间戳;
+- inlong_group_id:长度为 100 的字符串,表示目标 Inlong 组的 ID;
+- inlong_stream_id:长度为 100 的字符串,表示目标 Inlong 流的 ID;
+- audit_id:长度为 100 的字符串,表示审计 ID;
+- audit_tag:长度为 100 的字符串,表示审计标签,默认为空字符串;
+- audit_version:64 位整数,表示审计版本,默认值为-1;
+- count:64 位整数,表示消息数量,默认值为 0;
+- size:64 位整数,表示消息大小,默认值为 0;
+- delay:64 位整数,表示消息延迟数量,默认值为 0;
+- update_time:TIMESTAMP 类型,表示更新时间,默认值为当前时间戳,当记录被更新时自动更新。
\ No newline at end of file

Reply via email to