This is an automated email from the ASF dual-hosted git repository.

aloyszhang pushed a commit to branch master
in repository

The following commit(s) were added to refs/heads/master by this push:
     new 66dc3b6190b [INLONG-1041][Audit] Update audit documentation 
instructions (#1042)
66dc3b6190b is described below

commit 66dc3b6190b9cc9e6171307be78409348346ec08
Author: doleyzi <>
AuthorDate: Mon Sep 30 10:25:11 2024 +0800

    [INLONG-1041][Audit] Update audit documentation instructions (#1042)
 docs/modules/audit/                     | 165 ++++++++++-----------
 .../current/modules/audit/              | 143 +++++++++---------
 2 files changed, 145 insertions(+), 163 deletions(-)

diff --git a/docs/modules/audit/ b/docs/modules/audit/
index 834779bc557..59ee0e806e2 100644
--- a/docs/modules/audit/
+++ b/docs/modules/audit/
@@ -3,104 +3,97 @@ title: Overview
 sidebar_position: 1
-InLong audit is a subsystem independent of InLong, which performs real-time 
audit and reconciliation on the incoming and outgoing traffic of the Agent, 
DataProxy, and Sort modules of the InLong system.
+InLong audit is a subsystem independent of InLong, which performs real-time 
audit and reconciliation on the incoming and
+outgoing traffic of the Agent, DataProxy, and Sort modules of the InLong 
 There are three granularities for reconciliation: minutes, hours, and days.
-The audit reconciliation is based on the log reporting time, and each service 
participating in the audit will conduct real-time reconciliation according to 
the same log time. Through audit reconciliation, we can clearly understand 
+The audit reconciliation is based on the log reporting time, and each service 
participating in the audit will conduct
+real-time reconciliation according to the same log time. Through audit 
reconciliation, we can clearly understand InLong
 The transmission status of each module, and whether the data stream is lost or 
 ## Architecture
-1. The audit SDK is nested in the service that needs to be audited, audits the 
service, and sends the audit result to the audit access layer
-2. The audit proxy writes audit data to MQ (Pulsar, Kafka or TubeMQ)
-3. The distribution service consumes the audit data of MQ, and writes the 
audit data to MySQL or StarRocks.
-4. The interface layer encapsulates the data of MySQL or StarRocks.
-5. Application scenarios mainly include report display, audit reconciliation, 
-6. Support audit and reconciliation of data supplementary recording scenarios.
-7. Support audit reconciliation in Flink checkpoint scenarios.
+- The audit SDK is nested in the service that needs to be audited, audits the 
service, and sends the audit result to
+  the audit access layer
+- The audit proxy writes audit data to MQ (Pulsar, Kafka or TubeMQ)
+- The distribution service consumes the audit data of MQ, and writes the audit 
data to MySQL or StarRocks.
+- The interface layer encapsulates the data of MySQL or StarRocks.
+- Application scenarios mainly include report display, audit reconciliation, 
+- Support audit and reconciliation of data supplementary recording scenarios.
+- Support audit reconciliation in Flink checkpoint scenarios.
 ## Module
-| Modules                     | Description                                    
-| audit-sdk                   | Audit hidden points are reported. Each module 
uses the SDK to report audit data              |
-| audit-proxy                 | Audit proxy layer, receives data reported by 
SDK and forwards it to MQ (pulsar/kafka/tubeMQ) |
-| audit-store                 | Audit storage layer, supporting common JDBC 
protocol                                         |
-| audit-service               | Audit service layer, providing aggregation, 
cache, OpenAPI and other capabilities            |
+| Modules       | Description                                                  
+| audit-sdk     | Audit hidden points are reported. Each module uses the SDK 
to report audit data              |
+| audit-proxy   | Audit proxy layer, receives data reported by SDK and 
forwards it to MQ (pulsar/kafka/tubeMQ) |
+| audit-store   | Audit storage layer, supporting common JDBC protocol         
+| audit-service | Audit service layer, providing aggregation, cache, OpenAPI 
and other capabilities            |
 ## Audit Dimension
-| | | || | | | | | |
-| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+|            |              |           ||                    |          |     
            |                  |                   |      |
 ---- |
 | Machine ip | Container ID | Thread ID | Log time (minutes) | Audit ID | 
inlong_group_id | inlong_stream_id | Number of records | Size | Transmission 
delay (ms) |
 ## Audit ID
 The receiving and sending of each module are respectively an independent audit 
item ID
-|Inlong Service Module |Audit ID |
-|Inlong API Received Successfully      |1 |
-|Inlong API Send Successfully  |2|
-|Inlong Agent Received Successfully    |3|
-|Inlong Agent Send Successfully        |4|
-|Inlong DataProxy Received Successfully        |5|
-|Inlong DataProxy Send Successfully    |6|
-## Data Transfer Protocol
-The transmission protocol between SDK, Audit Proxy, and Audit Store is 
Protocol Buffers
-syntax = "proto3";
-package org.apache.inlong.audit.protocol;
-message BaseCommand {
-    enum Type {
-        PING          = 0;
-        PONG          = 1;
-        AUDITREQUEST  = 2;
-        AUDITREPLY    = 3;
-    }
-    Type type                            = 1;
-    optional AuditRequest audit_request  = 2;
-    optional AuditReply audit_reply      = 3;
-    optional Ping ping                   = 4;
-    optional Pong pong                   = 5;
-message Ping {
-message Pong {
-message AuditRequest {
-  AuditMessageHeader msg_header = 1;   
-  repeated AuditMessageBody msg_body = 2;   
-message AuditMessageHeader {
-  string ip = 1;            
-  string docker_id = 2;     
-  string thread_id = 3;     
-  uint64 sdk_ts = 4;        
-  uint64 packet_id = 5;     
-message AuditMessageBody {
-  uint64 log_ts = 1;   
-  string inlong_group_id= 2;   
-  string inlong_stream_id= 3; 
-  string audit_id = 4;   
-  uint64 count = 5;     
-  uint64 size = 6;      
-  int64  delay = 7;      
-message AuditReply {
-  enum RSP_CODE {
-    SUCCESS  = 0;  
-    FAILED   = 1;   
-    DISASTER = 2; 
-  }
-  RSP_CODE rsp_code = 1;   
-  optional string message = 2;
\ No newline at end of file
+| Inlong Service Module                   | Audit ID |
+| Inlong API Received Successfully            | 1        |
+| Inlong API Send Successfully            | 2        |
+| Inlong Agent Received Successfully        | 3        |
+| Inlong Agent Send Successfully                | 4        |
+| Inlong DataProxy Received Successfully        | 5        |
+| Inlong DataProxy Send Successfully        | 6        |
+## Audit data storage
+Audit Store supports writing operations to all storage components compatible 
with the JDBC protocol. Therefore, when
+selecting a storage component compatible with the JDBC protocol, it is only 
necessary to ensure that it meets the
+following schema:
+    `id`               int(32)      NOT NULL PRIMARY KEY AUTO_INCREMENT 
COMMENT 'Incremental primary key',
+    `ip`               varchar(32)  NOT NULL DEFAULT '' COMMENT 'Client IP',
+    `docker_id`        varchar(100) NOT NULL DEFAULT '' COMMENT 'Client docker 
+    `thread_id`        varchar(50)  NOT NULL DEFAULT '' COMMENT 'Client thread 
'SDK timestamp',
+    `packet_id`        BIGINT       NOT NULL DEFAULT '0' COMMENT 'Packet id',
'Log timestamp',
+    `inlong_group_id`  varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong group id',
+    `inlong_stream_id` varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong stream id',
+    `audit_id`         varchar(100) NOT NULL DEFAULT '' COMMENT 'Audit id',
+    `audit_tag`        varchar(100)          DEFAULT '' COMMENT 'Audit tag',
+    `audit_version`    BIGINT                DEFAULT -1 COMMENT 'Audit 
+    `count`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
+    `size`             BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
+    `delay`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
delay count',
+    `update_time`      timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON 
+    INDEX group_stream_audit_id (`inlong_group_id`, `inlong_stream_id`, 
`audit_id`, `log_ts`)
+) ENGINE = InnoDB
+  DEFAULT CHARSET = UTF8 COMMENT ='InLong audit data table';
+- ip: Represents the client's IP address;
+- docker_id: String of length 100 that represents the client's Docker ID;
+- thread_id: String of length 50 that represents the client's thread ID;
+- sdk_ts: TIMESTAMP type that represents the SDK timestamp, with a default 
value of the current timestamp;
+- packet_id: 64-bit integer that represents the ID of the data packet;
+- log_ts: TIMESTAMP type that represents the timestamp of the log, with a 
default value of the current timestamp;
+- inlong_group_id: String of length 100 that represents the ID of the target 
Inlong group;
+- inlong_stream_id: String of length 100 that represents the ID of the target 
Inlong stream;
+- audit_id: String of length 100 that represents the audit ID;
+- audit_tag: String of length 100 that represents the audit tag, with a 
default value of an empty string;
+- audit_version: 64-bit integer that represents the audit version, with a 
default value of -1;
+- count: 64-bit integer that represents the message count, with a default 
value of 0;
+- size: 64-bit integer that represents the message size, with a default value 
of 0;
+- delay: 64-bit integer that represents the message delay count, with a 
default value of 0;
+- update_time: TIMESTAMP type that represents the update time, with a default 
value of the current timestamp.
\ No newline at end of file
diff --git 
index e0b925d8932..0aa2cb98cbc 100644
@@ -6,34 +6,39 @@ sidebar_position: 1
 * InLong 审计是独立于 InLong 的一个子系统,对 InLong 系统的 Agent、DataProxy、Sort 
 * 对账的粒度有分钟、10分钟、30分钟、小时、天等等。
-审计对账以日志上报时间为统一的口径,参与审计的各个服务将按照相同的日志时间进行实时对账。通过审计对账,我们可以清晰的了解 InLong 
 ## 架构
-1. 审计SDK嵌套在需要审计的服务,对服务进行审计,将审计结果发送到审计接入层。
-2. 审计接入层将审计数据写到 MQ (Pulsar、Kafka 或者 TubeMQ)。
-3. 分发服务消费 MQ 的审计数据,将审计数据写到 MySQL、StarRocks。
-4. 接口层将 MySQL、StarRocks 的数据进行实时聚合并且 cache,对外提供 OpenAPI。
-5. 应用场景主要包括报表展示、审计对账等等。
-6. 支持数据补录场景的审计对账。
-7. 支持 Flink CheckPoint 场景的审计对账。
+- 审计SDK嵌套在需要审计的服务,对服务进行审计,将审计结果发送到审计接入层。
+- 审计接入层将审计数据写到 MQ (Pulsar、Kafka 或者 TubeMQ)。
+- 分发服务消费 MQ 的审计数据,将审计数据写到 MySQL、StarRocks。
+- 接口层将 MySQL、StarRocks 的数据进行实时聚合并且 cache,对外提供 OpenAPI。
+- 应用场景主要包括报表展示、审计对账等等。
+- 支持数据补录场景的审计对账。
+- 支持 Flink CheckPoint 场景的审计对账。
 ## 模块
-| 模块             | 描述                                                 |
-| audit-sdk      | 审计埋点上报,各个模块使用该 SDK 上报审计数据                          |
-| audit-proxy    | 审计代理层,接收 SDK 上报数据,转发到 MQ (pulsar / kafka / tubeMQ) |
-| audit-store    | 审计存储层,支持通用的 JDBC 协议                                |
-| audit-service  | 审计服务层,提供聚合、cache、OpenAPI 等能力                       |
+| 模块            | 描述                                                 |
+| audit-sdk     | 审计埋点上报,各个模块使用该 SDK 上报审计数据                          |
+| audit-proxy   | 审计代理层,接收 SDK 上报数据,转发到 MQ (pulsar / kafka / tubeMQ) |
+| audit-store   | 审计存储层,支持通用的 JDBC 协议                                |
+| audit-service | 审计服务层,提供聚合、cache、OpenAPI 等能力                       |
 ## 审计维度
-|       |       |       || |       | | | | |
-|-------|-------|-------| ---- |-------| ---- | ---- | ---- | ---- | ---- |
-| 机器 ip | 容器 ID | 线程 ID | 日志时间(分钟) | 审计 ID | inlong_group_id | 
inlong_stream_id | 条数 | 大小 | 传输时延(ms) |
+|       |       |       ||          |       |                 |                
  |     |     |
 ---- |
+| 机器 ip | 容器 ID | 线程 ID | 日志时间(分钟) | 审计 ID | inlong_group_id | 
inlong_stream_id | 条数  | 大小  | 传输时延(ms) |
 ## 审计项 ID
 每个模块的接收与发送分别为一个独立的审计项 ID
 | InLong 服务模块            | 审计 ID |
@@ -45,63 +50,47 @@ sidebar_position: 1
 | InLong DataProxy 接收成功         | 5     |
 | InLong DataProxy 发送成功         | 6     |
-## 数据传输协议
-SDK、接入层、分发层之间的传输协议为 Protocol Buffers
-syntax = "proto3";
-package org.apache.inlong.audit.protocol;
-message BaseCommand {
-    enum Type {
-        PING          = 0;
-        PONG          = 1;
-        AUDITREQUEST  = 2;
-        AUDITREPLY    = 3;
-    }
-    Type type                            = 1;
-    optional AuditRequest audit_request  = 2;
-    optional AuditReply audit_reply      = 3;
-    optional Ping ping                   = 4;
-    optional Pong pong                   = 5;
-message Ping {
-message Pong {
-message AuditRequest {
-  AuditMessageHeader msg_header = 1;   // 包头
-  repeated AuditMessageBody msg_body = 2;   // 包体
-message AuditMessageHeader {
-  string ip = 1;            // SDK 客户端 ip
-  string docker_id = 2;     // SDK 所在容器 ID
-  string thread_id = 3;     // SDK 所在的线程 ID
-  uint64 sdk_ts = 4;        // SDK 上报时间
-  uint64 packet_id = 5;     // SDK 上报的包 ID
-message AuditMessageBody {
-  uint64 log_ts = 1;    // 日志时间
-  string inlong_group_id= 2;   // InLong Group ID
-  string inlong_stream_id= 3; // InLong Stream ID
-  string audit_id = 4;   // 审计 ID
-  uint64 count = 5;     // 条数
-  uint64 size = 6;      // 大小
-  int64  delay = 7;      // 总传输延时
-message AuditReply {
-  enum RSP_CODE {
-    SUCCESS  = 0;  // 成功
-    FAILED   = 1;   // 失败
-    DISASTER = 2; // 容灾
-  }
-  RSP_CODE rsp_code = 1;   // 服务端返回码
-  optional string message = 2;
\ No newline at end of file
+## 审计数据存储
+Audit Store 能够支持所有兼容 JDBC 协议的存储组件的写入操作。因此,在选择兼容 JDBC 协议的存储组件时,只需确保其满足以下
+Schema 即可:
+    `id`               int(32)      NOT NULL PRIMARY KEY AUTO_INCREMENT 
COMMENT 'Incremental primary key',
+    `ip`               varchar(32)  NOT NULL DEFAULT '' COMMENT 'Client IP',
+    `docker_id`        varchar(100) NOT NULL DEFAULT '' COMMENT 'Client docker 
+    `thread_id`        varchar(50)  NOT NULL DEFAULT '' COMMENT 'Client thread 
'SDK timestamp',
+    `packet_id`        BIGINT       NOT NULL DEFAULT '0' COMMENT 'Packet id',
'Log timestamp',
+    `inlong_group_id`  varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong group id',
+    `inlong_stream_id` varchar(100) NOT NULL DEFAULT '' COMMENT 'The target 
inlong stream id',
+    `audit_id`         varchar(100) NOT NULL DEFAULT '' COMMENT 'Audit id',
+    `audit_tag`        varchar(100)          DEFAULT '' COMMENT 'Audit tag',
+    `audit_version`    BIGINT                DEFAULT -1 COMMENT 'Audit 
+    `count`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
+    `size`             BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
+    `delay`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message 
delay count',
+    `update_time`      timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON 
+    INDEX group_stream_audit_id (`inlong_group_id`, `inlong_stream_id`, 
`audit_id`, `log_ts`)
+) ENGINE = InnoDB
+  DEFAULT CHARSET = UTF8 COMMENT ='InLong audit data table';
+- ip:表示客户端的 IP 地址;
+- docker_id:长度为 100 的字符串,表示客户端的 Docker ID;
+- thread_id:长度为 50 的字符串,表示客户端的线程 ID;
+- sdk_ts:TIMESTAMP 类型,表示 SDK 的时间戳,默认值为当前时间戳;
+- packet_id:64 位整数,表示数据包的ID;
+- log_ts:TIMESTAMP 类型,表示日志的时间戳,默认值为当前时间戳;
+- inlong_group_id:长度为 100 的字符串,表示目标 Inlong 组的 ID;
+- inlong_stream_id:长度为 100 的字符串,表示目标 Inlong 流的 ID;
+- audit_id:长度为 100 的字符串,表示审计 ID;
+- audit_tag:长度为 100 的字符串,表示审计标签,默认为空字符串;
+- audit_version:64 位整数,表示审计版本,默认值为-1;
+- count:64 位整数,表示消息数量,默认值为 0;
+- size:64 位整数,表示消息大小,默认值为 0;
+- delay:64 位整数,表示消息延迟数量,默认值为 0;
+- update_time:TIMESTAMP 类型,表示更新时间,默认值为当前时间戳,当记录被更新时自动更新。
\ No newline at end of file

Reply via email to