This is an automated email from the ASF dual-hosted git repository. doleyzi pushed a commit to branch INLONG-945 in repository https://gitbox.apache.org/repos/asf/inlong-website.git
commit f3fac76a21595d9e16e7c07d12d23c88487fc179 Author: doleyzi <dole...@qq.com> AuthorDate: Mon Apr 29 20:56:57 2024 +0800 Update inlong audit usage documentation. --- docs/modules/audit/configure.md | 25 ++++- docs/modules/audit/img/audit_architecture.png | Bin 30983 -> 83639 bytes docs/modules/audit/img/audit_openapi.png | Bin 0 -> 56423 bytes docs/modules/audit/overview.md | 116 +++----------------- docs/modules/audit/quick_start.md | 32 +++++- .../current/modules/audit/configure.md | 34 ++++-- .../modules/audit/img/audit_architecture.png | Bin 33085 -> 83639 bytes .../current/modules/audit/img/audit_openapi.png | Bin 0 -> 56423 bytes .../current/modules/audit/overview.md | 119 +++------------------ .../current/modules/audit/quick_start.md | 38 ++++++- 10 files changed, 148 insertions(+), 216 deletions(-) diff --git a/docs/modules/audit/configure.md b/docs/modules/audit/configure.md index b53ff068f7..7f7b234db0 100644 --- a/docs/modules/audit/configure.md +++ b/docs/modules/audit/configure.md @@ -5,8 +5,9 @@ sidebar_position: 3 ## Overview -Audit-proxy source-channel-sink pipeline configuration (audit-proxy-{tube|pulsar|kafka}.conf).Audit-store storage service -configuration (application.properties) +* Audit-proxy source-channel-sink pipeline configuration (audit-proxy-{tube|pulsar|kafka}.conf). +* Audit-store storage service configuration (application.properties). +* OpenAPI audit-service configuration audit-service.properties. ## Audit-proxy source-channel-sink pipeline configuration (`audit-proxy-{tube|pulsar|kafka}.conf`) @@ -86,4 +87,22 @@ configuration (application.properties) | clickhouse.driver | Set the driver type | ru.yandex.clickhouse.ClickHouseDriver | | | clickhouse.url| clickhouse URL | jdbc:clickhouse://127.0.0.1:8123/Default value | | | clickhouse.username | account name | Default value | | -| clickhouse.password | password | Default value | | \ No newline at end of file +| clickhouse.password | password | Default value | | + +### StarRocks configuration + +| Parameter | Description | Default value | Notes | +|---------------------|---------------|-----------------|---------------------------------------------| +| jdbc.driver | Driver type | com.mysql.cj.jdbc.Driver | | +| jdbc.url | StarRocks URL | jdbc:mysql://127.0.0.1:8123/default | | +| jdbc.username | account name | default | | +| jdbc.password | password | default | | + +## OpenAPI audit-service +configuration `audit-service.properties` + +| Parameter | Description | Default value | Notes | +|-------------------|--------------|-------|---------------------------------------------| +| mysql.jdbc.url | mysql URL | jdbc:mysql://127.0.0.1:8123/default | | +| mysql.username | account name | default | | +| mysql.password | password | default | | \ No newline at end of file diff --git a/docs/modules/audit/img/audit_architecture.png b/docs/modules/audit/img/audit_architecture.png index 609b9b210b..3539c489f9 100644 Binary files a/docs/modules/audit/img/audit_architecture.png and b/docs/modules/audit/img/audit_architecture.png differ diff --git a/docs/modules/audit/img/audit_openapi.png b/docs/modules/audit/img/audit_openapi.png new file mode 100644 index 0000000000..c7bc0f9774 Binary files /dev/null and b/docs/modules/audit/img/audit_openapi.png differ diff --git a/docs/modules/audit/overview.md b/docs/modules/audit/overview.md index 46178d62c2..3f1c5bfc52 100644 --- a/docs/modules/audit/overview.md +++ b/docs/modules/audit/overview.md @@ -16,6 +16,8 @@ The transmission status of each module, and whether the data stream is lost or r 3. The distribution service consumes the audit data of MQ, and writes the audit data to MySQL, Elasticsearch and ClickHouse. 4. The interface layer encapsulates the data of MySQL, Elasticsearch and ClickHouse. 5. Application scenarios mainly include report display, audit reconciliation, etc. +6. Support audit and reconciliation of data supplementary recording scenarios. +7. Support audit reconciliation in Flink checkpoint scenarios. ## Audit Dimension | | | || | | | | | | @@ -101,119 +103,31 @@ message AuditReply { ***2. Data Uniqueness*** ***3. Reduce data loss caused by abnormal restart*** -### Main Logic Diagram - - -1. The sdk provides the add interface externally. The parameters are: audit_id, inlong_group_id, inlong_stream_id, number, size. -2. The sdk uses log time+audit_id+inlong_group_id+inlong_stream_id as the key to perform real-time statistics. -3. When the sending cycle is satisfied or the business program is actively triggered, the SDK will package the statistical results with the PB protocol and send the audit access layer. -4. If (4) fails to send, put it into the failure queue, and continue to send in the next cycle. -5. When the failure queue is greater than the threshold, perform disaster recovery through local files. - ### Service Discovery Audit name discovery between sdk and access layer, support plug-in, including domain name, vip, etc. ### Disaster Recovery - -1. When the SDK fails to send the access layer, it will be placed in the failure queue. -2. When the failure queue reaches the threshold, it will be written to the local disaster recovery file. -3. When the local disaster recovery file reaches the threshold, the old data will be eliminated (eliminated by time). +* When the SDK fails to send the access layer, it will be placed in the failure queue. +* When the failure queue reaches the threshold, it will be written to the local disaster recovery file. +* When the local disaster recovery file reaches the threshold, the old data will be eliminated (eliminated by time). ## Access layer Implementation ### Target ***1.High reliability*** -***2.at least once*** - -### Main Logic Diagram - -1. After the access layer receives the packet sent by the sdk, it writes the message queue. -2. After writing the message queue successfully, return success to the sdk. -3. The data protocol of the message queue is the PB protocol. -4. Set the ack of the write message queue to -1 or all. +***2.at least once*** -## Elasticsearch Distribution Implementation +## Distribution Implementation ### Target ***1. High real-time performance (minute level)*** ***2. Can operate tens of billions of audit data per day*** ***3. Can be deduplicated*** -### Main Logic Diagram - -1. Distribution service AuditDds consumes messages in real time. -2. According to the audit ID in the audit data, route the data to the corresponding Elasticsearch cluster. -3. Each audit ID corresponds to an Elasticsearch index. - -### Elasticsearch Index Design -#### Index Name -The index name consists of date + audit item ID, such as 20211019_1, 20211019_2. -#### Index Field Schema - -|field |type |instruction | -|---- |---- |----| -|audit_id |keyword |Audit ID | -|inlong_group_id |keyword |inlong_group_id | -|inlong_stream_id |keyword |inlong_stream_id | -|docker_id |keyword |ID of the container where the dk is located| -|thread_id |keyword |thread ID | -|packet_id |keyword |Package ID reported by sdk | -|ip |keyword |Machine IP | -|log_ts |keyword |log time | -|sdk_ts |long |Audit SDK reporting time | -|count |long |Number of logs | -|size |long |size of log | -|delay |long |The log transfer time, equal to the current machine time minus the log time | - -#### Elasticsearch Index Storage Period -Storage by day, storage period is dynamically configurable - -## Elasticsearch Write Design -### The relationship between inlong_group_id, inlong_stream_id, audit ID and Elasticsearch index - -The relationship between inlong_group_id, inlong_stream_id, audit ID and Elasticsearch index is 1:N in system design and service implementation - -### Write Routing Policy - -Use inlong_group_id and inlong_stream_id to route to Elasticsearch shards to ensure that the same inlong_group_id and inlong_stream_id are stored in the same shard -When writing the same inlong_group_id and inlong_stream_id to the same shard, when querying and aggregating, only one shard needs to be processed, which can greatly improve performance - -### Optional DeduplicationBy doc_id -Elasticsearch is resource-intensive for real-time deduplication. This function is optional through configuration. - -### Use bulk batch method -Use bulk to write, each batch of 5000, improve the write performance of the Elasticsearch cluster - -## MySQL Distribution Implementation -### Target -***1. High real-time performance (minute level)*** -***2. Simple to deploy*** -***3. Can be deduplicated*** - -### Main Logic Diagram - -MySQL distribution supports distribution to different MySQL instances according to the audit ID, and supports horizontal expansion. - -### Usage introduction - 1. When the audit scale of the business is relatively small, less than ten million per day, you can consider using MySQL as the audit storage. Because the deployment of MySQL is much simpler than that of Elasticsearch, the resource cost will be much less. - 2. If the scale of audit data is large and MySQL cannot support it, you can consider using Elasticsearch as storage. After all, a single Elasticsearch cluster can support tens of billions of audit data and horizontal expansion. - -## ClickHouse Distribution Implementation -### Target -***1. High real-time performance (minute level)*** -***2. Simple to deploy*** -***3. Can be deduplicated*** - -### Main Logic Diagram -ClickHouse distribution supports distribution to different ClickHouse instances according to the audit ID, and supports horizontal expansion. - -### Usage introduction - 1. When the audit scale of the business is huge and you want to use SQL to access audit data, you can consider using ClickHouse as the audit storage. Because ClickHouse support SQL accessing, and support tens of billions of audit data and horizontal expansion. - -## Audit Usage Interface Design -### Main Logic Diagram - -The audit interface layer uses SQL to check MySQL/ClickHouse or restful to check Elasticsearch. How to check which type of storage the interface uses depends on which type of storage is used. - -### UI Interface Display -### Main Logic Diagram +## OpenAPI Implementation +### Architecture + +* The audit interface layer provides OpenAPI capabilities to the outside world through real-time aggregation and local caching of multiple audit data sources. + +### UI Interface display +### Architecture  -The front-end page pulls the audit data of each module through the interface layer and displays it. \ No newline at end of file +* The front-end page pulls the audit data of each module through the interface layer and displays it. \ No newline at end of file diff --git a/docs/modules/audit/quick_start.md b/docs/modules/audit/quick_start.md index e6c6bce11e..ec51ffe643 100644 --- a/docs/modules/audit/quick_start.md +++ b/docs/modules/audit/quick_start.md @@ -44,10 +44,11 @@ agent1.sinks.kafka-sink-msg2.topic = inlong-audit # By default, pulsar is used as the MessageQueue, and the audit-proxy-pulsar.conf configuration file is loaded. bash +x ./bin/proxy-start.sh [pulsar|tube|kafka] ``` +The default listen port is `10081`. ## Audit Store ### Configure -The configuration file is `conf/application.properties`. +The configuration file is `conf/application.properties`. ```Shell # proxy.type: pulsar / tube / kafka @@ -87,6 +88,12 @@ clickhouse.driver=ru.yandex.clickhouse.ClickHouseDriver clickhouse.url=jdbc:clickhouse://127.0.0.1:8123/default clickhouse.username=default clickhouse.password=default + +# starrocks config (optional) +jdbc.driver=com.mysql.cj.jdbc.Driver +jdbc.url=jdbc:mysql://127.0.0.1:9020/apache_inlong_audit?characterEncoding=utf8&useSSL=false&serverTimezone=GMT%2b8&rewriteBatchedStatements=true&allowMultiQueries=true&zeroDateTimeBehavior=CONVERT_TO_NULL +jdbc.username=******* +jdbc.password=******** ``` ### Dependencies @@ -99,4 +106,25 @@ clickhouse.password=default bash +x ./bin/store-start.sh ``` -The default listen port is `10081`. \ No newline at end of file +## Audit Service +### Configure +The configuration file is `conf/audit-service.properties` +```Shell +mysql.jdbc.url=jdbc:mysql://127.0.0.1:3306/apache_inlong_audit?characterEncoding=utf8&useUnicode=true&rewriteBatchedStatements=true +mysql.username=***** +mysql.password=***** +``` +### Configure audit data sources +In the audit_source_config table used by the Audit Service, configure the data source for audit storage. + +### Configure audit audit items +In the audit_id_config table used by the Audit Service, configure the audit items that need to be cached. + +### Dependencies +- If the backend database is MySQL, please download [mysql-connector-java-8.0.28.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.28/mysql-connector-java-8.0.28.jar) and put it into `lib/` directory. +- If the backend database is PostgreSQL, there's no need for additional dependencies. + +### Start +```Shell +bash +x ./bin/service-start.sh +``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/configure.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/configure.md index 560580f4a2..a15d9ab44b 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/configure.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/configure.md @@ -5,9 +5,12 @@ sidebar_position: 3 ## 概览 -审计代理服务 audit-proxy 在 `audit-proxy-{tube|pulsar|kafka}.conf`中设置。 审计存储服务 audit-store 在 `application.properties`中设置。 +* 审计代理服务audit-proxy在`audit-proxy-{tube|pulsar|kafka}.conf`中设置。 +* 审计存储服务audit-store在`application.properties`中设置。 +* 审计OpenAPI服务audit-service`在audit-service.properties`中配置。 -## 审计代理层 audit-proxy source-channel-sink 管道配置(audit-proxy-{tube|pulsar|kafka}.conf) +## 审计代理层 audit-proxy +配置(audit-proxy-{tube|pulsar|kafka}.conf) ### 通用设置 @@ -16,7 +19,7 @@ sidebar_position: 3 | agent1.sources | source类型 | tcp-source | | | agent1.channels | 使用的channel | ch-msg1 | | | agent1.sinks | 使用的sink | pulsar-sink-msg1 | | -| + ### sources 相关设置 @@ -44,7 +47,8 @@ sidebar_position: 3 | agent1.sinks.pulsar-sink-msg1.enable_token_auth | 是否需要安全认证 | false | | | agent1.sinks.pulsar-sink-msg1.auth_token | pulsar认证token | 空 | | -## 审计存储服务 audit-store 配置 `application.properties` +## 审计存储服务 audit-store +配置 `application.properties` ### MQ配置 @@ -78,11 +82,29 @@ sidebar_position: 3 | elasticsearch.indexDeleteDay | 索引保存时间,天 | 5 | | | elasticsearch.auditIdSet | 允许写入的审计ID列表 | 1,2 | | -### clickhouse 相关配置 +### ClickHouse 相关配置 | 参数 | 描述 | 默认值 | 备注 | |---------------------------|--------------------|-------|---------------------------------------------| | clickhouse.driver | 设置Driver类型 | ru.yandex.clickhouse.ClickHouseDriver | | | clickhouse.url| clickhouse的URL | jdbc:clickhouse://127.0.0.1:8123/default | | | clickhouse.username | 账号名 | default | | -| clickhouse.password | 密码 | default | | \ No newline at end of file +| clickhouse.password | 密码 | default | | + +### StarRocks 相关配置 + +| 参数 | 描述 | 默认值 | 备注 | +|---------------------|--------------------|---------------------------------------|---------------------------------------------| +| jdbc.driver | 设置Driver类型 | com.mysql.cj.jdbc.Driver | | +| jdbc.url | StarRocks的URL | jdbc:mysql://127.0.0.1:8123/default | | +| jdbc.username | 账号名 | default | | +| jdbc.password | 密码 | default | | + +## OpenAPI服务 audit-service +配置 `audit-service.properties` + +| 参数 | 描述 | 默认值 | 备注 | +|-------------------|-----------|-------|---------------------------------------------| +| mysql.jdbc.url | mysql的URL | jdbc:mysql://127.0.0.1:8123/default | | +| mysql.username | 账号名 | default | | +| mysql.password | 密码 | default | | \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_architecture.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_architecture.png index d11d35767c..3539c489f9 100644 Binary files a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_architecture.png and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_architecture.png differ diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_openapi.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_openapi.png new file mode 100644 index 0000000000..c7bc0f9774 Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_openapi.png differ diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md index 312d2dbb11..df85b260a3 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md @@ -3,8 +3,8 @@ title: 总览 sidebar_position: 1 --- -InLong审计是独立于InLong的一个子系统,对InLong系统的Agent、DataProxy、Sort模块的入流量、出流量进行实时审计对账。 -对账的粒度有分钟、小时、天三种粒度。 +* InLong审计是独立于InLong的一个子系统,对InLong系统的Agent、DataProxy、Sort模块的入流量、出流量进行实时审计对账。 +* 对账的粒度有分钟、10分钟、30分钟、小时、天等等。 审计对账以日志上报时间为统一的口径,参与审计的各个服务将按照相同的日志时间进行实时对账。通过审计对账,我们可以清晰的了解InLong 各个模块的传输情况,以及数据流是否有丢失或者重复 @@ -13,9 +13,11 @@ InLong审计是独立于InLong的一个子系统,对InLong系统的Agent、Dat  1. 审计SDK嵌套在需要审计的服务,对服务进行审计,将审计结果发送到审计接入层。 2. 审计接入层将审计数据写到 MQ(Pulsar、Kafka 或者 TubeMQ)。 -3. 分发服务消费 MQ 的审计数据,将审计数据写到 MySQL、Elasticsearch、ClickHouse。 -4. 接口层将 MySQL、Elasticsearch、ClickHouse 的数据进行封装。 +3. 分发服务消费 MQ 的审计数据,将审计数据写到 MySQL、Elasticsearch、ClickHouse、StarRocks。 +4. 接口层将 MySQL、Elasticsearch、ClickHouse、StarRocks的数据进行实时聚合并且cache,对外提供OpenAPI。 5. 应用场景主要包括报表展示、审计对账等等。 +6. 支持数据补录场景的审计对账。 +7. 支持Flink checkpoint场景的审计对账。 ## 审计维度 | | | || | | | | | | @@ -98,124 +100,37 @@ message AuditReply { optional string message = 2; } ``` -## 审计SDK实现细节 +## 审计SDK实现 ### 目标 ***1.支持本地容灾*** ***2.数据唯一性*** ***3.减少异常重启导致的数据丢失*** -### 主要逻辑图 - -1.sdk对外提供add接口,参数为:audit_id, inlong_group_id,inlong_stream_id,条数,大小 -2.sdk以日志时间+audit_id+inlong_group_id+inlong_stream_id为key,进行实时统计 -3.满足发送周期或者业务程序主动触发,SDK将统计结果进行PB协议组包,发送审计接入层 -4.如果(4)发送失败,则放入失败队列,下个周期继续发送 -5.当失败队列大于阈值时,通过本地文件进行容灾 - ### 服务发现 -审计sdk与接入层之间的名字发现,支持插件化,包括域名、vip等 +* 审计sdk与接入层之间的名字发现,支持插件化,包括域名、vip等 ### 容灾逻辑 - -1.sdk发送接入层失败时,会放入失败队列 -2.失败队列达到阈值时,将写到本地容灾文件 -3.本地容灾文件达到阈值时,将淘汰旧数据(按时间淘汰) +* sdk发送接入层失败时,会放入失败队列 +* 失败队列达到阈值时,将写到本地容灾文件 +* 本地容灾文件达到阈值时,将淘汰旧数据(按时间淘汰) -## 接入层实现细节 +## 接入层实现 ### 目标 ***1.高可靠*** ***2.at least once*** -### 主要逻辑 - -1.接入层收到sdk发送的包之后,写消息队列 -2.写消息队列成功之后,则对sdk返回成功 -3.消息队列的数据协议为PB协议 -4.写消息队列的ack设置成-1或者all - -## Elasticsearch分发实现 -### 目标 -***1.高实时性(分钟级)*** -***2.可运营每天百亿级别的审计数据*** -***3.可去重*** - -### 主要逻辑图 - -1.分发服务AuditDds实时消费消息 -2.根据审计数据中的审计ID,将数据路由到对应的Elasticsearch集群 -3.每个审计ID对应一个Elasticsearch索引 - -### 索引设计 -#### 索引名 -索引名由日期+审计项ID组成,如20211019_1,20211019_2 -#### 索引字段格式 - -|字段 |类型 |说明 | -|---- |---- |----| -|audit_id |keyword |审计ID | -|inlong_group_id |keyword |inlong_group_id | -|inlong_stream_id |keyword |inlong_stream_id | -|docker_id |keyword |sdk所在容器ID | -|thread_id |keyword |线程ID | -|packet_id |keyword |sdk上报的包ID | -|ip |keyword |机器IP | -|log_ts |keyword |日志时间 | -|sdk_ts |long |审计SDK上报时间 | -|count |long |日志条数 | -|size |long |日志大小 | -|delay |long |日志传输时间,等于当前机器时间减去日志时间 | - -#### 索引的存储周期 -按天存储,存储周期动态可配置 - -## Elasticsearch写入设计 -### inlong_group_id、inlong_stream_id、审计ID与Elasticsearch索引的关系 - -系统设计与服务实现上inlong_group_id、inlong_stream_id、审计ID与Elasticsearch索引为1:N的关系 - -### 写入路由策略 - -使用inlong_group_id、inlong_stream_id路由到Elasticsearch分片,保证相同的inlong_group_id、inlong_stream_id存储在相同的分片 -将相同的inlong_group_id、inlong_stream_id写到同一个分片,查询以及聚合的时候,只需要处理一个分片,能够大大提高性能 - -### 可选按doc_id去重 -Elasticsearch实时去重比较耗资源,此功能通过配置可选。 - -### 使用bulk批量方式 -使用bulk写入,每批5000条,提高Elasticsearch集群的写入性能 - -## MySQL分发实现 -### 目标 -***1.高实时性(分钟级)*** -***2.部署简单*** -***3.可去重*** - -### 主要逻辑图 - -MySQL分发支持根据审计ID分发到不同的MySQL实例,支持水平扩展。 - -### 使用介绍 - 1.当业务的审计规模比较小,小于千万级/天时,就可以考虑采用MySQL作为审计的存储。因为MySQL的部署相对Elasticsearch要简单的多, 资源成本也会少很多。 - 2.如果审计数据规模很大,MySQL支撑不了时,就可以考虑采用Elasticsearch作为存储,毕竟单个Elasticsearch集群能够支持百亿级别的审计数据,也支持水平扩容。 - -## ClickHouse分发实现 +## 分发层实现 ### 目标 ***1.高实时性(分钟级)*** ***2.部署简单*** ***3.可去重*** +## OpenAPI实现 ### 主要逻辑图 -ClickHouse分发支持根据审计ID分发到不同的ClickHouse实例,支持水平扩展。 - -### 使用介绍 - 1.ClickHouse集群支持百亿级审计数据,也支持水平扩容,同时支持SQL方式访问审计数据,资源成本和ElasticSearch差不多。 - -## 审计使用接口设计 -### 主要逻辑图 - -审计接口层通过SQL查MySQL/ClickHouse或者restful查Elasticsearch。接口具体怎么查哪一种存储,取决使用了哪一种存储 + +* 审计接口层通过对多个审计数据源进行实时聚合、本地cache,对外提供OpenAPI能力。 ### UI 界面展示 ### 主要逻辑图  -前端页面通过接口层,拉取各个模块的审计数据,进行展示 \ No newline at end of file +* 前端页面通过接口层,拉取各个模块的审计数据,进行展示 \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/quick_start.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/quick_start.md index daa8a1fad5..80e57730df 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/quick_start.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/quick_start.md @@ -5,7 +5,7 @@ title: 安装部署 所有的安装文件都在 `inlong-audit` 目录下,如果使用 MySQL 存储审计数据,需要先通过`sql/apache_inlong_audit.sql`初始化数据库。 ```shell # 初始化 database -mysql -uDB_USER -pDB_PASSWD < sql/apache_inlong_audit.sql +mysql -uDB_USER -pDB_PASSWD < sql/apache_inlong_audit_mysql.sql ``` 如果使用 ClickHouse 存储审计数据,需要先通过`sql/apache_inlong_audit_clickhouse.sql`初始化数据库。 @@ -13,6 +13,12 @@ mysql -uDB_USER -pDB_PASSWD < sql/apache_inlong_audit.sql # 初始化 database clickhouse client -u DB_USER --password DB_PASSWD < sql/apache_inlong_audit_clickhouse.sql ``` + +如果使用 StarRocks 存储审计数据,需要先通过`sql/apache_inlong_audit_starrocks.sql`初始化数据库。 +```shell +# 初始化 StarRocks database +mysql -uDB_USER -pDB_PASSWD < sql/apache_inlong_audit_starrocks.sql +``` ## 依赖 - 如果后端连接 MySQL 数据库,请下载 [mysql-connector-java-8.0.28.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.28/mysql-connector-java-8.0.28.jar), 并将其放入 `lib/` 目录。 @@ -47,6 +53,7 @@ agent1.sinks.kafka-sink-msg2.topic = inlong-audit # 默认使用 pulsar 作为消息队列,加载 audit-proxy-pulsar.conf 配置文件 bash +x ./bin/proxy-start.sh [pulsar|tube|kafka] ``` +Audit Proxy 默认监听端口为 `10081`。 ## Audit Store ### 配置 @@ -89,6 +96,12 @@ elasticsearch.port=9200 clickhouse.url=jdbc:clickhouse://127.0.0.1:8123/default clickhouse.username=default clickhouse.password=default + +# starrocks config (optional) +jdbc.driver=com.mysql.cj.jdbc.Driver +jdbc.url=jdbc:mysql://127.0.0.1:9020/apache_inlong_audit?characterEncoding=utf8&useSSL=false&serverTimezone=GMT%2b8&rewriteBatchedStatements=true&allowMultiQueries=true&zeroDateTimeBehavior=CONVERT_TO_NULL +jdbc.username=******* +jdbc.password=******** ``` ### 依赖 @@ -100,4 +113,25 @@ clickhouse.password=default bash +x ./bin/store-start.sh ``` -Audit Proxy 默认监听端口为 `10081`。 \ No newline at end of file +## Audit Service +### 配置 +配置文件 `conf/audit-service.properties` +```Shell +mysql.jdbc.url=jdbc:mysql://127.0.0.1:3306/apache_inlong_audit?characterEncoding=utf8&useUnicode=true&rewriteBatchedStatements=true +mysql.username=***** +mysql.password=***** +``` +#### 配置审计数据源 +在Audit Service服务使用的audit_source_config表中,配置审计存储的数据源。 + +#### 配置审计审计项 +在Audit Service服务使用的audit_id_config表中,配置需要cache的审计项。 + +### 依赖 +- 如果后端连接 MySQL 数据库,请下载 [mysql-connector-java-8.0.28.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.26/mysql-connector-java-8.0.28.jar), 并将其放入 `lib/` 目录。 +- 如果后端连接 PostgreSQL 数据库,不需要引入额外依赖。 + +### 启动 +```Shell +bash +x ./bin/service-start.sh +``` \ No newline at end of file