This is an automated email from the ASF dual-hosted git repository.

gosonzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/inlong-website.git


The following commit(s) were added to refs/heads/master by this push:
     new c6079e594c [INLONG-895][Doc] Improve HTTP report documentation (#896)
c6079e594c is described below

commit c6079e594ca044ad318c0f55807b5d8c612ce788
Author: Goson Zhang <4675...@qq.com>
AuthorDate: Thu Nov 30 19:24:23 2023 +0800

    [INLONG-895][Doc] Improve HTTP report documentation (#896)
---
 docs/sdk/dataproxy-sdk/http.md                      |  16 ++++++++++++++++
 docs/sdk/dataproxy-sdk/img/http_report.png          | Bin 0 -> 166652 bytes
 .../current/sdk/dataproxy-sdk/http.md               |  20 ++++++++++++++++++++
 .../current/sdk/dataproxy-sdk/img/http_report.png   | Bin 0 -> 166652 bytes
 4 files changed, 36 insertions(+)

diff --git a/docs/sdk/dataproxy-sdk/http.md b/docs/sdk/dataproxy-sdk/http.md
index 159884836c..98407553c8 100644
--- a/docs/sdk/dataproxy-sdk/http.md
+++ b/docs/sdk/dataproxy-sdk/http.md
@@ -3,9 +3,25 @@ title: HTTP Report
 sidebar_position: 3
 ---
 
+## Introduction to the HTTP Reporting Process
+InLong processes HTTP report messages through DataProxy nodes:the reporting 
source periodically obtains the access point list from the Manager, and then 
selects available HTTP reporting nodes from the access point list based on its 
own strategy, after that uses the HTTP protocol for data production. The 
overall HTTP reporting process is illustrated in the following diagram:
+
+![](img/http_report.png)
+
+- Heartbeat reporting: DataProxy periodically reports heartbeats to the 
Manager, providing information about the enabled access points, including {IP, 
Port, Protocol, Load}.
+- Online node caching: The Manager caches the heartbeat information reported 
by DataProxy, sensing the available access nodes in the cluster and the 
available reporting access information.
+- Access point acquisition: The HTTP SDK (either an HttpProxySender 
implemented by DataProxy-SDK or an HTTP reporting SDK developed according to 
the HTTP reporting protocol) periodically obtains the available reporting 
access point list information for the current groupId by calling the 
"/inlong/manager/openapi/dataproxy/getIpList/{inlongGroupId}" method from the 
Manager.
+- Access point selection: The HTTP SDK selects the DataProxy node for message 
reporting based on the reporting node selection strategy.
+- Data reporting: The HTTP SDK constructs the reporting message according to 
the HTTP reporting protocol, sends the request message to the selected 
DataProxy node, and performs actions such as resending or exception output 
based on the response result after receiving the response.
+- Data acceptance: DataProxy checks the HTTP message. If the message is 
successfully accepted, it returns a success response and forwards the message 
to the MQ cluster. If the message format or value does not meet the 
specifications, or if the message processing fails, DataProxy returns a failure 
response with the corresponding error code and detailed error information.
+
+Suggestion: 
+Due to the issues of low performance, low proportion of valid data, and the 
ease of losing request messages in HTTP reporting, it is recommended for 
businesses to prioritize using the TCP method for data reporting.
+
 ## Create real-time synchronization task
 Create a task on the Dashboard or through the command line, and use `Auto 
Push` (autonomous push) as the data source type.
 
+
 ## Method 1: Call the interface to report (CURL)
 ```bash
 curl -X POST -d 
'groupId=give_your_group_id&streamId=give_your_stream_id&dt=data_time&body=give_your_data_body&cnt=1'
 http://dataproxy_url:46802/dataproxy/message
diff --git a/docs/sdk/dataproxy-sdk/img/http_report.png 
b/docs/sdk/dataproxy-sdk/img/http_report.png
new file mode 100644
index 0000000000..7b49d8641b
Binary files /dev/null and b/docs/sdk/dataproxy-sdk/img/http_report.png differ
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sdk/dataproxy-sdk/http.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sdk/dataproxy-sdk/http.md
index 427f5712ae..b2b07d422d 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sdk/dataproxy-sdk/http.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sdk/dataproxy-sdk/http.md
@@ -3,6 +3,26 @@ title: HTTP 上报
 sidebar_position: 3
 ---
 
+## HTTP 上报流程介绍 
+InLong 通过 DataProxy 节点处理 HTTP 上报消息,上报源定期从 Manager 获取接入点列表,然后根据自身策略从接入点列表里选择可用的 
HTTP 上报节点,再采用 HTTP 协议进行数据生产。总的 HTTP 上报流程如下图示:
+
+![](img/http_report.png)
+
+- 心跳上报:DataProxy 定期上报心跳至 Manager,提供该节点已启用接入的 {IP,Port,Protocol,Load} 信息;
+
+- 在线节点缓存:Manager 缓存 DataProxy 上报的心跳信息,感知集群里可用的接入节点,以及可用的上报接入信息;
+
+- 接入点获取:HTTP SDK(数据上报源采用 DataProxy-SDK 实现的 HttpProxySender,或者据 HTTP 上报协议自行开发的 
HTTP 上报 
SDK)定期通过“/inlong/manager/openapi/dataproxy/getIpList/{inlongGroupId}”方法从 
Manager 获取当前上报的groupId对应的可用上报接入点列表信息;
+
+- 接入点选取:HTTP SDK 根据上报节点选取策略,选择待进行消息上报的 DataProxy 节点;
+
+- 数据上报:HTTP SDK 根据 HTTP 上报协议构造上报消息,向选中的 DataProxy 
节点发送请求消息,并在收到响应后根据响应结果做是否重发、异常输出等操作;
+
+- 数据接纳:DataProxy 检查 HTTP 消息,如果成功接纳则返回成功响应,并将消息转发给 MQ 
集群;如果消息格式或者数值不符合规范,或者消息处理失败,则 DataProxy 返回失败响应,响应里携带对应的错误码和详细的错误信息。
+
+建议:
+  由于 HTTP 上报存在性能低、有效数据占比低、请求消息容易丢失等问题,建议业务尽量用 TCP 方式进行数据上报。
+
 ## 新建实时同步任务
 在 Dashboard 或者通过命令行工具创建任务,数据源类型使用 `Auto Push` (自主推送)。
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sdk/dataproxy-sdk/img/http_report.png
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sdk/dataproxy-sdk/img/http_report.png
new file mode 100644
index 0000000000..7b49d8641b
Binary files /dev/null and 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sdk/dataproxy-sdk/img/http_report.png
 differ

Reply via email to