mrzhugit opened a new issue, #8956:
URL: https://github.com/apache/seatunnel/issues/8956

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   部署 SeaTunnel Engine 分离模式集群, 一个master,一个worker. 配置为8核,12G内存.  提交一个stream job, 
数据从kafka 到 hdfs.   kafka 的topic中有1000w条数据. 
   job env 
   env {
     parallelism = 1
     job.mode = "STREAMING"
     checkpoint.interval = 10000
     read_limit.rows_per_second=1
   }
   任务启动后worker内存使用会内持续上升, 5分钟内从200M到5G.  如果通过api stop job, 内存使用不会释放. 
   如果再次恢复任务到RUNNING状态,内存会持续上升,直到OOM.  worker节点重启.
   read_limit.rows_per_second=1 实际并未限制住从kafka的数据读取.
   翻看org.apache.seatunnel.connectors.seatunnel.kafka.source.KafkaSource 类的 
createReader方法中  elementsQueue = new LinkedBlockingQueue<>(); 
未指定大小,猜测是此处导致的内存溢出. 
    数据读取未真正与  read_limit.rows_per_second 限制关联.
   
   ### SeaTunnel Version
   
   2.3.9
   
   ### SeaTunnel Config
   
   ```conf
   null
   ```
   
   ### Running Command
   
   ```shell
   null
   ```
   
   ### Error Exception
   
   ```log
   OOM
   ```
   
   ### Zeta or Flink or Spark Version
   
   Zeta
   
   ### Java or Scala Version
   
   _No response_
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to