[
https://issues.apache.org/jira/browse/KAFKA-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davor Poldrugo updated KAFKA-4455:
----------------------------------
Attachment: RocksDBException_IO-error_stacktrace.txt
> Commit during rebalance does not close RocksDB which later causes:
> org.rocksdb.RocksDBException: IO error: lock .../LOCK: No locks available
> --------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-4455
> URL: https://issues.apache.org/jira/browse/KAFKA-4455
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 0.10.1.0
> Environment: Kafka Streams were running on CentOS - I have observed
> this - after some time the locks were released even if the jvm/process wasn't
> restarted, so I guess CentOS has some lock cleaning policy.
> Reporter: Davor Poldrugo
> Labels: stacktrace
> Attachments: RocksDBException_IO-error_stacktrace.txt
>
>
> h2. Problem description
> From time to time a rebalance in Kafka Streams causes the commit to throw
> CommitFailedException. When this exception is thrown, the tasks and
> processors are not closed. If some processor contains a state store
> (RocksDB), the RocksDB is not closed, which leads to not relasead LOCK's on
> OS level, and when the Kafka Streams app is trying to open tasks and their
> respective processors and state stores the {{org.rocksdb.RocksDBException: IO
> error: lock .../LOCK: No locks available}} is thrown. If the the jvm/process
> is restarted the locks are released.
> h2. Additional info
> I have been running 3 Kafka Streams instances on separate machines with
> {{num.stream.threads=1}} and each with it's own state directory. Other Kafka
> Streams apps were running but they had separate directories for state stores.
> h2. Stacktrace
> [^RocksDBException_IO-error_stacktrace.txt]
> h2. Suggested solution
> To avoid restarting the jvm, modify Kafka Streams to close tasks, which will
> lead to release of resources - in this case - filesystem LOCK files.
> h2. Possible solution code
> Branch: https://github.com/dpoldrugo/kafka/commits/infobip-fork
> Commit: [BUGFIX: When commit fails during rebalance - release
> resources|https://github.com/dpoldrugo/kafka/commit/af0d16fc5f8629ab0583c94edf3dbf41158b73f3]
> h2. Note
> This could be related this issues: KAFKA-3708 and KAFKA-3938
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)