Thanks Matthias. Please find my response inline. On Wed, Dec 6, 2017 at 12:34 AM, Matthias J. Sax <matth...@confluent.io> wrote:
> Hard to say. > > However, deleting state directories will not have any negative impact as > you don't use stores. Thus, why do you not want to do this? > <Sri> We are running this in a prod-like setup. I don't have SSH access to > the box. Its not ideal to request deleting these every now and then. It doesn't > sound prod ready. > > Another workaround you can do, it to start four applications with 1 > thread each -- this would isolate the instances further and avoid the > lock issue (you would need to run on different host, or on the same > host, but with different state directory for the different instances.) > <Sri> App has a large in-mem cache. I'd like more threads to use this > cache to keep memory/cpu usage in balance. > In general, I would recommend to upgrade you applications -- you don't > need to upgrade the brokers for this. 1.0 versions has many additional > bug fixes. This should resolve the issues you are facing. <Sri> I haven't really tried bidirectional compatibility mode. Is there any > caveat I need to be aware of for 1.0.x client and 0.10.2.1 brokers? I probably should try this. > Otherwise, it's hard to say without logs. > <Sri> Unfortunately, that's all I had in the logs. I know it isn't > helpful. I've kept default logging to WARN except for few packages. I assume any logging that indicates why app isn't making progress would be > in WARN. I'll bump the level up for few days. > Hope this helps. > > > -Matthias > > > On 12/5/17 9:35 AM, Srikanth wrote: > > Hello, > > > > We noticed that a kafka streams app is stuck in rebalance state with > below > > error. > > Two instance of the app were running fine until a rebalace was > > triggered(possibly due to network issue). > > Both app instance are running(no app restart) > > App itself doesn't create/use state store. NUM_STREAM_THREADS_CONFIG=2 > > > > I did see several tickets with similar errors that are marked as fixed. > I'm > > using version 0.10.2.1. > > > > 17/12/04 18:34:57 WARN StreamThread: Could not create task 0_2. Will > retry. > > org.apache.kafka.streams.errors.LockException: task [0_2] Failed to lock > > the state directory for task 0_2 > > at > > org.apache.kafka.streams.processor.internals. > ProcessorStateManager.<init>(ProcessorStateManager.java:100) > > at > > org.apache.kafka.streams.processor.internals.AbstractTask.<init>( > AbstractTask.java:73) > > at > > org.apache.kafka.streams.processor.internals. > StreamTask.<init>(StreamTask.java:108) > > at > > org.apache.kafka.streams.processor.internals. > StreamThread.createStreamTask(StreamThread.java:864) > > at > > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator. > createTask(StreamThread.java:1237) > > at > > org.apache.kafka.streams.processor.internals.StreamThread$ > AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210) > > at > > org.apache.kafka.streams.processor.internals. > StreamThread.addStreamTasks(StreamThread.java:967) > > at > > org.apache.kafka.streams.processor.internals.StreamThread.access$600( > StreamThread.java:69) > > at > > org.apache.kafka.streams.processor.internals.StreamThread$1. > onPartitionsAssigned(StreamThread.java:234) > > at > > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator. > onJoinComplete(ConsumerCoordinator.java:259) > > at > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator. > joinGroupIfNeeded(AbstractCoordinator.java:352) > > at > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator. > ensureActiveGroup(AbstractCoordinator.java:303) > > at > > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll( > ConsumerCoordinator.java:290) > > at > > org.apache.kafka.clients.consumer.KafkaConsumer. > pollOnce(KafkaConsumer.java:1029) > > at > > org.apache.kafka.clients.consumer.KafkaConsumer.poll( > KafkaConsumer.java:995) > > at > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop( > StreamThread.java:592) > > at > > org.apache.kafka.streams.processor.internals. > StreamThread.run(StreamThread.java:361) > > 17/12/04 18:34:58 WARN StreamThread: Could not create task 0_8. Will > retry. > > org.apache.kafka.streams.errors.LockException: task [0_8] Failed to lock > > the state directory for task 0_8 > > > > > > kafka-consumer-groups --new-consumer --bootstrap-server <...> --describe > > --group GeoTest > > Note: This will only show information about consumers that use the Java > > consumer API (non-ZooKeeper-based consumers). > > > > Warning: Consumer group 'GeoTest' is rebalancing. > > > > I keep seeing the above lock exception continuously and app is not making > > any progress. Any idea why it is stuck? > > I read a few suggestions that required me to manually delete state > > directory. I'd like to avoid that. > > > > Thanks, > > Srikanth > > > >