Matthias, We moved to streams client 1.0.0 with 0.10.2.x broker now. Haven't see this issue ever since. Thanks!
Srikanth On Thu, Dec 7, 2017 at 12:15 AM, Matthias J. Sax <matth...@confluent.io> wrote: > Running Streams 1.0.0 should just work with 0.10.2 brokers. Of course, > you can't use EOS feature. > > Not sure how your app got into this bad state. But if is does not > recover from it, delete the store directory seems to be reasonable -- > also for stateful application, you won't loose data, as we recreate the > stores from the changelog topic. A bug, does not mean not production > ready IMHO. Many people run 0.10.2 in production. But newer versions > should fix bugs of course :) > > Please let us know if 1.0.0 works for you. > > -Matthias > > On 12/6/17 3:18 AM, Srikanth wrote: > > Thanks Matthias. Please find my response inline. > > > > On Wed, Dec 6, 2017 at 12:34 AM, Matthias J. Sax <matth...@confluent.io> > > wrote: > > > >> Hard to say. > >> > >> However, deleting state directories will not have any negative impact as > >> you don't use stores. Thus, why do you not want to do this? > >> <Sri> We are running this in a prod-like setup. I don't have SSH access > to > >> the box. > > > > Its not ideal to request deleting these every now and then. It doesn't > >> sound prod ready. > >> > > > > > >> Another workaround you can do, it to start four applications with 1 > >> thread each -- this would isolate the instances further and avoid the > >> lock issue (you would need to run on different host, or on the same > >> host, but with different state directory for the different instances.) > >> <Sri> App has a large in-mem cache. I'd like more threads to use this > >> cache to keep memory/cpu usage in balance. > > > > > > > > > >> In general, I would recommend to upgrade you applications -- you don't > >> need to upgrade the brokers for this. 1.0 versions has many additional > >> bug fixes. This should resolve the issues you are facing. > > > > <Sri> I haven't really tried bidirectional compatibility mode. Is there > any > >> caveat I need to be aware of for 1.0.x client and 0.10.2.1 brokers? > > > > I probably should try this. > > > > > > > >> Otherwise, it's hard to say without logs. > >> <Sri> Unfortunately, that's all I had in the logs. I know it isn't > >> helpful. I've kept default logging to WARN except for few packages. > > > > I assume any logging that indicates why app isn't making progress would > be > >> in WARN. I'll bump the level up for few days. > > > > > > > >> Hope this helps. > >> > >> > >> -Matthias > >> > >> > >> On 12/5/17 9:35 AM, Srikanth wrote: > >>> Hello, > >>> > >>> We noticed that a kafka streams app is stuck in rebalance state with > >> below > >>> error. > >>> Two instance of the app were running fine until a rebalace was > >>> triggered(possibly due to network issue). > >>> Both app instance are running(no app restart) > >>> App itself doesn't create/use state store. NUM_STREAM_THREADS_CONFIG=2 > >>> > >>> I did see several tickets with similar errors that are marked as fixed. > >> I'm > >>> using version 0.10.2.1. > >>> > >>> 17/12/04 18:34:57 WARN StreamThread: Could not create task 0_2. Will > >> retry. > >>> org.apache.kafka.streams.errors.LockException: task [0_2] Failed to > lock > >>> the state directory for task 0_2 > >>> at > >>> org.apache.kafka.streams.processor.internals. > >> ProcessorStateManager.<init>(ProcessorStateManager.java:100) > >>> at > >>> org.apache.kafka.streams.processor.internals.AbstractTask.<init>( > >> AbstractTask.java:73) > >>> at > >>> org.apache.kafka.streams.processor.internals. > >> StreamTask.<init>(StreamTask.java:108) > >>> at > >>> org.apache.kafka.streams.processor.internals. > >> StreamThread.createStreamTask(StreamThread.java:864) > >>> at > >>> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator. > >> createTask(StreamThread.java:1237) > >>> at > >>> org.apache.kafka.streams.processor.internals.StreamThread$ > >> AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210) > >>> at > >>> org.apache.kafka.streams.processor.internals. > >> StreamThread.addStreamTasks(StreamThread.java:967) > >>> at > >>> org.apache.kafka.streams.processor.internals.StreamThread.access$600( > >> StreamThread.java:69) > >>> at > >>> org.apache.kafka.streams.processor.internals.StreamThread$1. > >> onPartitionsAssigned(StreamThread.java:234) > >>> at > >>> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator. > >> onJoinComplete(ConsumerCoordinator.java:259) > >>> at > >>> org.apache.kafka.clients.consumer.internals.AbstractCoordinator. > >> joinGroupIfNeeded(AbstractCoordinator.java:352) > >>> at > >>> org.apache.kafka.clients.consumer.internals.AbstractCoordinator. > >> ensureActiveGroup(AbstractCoordinator.java:303) > >>> at > >>> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll( > >> ConsumerCoordinator.java:290) > >>> at > >>> org.apache.kafka.clients.consumer.KafkaConsumer. > >> pollOnce(KafkaConsumer.java:1029) > >>> at > >>> org.apache.kafka.clients.consumer.KafkaConsumer.poll( > >> KafkaConsumer.java:995) > >>> at > >>> org.apache.kafka.streams.processor.internals.StreamThread.runLoop( > >> StreamThread.java:592) > >>> at > >>> org.apache.kafka.streams.processor.internals. > >> StreamThread.run(StreamThread.java:361) > >>> 17/12/04 18:34:58 WARN StreamThread: Could not create task 0_8. Will > >> retry. > >>> org.apache.kafka.streams.errors.LockException: task [0_8] Failed to > lock > >>> the state directory for task 0_8 > >>> > >>> > >>> kafka-consumer-groups --new-consumer --bootstrap-server <...> > --describe > >>> --group GeoTest > >>> Note: This will only show information about consumers that use the Java > >>> consumer API (non-ZooKeeper-based consumers). > >>> > >>> Warning: Consumer group 'GeoTest' is rebalancing. > >>> > >>> I keep seeing the above lock exception continuously and app is not > making > >>> any progress. Any idea why it is stuck? > >>> I read a few suggestions that required me to manually delete state > >>> directory. I'd like to avoid that. > >>> > >>> Thanks, > >>> Srikanth > >>> > >> > >> > > > >