I will take the rebound of this subject to post a question if that's ok. What is the best solution between start X instances with num_stream_thread=1 to one instance with X threads, knowing that I have only 2 dedicated servers at my disposal ?
The best should be to have more hosts but currently we only have hosts and 2 instances : one on each machine with 2 threads per instances. Thanks for advices Best Regards On Wed, Dec 6, 2017 at 12:18 PM Srikanth <srikanth...@gmail.com> wrote: > Thanks Matthias. Please find my response inline. > > On Wed, Dec 6, 2017 at 12:34 AM, Matthias J. Sax <matth...@confluent.io> > wrote: > > > Hard to say. > > > > However, deleting state directories will not have any negative impact as > > you don't use stores. Thus, why do you not want to do this? > > <Sri> We are running this in a prod-like setup. I don't have SSH access > to > > the box. > > Its not ideal to request deleting these every now and then. It doesn't > > sound prod ready. > > > > > > Another workaround you can do, it to start four applications with 1 > > thread each -- this would isolate the instances further and avoid the > > lock issue (you would need to run on different host, or on the same > > host, but with different state directory for the different instances.) > > <Sri> App has a large in-mem cache. I'd like more threads to use this > > cache to keep memory/cpu usage in balance. > > > > > > In general, I would recommend to upgrade you applications -- you don't > > need to upgrade the brokers for this. 1.0 versions has many additional > > bug fixes. This should resolve the issues you are facing. > > <Sri> I haven't really tried bidirectional compatibility mode. Is there any > > caveat I need to be aware of for 1.0.x client and 0.10.2.1 brokers? > > I probably should try this. > > > > > Otherwise, it's hard to say without logs. > > <Sri> Unfortunately, that's all I had in the logs. I know it isn't > > helpful. I've kept default logging to WARN except for few packages. > > I assume any logging that indicates why app isn't making progress would be > > in WARN. I'll bump the level up for few days. > > > > > Hope this helps. > > > > > > -Matthias > > > > > > On 12/5/17 9:35 AM, Srikanth wrote: > > > Hello, > > > > > > We noticed that a kafka streams app is stuck in rebalance state with > > below > > > error. > > > Two instance of the app were running fine until a rebalace was > > > triggered(possibly due to network issue). > > > Both app instance are running(no app restart) > > > App itself doesn't create/use state store. NUM_STREAM_THREADS_CONFIG=2 > > > > > > I did see several tickets with similar errors that are marked as fixed. > > I'm > > > using version 0.10.2.1. > > > > > > 17/12/04 18:34:57 WARN StreamThread: Could not create task 0_2. Will > > retry. > > > org.apache.kafka.streams.errors.LockException: task [0_2] Failed to > lock > > > the state directory for task 0_2 > > > at > > > org.apache.kafka.streams.processor.internals. > > ProcessorStateManager.<init>(ProcessorStateManager.java:100) > > > at > > > org.apache.kafka.streams.processor.internals.AbstractTask.<init>( > > AbstractTask.java:73) > > > at > > > org.apache.kafka.streams.processor.internals. > > StreamTask.<init>(StreamTask.java:108) > > > at > > > org.apache.kafka.streams.processor.internals. > > StreamThread.createStreamTask(StreamThread.java:864) > > > at > > > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator. > > createTask(StreamThread.java:1237) > > > at > > > org.apache.kafka.streams.processor.internals.StreamThread$ > > AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210) > > > at > > > org.apache.kafka.streams.processor.internals. > > StreamThread.addStreamTasks(StreamThread.java:967) > > > at > > > org.apache.kafka.streams.processor.internals.StreamThread.access$600( > > StreamThread.java:69) > > > at > > > org.apache.kafka.streams.processor.internals.StreamThread$1. > > onPartitionsAssigned(StreamThread.java:234) > > > at > > > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator. > > onJoinComplete(ConsumerCoordinator.java:259) > > > at > > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator. > > joinGroupIfNeeded(AbstractCoordinator.java:352) > > > at > > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator. > > ensureActiveGroup(AbstractCoordinator.java:303) > > > at > > > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll( > > ConsumerCoordinator.java:290) > > > at > > > org.apache.kafka.clients.consumer.KafkaConsumer. > > pollOnce(KafkaConsumer.java:1029) > > > at > > > org.apache.kafka.clients.consumer.KafkaConsumer.poll( > > KafkaConsumer.java:995) > > > at > > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop( > > StreamThread.java:592) > > > at > > > org.apache.kafka.streams.processor.internals. > > StreamThread.run(StreamThread.java:361) > > > 17/12/04 18:34:58 WARN StreamThread: Could not create task 0_8. Will > > retry. > > > org.apache.kafka.streams.errors.LockException: task [0_8] Failed to > lock > > > the state directory for task 0_8 > > > > > > > > > kafka-consumer-groups --new-consumer --bootstrap-server <...> > --describe > > > --group GeoTest > > > Note: This will only show information about consumers that use the Java > > > consumer API (non-ZooKeeper-based consumers). > > > > > > Warning: Consumer group 'GeoTest' is rebalancing. > > > > > > I keep seeing the above lock exception continuously and app is not > making > > > any progress. Any idea why it is stuck? > > > I read a few suggestions that required me to manually delete state > > > directory. I'd like to avoid that. > > > > > > Thanks, > > > Srikanth > > > > > > > > -- Saïd BOURAS Consultant Big Data Mobile: 0662988731 Zenika Paris 10 rue de Milan 75009 Paris Standard : +33(0)1 45 26 19 15 <+33(0)145261915> - Fax : +33(0)1 72 70 45 10 <+33(0)172704510>