[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16031499#comment-16031499
 ] 

ASF GitHub Bot commented on KAFKA-5154:
---------------------------------------

GitHub user dguy opened a pull request:

    https://github.com/apache/kafka/pull/3181

    KAFKA-5154: Consumer fetches from revoked partitions when SyncGroup fails 
with disconnection [WIP]

    Scenario is as follows:
    1. Consumer subscribes to topic t1 and begins consuming
    2. heartbeat fails as the group is rebalancing
    3. ConsumerCoordinator.onJoinGroupPrepare is called
       3.1 onPartitionsRevoked is called
    4. consumer becomes the group leader
    5. sends sync group request
    6. sync group is cancelled due to disconnection
    7. fetch request is sent for partitions that have previously been revoked

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dguy/kafka kafka-5154

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/3181.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3181
    
----
commit f84737d30acdb8b49e8e0d4e3da8720083e88354
Author: Damian Guy <damian....@gmail.com>
Date:   2017-05-31T16:40:50Z

    just a test for discussion

----


> Kafka Streams throws NPE during rebalance
> -----------------------------------------
>
>                 Key: KAFKA-5154
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5154
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.2.0
>            Reporter: Lukas Gemela
>            Assignee: Damian Guy
>         Attachments: 5154_problem.log, clio_afa596e9b809.gz, clio_reduced.gz, 
> clio.txt.gz
>
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>       at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>       at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:00,038 INFO  StreamThread-1 
> org.apache.kafka.clients.producer.KafkaProducer.close() @689 - Closing the 
> Kafka producer with timeoutMillis = 9223372036854775807 ms.
> 2017-05-01T00:02:00,949 WARN  StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.setState() @160 - 
> Unexpected state transition from PARTITIONS_REVOKED to NOT_RUNNING
> 2017-05-01T00:02:00,951 ERROR StreamThread-1 
> com.williamhill.trading.platform.hades.kafka.KafkaStreamManager.uncaughtException()
>  @104 - UncaughtException in thread StreamThread-1, stopping kafka streams
>  java.lang.NullPointerException
>       at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>       at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:01,076 WARN  kafka-streams-close-thread 
> org.apache.kafka.streams.processor.internals.StreamThread.setState() @160 - 
> Unexpected state transition from NOT_RUNNING to PENDING_SHUTDOWN
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to