Hey guys,
We are running a stream application in our production environment. On our
latest restart, the application is consistently moving between these two
states.
>From our logs:
grep "State transition from " application.log | jq -r '.message' | sort |
uniq -c | sort -n -r
40 stream-thread [yyyy-StreamThread-9] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-8] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-7] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-6] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-5] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-4] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-3] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-2] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-1] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-12] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-11] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
40 stream-thread [yyyy-StreamThread-10] State transition from
PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
39 stream-thread [yyyy-StreamThread-9] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-8] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-7] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-6] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-5] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-4] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-3] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-2] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-1] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-12] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-11] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
39 stream-thread [yyyy-StreamThread-10] State transition from
PARTITIONS_ASSIGNED to PARTITIONS_REVOKED
As we can see the stream threads are first revoked than again assigned.
Also we can see the logs of resetting of offsets continuously as follows:
Resetting offset for partition xxxx-2 to offset 9166288.
We had actually deleted the consumer group on broker before the restart as
there was considerable lag in the topic and processing of the stale data
was not intended. We had assumed that on deleting the group, the
application will start processing from latest offset as mentioned in the
config auto.offset.reset policy.
On describing the consumer group on broker side, we receive output with the
current offset and lag set as -- (Eg shown below)
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
CONSUMER-ID
xxx 10 - 129822997 -
yyyy-StreamThread-2-consumer-a-b-c-d
Please help us understand why this can be happening and how to solve this.