hi Kostas,

Copy pasting this snippet where we see the fluctuations. Let me know if
this helps.

2020-09-22 23:39:19,646 DEBUG org.apache.kafka.clients.NetworkClient
                 - Node 3 disconnected.
2020-09-22 23:39:19,646 DEBUG org.apache.kafka.clients.NetworkClient
                 - Initialize connection to node
be-kafka-dragonpit-broker-4:8017 (id: 4 rack: null) for sending metadata
request
2020-09-22 23:39:19,646 DEBUG org.apache.kafka.clients.NetworkClient
                 - Initiating connection to node
be-kafka-dragonpit-broker-4:8017 (id: 4 rack: null)
2020-09-22 23:39:19,664 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Fetch
READ_UNCOMMITTED at offset 834984310 for partition captchastream-6 returned
fetch data (error=NONE, highWaterMark=834984311, lastStableOffset = -1,
logStartOffset = 834470755, abortedTransactions = null,
recordsSizeInBytes=1516)
2020-09-22 23:39:19,665 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Added
READ_UNCOMMITTED fetch request for partition captchastream-6 at offset
834984311 to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:39:19,665 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Sending
READ_UNCOMMITTED fetch for partitions [captchastream-6] to broker
be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:39:19,665 DEBUG org.apache.kafka.clients.NetworkClient
                 - Sending metadata request (type=MetadataRequest,
topics=captchastream) to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack:
null)
2020-09-22 23:39:19,665 DEBUG org.apache.kafka.common.network.Selector
                 - Created socket with SO_RCVBUF = 65536, SO_SNDBUF =
131072, SO_TIMEOUT = 0 to node 4
2020-09-22 23:39:19,665 DEBUG org.apache.kafka.clients.NetworkClient
                 - Completed connection to node 4. Fetching API versions.
2020-09-22 23:39:19,665 DEBUG org.apache.kafka.clients.NetworkClient
                 - Initiating API versions fetch from node 4.
2020-09-22 23:39:19,666 DEBUG org.apache.kafka.clients.Metadata
                - Updated cluster metadata version 319 to Cluster(id =
4ou4oBz8TU24ipwW8ws1Bw, nodes = [be-kafka-dragonpit-broker-6:8017 (id: 6
rack: null), be-kafka-dragonpit-broker-4:8017 (id: 4 rack: null),
be-kafka-dragonpit-broker-8:8017 (id: 8 rack: null),
be-kafka-dragonpit-broker-3:8017 (id: 3 rack: null),
be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null),
be-kafka-dragonpit-broker-7:8017 (id: 7 rack: null)], partitions =
[Partition(topic = captchastream, partition = 8, leader = 7, replicas =
[7,3], isr = [7,3]), Partition(topic = captchastream, partition = 9, leader
= 8, replicas = [8,4], isr = [4,8]), Partition(topic = captchastream,
partition = 4, leader = 3, replicas = [3,4], isr = [3,4]), Partition(topic
= captchastream, partition = 5, leader = 4, replicas = [4,5], isr = [4,5]),
Partition(topic = captchastream, partition = 6, leader = 5, replicas =
[5,7], isr = [5,7]), Partition(topic = captchastream, partition = 7, leader
= 6, replicas = [6,8], isr = [8,6]), Partition(topic = captchastream,
partition = 0, leader = 5, replicas = [5,6], isr = [5,6]), Partition(topic
= captchastream, partition = 1, leader = 6, replicas = [6,7], isr = [7,6]),
Partition(topic = captchastream, partition = 2, leader = 7, replicas =
[7,8], isr = [7,8]), Partition(topic = captchastream, partition = 3, leader
= 8, replicas = [8,3], isr = [8,3])])
2020-09-22 23:39:19,666 DEBUG org.apache.kafka.clients.NetworkClient
                 - Recorded API versions for node 4: (Produce(0): 0 to 5
[usable: 3], Fetch(1): 0 to 7 [usable: 5], Offsets(2): 0 to 2 [usable: 2],
Metadata(3): 0 to 5 [usable: 4], LeaderAndIsr(4): 0 to 1 [usable: 0],
StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 4 [usable: 3],
ControlledShutdown(7): 0 to 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable:
3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1
[usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1
[usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1
[usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1
[usable: 1], SaslHandshake(17): 0 to 1 [usable: 0], ApiVersions(18): 0 to 1
[usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1
[usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0
[usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0],
AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0],
EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0],
TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0],
CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0],
DescribeConfigs(32): 0 to 1 [usable: 0], AlterConfigs(33): 0 [usable: 0],
UNKNOWN(34): 0, UNKNOWN(35): 0, UNKNOWN(36): 0, UNKNOWN(37): 0,
UNKNOWN(38): 0, UNKNOWN(39): 0, UNKNOWN(40): 0, UNKNOWN(41): 0,
UNKNOWN(42): 0)
2020-09-22 23:39:19,716 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Fetch
READ_UNCOMMITTED at offset 834984311 for partition captchastream-6 returned
fetch data (error=NONE, highWaterMark=834984312, lastStableOffset = -1,
logStartOffset = 834470755, abortedTransactions = null,
recordsSizeInBytes=3479)
2020-09-22 23:39:19,716 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Added
READ_UNCOMMITTED fetch request for partition captchastream-6 at offset
834984312 to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:39:19,716 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Sending
READ_UNCOMMITTED fetch for partitions [captchastream-6] to broker
be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:39:19,815 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Fetch
READ_UNCOMMITTED at offset 834984312 for partition captchastream-6 returned
fetch data (error=NONE, highWaterMark=834984313, lastStableOffset = -1,
logStartOffset = 834470755, abortedTransactions = null,
recordsSizeInBytes=1523)
2020-09-22 23:39:19,815 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Added
READ_UNCOMMITTED fetch request for partition captchastream-6 at offset
834984313 to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:39:19,815 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Sending
READ_UNCOMMITTED fetch for partitions [captchastream-6] to broker
be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:39:20,239 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Fetch
READ_UNCOMMITTED at offset 834984313 for partition captchastream-6 returned
fetch data (error=NONE, highWaterMark=834984314, lastStableOffset = -1,
logStartOffset = 834470755, abortedTransactions = null,
recordsSizeInBytes=1296)
2020-09-22 23:39:20,239 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Added
READ_UNCOMMITTED fetch request for partition captchastream-6 at offset
834984314 to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)


2020-09-22 23:48:19,675 DEBUG org.apache.kafka.clients.NetworkClient
                 - Node 4 disconnected.
2020-09-22 23:48:19,675 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Fetch
READ_UNCOMMITTED at offset 834989827 for partition captchastream-6 returned
fetch data (error=NONE, highWaterMark=834989828, lastStableOffset = -1,
logStartOffset = 834470755, abortedTransactions = null,
recordsSizeInBytes=1019)
2020-09-22 23:48:19,675 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Added
READ_UNCOMMITTED fetch request for partition captchastream-6 at offset
834989828 to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:48:19,675 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Sending
READ_UNCOMMITTED fetch for partitions [captchastream-6] to broker
be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:48:19,675 DEBUG org.apache.kafka.clients.NetworkClient
                 - Initialize connection to node
be-kafka-dragonpit-broker-8:8017 (id: 8 rack: null) for sending metadata
request
2020-09-22 23:48:19,675 DEBUG org.apache.kafka.clients.NetworkClient
                 - Initiating connection to node
be-kafka-dragonpit-broker-8:8017 (id: 8 rack: null)
2020-09-22 23:48:19,683 DEBUG org.apache.kafka.common.network.Selector
                 - Created socket with SO_RCVBUF = 65536, SO_SNDBUF =
131072, SO_TIMEOUT = 0 to node 8
2020-09-22 23:48:19,684 DEBUG org.apache.kafka.clients.NetworkClient
                 - Completed connection to node 8. Fetching API versions.
2020-09-22 23:48:19,684 DEBUG org.apache.kafka.clients.NetworkClient
                 - Initiating API versions fetch from node 8.
2020-09-22 23:48:19,684 DEBUG org.apache.kafka.clients.NetworkClient
                 - Sending metadata request (type=MetadataRequest,
topics=captchastream) to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack:
null)
2020-09-22 23:48:19,685 DEBUG org.apache.kafka.clients.NetworkClient
                 - Recorded API versions for node 8: (Produce(0): 0 to 5
[usable: 3], Fetch(1): 0 to 7 [usable: 5], Offsets(2): 0 to 2 [usable: 2],
Metadata(3): 0 to 5 [usable: 4], LeaderAndIsr(4): 0 to 1 [usable: 0],
StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 4 [usable: 3],
ControlledShutdown(7): 0 to 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable:
3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1
[usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1
[usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1
[usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1
[usable: 1], SaslHandshake(17): 0 to 1 [usable: 0], ApiVersions(18): 0 to 1
[usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1
[usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0
[usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0],
AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0],
EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0],
TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0],
CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0],
DescribeConfigs(32): 0 to 1 [usable: 0], AlterConfigs(33): 0 [usable: 0],
UNKNOWN(34): 0, UNKNOWN(35): 0, UNKNOWN(36): 0, UNKNOWN(37): 0,
UNKNOWN(38): 0, UNKNOWN(39): 0, UNKNOWN(40): 0, UNKNOWN(41): 0,
UNKNOWN(42): 0)
2020-09-22 23:48:19,685 DEBUG org.apache.kafka.clients.Metadata
                - Updated cluster metadata version 321 to Cluster(id =
4ou4oBz8TU24ipwW8ws1Bw, nodes = [be-kafka-dragonpit-broker-6:8017 (id: 6
rack: null), be-kafka-dragonpit-broker-4:8017 (id: 4 rack: null),
be-kafka-dragonpit-broker-3:8017 (id: 3 rack: null),
be-kafka-dragonpit-broker-7:8017 (id: 7 rack: null),
be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null),
be-kafka-dragonpit-broker-8:8017 (id: 8 rack: null)], partitions =
[Partition(topic = captchastream, partition = 8, leader = 7, replicas =
[7,3], isr = [7,3]), Partition(topic = captchastream, partition = 9, leader
= 8, replicas = [8,4], isr = [4,8]), Partition(topic = captchastream,
partition = 4, leader = 3, replicas = [3,4], isr = [3,4]), Partition(topic
= captchastream, partition = 5, leader = 4, replicas = [4,5], isr = [4,5]),
Partition(topic = captchastream, partition = 6, leader = 5, replicas =
[5,7], isr = [5,7]), Partition(topic = captchastream, partition = 7, leader
= 6, replicas = [6,8], isr = [8,6]), Partition(topic = captchastream,
partition = 0, leader = 5, replicas = [5,6], isr = [5,6]), Partition(topic
= captchastream, partition = 1, leader = 6, replicas = [6,7], isr = [7,6]),
Partition(topic = captchastream, partition = 2, leader = 7, replicas =
[7,8], isr = [7,8]), Partition(topic = captchastream, partition = 3, leader
= 8, replicas = [8,3], isr = [8,3])])
2020-09-22 23:48:19,809 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Fetch
READ_UNCOMMITTED at offset 834989828 for partition captchastream-6 returned
fetch data (error=NONE, highWaterMark=834989829, lastStableOffset = -1,
logStartOffset = 834470755, abortedTransactions = null,
recordsSizeInBytes=3489)
2020-09-22 23:48:19,809 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Added
READ_UNCOMMITTED fetch request for partition captchastream-6 at offset
834989829 to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:48:19,809 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Sending
READ_UNCOMMITTED fetch for partitions [captchastream-6] to broker
be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)
2020-09-22 23:48:19,902 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Fetch
READ_UNCOMMITTED at offset 834989829 for partition captchastream-6 returned
fetch data (error=NONE, highWaterMark=834989830, lastStableOffset = -1,
logStartOffset = 834470755, abortedTransactions = null,
recordsSizeInBytes=1736)
2020-09-22 23:48:19,903 DEBUG
org.apache.kafka.clients.consumer.internals.Fetcher           - Added
READ_UNCOMMITTED fetch request for partition captchastream-6 at offset
834989830 to node be-kafka-dragonpit-broker-5:8017 (id: 5 rack: null)

On Wed, Sep 23, 2020 at 3:08 PM Kostas Kloudas <kklou...@gmail.com> wrote:

> Hi Ramya,
>
> Unfortunately I cannot see them.
>
> Kostas
>
> On Wed, Sep 23, 2020 at 10:27 AM Ramya Ramamurthy <hair...@gmail.com>
> wrote:
> >
> > Hi Kostas,
> >
> > Attaching the taskmanager logs regarding this issue.
> > I have attached the Kaka related metrics. I hope you can see it this
> time.
> >
> > Not sure why we get these many disconnects to Kafka. Maybe because of
> this interruptions, we seem to slow down on our processing. At some point
> the memory also increases and the workers almost stagnate not doing any
> processing. I have 3GB heap committed and allotted 5GB memory to the pods.
> >
> > Thanks for your help.
> >
> > ~Ramya.
> >
> > On Tue, Sep 22, 2020 at 9:18 PM Kostas Kloudas <kklou...@gmail.com>
> wrote:
> >>
> >> Hi Ramya,
> >>
> >> Unfortunately your images are blocked. Could you upload them somewhere
> and
> >> post the links here?
> >> Also I think that the TaskManager logs may be able to help a bit more.
> >> Could you please provide them here?
> >>
> >> Cheers,
> >> Kostas
> >>
> >> On Tue, Sep 22, 2020 at 8:58 AM Ramya Ramamurthy <hair...@gmail.com>
> wrote:
> >>
> >> > Hi,
> >> >
> >> > We are seeing an issue with Flink on our production. The version is
> 1.7
> >> > which we use.
> >> > We started seeing sudden lag on kafka, and the consumers were no
> longer
> >> > working/accepting messages. On trying to enable debug mode, the below
> >> > errors were seen
> >> > [image: image.jpeg]
> >> >
> >> > I am not sure why this occurs everyday and when this happens, I can
> see
> >> > the remaining workers arent able to handle the load. Unless i restart
> my
> >> > jobs, i am unable to start processing again. This way, there is data
> loss
> >> > as well.
> >> >
> >> > On the below graph, there is a slight dip in consumption before 5:30.
> That
> >> > is when this incident happens and correlated with logs.
> >> >
> >> > [image: image.jpeg]
> >> >
> >> > Any pointers/suggestions would be appreciated.
> >> >
> >> > Thanks.
> >> >
> >> >
>

Reply via email to