Hi,

We are having 3 brokers in a cluster. Producer request is getting failed
for broker 2. We are frequently getting below exception:

15/09/09 22:09:06 WARN async.DefaultEventHandler: Failed to send
producer request with* correlation id 1455 to broker 2* with data for
partitions [UserEvents,0]
> java.net.SocketTimeoutException
>       at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
>       at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>       at 
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>       at kafka.utils.Utils$.read(Utils.scala:375)
>       at 
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>       at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>       at 
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>       at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
>
>
After looking into request-logs in all machines, found that there is some
slowness in broker 2. I am listing top 20 request processing time from all
the brokers.

Broker 1

  Broker 2

                       Broker 3

Producer&

Fetcher

Producer

Producer + Fetcher

Producer

            Producer +  Fetcher

Producer

493

494

495

496

497

498

499

500

501

502

503

504

519

520

541

542

545

551

577

633

77

91

94

96

104

111

112

153

167

184

248

249

254

284

395

443

470

551

577

633

1033

1034

1035

1036

1037

1038

1039

1040

1042

1043

1044

1049

1051

1057

1064

1087

1145

1146

1466

1467

85

86

114

121

123

136

153

201

225

226

240

299

405

406

448

449

455

464

505

658

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

506

510

514

515

516

19

20

21

22

23

24

27

28

31

32

60

89

98

104

110

114

259

288

337

385


What can be the reason that fetcher thread taking more time to process?

What we need to do to get better performance? Any properties we need to
tweak?

Any suggestion are welcome.


Note: We are pushing data to Kafka in user thread(tomcat) and set producer
request timeout to 2sec. We don't want to increase timeout more than 2
sec., as if it too many threads will get hangup then application will be
hanged.


Thanks and Regards,
Madhukar

Reply via email to