Re: Timeout Exception When Producing/Consuming Messages to Hundreds of Topics

Arvid Heise Thu, 04 Mar 2021 01:45:44 -0800

Hi Claude,

Are you sure that the level of abstraction is fitting to your application?
Maybe you could describe your use case a bit.


Since you use one consumer, I'm assuming that all topics have the same
schema. Then, for me, hundreds of similarly structured topics with X
partitions sound counter-intuitive. Couldn't you just put it into the same
topic with X*100 partitions and use a composite key?

On Thu, Mar 4, 2021 at 9:42 AM Piotr Nowojski <pnowoj...@apache.org> wrote:

> Hi,
>
> Sorry, I don't know. I've heard that this kind of pattern is discouraged
> by Confluent. At least it used to be.
>
> Maybe someone else from the community will be able to help from his
> experience, however keep in mind that under the hood Flink is just simply
> using KafkaConsumer and KafkaProducer provided by Apache Kafka project.
> Only in a distributed fashion. You might have better luck trying to find an
> answer on Kafka user mailing list, or searching how to tune Kafka to work
> with hundreds of topics or how to solve timeout problems.
>
> Unless this is a well known and common issue, you will need to dig out
> more information on your own. Maybe for starters try investigating if the
> problem is caused by overloading Kafka clients running in Flink, or whether
> the problem is in overloaded Kafka brokers. Start with standard Java things
> like:
> - cpu overloaded
> - memory pressure/GC pauses
> - machines swapping
> - blocking disk IO (iostat)
> - single thread overloaded
> - network issues?
>
> Once you identify the source of the problem, it will be easier to find a
> solution. But again, most likely you will find your answer quicker asking
> on Kafka mailing lists.
>
> Best,
> Piotrek
>
> pon., 1 mar 2021 o 18:01 Claude M <claudemur...@gmail.com> napisał(a):
>
>> Hello,
>>
>> I'm trying to run an experiment w/ two flink jobs:
>>
>>    - A producer producing messages to hundreds of topics
>>    - A consumer consuming the messages from all the topics
>>
>> After the job runs after a few minutes, it will fail w/ following error:
>>
>> Caused by: org.apache.kafka.common.errors.TimeoutException: Topic
>> <topic-name> not present in metadata after 60000 ms
>>
>> If I run the job w/ a few topics, it will work.  I have tried setting the
>> following properties in the job but still encounter the problem:
>>
>> properties.setProperty("retries", "20");
>> properties.setProperty("request.timeout.ms", "300000");
>> properties.setProperty("metadata.fetch.timeout.ms", "300000");
>>
>> Any ideas about this?
>>
>> Thanks
>>
>

Re: Timeout Exception When Producing/Consuming Messages to Hundreds of Topics

Reply via email to