[
https://issues.apache.org/jira/browse/KAFKA-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220604#comment-15220604
]
Jiangjie Qin edited comment on KAFKA-3334 at 3/31/16 8:21 PM:
--------------------------------------------------------------
[~singhashish] I think we are on the same page that we want to let user have a
clear idea about where to look at if something goes wrong. In terms of
documentation, it is probably extremely difficult to document all the possible
scenario user might see because we have so many different configuration
combinations and each combination might result in different behaviors.
Documentation based on scenario might never be enough:) I was thinking about
the following:
1. The default configurations should just work out of the box in general if
user does not change any configurations.
2. For each configuration, we need to document clearly what this configuration
is for and what are the possible impact.
3. For each exception thrown to user, a clear yet brief message about what was
wrong should be in the error message itself. In the documentation of the
exceptions, we can list the possible places this exception is thrown (this
should match the info in the error message) and what is the possible cause as
well as suggested solution.
Taking this particular case as an example, in the TimeoutException thrown from
producer.send() user will see
{noformat}
"The producer failed to fetch the metadata for the topic XXX after XXX ms.
Please see the exception documentation for possible cause."
{noformat}
And the documentation of TimeoutException should have something like
{noformat}
"This exception can be thrown in the following cases:
1. The producer cannot fetch the metadata of a topic. This only happens when
the producer is sending message to the topic for the first time. It is more
likely to happen if the topic did not exist on the brokers. The new topic
creation on the broker might take some time. User can retry send the message in
this case.
2. blah blah blah"
{noformat}
I feel this is more intuitive for the users to get an idea about what went
wrong because at the end of the day, the first thing user will see is the
exception. If the exception itself does not provide clear pointer, users do not
know where to start. For example, if user see TimeoutException, what are they
supposed to search or read?
So my point is that we should provide crystal clear message in the exception
itself, through both error message and documentation.
I agree that it might also be useful if we provide the detail on how
KafkaProducer sends the message. But it seems for users really care about the
internal details, reading the code is probably the best way.
was (Author: becket_qin):
[~singhashish] I think we are on the same page that we want to let user have a
clear idea where to look at if something goes wrong. In terms of documentation,
it is probably extremely difficult to document all the possible scenario user
might see because we have so many different configuration combinations and each
combination might result in different behaviors. Documentation based on
scenario might never be enough:) I was thinking about the following:
1. The default configurations should just work out of the box in general if
user does not change any configurations.
2. For each configuration, we need to document clearly what this configuration
is for and what are the possible impact.
3. For each exception thrown to user, a clear yet brief message about what was
wrong should be in the error message itself. In the documentation of the
exceptions, we can list the possible places this exception is thrown (this
should match the info in the error message) and what is the possible cause as
well as suggested solution.
Taking this particular case as an example, in the TimeoutException thrown from
producer.send() user will see
{noformat}
"The producer failed to fetch the metadata for the topic XXX after XXX ms.
Please see the exception documentation for possible cause."
{noformat}
And the documentation of TimeoutException should have something like
{noformat}
"This exception can be thrown in the following cases:
1. The producer cannot fetch the metadata of a topic. This only happens when
the producer is sending message to the topic for the first time. It is more
likely to happen if the topic did not exist on the brokers. The new topic
creation on the broker might take some time. User can retry send the message in
this case.
2. blah blah blah"
{noformat}
I feel this is more intuitive for the users to get an idea about what went
wrong because at the end of the day, the first thing user will see is the
exception. If the exception itself does not provide clear pointer, users do not
know where to start. For example, if user see TimeoutException, what are they
supposed to search or read?
So my point is that we should provide crystal clear message in the exception
itself, through both error message and documentation.
I agree that it might also be useful if we provide the detail on how
KafkaProducer sends the message. But it seems for users really care about the
internal details, reading the code is probably the best way.
> First message on new topic not actually being sent, no exception thrown
> -----------------------------------------------------------------------
>
> Key: KAFKA-3334
> URL: https://issues.apache.org/jira/browse/KAFKA-3334
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.9.0.0
> Environment: Linux, Java
> Reporter: Aleksandar Stojadinovic
> Assignee: Ashish K Singh
> Fix For: 0.10.1.0
>
>
> Although I've seen this issue pop around the internet in a few forms, I'm not
> sure it is yet properly fixed.
> When publishing to a new topic, with auto create-enabled, the java client
> (0.9.0) shows this WARN message in the log, and the message is not sent
> obviously:
> org.apache.kafka.clients.NetworkClient - Error while fetching metadata with
> correlation id 0 : {file.topic=LEADER_NOT_AVAILABLE}
> In the meantime I see in the console the message that a log for partition is
> created. The next messages are patched through normally, but the first one is
> never sent. No exception is ever thrown, either by calling get on the future,
> or with the async usage, like everything is perfect.
> I notice when I leave my application blocked on the get call, in the
> debugger, then the message may be processed, but with significant delay. This
> is consistent with another issue I found for the python client. Also, if I
> call partitionsFor previously, the topic is created and the message is sent.
> But it seems silly to call it every time, just to mitigate this issue.
> {code}
> Future<RecordMetadata> recordMetadataFuture = producer.send(new
> ProducerRecord<>(topic, key, file));
> RecordMetadata recordMetadata = recordMetadataFuture.get(30,
> TimeUnit.SECONDS);
> {code}
> I hope I'm clear enough.
> Related similar (but not same) issues:
> https://issues.apache.org/jira/browse/KAFKA-1124
> https://github.com/dpkp/kafka-python/issues/150
> http://stackoverflow.com/questions/35187933/how-to-resolve-leader-not-available-kafka-error-when-trying-to-consume
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)