For what is worth, I also agree that this is better. It is awkward to implement and it is not done by any other request and that is why I didn't push for it, but happy that we are going in that direction. It's worth noting that even with this change, it seems to me that looking at the logs may still be needed since the error code would be `InvalidRequest` and we don't have a mechanism to pass a string along with the error (the alternative is to add a bunch of error codes).
Ismael On Mon, Jun 20, 2016 at 5:57 AM, Grant Henke <ghe...@cloudera.com> wrote: > Apologies for the delay in response here. > > It will take a bit of tracking inside the request object to track this > error and then handle it in KafkaApis, but it is possible. I am happy to > take that preferred approach. I will update the wiki & patch to handle this > scenario and re-initiate the vote tomorrow. > > Thanks, > Grant > > On Sun, Jun 19, 2016 at 8:59 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > I'm on the same page as Jun & Dana wrt disconnecting. Closing a > connection > > should really be a last resort because we can no longer trust correct > > behavior in this session. In this case, we detect a bad request, but > > there's no reason to believe it will affect subsequent requests. There > are > > dependencies to be sure, and if the client doesn't check errors, they may > > try to then write to topics that don't exist or something along those > > lines, but those requests can also be failed without killing the > underlying > > TCP connection. > > > > -Ewen > > > > On Fri, Jun 17, 2016 at 1:46 PM, Jun Rao <j...@confluent.io> wrote: > > > > > Grant, > > > > > > I think Dana has a valid point. Currently, we throw an > > > InvalidRequestException and close the connection only when the broker > > can't > > > deserialize the bytes into a request. In this case, the deserialization > > is > > > fine. It just that there are some additional constraints that can't be > > > specified at the protocol level. We can potentially just remember the > > > topics that violated those constraints in the request and handle them > > > accordingly with the right error code w/o disconnecting. > > > > > > Thanks, > > > > > > Jun > > > > > > On Fri, Jun 17, 2016 at 8:40 AM, Dana Powers <dana.pow...@gmail.com> > > > wrote: > > > > > > > I'm unconvinced (crazy, right?). Comments below: > > > > > > > > On Fri, Jun 17, 2016 at 7:27 AM, Grant Henke <ghe...@cloudera.com> > > > wrote: > > > > > Hi Dana, > > > > > > > > > > You mentioned one of the reasons I error and disconnect. Because I > > > can't > > > > > return an error for every request so the cardinality between > request > > > and > > > > > response would be different. Beyond that though, I am handling this > > > > > protocol rule/parsing error the same way all other messages do. > > > > > > > > But you can return an error for every topic, and isn't that the level > > > > of error required here? > > > > > > > > > CreateTopic Response (Version: 0) => [topic_error_codes] > > > > > topic_error_codes => topic error_code > > > > > topic => STRING > > > > > error_code => INT16 > > > > > > > > If I submit duplicate requests for a topic, it's an error isolated to > > > > that topic. If I mess up the partition / replication / etc semantics > > > > for a topic, that's an error isolated to that topic. Is there a > > > > cardinality problem at this level? > > > > > > > > > > > > > > > > > > Parsing is handled in the RequestChannel and any exception that > > occurs > > > > > during this phase is caught, converted into an > > InvalidRequestException > > > > and > > > > > results in a disconnect: > > > > > > > > > > > > > > > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/network/RequestChannel.scala#L92-L95 > > > > > > > > > > Since this is an error that could only occur (and would always > occur) > > > due > > > > > to incorrect client implementations, and not because of any cluster > > > state > > > > > or unusual situation, I felt this behavior was okay and made sense. > > For > > > > > client developers the broker logging should make it obvious what > the > > > > issue > > > > > is. My patch also clearly documents the protocol rules in the > > Protocol > > > > > definition. > > > > > > > > Documentation is great and definitely a must. But requiring client > > > > developers to dig through server logs is not ideal. Client developers > > > > don't always have direct access to those logs. And even if they do, > > > > the brokers may have other traffic, which makes it difficult to track > > > > down the exact point in the logs where the error occurred. > > > > > > > > As discussed above, I don't think you need to or should model this as > > > > a request-level parsing error. It may be easier for the current > broker > > > > implementation to do that and just crash the connection, but I don't > > > > think it makes that much sense from a raw api perspective. > > > > > > > > > In the future having a response header with an error code (and > > > optimally > > > > > error message) for every response would help solve this problem > (for > > > all > > > > > message types). > > > > > > > > That will definitely help solve the more general invalid request > error > > > > problem. But I think given the current state of error handling / > > > > feedback from brokers on request-level errors, you should treat > > > > connection crash as a last resort. I think there is a good > opportunity > > > > to avoid it in this case, and I think the api would be better if done > > > > that way. > > > > > > > > -Dana > > > > > > > > > On Fri, Jun 17, 2016 at 12:04 AM, Dana Powers < > dana.pow...@gmail.com > > > > > > > wrote: > > > > > > > > > >> Why disconnect the client on a InvalidRequestException? The 2 > errors > > > > >> you are catching are both topic-level: (1) multiple requests for > the > > > > >> same topic, and (2) ReplicaAssignment and num_partitions / > > > > >> replication_factor both set. Wouldn't it be better to just error > the > > > > >> offending create_topic_request, not the entire connection? The > > > > >> CreateTopicsResponse returns a map of topics to error codes. You > > could > > > > >> just return the topic that caused the error and an > > > > >> InvalidRequestException error code. > > > > >> > > > > >> -Dana > > > > >> > > > > >> On Thu, Jun 16, 2016 at 8:37 AM, Grant Henke <ghe...@cloudera.com > > > > > > wrote: > > > > >> > I have updated the wiki and pull request based on the feedback. > If > > > > there > > > > >> > are no objections I will start a vote at the end of the day. > > > > >> > > > > > >> > Details for this implementation can be read here: > > > > >> > > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-CreateTopicRequest > > > > >> > > > > > >> > The updated pull request can be found here (feel free to > review): > > > > >> > https://github.com/apache/kafka/pull/1489 > > > > >> > > > > > >> > Below is the exact content for clarity: > > > > >> > > > > > >> >> Create Topics Request (KAFKA-2945 > > > > >> >> <https://issues.apache.org/jira/browse/KAFKA-2945>) > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> CreateTopics Request (Version: 0) => [create_topic_requests] > > > timeout > > > > >> >> create_topic_requests => topic num_partitions > > replication_factor > > > > >> [replica_assignment] [configs] > > > > >> >> topic => STRING > > > > >> >> num_partitions => INT32 > > > > >> >> replication_factor => INT16 > > > > >> >> replica_assignment => partition_id [replicas] > > > > >> >> partition_id => INT32 > > > > >> >> replicas => INT32 > > > > >> >> configs => config_key config_value > > > > >> >> config_key => STRING > > > > >> >> config_value => STRING > > > > >> >> timeout => INT32 > > > > >> >> > > > > >> >> CreateTopicsRequest is a batch request to initiate topic > creation > > > > with > > > > >> >> either predefined or automatic replica assignment and > optionally > > > > topic > > > > >> >> configuration. > > > > >> >> > > > > >> >> Request semantics: > > > > >> >> > > > > >> >> 1. Must be sent to the controller broker > > > > >> >> 2. If there are multiple instructions for the same topic in > > one > > > > >> >> request an InvalidRequestException will be logged on the > > broker > > > > and > > > > >> the > > > > >> >> client will be disconnected. > > > > >> >> - This is because the list of topics is modeled server > side > > > as > > > > a > > > > >> >> map with TopicName as the key > > > > >> >> 3. The principal must be authorized to the "Create" > Operation > > on > > > > the > > > > >> >> "Cluster" resource to create topics. > > > > >> >> - Unauthorized requests will receive a > > > > >> ClusterAuthorizationException > > > > >> >> 4. > > > > >> >> > > > > >> >> Only one from ReplicaAssignment or (num_partitions + > > > > >> replication_factor > > > > >> >> ), can be defined in one instruction. > > > > >> >> - If both parameters are specified an > InvalidRequestException > > > > will be > > > > >> >> logged on the broker and the client will be disconnected. > > > > >> >> - In the case ReplicaAssignment is defined number of > > > partitions > > > > >> and > > > > >> >> replicas will be calculated from the supplied > > > > replica_assignment. > > > > >> >> - In the case of defined (num_partitions + > > > replication_factor) > > > > >> >> replica assignment will be automatically generated by the > > > > server. > > > > >> >> - One or the other must be defined. The existing broker > > side > > > > auto > > > > >> >> create defaults will not be used > > > > >> >> (default.replication.factor, num.partitions). The client > > > > >> implementation can > > > > >> >> have defaults for these options when generating the > > messages. > > > > >> >> - The first replica in [replicas] is assumed to be the > > > > preferred > > > > >> >> leader. This matches current behavior elsewhere. > > > > >> >> 5. Setting a timeout > 0 will allow the request to block > until > > > the > > > > >> >> topic metadata is "complete" on the controller node. > > > > >> >> - Complete means the local topic metadata cache been > > > completely > > > > >> >> populated and all partitions have leaders > > > > >> >> - The topic metadata is updated when the controller > > sends > > > > out > > > > >> >> update metadata requests to the brokers > > > > >> >> - If a timeout error occurs, the topic could still be > > created > > > > >> >> successfully at a later time. Its up to the client to > query > > > for > > > > >> the state > > > > >> >> at that point. > > > > >> >> 6. Setting a timeout <= 0 will validate arguments and > trigger > > > the > > > > >> >> create topics and return immediately. > > > > >> >> - This is essentially the fully asynchronous mode we have > > in > > > > the > > > > >> >> Zookeeper tools today. > > > > >> >> - The error code in the response will either contain an > > > > argument > > > > >> >> validation exception or a timeout exception. If you > > receive a > > > > >> timeout > > > > >> >> exception, because you asked for 0 timeout, you can > assume > > > the > > > > >> message was > > > > >> >> valid and the topic creation was triggered. > > > > >> >> 7. The request is not transactional. > > > > >> >> 1. If an error occurs on one topic, the others could > still > > be > > > > >> >> created. > > > > >> >> 2. Errors are reported independently. > > > > >> >> > > > > >> >> QA: > > > > >> >> > > > > >> >> - Why is CreateTopicsRequest a batch request? > > > > >> >> - Scenarios where tools or admins want to create many > > topics > > > > >> should > > > > >> >> be able to with fewer requests > > > > >> >> - Example: MirrorMaker may want to create the topics > > > downstream > > > > >> >> - What happens if some topics error immediately? Will it > > > > >> >> return immediately? > > > > >> >> - The request will block until all topics have either > been > > > > >> created, > > > > >> >> errors, or the timeout has been hit > > > > >> >> - There is no "short circuiting" where 1 error stops the > > > other > > > > >> >> topics from being created > > > > >> >> - Why implement "partial blocking" instead of fully async or > > > fully > > > > >> >> consistent? > > > > >> >> - See Cluster Consistent Blocking > > > > >> >> < > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-cluster-consistent-blocking > > > > >> > > > > > >> >> below > > > > >> >> - Why require the request to go to the controller? > > > > >> >> - The controller is responsible for the cluster metadata > > and > > > > >> >> its propagation > > > > >> >> - See Request Forwarding > > > > >> >> < > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-request > > > > >> > > > > > >> >> below > > > > >> >> > > > > >> >> Create Topics Response > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> CreateTopics Response (Version: 0) => [topic_error_codes] > > > > >> >> topic_error_codes => topic error_code > > > > >> >> topic => STRING > > > > >> >> error_code => INT16 > > > > >> >> > > > > >> >> CreateTopicsResponse contains a map between topic and topic > > > creation > > > > >> >> result error code (see New Protocol Errors > > > > >> >> < > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-NewProtocolErrors > > > > >> > > > > > >> >> ). > > > > >> >> > > > > >> >> > > > > >> > Thank you, > > > > >> > Grant > > > > >> > > > > > >> > > > > > >> > On Wed, Jun 15, 2016 at 4:11 PM, Grant Henke < > ghe...@cloudera.com > > > > > > > >> wrote: > > > > >> > > > > > >> >> Turns out we already have an InvalidRequestException: > > > > >> >> > > > > >> > > > > > > > > > > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/network/RequestChannel.scala#L75-L98 > > > > >> >> > > > > >> >> We just don't map it in Errors.java so it results in an UNKNOWN > > > error > > > > >> when > > > > >> >> sent back to the client. > > > > >> >> > > > > >> >> I will migrate the InvalidRequestException to the client > package, > > > > add it > > > > >> >> to Errors and use that to signify any protocol parsing/rule > > errors. > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> On Wed, Jun 15, 2016 at 2:55 PM, Dana Powers < > > > dana.pow...@gmail.com> > > > > >> >> wrote: > > > > >> >> > > > > >> >>> On Wed, Jun 15, 2016 at 12:19 PM, Ismael Juma < > > ism...@juma.me.uk> > > > > >> wrote: > > > > >> >>> > Hi Grant, > > > > >> >>> > > > > > >> >>> > Comments below. > > > > >> >>> > > > > > >> >>> > On Wed, Jun 15, 2016 at 6:52 PM, Grant Henke < > > > ghe...@cloudera.com > > > > > > > > > >> >>> wrote: > > > > >> >>> >> > > > > >> >>> >> The one thing I want to avoid is to many super specific > error > > > > >> codes. I > > > > >> >>> am > > > > >> >>> >> not sure how much of a problem it really is but in the case > > of > > > > wire > > > > >> >>> >> protocol errors like multiple instances of the same topic, > do > > > you > > > > >> have > > > > >> >>> any > > > > >> >>> >> thoughts on the error? Should we make a generic > > InvalidRequest > > > > error > > > > >> >>> and > > > > >> >>> >> log the detailed message on the broker for client authors > to > > > > debug? > > > > >> >>> >> > > > > >> >>> > > > > > >> >>> > That is a good question. It would be good to get input from > > > client > > > > >> >>> > developers like Dana on this. > > > > >> >>> > > > > >> >>> I think generic error codes are fine if the wire protocol > > > > requirements > > > > >> >>> are documented [i.e., no duplicate topics and > > partitions/replicas > > > > are > > > > >> >>> either/or not both]. If I get a broker error at the protocol > > level > > > > >> >>> that I don't understand, the first place I look is the > protocol > > > > docs. > > > > >> >>> It may cause a few more emails to the mailing lists asking for > > > > >> >>> clarification, but I think those will be easier to triage than > > > > >> >>> confused emails like "I said create topic with 10 partitions, > > but > > > I > > > > >> >>> only got 5???" > > > > >> >>> > > > > >> >>> -Dana > > > > >> >>> > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> -- > > > > >> >> Grant Henke > > > > >> >> Software Engineer | Cloudera > > > > >> >> gr...@cloudera.com | twitter.com/gchenke | > > > > linkedin.com/in/granthenke > > > > >> >> > > > > >> > > > > > >> > > > > > >> > > > > > >> > -- > > > > >> > Grant Henke > > > > >> > Software Engineer | Cloudera > > > > >> > gr...@cloudera.com | twitter.com/gchenke | > > > linkedin.com/in/granthenke > > > > >> > > > > > > > > > > > > > > > > > > > > -- > > > > > Grant Henke > > > > > Software Engineer | Cloudera > > > > > gr...@cloudera.com | twitter.com/gchenke | > > linkedin.com/in/granthenke > > > > > > > > > > > > > > > -- > > Thanks, > > Ewen > > > > > > -- > Grant Henke > Software Engineer | Cloudera > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke >