Hi Jay, Good point one of the main benefits of the create topic api is removing the server side auto create. The work is noted in the Follow Up Changes <https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-FollowUpChangesfollow-up-changes> section of the KIP-4 wiki and tacked by KAFKA-2410 <https://issues.apache.org/jira/browse/KAFKA-2410>.
You are pretty much spot on with the plan. But there are some things that would need to be discussed during that change that likely deserve their own KIP. I will lay out some of my thoughts below. However, do you mind if we defer the rest of the discussion until after the create topics patch is done? (I am happy drive that KIP as soon as this patch is in) High level plan: 1. We set auto.create.topics.enable=false by default on the server and deprecate it 2. We add a few new producer configs 1. auto.create.topics.enable (should document that privileges are required if using ACLs) 2. auto.create.topics.partitions 3. auto.create.topics.replicas 3. The producer tracks the location of the controller which it now gets in the metadata response and includes this in its internal cluster metadata representation 4. The producer uses the create topic api to make a request to the controller when it gets an error about a non-existent topic I mocked up a quick implementation based off of my create topics PR to vet the basics. But there are a still some open questions and things to test. The strawman implementation does the following: - Updates the metadata with the controller information - Set "topicsToBeCreated" in the metadata in NetworkClient.handleResponse - topics are added when receiving an UNKNOWN_TOPIC_OR_PARTITION error - Sends CreateTopicRequests to the controller in NetworkClient.maybeUpdate - This effectively makes create topic part of the metadata updates Some of the things that need to be thought through include: - Should auto.create.topics.replicas scale down to the number of known live servers at create time? - Should the consumer be able to auto create topics too? - What happens when both client are broker side auto create are enabled? - I think the broker wins in this case since metadata request happens first - What happens when the user is unauthorized to create topics? - Either throw exception or wait for metadata update timeout Thanks, Grant On Sun, Jun 12, 2016 at 2:08 PM, Jay Kreps <j...@confluent.io> wrote: > Hey Grant, > > Great to see this progressing. That API looks good to me. Thanks for the > thoughtful write-up. > > One thing that would be great to add to this KIP would be a quick sketch of > how the create topic api can be used to get rid of the thing where we > create topics when you ask for their metadata. This doesn't need to be in > any great depth, just enough to make sure this new api will work for that > use case. > > I think the plan is something like > > 1. We set auto.create.topics.enable=false by default on the server and > deprecate it > 2. We add a new producer config auto.create.topics.enable > 3. The producer tracks the location of the controller which it now gets > in the metadata response and includes this in its internal cluster > metadata > representation > 4. The producer uses the create topic api to make a request to the > controller when it gets an error about a non-existent topic > > I think the semantics of this at first will be the same as they are now--if > retries are disabled the first produce request will potentially fail if the > topic creation hasn't quite completed. This isn't great but it isn't worse > than the current state and I think would be fixed either by future > improvements to make the requests fully blocking or by idempotence for the > producer (which would mean retries were always enabled). > > One thing I'm not sure of is whether the admin java api, which would > maintain its own connection pool etc, would be used internally by the > producer (and potentially consumer) or if they would just reuse the request > objects. > > Just trying to write this down to sanity check that it will work. > > -Jay > > On Fri, Jun 10, 2016 at 9:21 AM, Grant Henke <ghe...@cloudera.com> wrote: > > > Now that Kafka 0.10 has been released I would like to start work on the > new > > protocol messages and client implementation for KIP-4. In order to break > up > > the discussion and feedback I would like to continue breaking up the > > content in to smaller pieces. > > > > This discussion thread is for the CreateTopic request/response and server > > side implementation. Details for this implementation can be read here: > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-CreateTopicRequest > > > > I have included the exact content below for clarity: > > > > > Create Topic Request (KAFKA-2945 > > > <https://issues.apache.org/jira/browse/KAFKA-2945>) > > > > > > > > > CreateTopic Request (Version: 0) => [create_topic_requests] timeout > > > create_topic_requests => topic partitions replication_factor > > [replica_assignment] [configs] > > > topic => STRING > > > partitions => INT32 > > > replication_factor => INT32 > > > replica_assignment => partition_id [replicas] > > > partition_id => INT32 > > > replicas => INT32 > > > configs => config_key config_value > > > config_key => STRING > > > config_value => STRING > > > timeout => INT32 > > > > > > CreateTopicRequest is a batch request to initiate topic creation with > > > either predefined or automatic replica assignment and optionally topic > > > configuration. > > > > > > Request semantics: > > > > > > 1. Must be sent to the controller broker > > > 2. Multiple instructions for the same topic in one request will be > > > silently ignored, only the last from the list will be executed. > > > - This is because the list of topics is modeled server side as a > > > map with TopicName as the key > > > 3. The principle must be authorized to the "Create" Operation on the > > > "Cluster" resource to create topics. > > > - Unauthorized requests will receive a > > ClusterAuthorizationException > > > 4. > > > > > > Only one from ReplicaAssignment or (Partitions + ReplicationFactor), > > can > > > be defined in one instruction. If both parameters are specified - > > > ReplicaAssignment takes precedence. > > > - In the case ReplicaAssignment is defined number of partitions and > > > replicas will be calculated from the supplied ReplicaAssignment. > > > - In the case of defined (Partitions + ReplicationFactor) replica > > > assignment will be automatically generated by the server. > > > - One or the other must be defined. The existing broker side auto > > > create defaults will not be used > > > (default.replication.factor, num.partitions). The client > > implementation can > > > have defaults for these options when generating the messages. > > > 5. Setting a timeout > 0 will allow the request to block until the > > > topic metadata is "complete" on the controller node. > > > - Complete means the topic metadata has been completely populated > > > (leaders, replicas, ISRs) > > > - If a timeout error occurs, the topic could still be created > > > successfully at a later time. Its up to the client to query for > > the state > > > at that point. > > > 6. The request is not transactional. > > > 1. If an error occurs on one topic, the other could still be > > > created. > > > 2. Errors are reported independently. > > > > > > QA: > > > > > > - Why is CreateTopicRequest a batch request? > > > - Scenarios where tools or admins want to create many topics > should > > > be able to with fewer requests > > > - Example: MirrorMaker may want to create the topics downstream > > > - What happens if some topics error immediately? Will it > > > return immediately? > > > - The request will block until all topics have either been > created, > > > errors, or the timeout has been hit > > > - There is no "short circuiting" where 1 error stops the other > > > topics from being created > > > - Why implement "partial blocking" instead of fully async of > fully > > > consistent? > > > - See Cluster Consistent Blocking > > > < > > > https://cwiki.apache.org/#KIP-4-Commandlineandcentralizedadministrativeoperations-clusterconsistentblocking > > > > > > below > > > - Why require the request to go to the controller? > > > - The controller is responsible for the cluster metadata and > > > its propagation > > > - See Request Forwarding > > > < > > > https://cwiki.apache.org/#KIP-4-Commandlineandcentralizedadministrativeoperations-request > > > > > > below > > > > > > Create Topic Response > > > > > > > > > CreateTopic Response (Version: 0) => [topic_error_codes] > > > topic_error_codes => topic error_code > > > topic => STRING > > > error_code => INT16 > > > > > > CreateTopicResponse contains a map between topic and topic creation > > > result error code (see New Protocol Errors > > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-NewProtocolErrors > > > > > > ). > > > > > > > A sample PR is on github (https://github.com/apache/kafka/pull/1489) > > though > > it could change drastically based on the feedback here. > > > > Thanks, > > Grant > > > > -- > > Grant Henke > > Software Engineer | Cloudera > > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke > > > -- Grant Henke Software Engineer | Cloudera gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke