Re: New Producer Public API

2014-01-31 Thread David Arthur
On 1/30/14 8:18 PM, Joel Koshy wrote: That's a good point about 1A - does seem that we would need to have some kind of TTL for each topic's metadata. Also, WRT ZK dependency I don't think that decision (for the Java client) affects other clients. i.e., other client implementations can use whate

Re: New Producer Public API

2014-01-30 Thread Jay Kreps
I thought a bit about it and I think the getCluster() thing was overly simplistic because we try to only maintain metadata about the current set of topics the producer cares about so the cluster might not have the partitions for the topic the user cares about. I think actually what we need is a new

Re: New Producer Public API

2014-01-30 Thread Joel Koshy
+ dev (this thread has become a bit unwieldy) On Thu, Jan 30, 2014 at 5:15 PM, Joel Koshy wrote: > Does it preclude those various implementations? i.e., it could become > a producer config: > default.partitioner.strategy="minimize-connections"/"roundrobin" - and > so on; and implement those par

Re: New Producer Public API

2014-01-30 Thread Joel Koshy
That's a good point about 1A - does seem that we would need to have some kind of TTL for each topic's metadata. Also, WRT ZK dependency I don't think that decision (for the Java client) affects other clients. i.e., other client implementations can use whatever discovery mechanism it chooses. That

Re: New Producer Public API

2014-01-30 Thread Jun Rao
With option 1A, if we increase # partitions on a topic, how will the producer find out newly created partitions? Do we expect the producer to periodically call getCluster()? As for ZK dependency, one of the goals of client rewrite is to reduce dependencies so that one can implement the client in l

Re: New Producer Public API

2014-01-30 Thread Jay Kreps
One downside to the 1A proposal is that without a Partitioner interface we can't really package up and provide common partitioner implementations. Example of these would be 1. HashPartitioner - The default hash partitioning 2. RoundRobinPartitioner - Just round-robins over partitions 3. ConnectionM

Re: New Producer Public API

2014-01-24 Thread Jay Kreps
Clark and all, I thought a little bit about the serialization question. Here are the options I see and the pros and cons I can think of. I'd love to hear people's preferences if you have a strong one. One important consideration is that however the producer works will also need to be how the new

Re: New Producer Public API

2014-01-24 Thread Jay Kreps
Yeah I'll fix that name. Hmm, yeah, I agree that often you want to be able delay network connectivity until you have started everything up. But at the same time I kind of loath special init() methods because you always forget to call them and get one round of error every time. I wonder if in those

Re: New Producer Public API

2014-01-24 Thread Roger Hoover
Jay, Thanks for the explanation. I didn't realize that the broker list was for bootstrapping and was not required to be a complete list of all brokers (although I see now that it's clearly stated in the text description of the parameter). Nonetheless, does it still make sense to make the config

Re: New Producer Public API

2014-01-24 Thread Jay Kreps
Roger, These are good questions. 1. The producer since 0.8 is actually zookeeper free, so this is not new to this client it is true for the current client as well. Our experience was that direct zookeeper connections from zillions of producers wasn't a good idea for a number of reasons. Our inten

Re: New Producer Public API

2014-01-24 Thread Roger Hoover
A couple comments: 1) Why does the config use a broker list instead of discovering the brokers in ZooKeeper? It doesn't match the HighLevelConsumer API. 2) It looks like broker connections are created on demand. I'm wondering if sometimes you might want to flush out config or network connectivi