Yes, let's describe that behavior in FAQ.
Thanks,
Jun
On Tue, Oct 1, 2013 at 8:35 AM, Joe Stein wrote:
> agreed, lets hold off until after 0.8
>
> I will update the JIRA ticket I created with your feedback and options we
> can discuss it there and then deal with changes in 0.8.1 or 0.9 or suc
agreed, lets hold off until after 0.8
I will update the JIRA ticket I created with your feedback and options we
can discuss it there and then deal with changes in 0.8.1 or 0.9 or such.
I will update the FAQ (should have time tomorrow unless someone else gets
to it first) I think we should have it
This proposal still doesn't address the following fundamental issue: The
random partitioner cannot select a random and AVAILABLE partition.
So, we have the following two choices.
1. Stick with the current partitioner api.
Then, we have to pick one way to do random partitioning (when key is null).
How about making UUID.randomUUID.toString() the default in KeyedMessage
instead of null if not supplied
def this(topic: String, message: V) = this(topic, UUID.randomUUID.toString(),
message)
and if you want the random refresh behavior then pass in "*" on the
KeyedMessage construction which we can
The main issue is that if we do that, when key is null, we can only select
a random partition, but not a random and available partition, without
changing the partitioner api. Being able to do the latter is important in
my opinion. For example, a user may choose the replication factor of a
topic to
I think Joe's suggesting that we can remove the checking logic for
key==null in DefaultEventHandler, and do that in partitioner.
One thing about this idea is any customized partitioner also has to
consider key == null case then.
Guozhang
On Fri, Sep 27, 2013 at 9:12 PM, Jun Rao wrote:
> We ha
We have the following code in DefaultEventHandler:
val partition =
if(key == null) {
// If the key is null, we don't really need a partitioner
// So we look up in the send partition cache for the topic to
decide the target partition
val id = sendPartitionPerTopicC
hmmm, yeah, on I don't want todo that ... if we don't have to.
What if the DefaultPartitioner code looked like this instead =8^)
private class DefaultPartitioner[T](props: VerifiableProperties = null)
extends Partitioner[T] {
def partition(key: T, numPartitions: Int): Int = {
if (key == nu
However, currently, if key is null, the partitioner is not even called. Do
you want to change DefaultEventHandler too?
This also doesn't allow the partitioner to select a random and available
partition, which in my opinion is more important than making partitions
perfectly evenly balanced.
Thanks
What I was proposing was two fold
1) revert the DefaultPartitioner class
then
2) create a new partitioner that folks could use (like at LinkedIn you
would use this partitioner instead) in ProducerConfig
private class RandomRefreshTimPartitioner[T](props: VerifiableProperties =
null) extends Par
Joe,
Not sure I fully understand your propose. Do you want to put the random
partitioning selection logic (for messages without a key) in the
partitioner without changing the partitioner api? That's difficult. The
issue is that in the current partitioner api, we don't know which
partitions are ava
Jun, can we hold this extra change over for 0.8.1 and just go with
reverting where we were before for the default with a new partition for
meta refresh and support both?
I am not sure I entirely understand why someone would need the extra
functionality you are talking about which sounds cool thoug
It's reasonable to make the behavior of random producers customizable
through a pluggable partitioner. So, if one doesn't care about # of socket
connections, one can choose to select a random partition on every send. If
one does have many producers, one can choose to periodically select a
random pa
Sounds good, I will create a JIRA and upload a patch.
/***
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop
/
On Sep 17, 2013, at 1:
I would be in favor of that. I agree this is better than 0.7.
-Jay
On Tue, Sep 17, 2013 at 10:19 AM, Joel Koshy wrote:
> I agree that minimizing the number of producer connections (while
> being a good thing) is really required in very large production
> deployments, and the net-effect of the
I agree that minimizing the number of producer connections (while
being a good thing) is really required in very large production
deployments, and the net-effect of the existing change is
counter-intuitive to users who expect an immediate even distribution
across _all_ partitions of the topic.
How
Let me ask another question which I think is more objective. Let's say 100
random, smart infrastructure specialists try Kafka, of these 100 how many
do you believe will
1. Say that this behavior is what they expected to happen?
2. Be happy with this behavior?
I am not being facetious I am genuinely
I just took a look at this change. I agree with Joe, not to put to fine a
point on it, but this is a confusing hack.
Jun, I don't think wanting to minimizing the number of TCP connections is
going to be a very common need for people with less than 10k producers. I
also don't think people are going
Joe,
Thanks for bringing this up. I want to clarify this a bit.
1. Currently, the producer side logic is that if the partitioning key is
not provided (i.e., it is null), the partitioner won't be called. We did
that because we want to select a random and "available" partition to send
messages so t
How about creating a new class called RandomRefreshPartioner and copy the
DefaultPartitioner code to it and then revert the DefaultPartitioner code. I
appreciate this is a one time burden for folks using the existing 0.8-beta1
bumping into KAFKA-1017 in production having to switch to the
Rando
>
>
> Thanks for bringing this up - it is definitely an important point to
> discuss. The underlying issue of KAFKA-1017 was uncovered to some degree by
> the fact that in our deployment we did not significantly increase the total
> number of partitions over 0.7 - i.e., in 0.7 we had say four parti
First, let me apologize for not realizing/noticing this until today. One
reason I left my last company was not being paid to work on Kafka nor being
able to afford any time for a while to work on it. Now in my new gig (just
wrapped up my first week, woo hoo) while I am still not "paid to work on
K
22 matches
Mail list logo