Noting here that the period '.' also causes potentially confusing behavior
when using regex whitelists or blacklists. It can be easily worked around
but users need to be aware of escaping the period.
If I create two topics 'a.c' and 'abc' and start the following consumer,
both topics will be consu
One way to get around this conflict could be to replace . with _ and _ with __
On Sat, Jul 11, 2015 at 10:33 AM, Todd Palino wrote:
> I tend to agree with this as a compromise at this point. The reality is that
> this is technical debt that has built up in the project, and it does not go
> away
This did come up in the discussion in KAFKA-1902. It is somewhat
concerning that something very specific - in this case (what I think
is a limitation [1]) in certain metric reporters should drive the
decision on what constitutes a legal topic name in Kafka - especially
when all the characters in qu
Magnus,
Converting dot to _ essentially is our way of escaping in the scope part of
the metric name. The issue is that your options of escaping is limited due
to the constraints in the reporters. For example, the Ganglia reporter
replaces anything other than alpha-numeric, -, _ and dot to _ in the
Hi,
since dots seem to be a problem on the metrics side, why not let the
metrics side handle it
by escaping troublesome characters? E.g. "foo.my\.topic.feh"
Let's not push the problem upstream.
Replacing "." with another set of allowed characters "__" seems like a bad
idea since it
is ambigious:
First, a couple of clarifications on this.
1. Currently, we allow Kafka topic to have dots, except that we disallow
topic names that are exactly "." or ".." (which can cause weird problems
when mapping to file directories and ZK paths as Gwen pointed out).
2. When creating the Coda Hale metrics,
I like the "lets warn people of conflicts when creating the topic"
suggestion. IMO, automatic topic creation as currently done is buggy
either way (Send data and hope the topic is ready before retries run
out, potentially failing with the super helpful NO_LEADER error), so I
don't mind leaving it b
Can we provide a tool so folks can "sync back" old topic names to new so
their clusters aren't format lopsided.
~ Joestein
On Jul 11, 2015 1:33 PM, "Todd Palino" wrote:
> I tend to agree with this as a compromise at this point. The reality is
> that this is technical debt that has built up in th
I tend to agree with this as a compromise at this point. The reality is that
this is technical debt that has built up in the project, and it does not go
away by documenting it, and it will only get worse.
As pointed out, eliminating either character at this point is going to cause
problems for
On Sat, Jul 11, 2015 at 12:54 AM, Ewen Cheslack-Postava
wrote:
> On Fri, Jul 10, 2015 at 4:41 PM, Gwen Shapira wrote:
>
>> Yeah, I have an actual customer who ran into this. Unfortunately,
>> inconsistencies in the way things are named are pretty common - just
>> look at Kafka's many CLI options.
For resolving the metrics conflicts, we can alternatively let Kafka to
replace "." with double underscores "__" if that is the primary reason for
topic name restrictions.
Guozhang
On Sat, Jul 11, 2015 at 12:54 AM, Ewen Cheslack-Postava
wrote:
> On Fri, Jul 10, 2015 at 4:41 PM, Gwen Shapira
> w
On Fri, Jul 10, 2015 at 4:41 PM, Gwen Shapira wrote:
> Yeah, I have an actual customer who ran into this. Unfortunately,
> inconsistencies in the way things are named are pretty common - just
> look at Kafka's many CLI options.
>
> I don't think that supporting both and pointing at the docs with
Yeah, I have an actual customer who ran into this. Unfortunately,
inconsistencies in the way things are named are pretty common - just
look at Kafka's many CLI options.
I don't think that supporting both and pointing at the docs with "I
told you so" when our metrics break is a good solution.
On F
I figure you'll probably see complaints no matter what change you make.
Gwen, given that you raised this, another important question might be how
many people you see using *both*. I'm guessing this question came up
because you actually saw a conflict? But I'd imagine (or at least hope)
that most or
I find dots more common in my customer base, so I will definitely feel
the pain of removing them.
However, "." are already used in metrics, file names, directories, etc
- so if we keep the dots, we need to keep code that translates them
and document the translation. Just banning "." seems more nat
I absolutely disagree with #2, Neha. That will break a lot of
infrastructure within LinkedIn. That said, removing "." might break other
people as well, but I think we should have a clearer idea of how much usage
there is on either side.
-Todd
On Fri, Jul 10, 2015 at 2:08 PM, Neha Narkhede wrote
The problem with '.' seems only to be in case of metrics. Should kafka
replace '.' with some special character, not in [a-zA-Z0-9\\._\\-] or some
reserved seq of characters?
On Fri, Jul 10, 2015 at 2:08 PM, Neha Narkhede wrote:
> "." seems natural for grouping topic names. +1 for 2) going forwar
"." seems natural for grouping topic names. +1 for 2) going forward only
without breaking previously created topics with "_" though that might
require us to patch the code somewhat awkwardly till we phase it out a
couple (purposely left vague to stay out of Ewen's wrath :-)) versions
later.
On Fri
Yes, agree here. While it can be a little confusing, I think it's better to
just disallow the character for all creation steps so you can't create more
"bad" topic names, but not try and enforce it for topics that already
exist. Anyone who is in that situation is already there with regards to
metri
I don't think we should break existing topics. Just disallow new
topics going forward.
Agree that having both is horrible, but we should have a solution that
fails when you run "kafka_topics.sh --create", not when you configure
Ganglia.
Gwen
On Fri, Jul 10, 2015 at 1:53 PM, Jay Kreps wrote:
> U
Unfortunately '.' is pretty common too. I agree that it is perverse, but
people seem to do it. Breaking all the topics with '.' in the name seems
like it could be worse than combining metrics for people who have a
'foo_bar' AND 'foo.bar' (and after all, having both is DEEPLY perverse,
no?).
Where
I vote for #1 too.
A special reason Kafka may use '.' in the future is for hierarchical or
namespaced topics.
On Fri, Jul 10, 2015 at 3:32 PM, Todd Palino wrote:
> My selfish point of view is that we do #1, as we use "_" extensively in
> topic names here :) I also happen to think it's the right
My selfish point of view is that we do #1, as we use "_" extensively in
topic names here :) I also happen to think it's the right choice,
specifically because "." has more special meanings, as you noted.
-Todd
On Fri, Jul 10, 2015 at 1:30 PM, Gwen Shapira wrote:
> Unintentional side effect fro
Thanks, Grant. That seems like a bad solution to the problem that John ran
into in that ticket. It's entirely reasonable to have separate validators
for separate things, but it seems like the choice was made to try and mash
it all into a single validator. And it appears that despite the commentary
Unintentional side effect from allowing IP addresses in consumer client IDs :)
So the question is, what do we do now?
1) disallow "."
2) disallow "_"
3) find a reversible way to encode "." and "_" that won't break existing metrics
4) all of the above?
btw. it looks like "." and ".." are currentl
Found it was added here: https://issues.apache.org/jira/browse/KAFKA-697
On Fri, Jul 10, 2015 at 3:18 PM, Todd Palino wrote:
> This was definitely changed at some point after KAFKA-495. The question is
> when and why.
>
> Here's the relevant code from that patch:
>
>
This was definitely changed at some point after KAFKA-495. The question is
when and why.
Here's the relevant code from that patch:
===
--- core/src/main/scala/kafka/utils/Topic.scala (revision 1390178)
+++ core/src/main/scala/kafka/u
kafka.common.Topic shows that currently period is a valid character and I
have verified I can use kafka-topics.sh to create a new topic with a period.
AdminUtils.createOrUpdateTopicPartitionAssignmentPathInZK currently uses
Topic.validate before writing to Zookeeper.
Should period character supp
I had to go look this one up again to make sure -
https://issues.apache.org/jira/browse/KAFKA-495
The only valid character names for topics are alphanumeric, underscore, and
dash. A period is not supposed to be a valid character to use. If you're
seeing them, then one of two things have happened:
On Fri, Jul 10, 2015 at 11:34 AM, Gwen Shapira wrote:
> Hi Kafka Fans,
>
> If you have one topic named "kafka_lab_2" and the other named
> "kafka.lab.2", the topic level metrics will be named kafka_lab_2 for
> both, effectively making it impossible to monitor them properly.
>
> The reason this hap
Hi Kafka Fans,
If you have one topic named "kafka_lab_2" and the other named
"kafka.lab.2", the topic level metrics will be named kafka_lab_2 for
both, effectively making it impossible to monitor them properly.
The reason this happens is that using "." in topic names is pretty
common, especially
31 matches
Mail list logo