RE: Stream naming conventions?

Thunder Stumpges Tue, 03 Mar 2015 06:58:13 -0800

I'm not sure who you were asking the question to, but since Gwen's was not 
bound to any restrictions just a guideline, I'll assume you meant me :)

We have a concept of a "topic suffix property" that is some property in the 
data that can change dynamically. The full topic name then becomes 
"<avro_class>-<topic_suffix>" the dash is agreed never to be used in a topic 
suffix so we can strip just the last dash to get back to the class name. You 
could pick any delimiter not used in class names or suffixes.

The topic suffix is then where we put things like processing stage (incoming, 
cleaned, duplicate, etc) as well as any other orthogonal delineation that needs 
to be in a different topic.

We use .NET so I'm not sure the terminology for java but we have property 
attributes to declare a property as the "topic suffix property" (and also the 
"message key property") and we use "property getters" in a partial class to do 
dynamic computation of these if necessary.

A "message registry" then uses reflection to get the topic name and message key 
for any message going out our producer. It also deals with stripping the topic 
suffix for consumers looking for the avro type given a topic name.

So far this has worked great for us.
Cheers,
Thunder

-----Original Message-----
From: Maciej Jaśkowski [maciej.jaskow...@gmail.com]
Received: Tuesday, 03 Mar 2015, 2:34AM
To: users@kafka.apache.org [users@kafka.apache.org]
CC: Taylor Gautier [tgaut...@yahoo.com]; kafka-us...@incubator.apache.org 
[kafka-us...@incubator.apache.org]
Subject: Re: Stream naming conventions?

This approach sounds nice at first but it would fail if you start
sending the same message but partitioned in different (orthogonal)
ways. How would you go about that?

Maciej

On 25 February 2015 at 05:17, Gwen Shapira <gshap...@cloudera.com> wrote:
> Nice :) I like the idea of tying topic name to avro schemas.
>
> I have experience with other people's data, and until now I mostly
> recommended:
> <app type>.<app name>.<data set name>.<stage of processing>
>
> So we end up with things like:
> etl.onlineshop.searches.validated
>
> Or if I have my own test dataset that I don't want to share:
> users.gshapira.newapp.testing1
>
> Makes it relatively easy to share datasets across the organization, and
> also makes white-listing and black-listing relatively simple because of the
> hierarchy (until we add a real topic hierarchy to kafka...).
>
> Gwen
>
> On Tue, Feb 24, 2015 at 1:13 PM, Thunder Stumpges <tstump...@ntent.com>
> wrote:
>
>> We have a global namespace hierarchy for topics that is exactly our Avro
>> namespace with Class Name. The template is basically:
>>
>> <root_ns>.Core.<core_data_types_shared_across_company>
>> <root_ns>.<product>.<product_specific_hierarchy>
>>
>> The up side of this for us is that since the topics are named based on the
>> Avro schema namespace and type, we can look up the avro schema in the Avro
>> Schema Repository using the topic name, and the schema ID coded into the
>> message. Each product then also has the flexibility of defining whatever
>> topics they find useful.
>>
>> Hope this helps,
>> Thunder
>>
>> -----Original Message-----
>> From: Taylor Gautier [mailto:tgaut...@yahoo.com.INVALID]
>> Sent: Tuesday, February 24, 2015 12:11 PM
>> To: kafka-us...@incubator.apache.org
>> Subject: Stream naming conventions?
>>
>> Hello all,
>> Just wondering if those with a good amount of experience using Kafka in
>> production with many streams have converged on any sort of naming
>> convention.  If so would you be willing to share?
>> Thanks in advance,
>> Taylor
>>

--

Twitter: @mjaskowski

RE: Stream naming conventions?

Reply via email to