Hi, all
We have upgrade our Kafka Cluster to 0.10.0 version mainly for the Kerberos
security feature and our Kafka-Client Java API is working fine.
But I tried to use Python API like Python-Kafka in our secure Kafka
Cluster. Seems there isn't any Python API to support secure Kafka cluster.
Can any
Thank you, Matthias. A great writeup!
Very detailed and definitely gives us "food for thought" and such.
- Dmitry
On Thu, Apr 13, 2017 at 8:05 PM, Matthias J. Sax
wrote:
> Dmitry.
>
> let me do one step back, to help you better understand the tradeoffs:
>
> A message will only be delivered mul
Dmitry.
let me do one step back, to help you better understand the tradeoffs:
A message will only be delivered multiple times in cause of failure --
ie, if a consumer crashed or timed out. For this case, another consumer
will take over the partitions assigned to the failing consumer and start
con
Thanks, Matthias. Will read the doc you referenced.
The duplicates are on the consumer side. We've been trying to curtail this
by increasing the consumer session timeout. Would that potentially help?
Basically, we're grappling with the causes of the behavior. Why would
messages be ever delivered
Hi,
reading a topic twice -- what it the first requirement you have -- is
not possible (and not necessary IMHO) with Streams API -- regardless of
a "delayed" read. The reason is, that Streams uses a single consumer
group.id internally and thus, Streams can commit only one offset per
topic-partitio
I'm building a prototype with Kafka Streams that will be consuming from the
same topic twice, once with no delay, just like any normal consumer, and
once with a 60 minute delay, using the new timestamp-per-message field. It
will also store state coming from other topics that are being read
simulta
Hi,
the first question to ask would be, if you get duplicate writes at the
producer or duplicate reads at the consumer...
For exactly-once: it's work in progress and we aim for 0.11 release
(what might still be a beta version).
In short, there will be an idempotent producer that will avoid dupli
Hi Kyle, (cc-ing user list as well)
This could be an interesting scenario. Two things to help us think through it
some more: 1) it seems you attached a figure, but I cannot seem to open it. 2)
what about using the low level processor API instead of the DSL as approach 3?
Do you have any thought
+1 (non-binding)
Built sources, ran all unit and integration tests, checked new documentation,
esp with an eye on the streams library.
Thanks Gwen
Eno
> On 12 Apr 2017, at 17:25, Gwen Shapira wrote:
>
> Hello Kafka users, developers, client-developers, friends, romans,
> citizens, etc,
>
>
No, internal topics do not need to be manually created.
Eno
> On 13 Apr 2017, at 10:00, Shimi Kiviti wrote:
>
> Is that (manual topic creation) also true for internal topics?
>
> On Thu, 13 Apr 2017 at 19:14 Matthias J. Sax wrote:
>
>> Hi,
>>
>> thanks for reporting this issue. We are aware
Hi
Background :
I have following set up
Apache server >> Apache Kafka Producer >> Apache Kafka Cluster >> Apache
Storm
As a normal scenario, front end boxes run the apache server and populate
the log files. The requirement is to read every log and send it to kafka
cluster.
The java producer r
We are also catching the exception in serde and returning null and then
filtering out null values downstream so as they are not included.
Thanks
Sachin
On Thu, Apr 13, 2017 at 9:13 PM, Mike Gould wrote:
> Great to know I've not gone off in the wrong direction
> Thanks
>
> On Thu, 13 Apr 2017 a
Is that (manual topic creation) also true for internal topics?
On Thu, 13 Apr 2017 at 19:14 Matthias J. Sax wrote:
> Hi,
>
> thanks for reporting this issue. We are aware of a bug in 0.10.2 that
> seems to be related: https://issues.apache.org/jira/browse/KAFKA-5037
>
> However, I also want to p
Thanks, Jayesh and Vincent.
It seems rather extreme that one has to implement a cache of already seen
messages using Redis, memcached or some such. I would expect Kafka to "do
the right thing". The data loss is a worse problem, especially for mission
critical applications. So what is the curren
Very enlightening presentation, thanks for sharing!
On Thu, Apr 13, 2017 at 9:07 AM, Thakrar, Jayesh <
jthak...@conversantmedia.com> wrote:
> Hi Dmitri,
>
> This presentation might help you understand and take appropriate actions
> to deal with data duplication (and data loss)
>
> https://www.sli
Hi,
thanks for reporting this issue. We are aware of a bug in 0.10.2 that
seems to be related: https://issues.apache.org/jira/browse/KAFKA-5037
However, I also want to point out, that it is highly recommended to not
use auto topic create for Streams, but to manually create all
input/output topics
Hi Dmitri,
This presentation might help you understand and take appropriate actions to
deal with data duplication (and data loss)
https://www.slideshare.net/JayeshThakrar/kafka-68540012
Regards,
Jayesh
On 4/13/17, 10:05 AM, "Vincent Dautremont"
wrote:
One of the case where you would get
Hi Mike,
Thank you. Could you open a JIRA to capture this specific problem (a copy-paste
would suffice)? Alternatively we can open it, up to you.
Thanks
Eno
> On 13 Apr 2017, at 08:43, Mike Gould wrote:
>
> Great to know I've not gone off in the wrong direction
> Thanks
>
> On Thu, 13 Apr 20
Hi Diego,
Confluent offers support for Apache Kafka.
https://www.confluent.io/
Cheers,
Roger
On Wed, Apr 12, 2017 at 11:14 AM, Diego Paes Ramalho Pereira <
diego.pere...@b3.com.br> wrote:
> Hello,
>
>
>
> I work for a Stock Exchange in Brazil and We are looking for a company
> that can provid
That's correct, if you're mostly dealing with "latest" message consumption,
faster disks will be mostly worthless. You will get some benefit if you
have to rebalance partitions, since the cluster needs to shuffle a lot of
data around for that, but during normal operations, there will be no
benefit
Great to know I've not gone off in the wrong direction
Thanks
On Thu, 13 Apr 2017 at 16:34, Matthias J. Sax wrote:
> Mike,
>
> thanks for your feedback. You are absolutely right that Streams API does
> not have great support for this atm. And it's very valuable that you
> report this (you are no
Mike,
thanks for your feedback. You are absolutely right that Streams API does
not have great support for this atm. And it's very valuable that you
report this (you are not the first person). It helps us prioritizing :)
For now, there is no better solution as the one you described in your
email,
Hi
Are there any better error handling options for Kafka streams in java.
Any errors in the serdes will break the stream. The suggested
implementation is to use the byte[] serde and do the deserialisation in a
map operation. However this isn't ideal either as there's no great way to
handle excep
Hi
Are there any better error handling options for Kafka streams in java.
Any errors in the serdes will break the stream. The suggested
implementation is to use the byte[] serde and do the deserialisation in a
map operation. However this isn't ideal either as there's no great way to
handle excep
One of the case where you would get a message more than once is if you get
disconnected / kicked off the consumer group / etc if you fail to commit
offset for messages you have already read.
What I do is that I insert the message in a in-memory cache redis database.
If it fails to insert because o
Hi all,
I was wondering if someone could list some of the causes which may lead to
Kafka delivering the same messages more than once.
We've looked around and we see no errors to notice, yet intermittently, we
see messages being delivered more than once.
Kafka documentation talks about the below
Thank you very much, Marcos.
My application is real-time processing so I would say most of the times I
am dealing with the "latest" message that emphasize page caching. In this
case, does it mean there is no additional throughput provided by using 10k
or 15k disks? What about having virtualized cl
27 matches
Mail list logo