Hi All,
Below are two scenarios that are failing
1) 3 node zookeeper cluster and 3 node Kafka cluster: One of zookeeper node
was down due to some reason when Kafka cluster started, none of its stances
came up. All are failing with connectivity to down zookeeper instance.
Whereas, Kafka contains zo
> As for the second issue you brought up, I agree it is indeed a bug; but
just to clarify it is the CREATION of the first task including restoring
stores that can take longer than MAX_POLL_INTERVAL_MS_CONFIG, not
processing it right
Yes this is correct. I may have misused the terminology so lets n
Hi Todd,
I agree that KAFKA-2561 would be good to have for the reasons you state.
Ismael
On Mon, Mar 6, 2017 at 5:17 PM, Todd Palino wrote:
> Thanks for the link, Ismael. I had thought that the most recent kernels
> already implemented this, but I was probably confusing it with BSD. Most of
>
Default broker configurations do not show in the topic overrides (which is
what you are showing with the topics tool). It is more accurate to say that
the min.insync.replicas setting in your server.properties file is what will
apply to every topic (regardless of when it is created), if there exists
Hi All,
Need details about min.insync.replicas in the server.properties.
I thought once I add this to server.properties, all subsequent topic create
should have this as default value.
C:\JAVA_INSTALLATION\kafka\kafka_2.11-0.10.1.1>bin\windows\kafka-topics.bat
--zookeeper localhost:2181/chroot
Hi,
you can implements custom operator via process(), transform(), and
transform() values.
Also, if you want to have even more control over the topology, you can
use low-level Processor API directly instead of DSL.
http://docs.confluent.io/current/streams/developer-guide.html#processor-api
-Ma
Dear folks,
Background: I'm leaning Kafka stream and want to use that in my product for
real time streaming process with data from various sensors.
Question:
1. Can I define my own processing function/api in Kafka stream except the
predefined functions like groupby(), count() etc.?
2. If I cou
Hey, all. Is there any general guidance around using mirrored topics
in the context of a cluster migration?
We're moving operations from one data center to another, and we want
to stream mirrored data from the old cluster to the new, migrate
consumers, then migrate producers.
Our basic question i
thanks Le.However my cluster is kerberized.
From: Le Cyberian
To: Mudit Agarwal
Sent: Monday, 6 March 2017 9:24 PM
Subject: Re: Kafka Kerberos Ansible
Hi Mudit,
I guess its more related to Ansible rather than Kafka itself, However i
will try to answer.
Since Ansible uses SSH and
Thanks for your input. I now understood the first issue (and the fix).
Still not sure about the second issue.
From my understanding, the deadlock is "caused" by your fix of problem
one. If the thread would die, the lock would get release and no deadlock
would occur.
However, because the thread do
Hi Mudit,
I guess its more related to Ansible rather than Kafka itself, However i
will try to answer.
Since Ansible uses SSH and you already have passwordless ssh between
ansible host (which executes playbooks) to Kafka Cluster.
You can simply use ansible command or shell module to get the list
Hi,
Assuming that we are building ZK and Kafka as a messaging system.
Two scenarios we consider.
1. Deploy 3 physical hosts (as opposed to VM) and create a Kafka cluster
there. On the same physical hosts create three ZK cluster.
2. Deploy 3 physical hosts (as opposed to VM) and create
Let me reframe the questions.
How can i list the topics using ansible script from ansible host which is
outside the kafka cluster.My kafka cluster is kerberized.Kafka and ansible are
passwordless ssh.
Thanks,Mudit
From: Le Cyberian
To: users@kafka.apache.org; Mudit Agarwal
Sent: Monda
Hi Han,
Thank you for your response. I understand. Its not possible to have a third
rack/server room at the moment as the requirement is to have redundancy
between both. I tried already to get one :-/
Is it possible to have a Zookeeper Ensemble (3 node) in one server room and
same in the other an
Hello Sachin,
Thanks for your finds!! Just to add what Damian said regarding 1), in
KIP-129 where we are introducing exactly-once processing semantics to
Streams we have also described different categories of error handling for
exactly-once. Commit exceptions due to rebalance will be handled as
"p
Is there any way you can find a third rack/server room/power supply nearby just
for the 1 extra zookeeper node? You don’t have to put any kafka brokers there,
just a single zookeeper. It’s less likely to have a 3-way split brain because
of a network partition. It’s so much cleaner with 3 avail
Hi Mudit,
What do you mean by accessing Kafka cluster outside Ansible VM ? It needs
to listen to a interface which is available for the network outside of the
VM
BR,
Lee
On Mon, Mar 6, 2017 at 7:42 PM, Mudit Agarwal
wrote:
> Hi,
> How we can access the kafka cluster from an outside Ansible VM
Thanks Han and Alexander for taking time out and your responses.
I now understand the risks and the possible outcome of having the desired
setup.
What would be better in your opinion to have failover (active-active)
between both of these server rooms to avoid switching to the clone / 3rd
zookeepe
Hi,
How we can access the kafka cluster from an outside Ansible VM.The kafka is
kerberiszed.All linux environment.
Thanks,Mudit
Yes, that is the parameter I was referring, too.
And yes, you can set consumer/producer config via StreamsConfig.
However, it's recommended to use
> props.put(StreamsConfig.consumerPrefix("consumer.parameter.name"), value);
-Matthias
On 3/6/17 6:48 AM, Neil Moore wrote:
> Thanks for the answ
Hi Mickael,
This looks to be the same as KAFKA-4669. In theory, this should never
happen and it's unclear when/how it can happen. Not sure if someone has
investigated it in more detail.
Ismael
On Mon, Mar 6, 2017 at 5:15 PM, Mickael Maison
wrote:
> Hi,
>
> In one of our clusters, some of our c
Hi Todd
Can you please help me with notes or document on how did you achieve
encryption ?
I have followed data available on official sites but failed as I m no good
with TLS .
On Mar 6, 2017 19:55, "Todd Palino" wrote:
> It’s not that Kafka has to decode it, it’s that it has to send it across
Thanks for the link, Ismael. I had thought that the most recent kernels
already implemented this, but I was probably confusing it with BSD. Most of
my systems are stuck in the stone age right now anyway.
It would be nice to get KAFKA-2561 in, either way. First off, if you can
take advantage of it
Hi,
In one of our clusters, some of our clients occasionally see this exception:
java.lang.IllegalStateException: Correlation id for response (4564)
does not match request (4562)
at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:486)
at org.apache.kafka.clients.NetworkClient.p
Even though OpenSSL is much faster than the Java 8 TLS implementation (I
haven't tested against Java 9, which is much faster than Java 8, but
probably still slower than OpenSSL), all the tests were without zero copy
in the sense that is being discussed here (i.e. sendfile). To benefit from
sendfile
So that’s not quite true, Hans. First, as far as the performance hit being
not a big impact (25% is huge). Or that it’s to be expected. Part of the
problem is that the Java TLS implementation does not support zero copy.
OpenSSL does, and in fact there’s been a ticket open to allow Kafka to
support
Please find the JIRA https://issues.apache.org/jira/browse/KAFKA-4848
On Mon, Mar 6, 2017 at 5:20 PM, Damian Guy wrote:
> Hi Sachin,
>
> If it is a bug then please file a JIRA for it, too.
>
> Thanks,
> Damian
>
> On Mon, 6 Mar 2017 at 11:23 Sachin Mittal wrote:
>
> > Ok that's great.
> > So y
I agree on this is one cluster but having one additional ZK node per
site does not help. (as far as I understand ZK)
A 3 out of 6 is also not a majority. So I think you mean 3/5 with a
cloned 3rd one. This would mean manually switching the cloned one for
majority which can cause issues again.
Thanks for the answers, Matthias.
You mention a metadata refresh interval. I see Kafka producers and consumers
have a property called metadata.max.age.ms which sounds similar. From the
documentation and looking at the Javadoc for Kafka streams it is not clear to
me how I can affect KafkaStream
Hi Marco,
I've done some testing and found that there is a performance issue when
caching is enabled. I suspect his might be what you are hitting. It looks
to me that you can work around this by doing something like:
final StateStoreSupplier sessionStore =
Stores.create(*"session-store-name"*)
Its not a single message at a time that is encrypted with TLS its the entire
network byte stream so a Kafka broker can’t even see the Kafka Protocol
tunneled inside TLS unless it’s terminated at the broker.
It is true that losing the zero copy optimization impacts performance somewhat
but it
It’s not that Kafka has to decode it, it’s that it has to send it across
the network. This is specific to enabling TLS support (transport
encryption), and won’t affect any end-to-end encryption you do at the
client level.
The operation in question is called “zero copy”. In order to send a message
In that case it’s really one cluster. Make sure to set different rack ids for
each server room so kafka will ensure that the replicas always span both floors
and you don’t loose availability of data if a server room goes down.
You will have to configure one addition zookeeper node in each site w
Hi Hans,
Thank you for your reply.
Its basically two different server rooms on different floors and they are
connected with fiber connectivity so its almost like a local connection
between them no network latencies / lag.
If i do a Mirror Maker / Replicator then i will not be able to use them at
Hi Hans,
Thank you for your reply.
Its basically two different server rooms on different floors and they are
connected with fiber connectivity so its almost like a local connection
between them no network latencies / lag.
If i do a Mirror Maker / Replicator then i will not be able to use them at
What do you mean when you say you have "2 sites not datacenters"? You
should be very careful configuring a stretch cluster across multiple sites.
What is the RTT between the two sites? Why do you think that MIrror Maker
(or Confluent Replicator) would not work between the sites and yet you
think a
Hi Guys,
Thank you very much for you reply.
The scenario which i have to implement is that i have 2 sites not
datacenters so mirror maker would not work here.
There will be 4 nodes in total, like 2 in Site A and 2 in Site B. The idea
is to have Active-Active setup along with fault tolerance so t
Hi everyone,
I understand one of the reasons why Kafka is performant is by using
zero-copy.
I often hear that when encryption is enabled, then Kafka has to copy the
data in user space to decode the message, so it has a big impact on
performance.
If it is true, I don t get why the message has to
Hi Guys,
Thank you very much for you reply.
The scenario which i have to implement is that i have 2 sites not
datacenters so mirror maker would not work here.
There will be 4 nodes in total, like 2 in Site A and 2 in Site B. The idea
is to have Active-Active setup along with fault tolerance so t
Thanks Damian,
sure, you are right, these details are modified to be compliant with my
company rules. However the main points are unchanged.
The producer of the original data is a "data ingestor" that attach few
extra fields and produces a message such as:
row = new JsonObject({
"id" : 1234565
Hi Marco,
Can you try setting StreamsConfig.CACHE_MAX_BYTES_BUFFER_CONFIG to 0 and
see if that resolves the issue?
Thanks,
Damian
On Mon, 6 Mar 2017 at 10:59 Damian Guy wrote:
> Hi Marco,
>
> Your config etc look ok.
>
> 1. It is pretty hard to tell what is going on from just your code below,
Hi Sachin,
If it is a bug then please file a JIRA for it, too.
Thanks,
Damian
On Mon, 6 Mar 2017 at 11:23 Sachin Mittal wrote:
> Ok that's great.
> So you have already fixed that issue.
>
> I have modified my PR to remove that change (which was done keeping
> 0.10.2.0 in mind).
>
> However the
Ok that's great.
So you have already fixed that issue.
I have modified my PR to remove that change (which was done keeping
0.10.2.0 in mind).
However the other issue is still valid.
Please review that change. https://github.com/apache/kafka/pull/2642
Thanks
Sachin
On Mon, Mar 6, 2017 at 3:56
Jens,
I think you are correct that a 4 node zookeeper ensemble can be made to work
but it will be slightly less resilient than a 3 node ensemble because it can
only tolerate 1 failure (same as a 3 node ensemble) and the likelihood of node
failures is higher because there is 1 more node that cou
Hi Marco,
Your config etc look ok.
1. It is pretty hard to tell what is going on from just your code below,
unfortunately. But the behaviour doesn't seem to be inline with what I'm
reading in the streams code. For example your MySession::new function
should be called once per record. The merger a
On trunk the CommitFailedException isn't thrown anymore. The commitOffsets
method doesn't throw an exception. It returns one if it was thrown. We used
to throw this exception during suspendTasksAndState, but we don't anymore.
On Mon, 6 Mar 2017 at 05:04 Sachin Mittal wrote:
> Hi
> On CommitFaile
Hi Ofir,
My advice it to handle the duplicates. As you said compaction only runs on
the non-active segments. There could be duplicates in the active segment.
Further, even after compaction has run there could still be duplicates.
You can attempt to minimize the occurrence of duplicates by adjustin
Hello,
I'm playing around with the brand new SessionWindows. I have a simple
topology such as:
KStream sess =
builder.stream(stringSerde, jsonSerde, SOURCE_TOPIC);
sess
.map(MySession::enhanceWithUserId_And_PutUserIdAsKey)
.groupByKey(stringSerde, jsonSerde)
.aggregate(
MySes
Hi, all
I'm trying to modify kafka authentication using our own authenticating
procedure, authorization will stick to kafka's acls .
Does every entry which fetches data from certain topic need to go through
authentication? ( Including KafkaStreams, replica to leader ,etc.)
Hi Hans,
On Mon, Mar 6, 2017 at 12:10 AM, Hans Jespersen wrote:
> A 4 node zookeeper ensemble will not even work. It MUST be an odd number
> of zookeeper nodes to start.
Are you sure about that? If Zookeer doesn't run with four nodes, that means
a running ensemble of three can't be live-migrat
The DSL has some unique features that aren't in the Processor API, such as:
- KStream and KTable abstractions.
- Support for time windows (tumbling windows, hopping windows) and session
windows. The Processor API only has stream-time based `punctuate()`.
- Record caching, which is slightly better
I'd use option 2 (Kafka Connect).
Advantages of #2:
- The code is decoupled from the processing code and easier to refactor in
the future. (same as #4)
- The runtime/uptime/scalability of your Kafka Streams app (processing) is
decoupled from the runtime/uptime/scalability of the data ingestion in
52 matches
Mail list logo