date:20170306

Kafka Cluster fail to start if one instance of zookeeper not up

2017-03-06 Thread safique ahemad

Hi All, Below are two scenarios that are failing 1) 3 node zookeeper cluster and 3 node Kafka cluster: One of zookeeper node was down due to some reason when Kafka cluster started, none of its stances came up. All are failing with connectivity to down zookeeper instance. Whereas, Kafka contains zo

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Sachin Mittal

> As for the second issue you brought up, I agree it is indeed a bug; but just to clarify it is the CREATION of the first task including restoring stores that can take longer than MAX_POLL_INTERVAL_MS_CONFIG, not processing it right Yes this is correct. I may have misused the terminology so lets n

Re: Performance and encryption

2017-03-06 Thread Ismael Juma

Hi Todd, I agree that KAFKA-2561 would be good to have for the reasons you state. Ismael On Mon, Mar 6, 2017 at 5:17 PM, Todd Palino wrote: > Thanks for the link, Ismael. I had thought that the most recent kernels > already implemented this, but I was probably confusing it with BSD. Most of >

Re: Clarification on min.insync.replicas

2017-03-06 Thread Todd Palino

Default broker configurations do not show in the topic overrides (which is what you are showing with the topics tool). It is more accurate to say that the min.insync.replicas setting in your server.properties file is what will apply to every topic (regardless of when it is created), if there exists

Clarification on min.insync.replicas

2017-03-06 Thread Shrikant Patel

Hi All, Need details about min.insync.replicas in the server.properties. I thought once I add this to server.properties, all subsequent topic create should have this as default value. C:\JAVA_INSTALLATION\kafka\kafka_2.11-0.10.1.1>bin\windows\kafka-topics.bat --zookeeper localhost:2181/chroot

Re: Can I create user defined stream processing function/api?

2017-03-06 Thread Matthias J. Sax

Hi, you can implements custom operator via process(), transform(), and transform() values. Also, if you want to have even more control over the topology, you can use low-level Processor API directly instead of DSL. http://docs.confluent.io/current/streams/developer-guide.html#processor-api -Ma

Can I create user defined stream processing function/api?

2017-03-06 Thread Wang LongTian

Dear folks, Background: I'm leaning Kafka stream and want to use that in my product for real time streaming process with data from various sensors. Question: 1. Can I define my own processing function/api in Kafka stream except the predefined functions like groupby(), count() etc.? 2. If I cou

MirrorMaker and producers

2017-03-06 Thread Jack Foy

Hey, all. Is there any general guidance around using mirrored topics in the context of a cluster migration? We're moving operations from one data center to another, and we want to stream mirrored data from the old cluster to the new, migrate consumers, then migrate producers. Our basic question i

Re: Kafka Kerberos Ansible

2017-03-06 Thread Mudit Agarwal

thanks Le.However my cluster is kerberized. From: Le Cyberian To: Mudit Agarwal Sent: Monday, 6 March 2017 9:24 PM Subject: Re: Kafka Kerberos Ansible Hi Mudit, I guess its more related to Ansible rather than Kafka itself, However i will try to answer. Since Ansible uses SSH and

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Matthias J. Sax

Thanks for your input. I now understood the first issue (and the fix). Still not sure about the second issue. From my understanding, the deadlock is "caused" by your fix of problem one. If the thread would die, the lock would get release and no deadlock would occur. However, because the thread do

Re: Kafka Kerberos Ansible

2017-03-06 Thread Le Cyberian

Hi Mudit, I guess its more related to Ansible rather than Kafka itself, However i will try to answer. Since Ansible uses SSH and you already have passwordless ssh between ansible host (which executes playbooks) to Kafka Cluster. You can simply use ansible command or shell module to get the list

3 node Kafka cluster with 3 node ZK ensemble

2017-03-06 Thread Mich Talebzadeh

Hi, Assuming that we are building ZK and Kafka as a messaging system. Two scenarios we consider. 1. Deploy 3 physical hosts (as opposed to VM) and create a Kafka cluster there. On the same physical hosts create three ZK cluster. 2. Deploy 3 physical hosts (as opposed to VM) and create

Re: Kafka Kerberos Ansible

2017-03-06 Thread Mudit Agarwal

Let me reframe the questions. How can i list the topics using ansible script from ansible host which is outside the kafka cluster.My kafka cluster is kerberized.Kafka and ansible are passwordless ssh. Thanks,Mudit From: Le Cyberian To: users@kafka.apache.org; Mudit Agarwal Sent: Monda

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Le Cyberian

Hi Han, Thank you for your response. I understand. Its not possible to have a third rack/server room at the moment as the requirement is to have redundancy between both. I tried already to get one :-/ Is it possible to have a Zookeeper Ensemble (3 node) in one server room and same in the other an

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Guozhang Wang

Hello Sachin, Thanks for your finds!! Just to add what Damian said regarding 1), in KIP-129 where we are introducing exactly-once processing semantics to Streams we have also described different categories of error handling for exactly-once. Commit exceptions due to rebalance will be handled as "p

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Hans Jespersen

Is there any way you can find a third rack/server room/power supply nearby just for the 1 extra zookeeper node? You don’t have to put any kafka brokers there, just a single zookeeper. It’s less likely to have a 3-way split brain because of a network partition. It’s so much cleaner with 3 avail

Re: Kafka Kerberos Ansible

2017-03-06 Thread Le Cyberian

Hi Mudit, What do you mean by accessing Kafka cluster outside Ansible VM ? It needs to listen to a interface which is available for the network outside of the VM BR, Lee On Mon, Mar 6, 2017 at 7:42 PM, Mudit Agarwal wrote: > Hi, > How we can access the kafka cluster from an outside Ansible VM

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Le Cyberian

Thanks Han and Alexander for taking time out and your responses. I now understand the risks and the possible outcome of having the desired setup. What would be better in your opinion to have failover (active-active) between both of these server rooms to avoid switching to the clone / 3rd zookeepe

Kafka Kerberos Ansible

2017-03-06 Thread Mudit Agarwal

Hi, How we can access the kafka cluster from an outside Ansible VM.The kafka is kerberiszed.All linux environment. Thanks,Mudit

Re: Kafka streams questions

2017-03-06 Thread Matthias J. Sax

Yes, that is the parameter I was referring, too. And yes, you can set consumer/producer config via StreamsConfig. However, it's recommended to use > props.put(StreamsConfig.consumerPrefix("consumer.parameter.name"), value); -Matthias On 3/6/17 6:48 AM, Neil Moore wrote: > Thanks for the answ

Re: java.lang.IllegalStateException: Correlation id for response () does not match request ()

2017-03-06 Thread Ismael Juma

Hi Mickael, This looks to be the same as KAFKA-4669. In theory, this should never happen and it's unclear when/how it can happen. Not sure if someone has investigated it in more detail. Ismael On Mon, Mar 6, 2017 at 5:15 PM, Mickael Maison wrote: > Hi, > > In one of our clusters, some of our c

Re: Performance and encryption

2017-03-06 Thread IT Consultant

Hi Todd Can you please help me with notes or document on how did you achieve encryption ? I have followed data available on official sites but failed as I m no good with TLS . On Mar 6, 2017 19:55, "Todd Palino" wrote: > It’s not that Kafka has to decode it, it’s that it has to send it across

Re: Performance and encryption

2017-03-06 Thread Todd Palino

Thanks for the link, Ismael. I had thought that the most recent kernels already implemented this, but I was probably confusing it with BSD. Most of my systems are stuck in the stone age right now anyway. It would be nice to get KAFKA-2561 in, either way. First off, if you can take advantage of it

java.lang.IllegalStateException: Correlation id for response () does not match request ()

2017-03-06 Thread Mickael Maison

Hi, In one of our clusters, some of our clients occasionally see this exception: java.lang.IllegalStateException: Correlation id for response (4564) does not match request (4562) at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:486) at org.apache.kafka.clients.NetworkClient.p

Re: Performance and encryption

2017-03-06 Thread Ismael Juma

Even though OpenSSL is much faster than the Java 8 TLS implementation (I haven't tested against Java 9, which is much faster than Java 8, but probably still slower than OpenSSL), all the tests were without zero copy in the sense that is being discussed here (i.e. sendfile). To benefit from sendfile

Re: Performance and encryption

2017-03-06 Thread Todd Palino

So that’s not quite true, Hans. First, as far as the performance hit being not a big impact (25% is huge). Or that it’s to be expected. Part of the problem is that the Java TLS implementation does not support zero copy. OpenSSL does, and in fact there’s been a ticket open to allow Kafka to support

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Sachin Mittal

Please find the JIRA https://issues.apache.org/jira/browse/KAFKA-4848 On Mon, Mar 6, 2017 at 5:20 PM, Damian Guy wrote: > Hi Sachin, > > If it is a bug then please file a JIRA for it, too. > > Thanks, > Damian > > On Mon, 6 Mar 2017 at 11:23 Sachin Mittal wrote: > > > Ok that's great. > > So y

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Alexander Binzberger

I agree on this is one cluster but having one additional ZK node per site does not help. (as far as I understand ZK) A 3 out of 6 is also not a majority. So I think you mean 3/5 with a cloned 3rd one. This would mean manually switching the cloned one for majority which can cause issues again.

Re: Kafka streams questions

2017-03-06 Thread Neil Moore

Thanks for the answers, Matthias. You mention a metadata refresh interval. I see Kafka producers and consumers have a property called metadata.max.age.ms which sounds similar. From the documentation and looking at the Javadoc for Kafka streams it is not clear to me how I can affect KafkaStream

Re: Strange behaviour in Session Windows

2017-03-06 Thread Damian Guy

Hi Marco, I've done some testing and found that there is a performance issue when caching is enabled. I suspect his might be what you are hitting. It looks to me that you can work around this by doing something like: final StateStoreSupplier sessionStore = Stores.create(*"session-store-name"*)

Re: Performance and encryption

2017-03-06 Thread Hans Jespersen

Its not a single message at a time that is encrypted with TLS its the entire network byte stream so a Kafka broker can’t even see the Kafka Protocol tunneled inside TLS unless it’s terminated at the broker. It is true that losing the zero copy optimization impacts performance somewhat but it

Re: Performance and encryption

2017-03-06 Thread Todd Palino

It’s not that Kafka has to decode it, it’s that it has to send it across the network. This is specific to enabling TLS support (transport encryption), and won’t affect any end-to-end encryption you do at the client level. The operation in question is called “zero copy”. In order to send a message

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Hans Jespersen

In that case it’s really one cluster. Make sure to set different rack ids for each server room so kafka will ensure that the replicas always span both floors and you don’t loose availability of data if a server room goes down. You will have to configure one addition zookeeper node in each site w

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Le Cyberian

Hi Hans, Thank you for your reply. Its basically two different server rooms on different floors and they are connected with fiber connectivity so its almost like a local connection between them no network latencies / lag. If i do a Mirror Maker / Replicator then i will not be able to use them at

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Le Cyberian

Hi Hans, Thank you for your reply. Its basically two different server rooms on different floors and they are connected with fiber connectivity so its almost like a local connection between them no network latencies / lag. If i do a Mirror Maker / Replicator then i will not be able to use them at

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Hans Jespersen

What do you mean when you say you have "2 sites not datacenters"? You should be very careful configuring a stretch cluster across multiple sites. What is the RTT between the two sites? Why do you think that MIrror Maker (or Confluent Replicator) would not work between the sites and yet you think a

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Le Cyberian

Hi Guys, Thank you very much for you reply. The scenario which i have to implement is that i have 2 sites not datacenters so mirror maker would not work here. There will be 4 nodes in total, like 2 in Site A and 2 in Site B. The idea is to have Active-Active setup along with fault tolerance so t

Performance and encryption

2017-03-06 Thread Nicolas Motte

Hi everyone, I understand one of the reasons why Kafka is performant is by using zero-copy. I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance. If it is true, I don t get why the message has to

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Le Cyberian

Hi Guys, Thank you very much for you reply. The scenario which i have to implement is that i have 2 sites not datacenters so mirror maker would not work here. There will be 4 nodes in total, like 2 in Site A and 2 in Site B. The idea is to have Active-Active setup along with fault tolerance so t

Re: Strange behaviour in Session Windows

2017-03-06 Thread Marco Abitabile

Thanks Damian, sure, you are right, these details are modified to be compliant with my company rules. However the main points are unchanged. The producer of the original data is a "data ingestor" that attach few extra fields and produces a message such as: row = new JsonObject({ "id" : 1234565

Re: Strange behaviour in Session Windows

2017-03-06 Thread Damian Guy

Hi Marco, Can you try setting StreamsConfig.CACHE_MAX_BYTES_BUFFER_CONFIG to 0 and see if that resolves the issue? Thanks, Damian On Mon, 6 Mar 2017 at 10:59 Damian Guy wrote: > Hi Marco, > > Your config etc look ok. > > 1. It is pretty hard to tell what is going on from just your code below,

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Damian Guy

Hi Sachin, If it is a bug then please file a JIRA for it, too. Thanks, Damian On Mon, 6 Mar 2017 at 11:23 Sachin Mittal wrote: > Ok that's great. > So you have already fixed that issue. > > I have modified my PR to remove that change (which was done keeping > 0.10.2.0 in mind). > > However the

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Sachin Mittal

Ok that's great. So you have already fixed that issue. I have modified my PR to remove that change (which was done keeping 0.10.2.0 in mind). However the other issue is still valid. Please review that change. https://github.com/apache/kafka/pull/2642 Thanks Sachin On Mon, Mar 6, 2017 at 3:56

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Hans Jespersen

Jens, I think you are correct that a 4 node zookeeper ensemble can be made to work but it will be slightly less resilient than a 3 node ensemble because it can only tolerate 1 failure (same as a 3 node ensemble) and the likelihood of node failures is higher because there is 1 more node that cou

Re: Strange behaviour in Session Windows

2017-03-06 Thread Damian Guy

Hi Marco, Your config etc look ok. 1. It is pretty hard to tell what is going on from just your code below, unfortunately. But the behaviour doesn't seem to be inline with what I'm reading in the streams code. For example your MySession::new function should be called once per record. The merger a

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Damian Guy

On trunk the CommitFailedException isn't thrown anymore. The commitOffsets method doesn't throw an exception. It returns one if it was thrown. We used to throw this exception during suspendTasksAndState, but we don't anymore. On Mon, 6 Mar 2017 at 05:04 Sachin Mittal wrote: > Hi > On CommitFaile

Re: Kafka Streams - ordering grouped messages

2017-03-06 Thread Damian Guy

Hi Ofir, My advice it to handle the duplicates. As you said compaction only runs on the non-active segments. There could be duplicates in the active segment. Further, even after compaction has run there could still be duplicates. You can attempt to minimize the occurrence of duplicates by adjustin

Strange behaviour in Session Windows

2017-03-06 Thread Marco Abitabile

Hello, I'm playing around with the brand new SessionWindows. I have a simple topology such as: KStream sess = builder.stream(stringSerde, jsonSerde, SOURCE_TOPIC); sess .map(MySession::enhanceWithUserId_And_PutUserIdAsKey) .groupByKey(stringSerde, jsonSerde) .aggregate( MySes

kafka authenticate

2017-03-06 Thread 陈江枫

Hi, all I'm trying to modify kafka authentication using our own authenticating procedure, authorization will stick to kafka's acls . Does every entry which fetches data from certain topic need to go through authentication? ( Including KafkaStreams, replica to leader ,etc.)

Re: Having 4 Node Kafka Cluster

2017-03-06 Thread Jens Rantil

Hi Hans, On Mon, Mar 6, 2017 at 12:10 AM, Hans Jespersen wrote: > A 4 node zookeeper ensemble will not even work. It MUST be an odd number > of zookeeper nodes to start. Are you sure about that? If Zookeer doesn't run with four nodes, that means a running ensemble of three can't be live-migrat

Re: Kafka streams DSL advantage

2017-03-06 Thread Michael Noll

The DSL has some unique features that aren't in the Processor API, such as: - KStream and KTable abstractions. - Support for time windows (tumbling windows, hopping windows) and session windows. The Processor API only has stream-time based `punctuate()`. - Record caching, which is slightly better

Re: Writing data from kafka-streams to remote database

2017-03-06 Thread Michael Noll

I'd use option 2 (Kafka Connect). Advantages of #2: - The code is decoupled from the processing code and easier to refactor in the future. (same as #4) - The runtime/uptime/scalability of your Kafka Streams app (processing) is decoupled from the runtime/uptime/scalability of the data ingestion in

52 matches

Mail list logo