date:20160723

Re: Kafka Streams/Connect for Persistence?

2016-07-23 Thread Ewen Cheslack-Postava

Gwen and Guozhang are very convincing, so I don't have much to add here :) The only thing I can think to add is that it is less code for you to write! Once one person writes the connector, we don't have a bunch of people reimplementing the logic for copying data to a JDBC sink for their KS apps. R

Re: release of 0.10.1

2016-07-23 Thread Ewen Cheslack-Postava

0.10.1.0 is considered a major release. The release 0.10.0.0 might have a follow up 0.10.0.1 for critical bug fixes, but 0.10.1.0 is a "minor" release. Kafka is a bit odd in that its "major" releases are labeled as a normal "minor" release number because Kafka hasn't decided to make an official 1.0

Re: Find partition offsets in a kerberized kafka cluster

2016-07-23 Thread Ewen Cheslack-Postava

The GetOffsetShell utility still uses the SimpleConsumer, so I don't think there's a way to use it with Kerberos. The new consumer doesn't expose all the APIs that SimpleConsumer does, so I don't think the tool can be converted to the new consumer yet. -Ewen On Wed, Jul 6, 2016 at 11:02 AM, Prabh

Re: How to know the producer of one topic?

2016-07-23 Thread Ewen Cheslack-Postava

Unfortunately there's no ID for the producer of messages -- the client ID is included when the request is sent, but it isn't recorded on disk. You *might* be able to dig out the producer of bad messages from the Kafka logs, but there's nothing in the stored data that would lead you directly to the

Re: Rebalance and Failures

2016-07-23 Thread Ewen Cheslack-Postava

Since you mention ZK timeout, I think you might be confused about new vs old consumer semantics. With the new consumer, there's no ZK interaction. If one of the member dies after indicating membership but before the group protocol completes, it will simply be assigned data and not process it. After

Re: Kafka consumer performance with large network delay

2016-07-23 Thread Ewen Cheslack-Postava

Kafka will batch messages, but if the rate of delivery is too slow it'll fall back to delivering only one message per batch. What is the total throughput per broker? -Ewen On Fri, Jul 15, 2016 at 5:21 PM, Boris Sorochkin wrote: > Hi All, > I have Kafka setup with default settings and relatively

Re: Nginx Logs to Kafka

2016-07-23 Thread Ewen Cheslack-Postava

Kafka Connect can also help you here. There's nothing nginx specific, but even a very simple file connector can help you ingest nginx logs into Kafka. -Ewen On Tue, Jul 19, 2016 at 11:22 AM, Steve Brandon wrote: > You can use the ELK stack to push your logs to Kafka, and Kibana to > visualize >

Re: Maximum number of producers per topic per broker

2016-07-23 Thread Ewen Cheslack-Postava

There's no strict limit on the number of producers. If you're hitting some CPU limit, perhaps you are simply overloading the broker? 6 or 700 brokers doesn't sound that bad, but if they are producing too much data then of course eventually the broker will become overwhelmed. How much total data is

Re: Duplicates consumed on rebalance. No compression, autocommit enabled.

2016-07-23 Thread Ewen Cheslack-Postava

I'd suggest using the new consumer instead of the old consumer. We've refined the implementation such that even with auto-commit you should get at least once processing in the worst case (and when there aren't failures, exactly once). The 0.10.0.0 release should get all of these semantics right. -

Re: Topic naming convention and common message envelope.

2016-07-23 Thread Ewen Cheslack-Postava

On Tue, Jul 19, 2016 at 12:48 AM, Denis Mikhaylov wrote: > Hi, I plan to use Kafka for event-based integration between services. And > I have two questions that bother me a lot: > > 1) What topic naming convention do you use? > There's no strict convention, but using '.' or '_' to indicate hiera

Re: KafkaConsumer poll(timeout) doesn't seem to work as expected

2016-07-23 Thread Ewen Cheslack-Postava

Hey Josh, There's no guarantee that a poll() will return data. It might send a request, but if it takes longer than the timeout to retrieve some data from the brokers, then the call will have no data to return. This doesn't indicate anything is wrong, just that no data has been returned yet. poll

Re: Error ILLEGAL_GENERATION occurred while committing offsets for group

2016-07-23 Thread Ewen Cheslack-Postava

You're probably spending too long processing some messages. The ILLEGAL_GENERATION message indicates that a new generation was created, which probably means you didn't call poll() frequently enough to send heartbeats indicating you are still an active member of the group. We're currently working o

Re: Kafka Active Segment List Diagram Typo?

2016-07-23 Thread Ewen Cheslack-Postava

Yes, that's just a bug in the image -- the second log segment should hold messages in the range indicated in the left side of the image. -Ewen On Sun, Jul 3, 2016 at 10:03 AM, Adam Cardenas wrote: > Good day Kafka users, > > Was looking over the current Kafka docs; > https://kafka.apache.org/do

Re: consumer.subscribe(Pattern p , ..) method fails with Authorizer

2016-07-23 Thread Ewen Cheslack-Postava

Manikumar, Yeah, that seems bad. Seems like maybe instead of moving to server-side processing we should make the metadata request limit results to topics the principal is authorized for? I suspect this is important anyway since generally it seems we don't want to reveal errors when there's unautho

Re: 0.9 client persistently high CPU usage

2016-07-23 Thread Ewen Cheslack-Postava

That exception indicates that another thread is interrupting the consumer thread. Is there something else in the process that could be causing that interruption? The -1 broker ID actually isn't unusual. Since broker IDs should be positive, this is just a simple approach to identifying bootstrap se

Re: Deploying new connector to existing Kafka cluster

2016-07-23 Thread Ewen Cheslack-Postava

You're right that today you need to distribute jars manually today -- we don't have a built-in distribution mechanism, we just depend on what's on the classpath. Once you've got the jars installed, to make the jars accessible you'll need to do a rolling bounce with updated classpaths. We know that

Re: Can I use a single-partition topic for leader election?

2016-07-23 Thread Ewen Cheslack-Postava

If you have a consumer group, you can use whichever member is assigned partition 0 of a topic as the leader. Just note that this does *not* handle zombie leader scenarios, e.g. if a member of the group has a very long GC and then wakes up later thinking it is still the leader, it will execute any l

Re: Understanding Consumer Pooling vs Streaming

2016-07-23 Thread Ewen Cheslack-Postava

They implement generally the same consumer group functionality, but the new consumer (your option 1) is more modern, will be supported for a long time (whereas option 2 will eventually be deprecated and removed), and has a better implementation. The new consumer takes into account a lot of lessons

Re: Opportunity to contribute in Apache Kafka

2016-07-23 Thread Ewen Cheslack-Postava

Hey Shubham, I'd highly recommend a couple of newbie bugs just to get familiarized ( https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20labels%20%3D%20%22newbie%22%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20key%20DESC ) After getting familiarized with the project (

Re: Kafka Connect issues

2016-07-23 Thread Ewen Cheslack-Postava

That definitely sounds unusual -- rebalancing normally only happens either when a) there are new workers or b) there are connectivity issues/failures. Is it possible there's something causing large latencies? -Ewen On Sat, Jul 16, 2016 at 6:09 AM, Kristoffer Sjögren wrote: > Hi > > I'm running

Re: Kafka does not preserve an offset on topic.

2016-07-23 Thread Ewen Cheslack-Postava

The parameter you want is AUTO_OFFSET_RESET_CONFIG. If setting that to latest isn't working, can you include some code that reproduces the issue? -Ewen On Wed, Jul 6, 2016 at 6:21 AM, Pawel Huszcza wrote: > Hello, > > I tried every different property I can think of - I have set > ConsumerConfig

Re: Monitoring Kafka Connect

2016-07-23 Thread Ewen Cheslack-Postava

On Wed, Jun 29, 2016 at 9:44 AM, Sumit Arora wrote: > Hello, > > We are currently building our data-pipeline using Confluent and as part of > this implementation, we have written couple of Kafka Connect Sink > Connectors for Azure and MS SQL server. To provide some more context, we > are planning

Re: More OS packages, please!

2016-07-23 Thread Ewen Cheslack-Postava

Confluent Platform includes RPM and Debian packages: http://www.confluent.io/download We tag them a bit differently due to different release schedules, but the CP builds are entirely open source and effectively map directly to Apache releases. Check out http://docs.confluent.io/3.0.0/installation.h

Re: [kafka-connect] multiple or single clusters?

2016-07-23 Thread Ewen Cheslack-Postava

On Fri, Jun 24, 2016 at 11:16 AM, noah wrote: > I'm having some trouble figuring out the right way to run Kafka Connect in > production. We will have multiple sink connectors that we need to remain > running indefinitely and have at least once semantics (with as little > duplication as possible)

Re: Colocating Kafka Connect on Kafka Broker

2016-07-23 Thread Ewen Cheslack-Postava

Generally we discourage colocating services with Kafka. Kafka relies heavily on the page cache. It's generally light on CPU (except maybe if it has to recompress messages), but may not play well with other services. For very light installations, colocating some services (e.g. both ZK and Kafka), m

Re: Kafka cluster

2016-07-23 Thread Ewen Cheslack-Postava

2 is technically enough but you're at risk of losing data if there is a failure and the second broker fails while a replacement broker is replicating the data. In general, 3 brokers (and replicas) is a good minimum, but there are some cases that might warrant using fewer, even as few as 1. For exam

Re: Kafka (Streams) scalability

2016-07-23 Thread Jagat Singh

My post if not directly referring to KS. The new free book by Orielly has very good explanation about Kafka Topic counts. You can download it from below link ( See Chapter 4) http://shop.oreilly.com/product/0636920049463.do In short quoting from there >>> These problems are likely substantiall

Re: Kafka Consumer stops consuming from a topic

2016-07-23 Thread OGrandeDiEnne

Mmh... Some time ago we had an issue with Kafka 0.8.x The consumer was extremely slow ( the CPU was sucked up by other processes) and it was not picking up any message. Looking at Zookeeper we saw the offset was committed as the messages were already read by the consumer. We disabled auto back an

Re: Kafka Streams/Connect for Persistence?

Re: release of 0.10.1

Re: Find partition offsets in a kerberized kafka cluster

Re: How to know the producer of one topic?

Re: Rebalance and Failures

Re: Kafka consumer performance with large network delay

Re: Nginx Logs to Kafka

Re: Maximum number of producers per topic per broker

Re: Duplicates consumed on rebalance. No compression, autocommit enabled.

Re: Topic naming convention and common message envelope.

Re: KafkaConsumer poll(timeout) doesn't seem to work as expected

Re: Error ILLEGAL_GENERATION occurred while committing offsets for group

Re: Kafka Active Segment List Diagram Typo?

Re: consumer.subscribe(Pattern p , ..) method fails with Authorizer

Re: 0.9 client persistently high CPU usage

Re: Deploying new connector to existing Kafka cluster

Re: Can I use a single-partition topic for leader election?

Re: Understanding Consumer Pooling vs Streaming

Re: Opportunity to contribute in Apache Kafka

Re: Kafka Connect issues

Re: Kafka does not preserve an offset on topic.

Re: Monitoring Kafka Connect

Re: More OS packages, please!

Re: [kafka-connect] multiple or single clusters?

Re: Colocating Kafka Connect on Kafka Broker

Re: Kafka cluster

Re: Kafka (Streams) scalability

Re: Kafka Consumer stops consuming from a topic

28 matches

Site Navigation

Mail list logo

Footer information