Colocating Kafka Connect on Kafka Broker

2016-06-29 Thread Kristoffer Sjögren
Hi We want to use Kafka Connect to copy data to HDFS (using kafka-connect-hdfs) in parquet format and was wondering if its a good idea to collocate distributed Kafka Connect 1-1 on Kafka Brokers? Considering the parquet indexing process would steal (a lot of / too much?) computing resources from

Last offset in all partitions

2016-07-06 Thread Kristoffer Sjögren
Hi Is there a way to get the last offset written by all partitions of a topic programmatically using the 0.10.0.0 API? At the moment I use KafkaConsumer.seekToEnd as seen in this gist[1] but maybe there is a better, more efficient, way to do it? Cheers, -Kristoffer [1] https://gist.github.com/k

Re: Last offset in all partitions

2016-07-06 Thread Kristoffer Sjögren
stamp to offset for every partition with (currently) 60 second > granularity. Useful for offset resets and other tasks. > > -Todd > > On Wednesday, July 6, 2016, Kristoffer Sjögren wrote: > >> Hi >> >> Is there a way to get the last offset written by all partition

Re: Last offset in all partitions

2016-07-06 Thread Kristoffer Sjögren
t;> > We do this through our monitoring agents by pulling it as a metric from >> the >> > LogEndOffset beans. By putting it into our metrics system we get a >> mapping >> > of timestamp to offset for every partition with (currently) 60 second >> > granulari

Kafka Connect issues

2016-07-16 Thread Kristoffer Sjögren
Hi I'm running Kafka Connect in distributed mode with the confluent HDFS sink connector. But the WorkerSinkTask constantly gets interfered with rebalancing requests from the broker (onPartitionsRevoked) [1] and gets stuck in a recovery state where the brokers constantly logs "Preparing to restabi

Re: Kafka Connect issues

2016-07-26 Thread Kristoffer Sjögren
rkers or b) there are connectivity issues/failures. Is it possible there's something causing large latencies? -Ewen On Sat, Jul 16, 2016 at 6:09 AM, Kristoffer Sjögren wrote: > Hi > > I'm running Kafka Connect in distributed mode with the confluent HDFS > sink connector. >

Re: Kafka Connect issues

2016-07-27 Thread Kristoffer Sjögren
2 - session.timeout.ms=30 - max.poll.records=1 Please let me know if something stands out as a bad/imbalanced/under-provisioned. Cheers, -Kristoffer On Tue, Jul 26, 2016 at 12:38 PM, Kristoffer Sjögren wrote: > We found very high cpu usage which might cause the problem. Seems to be > spe

Re: Kafka Connect issues

2016-07-28 Thread Kristoffer Sjögren
We're also seeing lots of failures as the TopicPartitionWriter tries to close WAL files in HDFS [1]. [1] http://pastebin.com/6ipUndZv On Wed, Jul 27, 2016 at 5:01 PM, Kristoffer Sjögren wrote: > The workers seems happier when reducing number of partitions for each > worker. And when

Kafka consumer stuck in (Re-)joining group

2016-08-11 Thread Kristoffer Sjögren
Hi I have been using distributed Kafka Connect 0.10.0.0 very successfully for a while now. But then after a restart both machines get stuck as they try to join the group and marks the coordinator dead. I found no way of getting out of this state. Restarting the machines does not help. Here are so

Kafka-connect cannot find configuration in config.storage.topic

2016-10-12 Thread Kristoffer Sjögren
Hi We have noticed that kafka-connect cannot find its connector configuration after a few passed weeks. The web ui reports that no connectors are available even though the configuration records are still available in config.storage.topic. Its possible to start the connectors again by curling the c