Re: Kafka as an event store for Event Sourcing

2015-06-11 Thread Gwen Shapira
Hi Ben, Thanks for creating the ticket. Having check-and-set capability will be sweet :) Are you planning to implement this yourself? Or is it just an idea for the community? Gwen On Thu, Jun 11, 2015 at 8:01 PM, Ben Kirwin wrote: > As it happens, I submitted a ticket for this feature a couple

Re: Broken auto leader rebalance after using reassign partitions tool

2015-06-15 Thread Gwen Shapira
Can you share the command you ran for partition reassignment? (and the JSON) On Mon, Jun 15, 2015 at 8:41 AM, Valentin wrote: > > Hi guys, > > today I have observed a very strange behavior of the auto leader rebalance > feature after I used the reassign partitions tool. > For some reason only th

Re: Broken auto leader rebalance after using reassign partitions tool

2015-06-16 Thread Gwen Shapira
uot;T5", "partition": 0, "replicas": [1,3,5] }, > { "topic": "T5", "partition": 1, "replicas": [2,4,6] }, > { "topic": "T5", "partition": 2, "replicas": [1,3,5] }, > { "

Re: QuickStart OK locally, but getting "WARN Property topic is not valid" and LeaderNotAvailableException remotely

2015-06-16 Thread Gwen Shapira
The topic warning is a bug (i.e the fact that you get a warning on perfectly valid parameter). We fixed it for next release. It is also unrelated to the real issue with the LeaderNotAvailable On Tue, Jun 16, 2015 at 2:08 PM, Mike Bridge wrote: > I am able to get a simple one-node Kafka (kafka_2.

Re: Need recommendation

2015-06-18 Thread Gwen Shapira
I'm assuming you are sending data in a continuous stream and not a single large batch: 500GB a day = 20GB an hour = 5MB a second. A minimal 3 node cluster should work. You also need enough storage for reasonable retention period (15TB per month). On Thu, Jun 18, 2015 at 10:39 AM, Khanda.Rajat

Re: Is trunk safe for production?

2015-06-23 Thread Gwen Shapira
Out of curiosity, why do you want to run trunk? General fondness for cutting edge stuff? Or are there specific features in trunk that you need? Gwen On Tue, Jun 23, 2015 at 2:59 AM, Achanta Vamsi Subhash wrote: > I am planning to use for the producer part. How stable is trunk generally? > > -- >

Re: Best Practices for Java Consumers

2015-06-23 Thread Gwen Shapira
I don't know of any such resource, but I'll be happy to help contribute from my experience. I'm sure others would too. Do you want to start one? Gwen On Tue, Jun 23, 2015 at 2:03 PM, Tom McKenzie wrote: > Hello > > Is there a good reference for best practices on running Java consumers? > I'm th

Re: Kafka consumer does not work from eclipse when kafka running inside VM

2015-06-26 Thread Gwen Shapira
Zookeeper actually doesn't show any errors - it shows a warning, which is pretty normal. What does your consumer and Kafka broker show? Are there any errors in the consumer? Or is it just hanging? You may want to consult our FAQ: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whydoesmy

Re: Help Us Nominate Apache Kafka for a 2015 Bossie (Best of OSS) Award - Due June 30th

2015-06-26 Thread Gwen Shapira
Sent! Thanks for letting us know of this opportunity to promote our favorite Apache project :) For inspiration, here's what I wrote: Apache Kafka revolutionized stream processing for big data. There is a dramatic growth in the number and variety of data sources an organization has to track as wel

Re: Producer repeatedly locking up

2015-06-30 Thread Gwen Shapira
Do you see any errors on the brokers when this happens? On Tue, Jun 30, 2015 at 10:14 AM, Shayne S wrote: > This problem is intermittent, not sure what is causing it. Some days > everything runs non-stop with no issues, some days I get the following. > > Setup: > - Single broker > - Running 0.8.2

Re: EOL JDK 1.6 for Kafka

2015-07-01 Thread Gwen Shapira
Huge +1. I don't think there is any other project that still supports 1.6. On Wed, Jul 1, 2015 at 8:05 AM, Harsha wrote: > Hi, > During our SSL Patch KAFKA-1690. Some of the reviewers/users > asked for support this config > https://docs.oracle.com/javase/8/docs/api/javax/net/ssl/

Re: Origin of product name Kafka

2015-07-06 Thread Gwen Shapira
Nice :) I always thought its a reference to the Kafkaesque process of building data pipelines in a large organization :) On Mon, Jul 6, 2015 at 6:52 PM, luo.fucong wrote: > I just found the answer in Quora: > > http://www.quora.com/What-is-the-relation-between-Kafka-the-writer-and-Apache-Kafka-t

Re: Kafka settings for (more) reliable/durable messaging

2015-07-07 Thread Gwen Shapira
I am not sure "different replica" can ACK the second back of messages while not having the first - from what I can see, it will need to be up-to-date on the latest messages (i.e. correct HWM) in order to ACK. On Tue, Jul 7, 2015 at 7:13 AM, Stevo Slavić wrote: > Hello Apache Kafka community, > >

Re: Kafka settings for (more) reliable/durable messaging

2015-07-07 Thread Gwen Shapira
te can be ACKed by any ISR, and then > why not by one which has fallen more behind. > > Kind regards, > Stevo Slavic. > > On Tue, Jul 7, 2015 at 4:47 PM, Gwen Shapira wrote: > >> I am not sure "different replica" can ACK the second back of messages >>

Re: Launch Kafka/Zookeeper servers without using command line

2015-07-07 Thread Gwen Shapira
Is this for unit tests? Or do you need an embedded Kafka / ZK inside an app? Or do you mean launching an external Kafka cluster, but without command line? For unit tests / embedded, you can see this example: https://gist.github.com/vmarcinko/e4e58910bcb77dac16e9 To start an actual Kafka server th

Re: stunning error - Request of length 1550939497 is not valid, it is larger than the maximum size of 104857600 bytes

2015-07-11 Thread Gwen Shapira
You need to configure the Kafka broker to allow you to send larger messages. The relevant parameters are: message.max.bytes (default:100) – Maximum size of a message the broker will accept. This has to be smaller than the consumer fetch.message.max.bytes, or the broker will have messages that

Re: stunning error - Request of length 1550939497 is not valid, it is larger than the maximum size of 104857600 bytes

2015-07-11 Thread Gwen Shapira
> # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002". > # You can also append an optional chroot string to the urls to specify the > # root directory for all kafka znodes. > #zookeeper.connect=localhost:2181 > zookeeper.connect=<%=@zookeeper%> > > > # T

Re: New producer and ordering of Callbacks when sending to multiple partitions

2015-07-13 Thread Gwen Shapira
James, There are separate queues for each partition, so there are no guarantees on the order of the sends (or callbacks) between partitions. (Actually, IIRC, the code intentionally randomizes the partition order a bit, possibly to avoid starvation) Gwen On Mon, Jul 13, 2015 at 5:41 PM, James Che

Re: Using Kafka as a persistent store

2015-07-13 Thread Gwen Shapira
Hi, 1. What you described sounds like a reasonable architecture, but may I ask why JSON? Avro seems better supported in the ecosystem (Confluent's tools, Hadoop integration, schema evolution, tools, etc). 1.5 If all you do is convert data into JSON, SparkStreaming sounds like a difficult-to-manag

Re: How to run the three producers test

2015-07-14 Thread Gwen Shapira
You need to run 3 of those at the same time. We don't expect any errors, but if you run into anything, let us know and we'll try to help. Gwen On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du wrote: > Hi, > > I am running the performance test for kafka. https://gist.github.com/jkreps > /c7ddb4041ef62

Re: stunning error - Request of length 1550939497 is not valid, it is larger than the maximum size of 104857600 bytes

2015-07-14 Thread Gwen Shapira
I am not familiar with Apache Bench. Can you share more details on what you are doing? On Tue, Jul 14, 2015 at 11:45 AM, JIEFU GONG wrote: > So I'm trying to make a request with a simple ASCII text file, but what's > strange is even if I change files to send or the contents of the file I get > th

Re: How to run the three producers test

2015-07-14 Thread Gwen Shapira
Are there any errors on the broker logs? On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du wrote: > Jiefu, > > Thank you. The three producers can run at the same time. I mean should they > be started at exactly the same time? (I have three consoles from each of > the three machines and I just start the

Re: Consumer that consumes only local partition?

2015-07-15 Thread Gwen Shapira
This is not something you can use the consumer API to simply do easily (consumers don't have locality notion). I can imagine using Kafka's low-level API calls to get a list of partitions and the lead replica, figuring out which are local and using those - but that sounds painful. Are you 100% sure

Re: Delete topic using Admintools is not working

2015-07-16 Thread Gwen Shapira
Looks like you try to delete a topic that is already in the process of getting deleted: NodeExists for /admin/delete_topics/testTopic17 (We can improve the error messages for sure, or maybe even catch the exception and ignore it) Gwen On Thu, Jul 16, 2015 at 3:40 PM, Sivananda Reddy wrote: > Hi

Re: ZK chroot path would be automatically created since Kafka 0.8.2.0?

2015-07-22 Thread Gwen Shapira
You are right, this sounds like a doc bug. Do you mind filing a JIRA ticket (http://issues.apache.org/jira/browse/KAFKA) so we can keep track of this issue? On Tue, Jul 21, 2015 at 7:43 PM, yewton wrote: > Hi, > > The document about zookeeper.connect on Broker Configs says that > "Note that you

Re: Custom Zookeeper install with kafka

2015-07-22 Thread Gwen Shapira
All Cloudera customers use ZK 3.4.5 with no issues. On Wed, Jul 22, 2015 at 11:05 AM, Todd Palino wrote: > Yes, we use ZK 3.4.6 exclusively at LinkedIn and there's no problem. > > -Todd > >> On Jul 22, 2015, at 9:49 AM, Adam Dubiel wrote: >> >> Hi, >> >> I don't think it matters much which versi

Re: parametrized number of partitions for topics

2015-07-22 Thread Gwen Shapira
Edenhill's reply actually covers everything: 1) Right now you'll need to use the kafka-topic.sh tool bundled with Kafka. 2) There are future plans to add this capability : https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations On Wed, Ju

Re: Kafka message queue during broker failure

2015-07-22 Thread Gwen Shapira
If you are using the new Kafka Producer (in org.apache.kafka.clients package), you can configure number of retries. The Producer will queue messages and re-attempt to send them as specified. On Wed, Jul 22, 2015 at 11:54 AM, Jeff Gong wrote: > hi all, > > currently working with a team that is int

Re: kafka-topics.sh - Include topic deletion information?

2015-07-22 Thread Gwen Shapira
actually, I believe kafka-topics --list already shows this information. At least, I remember adding this feature... On Wed, Jul 22, 2015 at 3:11 PM, Ashish Singh wrote: > Hey Jaikiran, I think that is a fair ask. However, I am curious in which > scenario would you want to know the topics that sh

Re: wow--kafka--why? unresolved dependency: com.eed3si9n#sbt-assembly;0.8.8: not found

2015-07-23 Thread Gwen Shapira
Sorry, we don't actually do SBT builds anymore. You can build successfully using Gradle: You need to have [gradle](http://www.gradle.org/installation) installed. ### First bootstrap and download the wrapper ### cd kafka_source_dir gradle Now everything else will work ### Building a jar

Re: properducertest on multiple nodes

2015-07-24 Thread Gwen Shapira
Does topic "speedx1" exist? On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du wrote: > Hi, > > I am trying to run 20 performance test on 10 nodes using pbsdsh. > > The messages will send to a 6 brokers cluster. It seems to work for a > while. When I delete the test queue and rerun the test, the broker d

Re: The meaning of the term "group"

2015-07-30 Thread Gwen Shapira
Can you point specifically to which offsets() function you are referring to? (i.e object or file name will help) I didn't find a method that takes group as a parameter in the consumer API... On Wed, Jul 29, 2015 at 2:37 PM, Keith Wiley wrote: > ?My understanding is that the group id indicated to

Re: Consumer limit for pub-sub mode

2015-08-03 Thread Gwen Shapira
I don't know a specific limit for number of consumers, perhaps someone will have better idea. Do note that with a single topic and a single partition, you can't really scale by adding more machines - the way Kafka is currently designed, all consumers will read from the one machine that has the lea

Re: 0.9.0 release

2015-08-03 Thread Gwen Shapira
According to the plan, never :) Is there a specific feature you are looking forward to? I think the most exciting features are planned for 0.8.3 - which is targeted for Oct. On Mon, Aug 3, 2015 at 1:14 PM, Shrikant Patel wrote: > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+

Re: 0.9.0 release

2015-08-03 Thread Gwen Shapira
> Shrikant Patel | 817.246.6760 | ext. 4302 Enterprise Architecture > Team PDX-NHIN-Rx.com > > -Original Message- > From: Gwen Shapira [mailto:g...@confluent.io] > Sent: Monday, August 03, 2015 3:23 PM > To: users@kafka.apache.org > Subject: Re: 0.9.0 release

Re: Checkpointing with custom metadata

2015-08-03 Thread Gwen Shapira
You are correct. You can see that ZookeeperConsumerConnector is hardcoded with null metadata. https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala#L310 More interesting, it looks like the Metadata is not exposed in the new KafkaConsumer eit

Re: Checkpointing with custom metadata

2015-08-03 Thread Gwen Shapira
how adding metadata to commit message can emulate some light-weight transactions, but I'd be concerned that this capability can get abused... P.S Thanks. I like my new address :) On Mon, Aug 3, 2015 at 6:46 PM, James Cheng wrote: > Nice new email address, Gwen. :) > > On Aug 3, 2015

Re: How to read in batch using HighLevel Consumer?

2015-08-04 Thread Gwen Shapira
To add some internals, the high level consumer actually does read entire batches from Kafka. It just exposes them to the user in an event loop, because its a very natural API. Users can then batch events the way they prefer. So if you are worried about batches being more efficient than single even

Re: message filterin or "selector"

2015-08-04 Thread Gwen Shapira
The way Kafka is currently implemented is that Kafka is not aware of the content of messages, so there is no Selector logic available. The way to go is to implement the Selector in your client - i.e. your consume() loop will get all messages but will throw away those that don't fit your pattern.

Re: InvalidMessageException: Message is corrupt, Consumer stuck

2015-08-04 Thread Gwen Shapira
The high level consumer stores its state in ZooKeeper. Theoretically, you should be able to go into ZooKeeper, find the consumer-group, topic and partition, and increment the offset past the "corrupt" point. On Tue, Aug 4, 2015 at 10:23 PM, Henry Cai wrote: > Hi, > > We are using the Kafka high-

Re: Recovering from Kafka NoReplicaOnlineException with one node

2015-08-10 Thread Gwen Shapira
Maybe it is not ZooKeeper itself, but the Broker connection to ZK timed-out and caused the controller to believe that the broker is dead and therefore attempted to elect a new leader (which doesn't exist, since you have just one node). Increasing the zookeeper session timeout value may help. Also,

Re: use page cache as much as possiblee

2015-08-13 Thread Gwen Shapira
On Thu, Aug 13, 2015 at 4:10 PM, Kishore Senji wrote: > Consumers can only fetch data up to the committed offset and the reason is > reliability and durability on a broker crash (some consumers might get the > new data and some may not as the data is not yet committed and lost). Data > will be co

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Gwen Shapira
Will be nice to include Kafka-2308 and fix two critical snappy issues in the maintenance release. Gwen On Aug 14, 2015 6:16 AM, "Grant Henke" wrote: > Just to clarify. Will KAFKA-2189 be the only patch in the release? > > On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy > wrote: > > > +1 for 0

Re: 0.8.2 producer and single message requests

2015-08-14 Thread Gwen Shapira
Hi Neelesh :) The new producer has configuration for controlling the batch sizes. By default, it will batch as much as possible without delay (controlled by linger.ms) and without using too much memory (controlled by batch.size). As mentioned in the docs, you can set batch.size to 0 to disable ba

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-16 Thread Gwen Shapira
and > > https://issues.apache.org/jira/browse/KAFKA-2120 > > > > On Fri, Aug 14, 2015 at 4:03 PM, Gwen Shapira wrote: > > > > > Will be nice to include Kafka-2308 and fix two critical snappy issues > in > > > the maintenance release. > > > >

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Gwen Shapira
a/browse/KAFKA-2345>: > > > Attempt to delete a topic already marked for deletion throws > > >ZkNodeExistsException > > >- KAFKA-2353 <https://issues.apache.org/jira/browse/KAFKA-2353>: > > >SocketServer.Processor should catch exception

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Gwen Shapira
but +1 for 0.8.2 patch that marks the new consumer API as unstable (or unimplemented ;) On Mon, Aug 17, 2015 at 9:12 AM, Gwen Shapira wrote: > The network refactoring portion was not tested well enough yet for me to > feel comfortable pushing it into a bugfix release. The new purgato

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Gwen Shapira
ZkNodeExistsException > >- KAFKA-2353 <https://issues.apache.org/jira/browse/KAFKA-2353>: > >SocketServer.Processor should catch exception and close the socket > properly > >in configureNewConnections. > >- KAFKA-1836 <https://issues.apache.org/

Re: KafkaConsumer from 0.8.3 trunk hangs indefinitely on poll

2015-08-18 Thread Gwen Shapira
As you can see in the javadoc for KafkaConsumer, you need to call poll() in a loop. Something like: while (true) { * ConsumerRecords records = consumer.poll(100); * records.forEach(c -> queue.add(c.value())); * * } On Tue, Aug 18, 2015 at 2:46 AM, Krogh-Moe, Espen wrote: >

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-18 Thread Gwen Shapira
A-2337 & KAFKA-2393 > > KAFKA-1867 > > KAFKA-2407 > > KAFKA-2234 > > KAFKA-1866 > > KAFKA-2345 & KAFKA-2355 > > > > thoughts? > > > > Thank you, > > Grant > > > > On Mon, Aug 17, 2015 at 4:56 PM, Gwen Shapira wrote: >

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-18 Thread Gwen Shapira
corporating that feedback and iterate on it. > > We could absolutely do both 0.8.2.2 and 0.8.3. What I'd ask for is for us > to look at the 0.8.3 timeline too and make a call whether 0.8.2.2 still > makes sense. > > Thanks, > Neha > > On Tue, Aug 18, 2015 at 10:24 AM

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-18 Thread Gwen Shapira
xed in trunk and we weren't planning for an 0.8.2.2 release then. > > Thanks, > > Jun > > On Mon, Aug 17, 2015 at 2:56 PM, Gwen Shapira wrote: > > > Thanks for creating a list, Grant! > > > > I placed it on the wiki with a quick evaluation of the con

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-18 Thread Gwen Shapira
Any objections if I leave KAFKA-2114 (setting min.insync.replicas default) out? The test code is using changes that were done after 0.8.2.x cut-off, which makes it difficult to cherry-pick. Gwen On Tue, Aug 18, 2015 at 12:16 PM, Gwen Shapira wrote: > Jun, > > KAFKA-2147 doesn'

Re: Possible Memory Leak in Record Accumulator

2015-08-20 Thread Gwen Shapira
Hi, I didn't see this issue during our network hiccups. You wrote you saw: Got error produce response with correlation id 17717 on topic-partition event.beacon-38, retrying (8 attempts left). Error: NETWORK_EXCEPTION What did you see after? Especially once the network issue was resolved? more re

Re: Possible Memory Leak in Record Accumulator

2015-08-20 Thread Gwen Shapira
at 5:35 PM, Gwen Shapira wrote: > Hi, > > I didn't see this issue during our network hiccups. You wrote you saw: > > Got error produce response with correlation id 17717 on topic-partition > event.beacon-38, retrying (8 attempts left). Error: NETWORK_EXCEPTION > > What

Re: Painfully slow kafka recovery

2015-08-21 Thread Gwen Shapira
By default, num.replica.fetchers = 1. This means only one thread per broker is fetching data from leaders. This means it make take a while for the recovering machine to catch up and rejoin the ISR. If you have bandwidth to spare, try increasing this value. Regarding "no data flowing into kafka" -

Re: Painfully slow kafka recovery

2015-08-21 Thread Gwen Shapira
e been minimal? > > Thanks, > Raja. > > > > On Fri, Aug 21, 2015 at 12:31 PM, Gwen Shapira wrote: > > > By default, num.replica.fetchers = 1. This means only one thread per > broker > > is fetching data from leaders. This means it make take a while for the >

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Gwen Shapira
I can't speak for the Spark Community, but checking their code, DirectKafkaStream and KafkaRDD use the SimpleConsumer API: https://github.com/apache/spark/blob/master/external/kafka/src/main/scala/org/apache/spark/streaming/kafka/DirectKafkaInputDStream.scala https://github.com/apache/spark/blob/m

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Gwen Shapira
; > > > We are fine with non SSL consumer as our kafka cluster and spark cluster > > are in the same network > > > > > > Thanks, > > Sourabh > > > > On Fri, Aug 28, 2015 at 12:03 PM, Gwen Shapira > wrote: > > I can't speak for the Spark Co

Re: Get kafka connection status

2015-08-30 Thread Gwen Shapira
Two suggestions: 1. While the consumer is connected, it has one or more threads called "ConsumerFetcherThread---" If you can look at which threads are currently running and check if any called ConsumerFetcherThread-* exist, this is a good indication. The threads are closed when shutdown() is calle

Re: 0.8.2 producer and single message requests

2015-09-01 Thread Gwen Shapira
ntially get async model) for every send() and based on that it > > should > > > > respond to its clients whether the call is successful or not. The > > clients > > > > of your webservice should have fault tolerance built on top of your > > > > response co

Re: Kafka loses data after one broker reboot

2015-09-01 Thread Gwen Shapira
There is another edge-case that can lead to this scenario. It is described in detail in KAFKA-2134, and I'll copy Becket's excellent summary here for reference: 1) Broker A (leader) has committed offset up-to 5000 2) Broker B (follower) has committed offset up to 3000 (he is still in ISR because o

Re: Competing customers

2015-09-03 Thread Gwen Shapira
Yeah, scaling through adding partitions ("sharding") is a basic feature of Kafka. We expect topics to have many partitions (at least as many as number of consumers), and each consumer to get a subset of the messages by getting a subset of partitions. This design gives Kafka its two biggest advanta

Re: API to query cluster metadata on-demand

2015-09-03 Thread Gwen Shapira
Ah, I wish. We are working on it :) On Thu, Sep 3, 2015 at 9:10 AM, Simon Cooper < simon.coo...@featurespace.co.uk> wrote: > Is there a basic interface in the new client APIs to get the list of > topics on a cluster, and get information on the topics (offsets, sizes, > etc), without having to de

Re: Slow ISR catch-up

2015-09-03 Thread Gwen Shapira
The test uses the old producer (we should fix that), and since you don't specify --sync, it runs async. The old async producer simply sends data and doesn't wait for acks, so it is possible that the messages were never acked... On Thu, Sep 3, 2015 at 7:56 AM, Prabhjot Bharaj wrote: > Hi Folks, >

Re: Slow ISR catch-up

2015-09-03 Thread Gwen Shapira
Yes, this should work. Expect lower throughput though. On Thu, Sep 3, 2015 at 12:52 PM, Prabhjot Bharaj wrote: > Hi, > > Can I use sync for acks = -1? > > Regards, > Prabhjot > On Sep 3, 2015 11:49 PM, "Gwen Shapira" wrote: > > > The test uses the old

Re: [VOTE] 0.8.2.2 Candidate 1

2015-09-09 Thread Gwen Shapira
+1 non-binding - verified signatures and build. On Wed, Sep 9, 2015 at 10:28 AM, Ewen Cheslack-Postava wrote: > +1 non-binding. Verified artifacts, unit tests, quick start. > > On Wed, Sep 9, 2015 at 10:09 AM, Guozhang Wang wrote: > > > +1 binding, verified unit tests and quick start. > > > > O

Re: 0.9.0.0 remaining jiras

2015-09-14 Thread Gwen Shapira
We decided to rename 0.8.3 to 0.9.0 since it contains few large changes (Security, new consumer, quotas). On Sun, Sep 13, 2015 at 11:56 PM, Jason Rosenberg wrote: > Hi Jun, > > Can you clarify, will there not be a 0.8.3.0 (and instead we move straight > to 0.9.0.0)? > > Also, can you outline t

Re: 0.9.0.0 remaining jiras

2015-09-14 Thread Gwen Shapira
Agree that these are very nice to have. We've seen many deployments that need to manage these on their own. However, if this is not ready before we are done adding security and the new consumer, it will make sense to still release 0.9.0 and add the broker management improvements in 0.9.1. I'm tryi

Re: Auto preferred leader elections cause data loss?

2015-09-14 Thread Gwen Shapira
acks = all should prevent this scenario: If broker 0 is still in ISR, the produce request for 101 will not be "acked" (because 0 is in ISR and not available for acking), and the producer will retry it until all ISR acks. If broker 0 dropped off ISR, it will not be able to rejoin until it has all

Re: Tools/recommendations to debug performance issues?

2015-09-14 Thread Gwen Shapira
Kafka also collects very useful metrics on request times and their breakdown. They are under kafka.network. On Mon, Sep 14, 2015 at 6:59 AM, Rahul Jain wrote: > Have you checked the consumer lag? You can use the offset checker tool to > see if there is a lag. > On 14 Sep 2015 18:36, "noah" wr

Re: Mapping a consumer in a consumer group to a partition in a topic

2015-09-22 Thread Gwen Shapira
Unfortunately, in order to get a specific partition, you will need to use the simple consumer API, which does not have consumer groups. see here for details: https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example On Tue, Sep 22, 2015 at 6:08 PM, Spandan Harithas Karamchedu

Re: log clean up

2015-09-25 Thread Gwen Shapira
Absolutely. You can go into config/log4j.properties and configure the appenders to roll the logs. For example: log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.stateChangeAppender.DatePattern='.'-MM-dd-HH log4j.appender.stateChangeAppender.File=${ka

Re: Frequent Consumer and Producer Disconnects

2015-09-25 Thread Gwen Shapira
How busy are the clients? The brokers occasionally close idle connections, this is normal and typically not something to worry about. However, this shouldn't happen to consumers that are actively reading data. I'm wondering if the "consumers not making any progress" could be due to a different is

Re: which producer should be used

2015-09-27 Thread Gwen Shapira
KafkaProducer is the most current and full-featured one, and it should be used. The other producers will be deprecated in a release or two, so I recommend not to use them. On Sun, Sep 27, 2015 at 8:40 PM, Li Tao wrote: > Hi there, > I noticed that there are several producers our there: > > **

Re: which producer should be used

2015-09-28 Thread Gwen Shapira
> http://kafka.apache.org/082/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html > > Is there an equivalent Java API for 0.8.2 yet or is the older one the most > current? > > -- > Sharninder > > > On Mon, Sep 28, 2015 at 9:15 AM, Gwen Shapira wrote: > >

Re: Dealing with large messages

2015-10-06 Thread Gwen Shapira
Storing large blobs in S3 or HDFS and placing URIs in Kafka is the most common solution I've seen in use. On Tue, Oct 6, 2015 at 8:32 AM, Joel Koshy wrote: > The best practice I think is to just put large objects in a blob store > and have messages embed references to those blobs. Interestingly

Re: Partition ownership with high-level consumer

2015-10-06 Thread Gwen Shapira
Zookeeper will have this information under /consumers//owners On Tue, Oct 6, 2015 at 12:22 PM, Joey Echeverria wrote: > Hi! > > Is there a way to track current partition ownership when using the > high-level consumer? It looks like the rebalance callback only tells me the > partitions I'm (pot

Re: Partition ownership with high-level consumer

2015-10-06 Thread Gwen Shapira
I don't think so. AFAIK, even the new API won't send this information to every consumer, because in some cases it can be huge. On Tue, Oct 6, 2015 at 1:44 PM, Joey Echeverria wrote: > But nothing in the API? > > -Joey > > On Tue, Oct 6, 2015 at 3:43 PM, Gwen Shapir

Re: Partition ownership with high-level consumer

2015-10-06 Thread Gwen Shapira
ent should know > that in order to acquire the zookeeper locks and could potentially execute > a callback to tell me the partitions I own after a rebalance. > > -Joey > > On Tue, Oct 6, 2015 at 4:08 PM, Gwen Shapira wrote: > > > I don't think so. AFAIK, even the new

Re: mapping events to topics

2015-10-06 Thread Gwen Shapira
I usually approach this questions by looking at possible consumers. You usually want each consumer to read from relatively few topics, use most of the messages it receives and have fairly cohesive logic for using these messages. Signs that things went wrong with too few topics: * Consumers that t

Re: Datacenter to datacenter over the open internet

2015-10-06 Thread Gwen Shapira
You can configure "advertised.host.name" for each broker, which is the name external consumers and producers will use to refer to the brokers. On Tue, Oct 6, 2015 at 3:31 PM, Tom Brown wrote: > Hello, > > How do you consume a kafka topic from a remote location without a dedicated > connection? H

Re: 0.9.0 Beta Release

2015-10-08 Thread Gwen Shapira
Hi Jarred, At the moment, we still believe that we are planning a release (without a beta, I think) in mid-Nov. On the other hand, we are all engineers, we tend to be optimistic about those things :) Gwen On Thu, Oct 8, 2015 at 10:31 AM, Jarred Ward wrote: > Greetings, > > We were really looki

Re: Partitions - Brokers - Servers

2015-10-13 Thread Gwen Shapira
Hi, We normally run 1 broker per 1 physical server, and up to around 1000 partitions per broker (although that depends on the specific machine the broker is on and specific configuration). In order to enjoy replication, we recommend a minimum of 3 brokers in the cluster, to support 3 replicas per

Re: r/apachekafka subreddit

2015-10-13 Thread Gwen Shapira
Subscribed :) Since the mailing list is rather active, I'm not sure there is significant benefit in a reddit community - but I'll be around to join discussions and see how it turns out. On Tue, Oct 13, 2015 at 1:58 PM, Andrew Pennebaker < andrew.penneba...@gmail.com> wrote: > Any Redditors in th

Re: Strange ZK Error precedes frequent rebalances

2015-10-14 Thread Gwen Shapira
It is not strange, it means that one of the consumers lost connectivity to Zookeeper, its session timed-out and this caused ephemeral ZK nodes (like /consumers/real-time-updates/ids/real-time-updates_infra- buildagent-06-1444854764478-4dd4d6af) to be removed and ultimately cause the rebalance. Wha

Re: Strange ZK Error precedes frequent rebalances

2015-10-14 Thread Gwen Shapira
e subscribed to? > > On Wed, Oct 14, 2015 at 3:52 PM Gwen Shapira wrote: > > > It is not strange, it means that one of the consumers lost connectivity > to > > Zookeeper, its session timed-out and this caused ephemeral ZK nodes (like > > /consumers/real-time-updates/i

Re: Where is replication factor stored?

2015-10-16 Thread Gwen Shapira
We don't store the replication factor per-se. When the topic is created, we use the replication factor to generate replica-assignment, and the replica assignment gets stored in ZK under: /brokers/topics//partitions/... This is what gets modified when we re-assign replicas. Hope this helps. Gwen

Re: Kafka 8.2.2 doesn't want to compress in snappy

2015-10-20 Thread Gwen Shapira
We compress a batch of messages together, but we need to give each message its own offset (and know its key if we want to use topic compaction), so messages are un-compressed and re-compressed. We are working on an improvement to add relative offsets which will allow the broker to skip this re-com

Re: Questions about .9 consumer API

2015-10-23 Thread Gwen Shapira
There are some examples that include error handling. These are to demonstrate the new and awesome seek() method. You don't have to handle errors that way, we are just showing that you can. On Thu, Oct 22, 2015 at 8:34 PM, Mohit Anchlia wrote: > It's in this link. Most of the examples have some ki

Re: Kafka and Spark Issue

2015-11-02 Thread Gwen Shapira
Since the error is from the HBase client and completely unrelated to Kafka, you will have better luck in the HBase mailing list. On Mon, Nov 2, 2015 at 9:16 AM, Nikhil Gs wrote: > Hello Team, > > My scenario is to load the data from producer topic to Hbase by using Spark > API. Our cluster is Ke

Re: Quick kafka-reassign-partitions.sh script question

2015-11-02 Thread Gwen Shapira
Actually, no. You can move partitions online. The way it works is that: 1. A new replica is created for the partition in the new broker 2. It starts replicating from the leader until it catches up - if you continue producing at this time, it will take longer to catch up. 3. Once the new replica ca

Re: Debug kafka code in intellij

2015-11-05 Thread Gwen Shapira
Running tests from intellij is fairly easy - you click on the test name and select "run" or "debug", if you select "debug" it honors breakpoints. Rad, what happens when you try to run a test within Intellij? On Thu, Nov 5, 2015 at 2:55 PM, Dong Lin wrote: > Hi Rad, > > I never use intellij to r

Re: Debug kafka code in intellij

2015-11-05 Thread Gwen Shapira
> de.linkedin.com/in/radgruchalski/ ( > > > http://de.linkedin.com/in/radgruchalski/) > > > > > > Confidentiality: > > > This communication is intended for the above-named person and may be > > > confidential and/or legally privileged. > > > I

Re: Debug kafka code in intellij

2015-11-05 Thread Gwen Shapira
7;Kafka-0.8.2.1' tests > Please configure separate output paths to proceed with the compilation. > TIP: you can use Project Artifacts to combine compiled classes if needed. > > Regards, > Prabhjot > > On Fri, Nov 6, 2015 at 10:00 AM, Gwen Shapira wrote: > > > I also

Re: [kafka-clients] Re: [VOTE] 0.9.0.0 Candiate 1

2015-11-10 Thread Gwen Shapira
BTW. I created a Jenkins job for the 0.9 branch: https://builds.apache.org/job/kafka_0.9.0_jdk7/ Right now its pretty much identical to trunk, but since they may diverge, I figured we want to keep an eye on the branch separately. Gwen On Tue, Nov 10, 2015 at 11:39 AM, Jun Rao wrote: > Ewen, >

Re: Re: Re: Re: Re: Kafka lost data issue

2015-11-12 Thread Gwen Shapira
Hi, First, here's a handy slide-deck on avoiding data loss in Kafka: http://www.slideshare.net/gwenshap/kafka-reliability-when-it-absolutely-positively-has-to-be-there Note configuration parameters like the number of retries. Also, it looks like you are sending data to Kafka asynchronously, but

Re: Kafka log retention questions

2015-11-12 Thread Gwen Shapira
See answers inline On Thu, Nov 12, 2015 at 2:53 PM, Sandhu, Dilpreet wrote: > Hi all, >I am new to Kafka usage. Here are some questions that I have in > mind. Kindly help me understand it better. If some questions make no sense > feel free to call it out. > 1. Is it possible to prune lo

Re: Kafka log retention questions

2015-11-13 Thread Gwen Shapira
e again. > > Your help is much appreciated. > Best regards, > Dilpreet > > > > On 11/13/15, 1:24 AM, "Raju Bairishetti" wrote: > > >Adding some more info inline. > > > >On Fri, Nov 13, 2015 at 10:43 AM, Gwen Shapira wrote: > > > >>

Re: Re: Re: Re: Re: Kafka lost data issue

2015-11-16 Thread Gwen Shapira
r your excellent slides >> >> I will test it again based on your suggestions. >> >> >> >> >> Best regards >> Hawin >> >> On Thu, Nov 12, 2015 at 6:35 PM, Gwen Shapira wrote: >> >> > Hi, >> > >> > First, h

<    1   2   3   4   5   6   >