Re: Logging in new clients

2014-02-03 Thread Jay Kreps
is to use java.util.logging to avoid adding an external > > dependency, > > but I'm not too sure about what's the "standard" out there, so open to > > suggestions > > on picking a different library. > > > > > > > > On Mon, Feb 3,

Re: Logging in new clients

2014-02-03 Thread Jay Kreps
> > > >> Basically my preference would be java.util.logging unless there is > > > some > > > > known problem with it, otherwise I guess slf4j, and if not that then > > > log4j. > > > > > > > > +1. My preference is to use java.util.logging to avoid ad

Config for new clients (and server)

2014-02-04 Thread Jay Kreps
We touched on this a bit in previous discussions, but I wanted to draw out the approach to config specifically as an item of discussion. The new producer and consumer use a similar key-value config approach as the existing scala clients but have different implementation code to help define these c

Re: Message latency

2014-02-04 Thread Jay Kreps
There are two definitions of latency: 1. How long before the writer gets an acknowledgement for their write. This depends on the acks setting the producer has as Guozhang says. If acks = 1 we wait just on the leader, if acks=-1 we wait on all "in sync" brokers (i.e. alive brokers). 2. How long befo

Re: Surprisingly high network traffic between kafka servers

2014-02-04 Thread Jay Kreps
No this is not normal. Checking twice a second (using 500ms default) for new data shouldn't cause high network traffic (that should be like < 1KB of overhead). I don't think that explains things. Is it possible that setting has been overridden? -Jay On Tue, Feb 4, 2014 at 9:25 PM, Guozhang Wang

Re: New Producer Public API

2014-02-05 Thread Jay Kreps
hich partition exactly they go to. If that is > > true, why not just assign the same byte array as partition key with the > > default hash based partitioning in option 1.A? But again, that is based > on > > my presumption that very few users would want to really specif

Re: Surprisingly high network traffic between kafka servers

2014-02-05 Thread Jay Kreps
napshot > of jnettop from one of the servers. > > https://gist.github.com/carllerche/4f2cf0f0f6d1e891f482 > > The bottom row (89.9K/s) is the producer (it lives on a Kafka server). > The top two rows are Kafkas on other servers, you can see the combined > throughput is ~80MB/s > &

Re: Broker rejoin with big replica lag

2014-02-05 Thread Jay Kreps
Do we have the right default? -Jay On Wed, Feb 5, 2014 at 2:04 PM, Joel Koshy wrote: > > > topics are all caught up, but I have one high volume topic (around > > 40K msgs/sec) that is taking much longer. I just took a few samples > > of Replica-MaxLag to see how long it would take to catch up

Re: It would be nice to be able to read Kafka topics in reverse.

2014-02-05 Thread Jay Kreps
Technically this is possible with the existing server and protocol and could be implemented using the "low level" network client. The high-level client doesn't really allow this. This would be a good thing to think about as we start on the redesign of that client. I don't think it has to be terribl

Re: Config for new clients (and server)

2014-02-05 Thread Jay Kreps
s, > Neha > > > On Wed, Feb 5, 2014 at 10:10 AM, Guozhang Wang wrote: > > > +1 for the key-value approach. > > > > Guozhang > > > > > > On Tue, Feb 4, 2014 at 9:34 AM, Jay Kreps wrote: > > > > > We touched on this a bit in previous d

Re: Config for new clients (and server)

2014-02-06 Thread Jay Kreps
Joel, Ah, I actually don't think the internal usage is a problem for *us*. We just use config in one place, whereas it gets set in 1000s of apps, so I am implicitly optimizing for the application interface. I agree that we can add getters and setters on the ProducerConfig if we like. Basically I w

Re: New Producer Public API

2014-02-06 Thread Jay Kreps
sily aggregate many of these messages to reduce the message count. > If there are messages that store counts, you could aggregate these into a > single message and then send to kafka. > > Thoughts? > > > > On Wed, Feb 5, 2014 at 2:03 PM, Jay Kreps wrote: > > >

Re: Building a producer/consumer supporting exactly-once messaging

2014-02-10 Thread Jay Kreps
The out-of-the-box support for this in Kafka isn't great right now. Exactly once semantics has two parts: avoiding duplication during data production and avoiding duplicates during data consumption. There are two approaches to getting exactly once semantics during data production. 1. Use a singl

Re: Building a producer/consumer supporting exactly-once messaging

2014-02-10 Thread Jay Kreps
Ack, nice, should have thought of doing that... -Jay On Mon, Feb 10, 2014 at 10:12 AM, Neha Narkhede wrote: > Added this to our FAQ - > > https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIgetexactlyonemessagingfromKafka > ? > > > > On Mon, Feb 10, 20

Re: Config for new clients (and server)

2014-02-10 Thread Jay Kreps
add similar support in the new > config. > > Thanks, > > Jun > > > On Tue, Feb 4, 2014 at 9:34 AM, Jay Kreps wrote: > > > We touched on this a bit in previous discussions, but I wanted to draw > out > > the approach to config specifically as an item of d

Re: New Consumer API discussion

2014-02-10 Thread Jay Kreps
A few items: 1. ConsumerRebalanceCallback a. onPartitionsRevoked would be a better name. b. We should discuss the possibility of splitting this into two interfaces. The motivation would be that in Java 8 single method interfaces can directly take methods which might be more intuitive. c. I

Re: Config for new clients (and server)

2014-02-10 Thread Jay Kreps
one comment. Currently, when initiating a > > config > > > (e.g. ProducerConfig), we log those overridden property values and > unused > > > property keys (likely due to mis-spelling). This has been very useful > for > > > config verification. It would be goo

Re: New Consumer API discussion

2014-02-11 Thread Jay Kreps
y identifies a partition of a topic > > > > Thanks, > > Neha > > > > > > On Mon, Feb 10, 2014 at 12:36 PM, Pradeep Gollakota < > pradeep...@gmail.com > > >wrote: > > > > > Couple of very quick thoughts. > > > > > > 1. +1 about

Re: New Consumer API discussion

2014-02-11 Thread Jay Kreps
ten a message. I'm not sure if that covers all cases or not. > About use cases: great point. I will add some more examples of using the > API functions in the wiki pages. > > Guozhang > > > > > On Mon, Feb 10, 2014 at 12:20 PM, Jay Kreps wrote: > > > A few items:

Re: New Consumer API discussion

2014-02-12 Thread Jay Kreps
> > > > List consumers = new ArrayList(); > > List topics = new ArrayList > // populate topics > > assert(consumers.size == topics.size); > > > > for (int i = 0; i < numThreads; i++) { > > MyConsumer c = new MyConsumer(); > > c.subscribe(to

Re: New Consumer API discussion

2014-02-13 Thread Jay Kreps
Hey guys, One thing that bugs me is the lack of symmetric for the different position calls. The way I see it there are two positions we maintain: the fetch position and the last commit position. There are two things you can do to these positions: get the current value or change the current value.

Re: New Consumer API discussion

2014-02-13 Thread Jay Kreps
picOffsetPosition... p) >Map committed(TopicPartition... tp) >void commit(TopicOffsetPosition...) > > These are definitely more clunky than the non batched ones though. > > Thanks, > Neha > > > > On Thu, Feb 13, 2014 at 1:24 PM, Jay Kreps wrote: > > >

Re: New Consumer API discussion

2014-02-14 Thread Jay Kreps
ile APIs. > > I also think your suggestion about ConsumerPosition makes sense. > > Thanks, > Neha > On Feb 13, 2014 9:22 PM, "Jay Kreps" wrote: > > > Hey Neha, > > > > I actually wasn't proposing the name TopicOffsetPosition, that was just a >

Re: Surprisingly high network traffic between kafka servers

2014-02-14 Thread Jay Kreps
Yeah that is a bug. We should be giving an error here rather than retrying. -Jay On Fri, Feb 14, 2014 at 7:54 AM, Jun Rao wrote: > Hi, Zhong, > > Thanks for sharing this. We probably should add a sanity check in the > broker to make sure that replica.fetch.max.bytes >= message.max.bytes. > Cou

Re: New Consumer API discussion

2014-02-16 Thread Jay Kreps
ted(TopicPartition tp); >void commit(TopicPartitionOffset...); > > Thanks, > Neha > > On Friday, February 14, 2014, Jay Kreps wrote: > > > Oh, interesting. So I am assuming the following implementation: > > 1. We have an in-memory fetch position which controls the n

Re: trunk unit test failure

2014-02-18 Thread Jay Kreps
Why do we need to create that gradle file? Can't we ship a default for that? Also we should really comment out the @Test annotation on the failing test. I think we checked that in so I had a case to fix against but I think it is a bit disruptive. -Jay On Mon, Feb 17, 2014 at 7:41 PM, Guozhang W

Re: New Consumer API discussion

2014-02-21 Thread Jay Kreps
sages that won't work. -Jay On Fri, Feb 21, 2014 at 4:56 PM, Jun Rao wrote: > What's the use case of position()? Isn't that just the nextOffset() on the > last message returned from poll()? > > Thanks, > > Jun > > > On Sun, Feb 16, 2014 at 9:12 AM, Ja

Re: New Consumer API discussion

2014-02-22 Thread Jay Kreps
you are trying to add transactional features, then formally define a > DTP capability and pull in other server frameworks to share the > implementation. Should it be XA/Open? How about a new peer2peer DTP > protocol? > > > > Thank you, > > Robert > > > > Ro

Re: New Consumer API discussion

2014-02-25 Thread Jay Kreps
th those methods. In the main loop of our new consumer > >> api, you can just call those methods based on the events you get. > >> > >> Also, we already have an api to get the first and the last offset of a > >> partition (getOffsetBefore). > >> &

Re: New Consumer API discussion

2014-02-25 Thread Jay Kreps
set is committed on a partition, when > a > >>>> leader changes and so on? I call this OOB traffic, since they are not > >>>> the > >>>> core messages streaming, but side-band events, yet they are still > >>>> potentially useful to consu

Usability

2014-03-04 Thread Jay Kreps
Hey guys, It would be good to tag any JIRA for something which is an confusing or annoying with the "usability" tag. I am trying to get a list of all these together so we can take a wack at some of them in a co-ordinated way. -Jay

Documentation for the upcoming 0.8.1 release

2014-03-05 Thread Jay Kreps
Hey guys, I took a stab at updating the docs for the 0.8.1 release. In particular, I added a section on log compaction: http://kafka.apache.org/081/documentation.html#compaction I also updated the configs. This is all under 081, I will flip this over to be the main documentation when 0.8.1 is rel

Re: Documentation for the upcoming 0.8.1 release

2014-03-06 Thread Jay Kreps
zing executed action, and another one to "state change" > topic which ultimately is good for state replication and can be compacted > after period of time? This state change messages should maybe even contain > reference to event message to be able to see the cause of state change..

Re: Anouncing Kafka Offset Monitor 0.1

2014-03-07 Thread Jay Kreps
This is really useful! I added it to the ecosystem page: https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem -Jay On Fri, Mar 7, 2014 at 10:49 AM, Pierre Andrews wrote: > Hello everyone, > > at Quantifind, we are big users of Kafka and we like it a lot! > In a few use cases, we had to f

Re: Question on stability of new shiny KafkaProducer

2014-03-10 Thread Jay Kreps
I would second that. If you are a little bit risk tolerant, though, we would certainly appreciate the additional usage and since we are actively doing QA on it would want to fix any issues you might find. -Jay On Mon, Mar 10, 2014 at 3:23 PM, Neha Narkhede wrote: > The new producer is running o

Re: kafka-websocket on github

2014-03-11 Thread Jay Kreps
Very cool. If I understand correctly this is a kind of proxy that would connect web browsers to Kafka? Any information you could give on the use cases this is for? -Jay On Mon, Mar 10, 2014 at 11:02 AM, Benjamin Black wrote: > I put this up over the weekend, thought it might be useful to folks

Re: kafka-websocket on github

2014-03-11 Thread Jay Kreps
Yeah that is very cool. I added it to the ecosystem page: https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem -Jay On Tue, Mar 11, 2014 at 10:48 AM, Benjamin Black wrote: > exactly. i'm using it for streaming query output to dashboards. > > > On Tue, Mar 11, 2014

Re: [ANNOUNCEMENT] Apache Kafka 0.8.1 Released

2014-03-12 Thread Jay Kreps
That is a good point Michael. It looks like Joe fixed the typo. I added an explanation of the scala versioning. -Jay On Wed, Mar 12, 2014 at 2:17 PM, Michael G. Noll wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Many thanks to everyone involved in the release! > > Please let me s

What's new in Kafka 0.8.1?

2014-03-12 Thread Jay Kreps
Hi guys, I wrote up a quick blog post on the new stuff in 0.8.1: http://blog.empathybox.com/post/79427855885/whats-new-in-kafka-0-8-1 This provides a little more human readable version of the release notes. Let me know if I missed anything. -Jay

Re: What's new in Kafka 0.8.1?

2014-03-13 Thread Jay Kreps
operations ( > http://localhost/documentation.html#basic_ops). It's in the paragraph > that > starts with "We also improved a lot of operational activities...". > > Roger > > > On Wed, Mar 12, 2014 at 8:57 PM, Jay Kreps wrote: > > > Hi guys, > >

Re: Documentation for metadata.broker.list

2014-03-17 Thread Jay Kreps
Currently metadata.broker.list is a separate list, unrelated to the cluster, which is used for all metadata requests. We are actually just now discussing whether this makes sense or not with respect to the new producer and consumer we are working on. I actually misunderstood how the existing produ

Re: Impact of slow consumers

2014-03-18 Thread Jay Kreps
In general this is expected. Consumers generally read from the OS cache and so they do no I/O. However a slow consumer may fall out of this cached portion of the log and do real reads. These reads will compete with writes for disk bandwidth. However what is puzzling is that you mention that other

Re: 0.8.1 stability

2014-03-18 Thread Jay Kreps
+1 -Jay On Tue, Mar 18, 2014 at 5:56 PM, Neha Narkhede wrote: > Thanks for giving the 0.8.1 release a spin! A few people have reported bugs > with delete topic and > also the automatic leader > rebalancing

Re: Requesting advice about Producer & Consumer Design

2014-03-19 Thread Jay Kreps
Hey Krishna, Let me clarify the current state of things a little: 1. Kafka offers a single producer interface as well as two consumer interfaces: the low-level "simple consumer" which just directly makes network requests, and the higher level interface which handles fault-tolerance, partition assi

Re: notes from parsing the protocol in erlang + 1 wiki feedback

2014-03-21 Thread Jay Kreps
Hey Bosky, This is super helpful, I'll make those changes on the wiki. -Jay On Fri, Mar 21, 2014 at 12:01 PM, Bosky wrote: > Hello all, > > I've enjoyed reading the docs, and have feedback that might save other > client contributors from hours of staring at binary data and tcpdump. > > 1) Par

Re: Separate broker replication traffic from producer/consumer traffic

2014-03-25 Thread Jay Kreps
No not at the moment. Are you seeing a problem that this would resolve? -Jay On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok wrote: > Hi all, > > Is there any way to configure the brokers such that producers & consumers > are talking via IP1, while the brokers are replicating between themselves > us

Re: Separate broker replication traffic from producer/consumer traffic

2014-03-26 Thread Jay Kreps
dwidth usage by 33%, down to (4n). > Or 50% more capacity for producers to push before hitting the NIC's cap (1 > Gbps) > > We're not quite at the cap yet, but would like to see if we can make use > of the second NIC to give us more room in the primary NIC. > > Thanks

Re: Cluster design distribution and JBOD vs RAID

2014-04-18 Thread Jay Kreps
If you lose one drive in a JBOD setup you will just re-replicate the data on that disk. It is similar to what you would do during RAID repair except that instead of having the data coming 100% from the mirror drives the load will be spread over the rest of the cluster. The real downside of RAID is

Re: Cluster design distribution and JBOD vs RAID

2014-04-19 Thread Jay Kreps
I think we are saying the same thing. If any drive fails the broker is down. But when the drive is repaired only the data on the destroyed drive will need to be restored from replicas. -Jay On Fri, Apr 18, 2014 at 3:24 PM, Maxime Brugidou wrote: > Are you sure about that? Our latest tests show

Re: Topic Reassignment Tool Improvements

2014-04-23 Thread Jay Kreps
I don't think we are doing anything at the moment to improve this, but I think we agree it could be improved. We would welcome a contribution here. The best way to proceed would just be to write up a wiki or JIRA on how you think it should work and kick off a discussion. -Jay On Wed, Apr 23, 201

Re: 0.8.1 cpu usage

2014-04-30 Thread Jay Kreps
Hm, this seems counter-intuitive. Running 3x per second vs 2x per second should not really register on a modern CPU, right? Can you try this and see if you notice any difference? On Wed, Apr 30, 2014 at 8:10 AM, Libo Yu wrote: > Hi team, > > We have noticed that the cpu usage of 0.8.1 has more

Re: Kafka Mock

2014-04-30 Thread Jay Kreps
With the new Java Kafka clients we are shipping a mock producer and consumer. E.g. https://github.com/apache/kafka/blob/0.8.1/clients/src/main/java/org/apache/kafka/clients/producer/MockProducer.java I suppose that doesn't help you now unless you are using the new producer already, but at least we

Re: 0.8.1 Java Producer API Callbacks

2014-05-02 Thread Jay Kreps
This summary is correct. If you are just starting development now it is probably reasonable to start with the new producer. We would certainly appreciate any feedback on it. My recommendation would be to build the producer off trunk as there were a few bug fixes since 0.8.1.x that are worth gettin

Re: 0.7.1 Will there be any log info under poor network which packet may lose?

2014-05-05 Thread Jay Kreps
TCP will attempt to resend until the packets are successfully delivered or a timeout occurs. So a packet loss should not lead to an error. The error you see in your log is the queue of unsent events in the client backing up to the point it hits its configured memory limit and then dropping events w

Re: JAVA HEAP settings for KAFKA in production

2014-05-06 Thread Jay Kreps
Hey Todd, Doc patch? :-) svn co http://svn.apache.org/repos/asf/kafka/site/081/ Don't stress about html or formatting, I'm happy to do that part. I would love to give people more authoritative advice. Right now everything is a bit obsolete and wrong. -Jay On Mon, May 5, 2014 at 10:36 PM, Todd

Re: 100 MB messages

2014-05-13 Thread Jay Kreps
It can, but it will not perform very well. Kafka fully instantiates messages in memory (as a byte[] basically) so if you send a 100MB message the server will do a 100MB allocation to hold that data prior to writing to disk. I think MongoDB does have blob support so passing a pointer via Kafka as y

Re: keyed-messages & de-duplication

2014-05-13 Thread Jay Kreps
Hi, The compaction is done to clean-up space. It isn't done immediately only periodically. I suspect the reason you see no compaction is that we never compact the active segment of the log (the most recent file) as that is still being written to. The compaction would not happen until a new segmen

Re: New consumer APIs

2014-05-16 Thread Jay Kreps
Hey Eric, Yeah this is more similar to what we currently have but with a richer api then a simple Iterator. I think the question is how the poll() on the various streams translates into the ultimate poll that we need to do against the individual socket connections. Some of the things that make t

Re: [DISCUSS] Kafka Security Specific Features

2014-06-04 Thread Jay Kreps
Hey Joe, Thanks for kicking this discussion off! I totally agree that for something that acts as a central message broker security is critical feature. I think a number of people have been interested in this topic and several people have put effort into special purpose security efforts. Since mos

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Jay Kreps
e Stein wrote: > I like the idea of working on the spec and prioritizing. I will update the > wiki. > > - Joestein > > > On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps wrote: > > > Hey Joe, > > > > Thanks for kicking this discussion off! I totally agree that fo

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Jay Kreps
tricting access to it > (both modification and visibility) is a critical component. > > -Todd > > > On 6/5/14, 2:01 PM, "Jay Kreps" wrote: > > >Hey Joe, > > > >I don't really understand the sections you added to the wiki. Can you > >cla

Re: kafka producer, one per web app?

2014-06-16 Thread Jay Kreps
Yes, the producer is thread safe, and sharing instances will be more efficient if you are producing in async mode. -Jay On Mon, Jun 16, 2014 at 9:12 AM, S Ahmed wrote: > In my web application, I should be creating a single instance of a producer > correct? > > So in scala I should be doing som

Re: Broken download link?

2014-06-16 Thread Jay Kreps
Ack! Thanks for pointing that out. Should be fixed now. -Jay On Mon, Jun 16, 2014 at 11:22 AM, William Borg Barthet < w.borgbart...@onehippo.com> wrote: > Hi there, > > as I was excitedly reading through the introductory documentation, I came > across a download link [1] in the quickstart sectio

Re: Kafka latency measures

2014-06-19 Thread Jay Kreps
There were actually several patches against trunk since 0.8.1.1 that may impact latency however, especially when using acks=-1. So those results in the blog may be a bit better than what you would see in 0.8.1.1. -Jay On Wed, Jun 18, 2014 at 7:58 PM, Supun Kamburugamuva wrote: > My machine con

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Jay Kreps
The primary relevant factor here is the fsync interval. Kafka's replication guarantees do not require fsyncing every message, so the reason for doing so is to handle correlated power loss (a pretty uncommon failure in a real data center). Replication will handle most other failure modes with much m

Re: Scalability question?

2014-06-26 Thread Jay Kreps
I think currently we do a little over 200 billion events per day at LinkedIn, though we are not actually the largest Kafka user any more. On the whole scaling the volume of messages is actually not that hard in Kafka. Data is partitioned, and partitions don't really communicate with each other, so

Re: Apache Kafka NYC Users Group!

2014-06-27 Thread Jay Kreps
This is great! -Jay On Thu, Jun 26, 2014 at 5:47 PM, Joe Stein wrote: > Hi folks, I just started a new Meetup specifically for Apache Kafka in NYC > (everyone is welcome of course) http://www.meetup.com/Apache-Kafka-NYC/ > > For the last couple of years we have been piggy backing talks and the >

Re: Intercept broker operation in Kafka

2014-06-27 Thread Jay Kreps
Hey Ravi, I think what you want is available via log4j and jmx. Log4j is pluggable you can plug in any java code at runtime you want to handle the log events. JMX can be called in any way you like too. -Jay On Mon, Jun 23, 2014 at 11:51 PM, ravi singh wrote: > Primarily we want to log below dat

Re: performance test on github failing

2014-07-03 Thread Jay Kreps
Yes, the code in the 0.8.1 release just took those parameters listed in the usage. In trunk it will take arbitrary properties and pass them on as config to the producer. This let's you control more of the tunable configurations. -Jay On Thu, Jul 3, 2014 at 1:00 PM, Guozhang Wang wrote: > Bredan,

Re: request.required.acks=-1 under high data volume

2014-07-11 Thread Jay Kreps
I think the root problem is that replicas are falling behind and hence are effectively "failed" under normal load and also that you have unclean leader election enabled which "solves" this catastrophic failure by electing new leaders without complete data. Starting in 0.8.2 you will be able to sel

Interested in contributing to Kafka?

2014-07-16 Thread Jay Kreps
Hey All, A number of people have been submitting really nice patches recently. If you are interested in contributing and are looking for something to work on, or if you are contributing and are interested in ramping up to be a committer on the project, please let us know--we are happy to help you

Re: Interested in contributing to Kafka?

2014-07-17 Thread Jay Kreps
.twitter.com/allthingshadoop> >> > / >> > >> > >> > On Wed, Jul 16, 2014 at 8:39 PM, hsy...@gmail.com >> > wrote: >> > >> > > Is there a scala API doc for the entire kafka library? >

Re: Kafka 0.8.1.1 / Can't read latest message

2014-07-17 Thread Jay Kreps
This makes sense if you think about it. If you want to start "now" you don't want the last message in the log, which could be seconds, minutes, days, or weeks old, you actually want the next message that comes in. -Jay On Thu, Jul 17, 2014 at 10:00 AM, Guozhang Wang wrote: > Hello Tanguy, > > Wi

Re: Interested in contributing to Kafka?

2014-07-18 Thread Jay Kreps
many Manager classes, so many...). I find Java to have this super high > type-to-thought ratio. Would you guys have anything to say about Scala > compared to Java? How has your experience been with coding in it, and > building large systems with it? > > > Philip > > >

Improving the Kafka client ecosystem

2014-07-18 Thread Jay Kreps
A question was asked in another thread about what was an effective way to contribute to the Kafka project for people who weren't very enthusiastic about writing Java/Scala code. I wanted to kind of advocate for an area I think is really important and not as good as it could be--the client ecosyste

Re: Interested in contributing to Kafka?

2014-07-18 Thread Jay Kreps
; and more Go recently, and would like to write more. So working with Kafka > (tools, clients, monitoring), but coding in Go, would be appealing. Others > will have other preferences, I am sure. > > > Philip > > > --- > www.philipotoole.com > >

Re: Improving the Kafka client ecosystem

2014-07-18 Thread Jay Kreps
? > > Thanks, > > Jun > > > On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps wrote: > >> A question was asked in another thread about what was an effective way >> to contribute to the Kafka project for people who weren't very >> enthusiastic about writing

Re: Improving the Kafka client ecosystem

2014-07-19 Thread Jay Kreps
ation > -- it would not really be about coding, but system design. Ideally it would > be reviewed by you committers, but someone else would do the work. > > Philip > > > ------- > www.philipotoole.com > > > > On Friday, July 18, 2014 3:58 PM,

Re: Switch Apache logo URL to HTTPS

2014-07-20 Thread Jay Kreps
Fixed. Thanks for pointing that out. -Jay On Sat, Jul 19, 2014 at 8:48 AM, Marcin ZajÄ…czkowski wrote: > Hi, > > Reading Kafka webpage/documentation through HTTPS (e.g. > https://kafka.apache.org/documentation.html) I spotted that it generates > warning (in Firefox and Chromium) about a reference

Re: New Consumer Design

2014-07-21 Thread Jay Kreps
This thread is a bit long, but let me see if I can restate it correctly (not sure I fully follow). There are two suggestions: 1. Allow partial rebalances that move just some partitions. I.e. if a consumer fails and has only one partition only one other consumer should be effected (the one who pick

Re: New Consumer Design

2014-07-22 Thread Jay Kreps
inator, which would then inform the new consumer and > ISR brokers about the partition gain. Broker security would be the master > record of love assignments. > > Thanks, > Rob > >> On Jul 21, 2014, at 6:10 PM, Jay Kreps wrote: >> >> This thread is a bit long,

Re: much reduced io utilization after upgrade to 0.8.0 -> 0.8.1.1

2014-07-23 Thread Jay Kreps
Yes, it could definitely be related to KAFKA-615. The default in 0.8.1 is to let the OS handle disk writes. This is much more efficient as it will schedule them in an order friendly to the layout on disk and do a good job of merging adjacent writes. However if you are explicitly configuring an fsyn

Re: Kafka on yarn

2014-07-23 Thread Jay Kreps
Hey Kam, It would be nice to have a way to get a failed node back with it's original data, but this isn't strictly necessary, it is just a good optimization. As long as you run with replication you can restart a broker elsewhere with no data, and it will restore it's state off the other replicas.

Re: Kafka on yarn

2014-07-23 Thread Jay Kreps
ting from > this at the same time kafka's appeal IMO is it's simplicity. Spark has chosen > to include YARN within its distribution, not sure what the kafka team thinks. > > > > On Wednesday, July 23, 2014 4:19 PM, Jay Kreps wrote: > > > > Hey Kam, > &g

Re: kafka support in collectd and syslog-ng

2014-07-25 Thread Jay Kreps
This is great! -Jay On Fri, Jul 25, 2014 at 8:55 AM, Pierre-Yves Ritschard wrote: > Hi list, > > Just a quick note to let you know that kafka support has now been merged in > collectd, which means that system and application metrics can directly be > produced on a topic from the collectd daemon

Re: undesirable log retention behavior

2014-07-31 Thread Jay Kreps
This is a real problem you describe. Unfortunately adding the timestamp to the file name won't help the case you describe as replicas don't directly interact with files they just fetch messages by offset so there is really no clean way for them to get modification times from the source broker. I t

Re: Request: Adding us to the "Powered By" list

2014-08-01 Thread Jay Kreps
Sure, can you give me the blurb you want? -Jay On Fri, Aug 1, 2014 at 6:58 AM, Vitaliy Verbenko wrote: > Dear Kafka team, > > Would you mind add us @ > https://cwiki.apache.org/confluence/display/KAFKA/Powered+By ? > We're using it as part of our ticket sequencing system for our helpdesk > softw

Re: Writing to Kafka

2014-08-14 Thread Jay Kreps
Hi Telles, 1. One broker is probably fine with the load, though if you want replication for fault tolerance you will need more than one. 2. The host/port you configure is just to discover the full cluster topology. Messages will be partitioned semantically and balanced over all hosts that have par

Re: Improving the Kafka client ecosystem

2014-08-19 Thread Jay Kreps
iling list ever get created? Was there consensus that it did or >> didn't need created? >> >> -Mark >> >> > On Jul 18, 2014, at 14:34, Jay Kreps wrote: >> > >> > A question was asked in another thread about what was an effective way >

Re: OutOfMemoryError during index mapping

2014-08-24 Thread Jay Kreps
Is it possible that this is a 32 bit machine and you have more than 2gb of index? -jay On Sunday, August 24, 2014, pavel.zalu...@sematext.com < pavel.zalu...@sematext.com> wrote: > Hi, > > Sometimes we get: > > Caused by: java.lang.OutOfMemoryError: Map failed > at sun.nio.ch.FileChannel

Re: Handling send failures with async producer

2014-08-26 Thread Jay Kreps
Also, Jonathan, to answer your question, the new producer on trunk is running in prod for some use cases at LinkedIn and can be used with any 0.8.x. version. -Jay On Tue, Aug 26, 2014 at 12:38 PM, Jonathan Weeks wrote: > I am interested in this very topic as well. Also, can the trunk version of

Re: Consumer sensitive expiration of topic

2014-08-28 Thread Jay Kreps
Hey Dominique, What you describe makes sense, and it would certainly be possible for the broker to more aggressively discard data once it sees that the consumer has read it once. The reason we haven't really taken that as a priority is because modern drives are so large relative to their throughp

Re: Trunk backwards compatibility (producer / consumer client questions)

2014-08-29 Thread Jay Kreps
; trunk that is backwards compatible with an existing 0.8.1.1 broker cluster? > If not 0.8.1.1, will the consumer code on trunk work with a 0.8.2 broker > cluster when 0.8.2 is released? > > (Our code is scala, BTW) > > Best Regards, > > -Jonathan > > > On Aug 26,

Re: experience getting 0.8 perf scripts running

2014-09-01 Thread Jay Kreps
Hey Bill, Thanks for this very useful list. I think part of the problem is this. We have started versioning our documentation with each release. We do this to ensure people on older releases can still see the documentation. In the latest release we migrated a bunch of stuff off the wiki to help i

Re: Need Document and Explanation Of New Metrics Name in New Java Producer on Kafka Trunk

2014-09-09 Thread Jay Kreps
Hi Bhavesh, Each of those JMX attributes comes with documentation. If you open up jconsole and attach to a jvm running the consumer you should be able to read the descriptions for each attribute. -Jay On Tue, Sep 9, 2014 at 2:07 PM, Bhavesh Mistry wrote: > Kafka Team, > > Can you please let me

<    1   2   3   4   5