I've opened KAFKA-639 to track this feature On Nov 28, 2012, at 3:32 PM, David Arthur wrote:
> Are you referring to the HTTP side of things? If not, I'm not sure exactly > what you mean by threaded I/O in this context > > -David > > On Nov 21, 2012, at 10:54 AM, Taylor Gautier wrote: > >> It would make sense to use nio rather than threaded io. >> >> >> >> On Nov 20, 2012, at 2:06 PM, David Arthur <mum...@gmail.com> wrote: >> >>> BTW, here are some cURL calls from my test environment: >>> >>> https://gist.github.com/e59b9c8ee4ae56dad44f >>> >>> >>> On Nov 20, 2012, at 4:08 PM, David Arthur wrote: >>> >>>> Another bump for this thread... >>>> >>>> For those just joining, this prototype is a simple HTTP server that >>>> proxies the complex consumer code through two HTTP endpoints. >>>> >>>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala >>>> >>>> E.g., >>>> >>>> curl http://localhost:8888/my-topic -X POST -d 'Here is a message' >>>> >>>> and >>>> >>>> curl http://localhost:8888/my-topic/my-group -X GET >>>> >>>> >>>> This is not an attempt to expose the FetchRequest/ProduceRequest protocol >>>> over HTTP. >>>> >>>> Few questions: >>>> >>>> * Would including offsets be useful here? Since it is utilizing the >>>> ZK-backed consumer code, I would think not >>>> * I have chosen to create one thread per topic+group (mostly for >>>> simplicity sake). Multiple REST servers could be run and load balanced >>>> across to increase the consumer parallelism. Maybe it would make sense for >>>> an individual REST server to create more than one thread per topic+group? >>>> >>>> Cheers >>>> -David >>>> >>>> On Sep 10, 2012, at 9:49 AM, David Arthur wrote: >>>> >>>>> Bump. >>>>> >>>>> Anyone have feedback on this approach? >>>>> >>>>> -David >>>>> >>>>> On Aug 24, 2012, at 12:37 PM, David Arthur wrote: >>>>> >>>>>> Here is an initial pass at a Kafka REST proxy (in Scala) >>>>>> >>>>>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala >>>>>> >>>>>> The basic gist is: >>>>>> * Jetty for webserver >>>>>> * Messages are strings >>>>>> * GET /topic/group to get a message (timeout after 1s) >>>>>> * POST /topic, the request body is the message >>>>>> * One consumer thread per topic+group >>>>>> >>>>>> Be wary, many things are hard coded at this point (port numbers, etc). >>>>>> Obviously, this will need to change. Also, I haven't the slightest idea >>>>>> how to setup/use sbt properly, so I just checked in the libs. >>>>>> >>>>>> Feedback is welcome in this thread or on Github. Be gentle please, this >>>>>> is my first go at Scala >>>>>> >>>>>> -David >>>>>> >>>>>> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote: >>>>>> >>>>>>> Jay I agree with you 100%. >>>>>>> >>>>>>> At Tagged we have implemented a proxy for various internal reasons ( >>>>>>> primarily to act as a high performance relay from PHP to Kafka). It's >>>>>>> implemented in Node.js (JavaScript) >>>>>>> >>>>>>> Currently it services UDP packets encoded in binary but it could >>>>>>> easily be modified to accept http also since Node support for http is >>>>>>> pretty simple. >>>>>>> >>>>>>> If others are interested in maintaining something like this we could >>>>>>> consider adding this to the public domain along side the already >>>>>>> existing Node.js client implementation. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >>>>>>> >>>>>>>> My personal preference would be to have only a single protocol in kafka >>>>>>>> core. I have been down the multiple protocol route and my experience >>>>>>>> was >>>>>>>> that it adds a lot of burden for each change that needs to be made and >>>>>>>> a >>>>>>>> lot of complexity to abstract over the different protocols. From the >>>>>>>> point >>>>>>>> of view of a user they are generally a bit agnostic as to how bytes are >>>>>>>> sent back and forth provided it is reliable and easily implementable >>>>>>>> in any >>>>>>>> language. Generally they care more about the quality of the client in >>>>>>>> their >>>>>>>> language of choice. >>>>>>>> >>>>>>>> My belief is that the main benefit of REST is ease of implementing a >>>>>>>> client. But currently the biggest barrier is really the use of zk and >>>>>>>> fairly thick consumer design. So I think the current thinking is that >>>>>>>> we >>>>>>>> should focus on thinning that out and removing the client-side zk >>>>>>>> dependency. I actually don't think TCP is a huge burden if the >>>>>>>> protocol is >>>>>>>> simple, and there are actually some advantages (for example the >>>>>>>> consumer >>>>>>>> needs to consume from multiple servers so select/poll/epoll is natural >>>>>>>> but >>>>>>>> this is not always available from HTTP client libraries). >>>>>>>> >>>>>>>> Basically this is an area where I think it is best to pick one way and >>>>>>>> really make it really bullet proof rather than providing lots of >>>>>>>> options. >>>>>>>> In some sense each option tends to increase the complexity of testing >>>>>>>> (since now there are many combinations to try) and also of >>>>>>>> implementation >>>>>>>> (since now a lot things that were concrete now need to be abstracted >>>>>>>> away). >>>>>>>> >>>>>>>> So from this perspective I would prefer a standalone proxy that could >>>>>>>> evolve independently rather than retro-fitting the current socket >>>>>>>> server to >>>>>>>> handle other protocols. There will be some overhead for the extra hop, >>>>>>>> but >>>>>>>> then there is some overhead for HTTP itself. >>>>>>>> >>>>>>>> This is just my personal opinion, it would be great to hear what other >>>>>>>> think. >>>>>>>> >>>>>>>> -Jay >>>>>>>> >>>>>>>> On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <mum...@gmail.com> wrote: >>>>>>>> >>>>>>>>> I'd be happy to collaborate on this, though it's been a while since >>>>>>>>> I've >>>>>>>>> used PHP. >>>>>>>>> >>>>>>>>> From what it looks like, what you have is a true proxy that runs >>>>>>>>> outside >>>>>>>>> of Kafka and translates some REST routes into Kafka client calls. This >>>>>>>>> sounds more in line with what the project page describes. What I have >>>>>>>>> proposed is more like a translation layer between some REST routes and >>>>>>>>> FetchRequests. In this case the client is responsible for managing >>>>>>>>> offsets. >>>>>>>>> Using the consumer groups and ZooKeeper would be another nice way of >>>>>>>>> consuming messages (which is probably more like what you have). >>>>>>>>> >>>>>>>>> Any maintainers have feedback on this? >>>>>>>>> >>>>>>>>> On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: >>>>>>>>> >>>>>>>>>> I have an internal one working and was hoping to have it open >>>>>>>>>> sourced in >>>>>>>>>> the next week. The one at Box is based on the CodeIgniter framework, >>>>>>>>>> we >>>>>>>>>> have about 45 RESTful interfaces built on this framework so I just >>>>>>>>>> put >>>>>>>>>> together another one for Kafka. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Here are my notes, these were pre-dev so may be a little different >>>>>>>>>> than >>>>>>>>>> what we ended up with. >>>>>>>>>> >>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal >>>>>>>>>> >>>>>>>>>> I will read yours later this afternoon, we should work together. >>>>>>>>>> >>>>>>>>>> -Jonathan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <mum...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I'd like to tackle this project (assuming it hasn't been started >>>>>>>>>>> yet). >>>>>>>>>>> >>>>>>>>>>> I wrote up some initial thoughts here: >>>>>>>>>>> https://gist.github.com/3248179 >>>>>>>>>>> >>>>>>>>>>> TLDR; use Range header for specifying offsets, simple URIs like >>>>>>>>>>> /kafka/topics/[topic]/[partition], use for a simple transport of >>>>>>>>>>> bytes >>>>>>>>>>> and/or represent the messages as some media type (text, json, xml) >>>>>>>>>>> >>>>>>>>>>> Feedback is most welcome (in the Gist or in this thread). >>>>>>>>>>> >>>>>>>>>>> Cheers! >>>>>>>>>>> >>>>>>>>>>> -David >>> >