Are you referring to the HTTP side of things? If not, I'm not sure exactly what 
you mean by threaded I/O in this context

-David

On Nov 21, 2012, at 10:54 AM, Taylor Gautier wrote:

> It would make sense to use nio rather than threaded io. 
> 
> 
> 
> On Nov 20, 2012, at 2:06 PM, David Arthur <mum...@gmail.com> wrote:
> 
>> BTW, here are some cURL calls from my test environment:
>> 
>> https://gist.github.com/e59b9c8ee4ae56dad44f
>> 
>> 
>> On Nov 20, 2012, at 4:08 PM, David Arthur wrote:
>> 
>>> Another bump for this thread...
>>> 
>>> For those just joining, this prototype is a simple HTTP server that proxies 
>>> the complex consumer code through two HTTP endpoints. 
>>> 
>>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala
>>> 
>>> E.g., 
>>> 
>>>   curl http://localhost:8888/my-topic -X POST -d 'Here is a message'
>>> 
>>> and 
>>> 
>>>   curl http://localhost:8888/my-topic/my-group -X GET
>>> 
>>> 
>>> This is not an attempt to expose the FetchRequest/ProduceRequest protocol 
>>> over HTTP.
>>> 
>>> Few questions:
>>> 
>>> * Would including offsets be useful here? Since it is utilizing the 
>>> ZK-backed consumer code, I would think not
>>> * I have chosen to create one thread per topic+group (mostly for simplicity 
>>> sake). Multiple REST servers could be run and load balanced across to 
>>> increase the consumer parallelism. Maybe it would make sense for an 
>>> individual REST server to create more than one thread per topic+group?
>>> 
>>> Cheers
>>> -David
>>> 
>>> On Sep 10, 2012, at 9:49 AM, David Arthur wrote:
>>> 
>>>> Bump. 
>>>> 
>>>> Anyone have feedback on this approach?
>>>> 
>>>> -David
>>>> 
>>>> On Aug 24, 2012, at 12:37 PM, David Arthur wrote:
>>>> 
>>>>> Here is an initial pass at a Kafka REST proxy (in Scala)
>>>>> 
>>>>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala
>>>>> 
>>>>> The basic gist is:
>>>>> * Jetty for webserver
>>>>> * Messages are strings
>>>>> * GET /topic/group to get a message (timeout after 1s)
>>>>> * POST /topic, the request body is the message
>>>>> * One consumer thread per topic+group
>>>>> 
>>>>> Be wary, many things are hard coded at this point (port numbers, etc). 
>>>>> Obviously, this will need to change. Also, I haven't the slightest idea 
>>>>> how to setup/use sbt properly, so I just checked in the libs.
>>>>> 
>>>>> Feedback is welcome in this thread or on Github.  Be gentle please, this 
>>>>> is my first go at Scala
>>>>> 
>>>>> -David
>>>>> 
>>>>> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote:
>>>>> 
>>>>>> Jay I agree with you 100%.
>>>>>> 
>>>>>> At Tagged we have implemented a proxy for various internal reasons (
>>>>>> primarily to act as a high performance relay from PHP to Kafka). It's
>>>>>> implemented in Node.js (JavaScript)
>>>>>> 
>>>>>> Currently it services UDP packets encoded in binary but it could
>>>>>> easily be modified to accept http also since Node support for http is
>>>>>> pretty simple.
>>>>>> 
>>>>>> If others are interested in maintaining something like this we could
>>>>>> consider adding this to the public domain along side the already
>>>>>> existing Node.js client implementation.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>>>>>> 
>>>>>>> My personal preference would be to have only a single protocol in kafka
>>>>>>> core. I have been down the multiple protocol route and my experience was
>>>>>>> that it adds a lot of burden for each change that needs to be made and a
>>>>>>> lot of complexity to abstract over the different protocols. From the 
>>>>>>> point
>>>>>>> of view of a user they are generally a bit agnostic as to how bytes are
>>>>>>> sent back and forth provided it is reliable and easily implementable in 
>>>>>>> any
>>>>>>> language. Generally they care more about the quality of the client in 
>>>>>>> their
>>>>>>> language of choice.
>>>>>>> 
>>>>>>> My belief is that the main benefit of REST is ease of implementing a
>>>>>>> client. But currently the biggest barrier is really the use of zk and
>>>>>>> fairly thick consumer design. So I think the current thinking is that we
>>>>>>> should focus on thinning that out and removing the client-side zk
>>>>>>> dependency. I actually don't think TCP is a huge burden if the protocol 
>>>>>>> is
>>>>>>> simple, and there are actually some advantages (for example the consumer
>>>>>>> needs to consume from multiple servers so select/poll/epoll is natural 
>>>>>>> but
>>>>>>> this is not always available from HTTP client libraries).
>>>>>>> 
>>>>>>> Basically this is an area where I think it is best to pick one way and
>>>>>>> really make it really bullet proof rather than providing lots of 
>>>>>>> options.
>>>>>>> In some sense each option tends to increase the complexity of testing
>>>>>>> (since now there are many combinations to try) and also of 
>>>>>>> implementation
>>>>>>> (since now a lot things that were concrete now need to be abstracted 
>>>>>>> away).
>>>>>>> 
>>>>>>> So from this perspective I would prefer a standalone proxy that could
>>>>>>> evolve independently rather than retro-fitting the current socket 
>>>>>>> server to
>>>>>>> handle other protocols. There will be some overhead for the extra hop, 
>>>>>>> but
>>>>>>> then there is some overhead for HTTP itself.
>>>>>>> 
>>>>>>> This is just my personal opinion, it would be great to hear what other
>>>>>>> think.
>>>>>>> 
>>>>>>> -Jay
>>>>>>> 
>>>>>>> On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <mum...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> I'd be happy to collaborate on this, though it's been a while since 
>>>>>>>> I've
>>>>>>>> used PHP.
>>>>>>>> 
>>>>>>>> From what it looks like, what you have is a true proxy that runs 
>>>>>>>> outside
>>>>>>>> of Kafka and translates some REST routes into Kafka client calls. This
>>>>>>>> sounds more in line with what the project page describes. What I have
>>>>>>>> proposed is more like a translation layer between some REST routes and
>>>>>>>> FetchRequests. In this case the client is responsible for managing 
>>>>>>>> offsets.
>>>>>>>> Using the consumer groups and ZooKeeper would be another nice way of
>>>>>>>> consuming messages (which is probably more like what you have).
>>>>>>>> 
>>>>>>>> Any maintainers have feedback on this?
>>>>>>>> 
>>>>>>>> On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote:
>>>>>>>> 
>>>>>>>>> I have an internal one working and was hoping to have it open sourced 
>>>>>>>>> in
>>>>>>>>> the next week. The one at Box is based on the CodeIgniter framework, 
>>>>>>>>> we
>>>>>>>>> have about 45 RESTful interfaces built on this framework so I just put
>>>>>>>>> together another one for Kafka.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Here are my notes, these were pre-dev so may be a little different 
>>>>>>>>> than
>>>>>>>>> what we ended up with.
>>>>>>>>> 
>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal
>>>>>>>>> 
>>>>>>>>> I will read yours later this afternoon, we should work together.
>>>>>>>>> 
>>>>>>>>> -Jonathan
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <mum...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> I'd like to tackle this project (assuming it hasn't been started 
>>>>>>>>>> yet).
>>>>>>>>>> 
>>>>>>>>>> I wrote up some initial thoughts here: 
>>>>>>>>>> https://gist.github.com/3248179
>>>>>>>>>> 
>>>>>>>>>> TLDR;  use Range header for specifying offsets, simple URIs like
>>>>>>>>>> /kafka/topics/[topic]/[partition], use for a simple transport of 
>>>>>>>>>> bytes
>>>>>>>>>> and/or represent the messages as some media type (text, json, xml)
>>>>>>>>>> 
>>>>>>>>>> Feedback is most welcome (in the Gist or in this thread).
>>>>>>>>>> 
>>>>>>>>>> Cheers!
>>>>>>>>>> 
>>>>>>>>>> -David
>> 

Reply via email to