Yeah, I just meant that as an example, it should be configurable.
-Jay
On Thu, Dec 20, 2012 at 2:15 PM, Milind Parikh wrote:
> +1 on limiting the size. But could you do 2k instead of 1k? Using Interval
> Time Clocks gets you a lot on distributed autonomous processing; but most
> large scale ITC
+1 on limiting the size. But could you do 2k instead of 1k? Using Interval
Time Clocks gets you a lot on distributed autonomous processing; but most
large scale ITCs go upto 1.5K.
http://code.google.com/p/itclocks/refer to the link on conference paper.
Regards
Milind
On Thu, Dec 20, 2012
Sounds good to me
On 12/20/12 5:04 PM, Jay Kreps wrote:
Err, to clarify, I meant punt on persisting the metadata not punt on
persisting the offset. Basically that field would be in the protocol but
would be unused in this phase.
-Jay
On Thu, Dec 20, 2012 at 2:03 PM, Jay Kreps wrote:
I actu
Err, to clarify, I meant punt on persisting the metadata not punt on
persisting the offset. Basically that field would be in the protocol but
would be unused in this phase.
-Jay
On Thu, Dec 20, 2012 at 2:03 PM, Jay Kreps wrote:
> I actually recommend we just punt on implementing persistence in
I actually recommend we just punt on implementing persistence in zk
entirely, otherwise we have to have an upgrade path to grandfather over
existing zk data to the new format. Let's just add it in the API and only
actually store it out when we redo the backend. We can handle the size
limit then too
No particular objection, though in order to support atomic writes of
(offset, metadata), we will need to define a protocol for the ZooKeeper
payloads. Something like:
OffsetPayload => Offset [Metadata]
Metadata => length prefixed string
should suffice. Otherwise we would have to rely on th
Okay I did some assessment of use cases we have which aren't using the
default offset storage API and came up with one generalization. I would
like to propose--add a generic metadata field to the offset api on a
per-partition basis. So that would leave us with the following:
OffsetCommitRequest =>
Good point about compressed message sets. I think that that works and is
simpler. We might still need the txn id to be able to do the application to
the hashmap atomically, but this depends on a number of details that aren't
really spec'd out, in particular how the replicas keep their hashmap fed
a
Thanks for the proposal. Added a couple of comments to the wiki.
Thanks,
Jun
On Mon, Dec 17, 2012 at 10:45 AM, Jay Kreps wrote:
> Hey Guys,
>
> David has made a bunch of progress on the offset commit api implementation.
>
> Since this is a public API it would be good to do as much thinking up-
There are two questions:
1. Who controls the consumers position in the stream
2. How is that position stored?
The theory for (1) is that the consumer should control this so it can chose
to move forwards or backwards or wherever it wants and has full control
over when position changes take effect.
Currently, consumers must deal with ZooKeeper directly and implement the
consumer re-balancing algorithm (correctly). As this is a rather
difficult and error-prone process, the intent of this new API is to
provide an easy mechanism for storing offsets for non-Scala clients.
At least, that's ho
Perhaps I don't understand the motivation well enough and perhaps I am
misreading the intent.
But I thought that the design principle behind kafka is for state (from a
consumer standpoint) was to be managed by consumer and not broker. I
understand that "These APIs are optional, clients can store
12 matches
Mail list logo