Re: [DISCUSS] KIP-82 - Add Record Headers

2016-11-08 Thread Sean McCauliff
+1 for String keys.

I've been doing some bechmarking and it seems like the speedup for using
integer keys is about 2-5 depending on the length of the strings and what
collections are being used.  The overall amount of time spent parsing a set
of header key, value pairs probably does not matter unless you are getting
close to 1M messages per consumer.  In which case probably don't use
headers.  There is also the option to use very short strings; some that are
even shorter than integers.

Partitioning the string key space will be easier than partitioning an
integer key space. We won't need a global registry.  Kafka internally can
reserve some prefix like "_" as its namespace.  Everyone else can use their
company or project name as namespace prefix and life should be good.

Here's the link to some of the benchmarking info:
https://docs.google.com/document/d/1tfT-6SZdnKOLyWGDH82kS30PnUkmgb7nPLdw6p65pAI/edit?usp=sharing



--
Sean McCauliff
Staff Software Engineer
Kafka

smccaul...@linkedin.com
linkedin.com/in/sean-mccauliff-b563192

On Mon, Nov 7, 2016 at 11:51 PM, Michael Pearce 
wrote:

> +1 on this slimmer version of our proposal
>
> I def think the Id space we can reduce from the proposed int32(4bytes)
> down to int16(2bytes) it saves on space and as headers we wouldn't expect
> the number of headers being used concurrently being that high.
>
> I would wonder if we should make the value byte array length still int32
> though as This is the standard Max array length in Java saying that it is a
> header and I guess limiting the size is sensible and would work for all the
> use cases we have in mind so happy with limiting this.
>
> Do people generally concur on Magnus's slimmer version? Anyone see any
> issues if we moved from int32 to int16?
>
> Re configurable ids per plugin over a global registry also would work for
> us.  As such if this has better concensus over the proposed global registry
> I'd be happy to change that.
>
> I was already sold on ints over strings for keys ;)
>
> Cheers
> Mike
>
> 
> From: Magnus Edenhill 
> Sent: Monday, November 7, 2016 10:10:21 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>
> Hi,
>
> I'm +1 for adding generic message headers, but I do share the concerns
> previously aired on this thread and during the KIP meeting.
>
> So let me propose a slimmer alternative that does not require any sort of
> global header registry, does not affect broker performance or operations,
> and adds as little overhead as possible.
>
>
> Message
> 
> The protocol Message type is extended with a Headers array consting of
> Tags, where a Tag is defined as:
>int16 Id
>int16 Len  // binary_data length
>binary_data[Len]  // opaque binary data
>
>
> Ids
> ---
> The Id space is not centrally managed, so whenever an application needs to
> add headers, or use an eco-system plugin that does, its Id allocation will
> need to be manually configured.
> This moves the allocation concern from the global space down to
> organization level and avoids the risk for id conflicts.
> Example pseudo-config for some app:
> sometrackerplugin.tag.sourcev3.id=1000
> dbthing.tag.tablename.id=1001
> myschemareg.tag.schemaname.id=1002
> myschemareg.tag.schemaversion.id=1003
>
>
> Each header-writing or header-reading plugin must provide means (typically
> through configuration) to specify the tag for each header it uses. Defaults
> should be avoided.
> A consumer silently ignores tags it does not have a mapping for (since the
> binary_data can't be parsed without knowing what it is).
>
> Id range 0..999 is reserved for future use by the broker and must not be
> used by plugins.
>
>
>
> Broker
> -
> The broker does not process the tags (other than the standard protocol
> syntax verification), it simply stores and forwards them as opaque data.
>
> Standard message translation (removal of Headers) kicks in for older
> clients.
>
>
> Why not string ids?
> -
> String ids might seem like a good idea, but:
>  * does not really solve uniqueness
>  * consumes a lot of space (2 byte string length + string, per header) to
> be meaningful
>  * doesn't really say anything how to parse the tag's data, so it is in
> effect useless on its own.
>
>
> Regards,
> Magnus
>
>
>
>
> 2016-11-07 18:32 GMT+01:00 Michael Pearce :
>
> > Hi Roger,
> >
> > Thanks for the support.
> >
> > I think the key thing is to have a common key space to make an ecosystem,
> > there does have to be some le

Re: [DISCUSS] KIP-82 - Add Record Headers

2016-11-08 Thread Sean McCauliff
On Tue, Nov 8, 2016 at 2:15 PM, Gwen Shapira  wrote:

> Since Kafka specifically targets high-throughput, low-latency
> use-cases, I don't think we should trade them off that easily.
>

I find these kind of design goals not to be really helpful unless it's
quantified in someway.  Because it's always possible to argue against
something as either being not performant or just an implementation detail.

This is a single threaded benchmarks so all the measurements are per
thread.

For 1M messages/s/thread  if header keys are int and you had even a single
header key, value pair then it's still about 2^-2 microseconds which means
you only have another 0.75 microseconds to do everything else you want to
do with a message (1M messages/s means 1 micro second per message).  With
string header keys there is still 0.5 micro seconds to process a message.



I love strings as much as the next guy (we had them in Flume), but I
> was convinced by Magnus/Michael/Radai that strings don't actually have
> strong benefits as opposed to ints (you'll need a string registry
> anyway - otherwise, how will you know what does the "profile_id"
> header refers to?) and I want to keep closer to our original design
> goals for Kafka.
>

"confluent.profile_id"


>
> If someone likes strings in the headers and doesn't do millions of
> messages a sec, they probably have lots of other systems they can use
> instead.
>

None of them will scale like Kafka.  Horizontal scaling is still good.


>
>
> On Tue, Nov 8, 2016 at 1:22 PM, Sean McCauliff
>  wrote:
> > +1 for String keys.
> >
> > I've been doing some bechmarking and it seems like the speedup for using
> > integer keys is about 2-5 depending on the length of the strings and what
> > collections are being used.  The overall amount of time spent parsing a
> set
> > of header key, value pairs probably does not matter unless you are
> getting
> > close to 1M messages per consumer.  In which case probably don't use
> > headers.  There is also the option to use very short strings; some that
> are
> > even shorter than integers.
> >
> > Partitioning the string key space will be easier than partitioning an
> > integer key space. We won't need a global registry.  Kafka internally can
> > reserve some prefix like "_" as its namespace.  Everyone else can use
> their
> > company or project name as namespace prefix and life should be good.
> >
> > Here's the link to some of the benchmarking info:
> > https://docs.google.com/document/d/1tfT-6SZdnKOLyWGDH82kS30PnUkmgb7nPL
> dw6p65pAI/edit?usp=sharing
> >
> >
> >
> > --
> > Sean McCauliff
> > Staff Software Engineer
> > Kafka
> >
> > smccaul...@linkedin.com
> > linkedin.com/in/sean-mccauliff-b563192
> >
> > On Mon, Nov 7, 2016 at 11:51 PM, Michael Pearce 
> > wrote:
> >
> >> +1 on this slimmer version of our proposal
> >>
> >> I def think the Id space we can reduce from the proposed int32(4bytes)
> >> down to int16(2bytes) it saves on space and as headers we wouldn't
> expect
> >> the number of headers being used concurrently being that high.
> >>
> >> I would wonder if we should make the value byte array length still int32
> >> though as This is the standard Max array length in Java saying that it
> is a
> >> header and I guess limiting the size is sensible and would work for all
> the
> >> use cases we have in mind so happy with limiting this.
> >>
> >> Do people generally concur on Magnus's slimmer version? Anyone see any
> >> issues if we moved from int32 to int16?
> >>
> >> Re configurable ids per plugin over a global registry also would work
> for
> >> us.  As such if this has better concensus over the proposed global
> registry
> >> I'd be happy to change that.
> >>
> >> I was already sold on ints over strings for keys ;)
> >>
> >> Cheers
> >> Mike
> >>
> >> 
> >> From: Magnus Edenhill 
> >> Sent: Monday, November 7, 2016 10:10:21 PM
> >> To: dev@kafka.apache.org
> >> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> >>
> >> Hi,
> >>
> >> I'm +1 for adding generic message headers, but I do share the concerns
> >> previously aired on this thread and during the KIP meeting.
> >>
> >> So let me propose a slimmer alternative that does not require any sort
> of
> >> global header registry, does not affect broker perf

Re: [DISCUSS] KIP-82 - Add Record Headers

2016-11-08 Thread Sean McCauliff
A local namespace mapping from namespace ids to ints would definitely solve
the problem of having a global namespace and would make the int header keys
potentially more readable for logging and debugging purposes.  But this
means another (potentially very large) set of configuration parameters that
need to be present on each component that wants to inspect the headers. I'm
sure it will be a fun day to track down that class of misconfiguration.

If the brokers are inspecting the headers then the brokers need this
config.  If the config changes then the brokers need to be restarted which
seems pretty expensive.  Otherwise there now needs to be a new way to
update the broker with this information.

Java itself does not have namespace collisions often and there is not a
central registration of namespaces. The set of Kafka infrastructure
engineers is much smaller than that namespace.  Having reasonable names
should allow every header user to peacefully coexist.

--
Sean McCauliff
Staff Software Engineer
Kafka

smccaul...@linkedin.com
linkedin.com/in/sean-mccauliff-b563192

On Mon, Nov 7, 2016 at 2:10 PM, Magnus Edenhill  wrote:

> Hi,
>
> I'm +1 for adding generic message headers, but I do share the concerns
> previously aired on this thread and during the KIP meeting.
>
> So let me propose a slimmer alternative that does not require any sort of
> global header registry, does not affect broker performance or operations,
> and adds as little overhead as possible.
>
>
> Message
> 
> The protocol Message type is extended with a Headers array consting of
> Tags, where a Tag is defined as:
>int16 Id
>int16 Len  // binary_data length
>binary_data[Len]  // opaque binary data
>
>
> Ids
> ---
> The Id space is not centrally managed, so whenever an application needs to
> add headers, or use an eco-system plugin that does, its Id allocation will
> need to be manually configured.
> This moves the allocation concern from the global space down to
> organization level and avoids the risk for id conflicts.
> Example pseudo-config for some app:
> sometrackerplugin.tag.sourcev3.id=1000
> dbthing.tag.tablename.id=1001
> myschemareg.tag.schemaname.id=1002
> myschemareg.tag.schemaversion.id=1003
>
>
> Each header-writing or header-reading plugin must provide means (typically
> through configuration) to specify the tag for each header it uses. Defaults
> should be avoided.
> A consumer silently ignores tags it does not have a mapping for (since the
> binary_data can't be parsed without knowing what it is).
>
> Id range 0..999 is reserved for future use by the broker and must not be
> used by plugins.
>
>
>
> Broker
> -
> The broker does not process the tags (other than the standard protocol
> syntax verification), it simply stores and forwards them as opaque data.
>
> Standard message translation (removal of Headers) kicks in for older
> clients.
>
>
> Why not string ids?
> -
> String ids might seem like a good idea, but:
>  * does not really solve uniqueness
>  * consumes a lot of space (2 byte string length + string, per header) to
> be meaningful
>  * doesn't really say anything how to parse the tag's data, so it is in
> effect useless on its own.
>
>
> Regards,
> Magnus
>
>
>
>
> 2016-11-07 18:32 GMT+01:00 Michael Pearce :
>
> > Hi Roger,
> >
> > Thanks for the support.
> >
> > I think the key thing is to have a common key space to make an ecosystem,
> > there does have to be some level of contract for people to play nicely.
> >
> > Having map or as per current proposed in kip of having a
> > numerical key space of  map is a level of the contract that
> > most people would expect.
> >
> > I think the example in a previous comment someone else made linking to
> AWS
> > blog and also implemented api where originally they didn’t have a header
> > space but not they do, where keys are uniform but the value can be
> string,
> > int, anything is a good example.
> >
> > Having a custom MetadataSerializer is something we had played with, but
> > discounted the idea, as if you wanted everyone to work the same way in
> the
> > ecosystem, having to have this also customizable makes it a bit harder.
> > Think about making the whole message record custom serializable, this
> would
> > make it fairly tricky (though it would not be impossible) to have made
> work
> > nicely. Having the value customizable we thought is a reasonable tradeoff
> > here of flexibility over contract of interaction between different
> parties.
> >
> > Is there a particular case or

Re: any plans to switch to java 8?

2016-11-10 Thread Sean McCauliff
Wait for JDK 9 which is supposed to be 4-5 months from now?

Sean

On Thu, Nov 10, 2016 at 10:23 AM, radai  wrote:
> with java 7 being EOL'ed for more than a year and a half now (apr 2015, see
> http://www.oracle.com/technetwork/java/eol-135779.html) i was wondering if
> there's an official plan/timetable for transitioning the kafka codebase
> over to java 8?


Re: [DISCUSS] KIP-82 - Add Record Headers

2016-12-01 Thread Sean McCauliff
gt;> >>
>> > >>  > > > > > >> >>
>> > >>  > > > > > >> >> > On Nov 8, 2016, at 5:54 PM, Gwen Shapira <
>> > >>  > g...@confluent.io>
>> > >>  > > > > > wrote:
>> > >>  > > > > > >> >> >
>> > >>  > > > > > >> >> > Thank you so much for this clear and fair
>> summary of
>> > the
>> > >>  > > > > arguments.
>> > >>  > > > > > >> >> >
>> > >>  > > > > > >> >> > I'm in favor of ints. Not a deal-breaker, but
>> in
>> > favor.
>> > >>  > > > > > >> >> >
>> > >>  > > > > > >> >> > Even more in favor of Magnus's decentralized
>> > suggestion
>> > >>  > with
>> > >>  > > > > > Roger's
>> > >>  > > > > > >> >> > tweak: add a namespace for headers. This will
>> allow
>> > each
>> > >>  > app
>> > >>  > > to
>> > >>  > > > > > just
>> > >>  > > > > > >> >> > use whatever IDs it wants internally, and then
>> let
>> > the
>> > >>  > admin
>> > >>  > > > > > >> deploying
>> > >>  > > > > > >> >> > the app figure out an available namespace ID
>> for the
>> > app
>> > >>  to
>> > >>  > > > live
>> > >>  > > > > > in.
>> > >>  > > > > > >> >> > So io.confluent.schema-registry can be
>> namespace
>> > 0x01 on
>> > >>  my
>> > >>  > > > > > >> deployment
>> > >>  > > > > > >> >> > and 0x57 on yours, and the poor guys
>> developing the
>> > app
>> > >>  > don't
>> > >>  > > > > need
>> > >>  > > > > > to
>> > >>  > > > > > >> >> > worry about that.
>> > >>  > > > > > >> >> >
>> > >>  > > > > > >> >>
>> > >>  > > > > > >> >> Gwen, if I understand your example right, an
>> > application
>> > >>  > > deployer
>> > >>  > > > > > might
>> > >>  > > > > > >> >> decide to use 0x01 in one deployment, and that
>> means
>> > that
>> > >>  > once
>> > >>  > > > the
>> > >>  > > > > > >> message
>> > >>  > > > > > >> >> is written into the broker, it will be saved on
>> the
>> > broker
>> > >>  > with
>> > >>  > > > > that
>> > >>  > > > > > >> >> specific namespace (0x01).
>> > >>  > > > > > >> >>
>> > >>  > > > > > >> >> If you were to mirror that message into another
>> > cluster,
>> > >>  the
>> > >>  > > 0x01
>> > >>  > > > > > would
>> > >>  > > > > > >> >> accompany the message, right? What if the
>> deployers of
>> > the
>> > >>  > same
>> > >>  > > > app
>> > >>  > > > > > in
>> > >>  > > > > > >> the
>> > >>  > > > > > >> >> other cluster uses 0x57? They won't understand
>> each
>> > other?
>> > >>  > > > > > >> >>
>> > >>  > > > > > >> >> I'm not sure that's an avoidable problem. I
>> think it
>> > simply
>> > >>  > > means
>> > >>  > > > > > that
>> > >>  > > > >

Re: [DISCUSS] 0.10.1.1 Plan

2016-12-01 Thread Sean McCauliff
Well I would like KAFKA-4250 (make ProducerRecord and ConsumerRecord
extensible) in the 0.10.1 branch if is not a big deal.  They are just
dumb structs.  But they are final so no extensibility is possible.

Sean

On Tue, Nov 29, 2016 at 5:32 PM, Ignacio Solis  wrote:
> I don't think anybody from LinkedIn asked for features on this release.  We
> just jumped in at the discussion of including a patch which was not a bug
> fix and whether it mattered.
>
> Having said that, the internal release we're working on came off the 0.10.1
> branch with a few internal hotfix patches and a few cherry picked fixes...
> Including the final keyword removal patch.
>
> Nacho
>
> On Tue, Nov 29, 2016, 5:15 PM Gwen Shapira  wrote:
>
>> btw. is LinkedIn no longer running from trunk? I'm not used to
>> LinkedIn employees requesting specific patches to be included in a
>> bugfix release.
>>
>> Any discussion on the content of any release is obviously welcome, I'm
>> just wondering if there was a change in policy.
>>
>> On Tue, Nov 29, 2016 at 2:17 PM, Ismael Juma  wrote:
>> > OK, so it seems like there are no changes that break compatibility in the
>> > 0.10.1 branch since we offer no compatibility guarantees for logging
>> > output. That's good. :)
>> >
>> > About the removal of final, it happened in trunk and it doesn't seem like
>> > anyone is still asking for it to be included in the 0.10.1 branch so it
>> is
>> > indeed not important for this bug fix release (I thought we had
>> established
>> > that quite a while ago).
>> >
>> > Ismael
>> >
>> > On Tue, Nov 29, 2016 at 9:35 PM, Ignacio Solis  wrote:
>> >
>> >> Sorry, that was a hasty reply.  There are also various logging things
>> that
>> >> change format. This could break parsers.
>> >>
>> >> None of them are important, my only argument is that the final keyword
>> >> removal is not important either.
>> >>
>> >> Nacho
>> >>
>> >>
>> >> On Tue, Nov 29, 2016 at 1:25 PM, Ignacio Solis  wrote:
>> >>
>> >> > https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=commit;h=
>> >> > 10cfc1628df024f7596d3af5c168fa90f59035ca
>> >> >
>> >> > On Tue, Nov 29, 2016 at 1:24 PM, Ismael Juma 
>> wrote:
>> >> >
>> >> >> Which changes break compatibility in the 0.10.1 branch? It would be
>> good
>> >> >> to
>> >> >> fix before the release goes out.
>> >> >>
>> >> >> Ismael
>> >> >>
>> >> >> On 29 Nov 2016 9:09 pm, "Ignacio Solis"  wrote:
>> >> >>
>> >> >> > Some of the changes in the 0.10.1 branch already are not bug fixes.
>> >> Some
>> >> >> > break compatibility.
>> >> >> >
>> >> >> > Having said that, at this level we should maintain a stable API and
>> >> >> leave
>> >> >> > any changes for real version bumps.  This should be only a bugfix
>> >> >> release.
>> >> >> >
>> >> >> > Nacho
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Tue, Nov 29, 2016 at 8:35 AM, Ismael Juma 
>> >> wrote:
>> >> >> >
>> >> >> > > I disagree, but let's discuss it another time and in a separate
>> >> >> thread.
>> >> >> > :)
>> >> >> > >
>> >> >> > > Ismael
>> >> >> > >
>> >> >> > > On Tue, Nov 29, 2016 at 4:30 PM, radai <
>> radai.rosenbl...@gmail.com>
>> >> >> > wrote:
>> >> >> > >
>> >> >> > > > designing kafka code for stable extensibility is a worthy and
>> >> noble
>> >> >> > > cause.
>> >> >> > > > however, seeing as there are no such derivatives out in the
>> wild
>> >> >> yet i
>> >> >> > > > think investing the effort right now is a bit premature from
>> >> kafka's
>> >> >> > pov.
>> >> >> > > > I think its enough simply not to purposefully prevent such
>> >> >> extensions.
>> >> >> > > >
>> >> >> > > > On Tue, Nov 29, 2016 at 4:05 AM, Ismael Juma <
>> ism...@juma.me.uk>
>> >> >> > wrote:
>> >> >> > > >
>> >> >> > > > > On Sat, Nov 26, 2016 at 11:08 PM, radai <
>> >> >> radai.rosenbl...@gmail.com>
>> >> >> > > > > wrote:
>> >> >> > > > >
>> >> >> > > > > > "compatibility guarantees that are expected by people who
>> >> >> subclass
>> >> >> > > > these
>> >> >> > > > > > classes"
>> >> >> > > > > >
>> >> >> > > > > > sorry if this is not the best thread for this discussion,
>> but
>> >> I
>> >> >> > just
>> >> >> > > > > wanted
>> >> >> > > > > > to pop in and say that since any subclassing of these will
>> >> >> > obviously
>> >> >> > > > not
>> >> >> > > > > be
>> >> >> > > > > > done within the kafka codebase - what guarantees are
>> needed?
>> >> >> > > > > >
>> >> >> > > > >
>> >> >> > > > > I elaborated a little in my other message in this thread. A
>> >> simple
>> >> >> > and
>> >> >> > > > > somewhat contrived example: `ConsumerRecord.toString` calls
>> the
>> >> >> > `topic`
>> >> >> > > > > method. Someone overrides the `topic` method and it all
>> works as
>> >> >> > > > expected.
>> >> >> > > > > In a subsequent release, we change `toString` to use the
>> field
>> >> >> > directly
>> >> >> > > > > (like it's done for other fields like `key` and `value`) and
>> it
>> >> >> will
>> >> >> > > > break
>> >> >> > > > > `toString` for this user. One may wonder: why would one
>> >> ov

Kafka SNAPSHOT artifact repositories.

2016-12-01 Thread Sean McCauliff
Is there an artifact repository where to-be-released versions of Kafka
are published?

There appears to be one at http://repository.apache.org/snapshots/  ,
but I'm not seeing anything published there after 0.8.2.

Thanks!
Sean


Re: [DISCUSS] KIP-82 - Add Record Headers

2016-10-03 Thread Sean McCauliff
Change to public interfaces:

"Add ProduceRequest/ProduceResponse V3 which uses the new message format.
Add FetchRequest/FetchResponse V3 which uses the new message format."

When I look at org.apache.kafka.common.requests.FetchResponse on
master I see that there is already a version 3.  Seems like this is
from a recent commit about implementing KIP-74.  Do we need to
coordinate these changes with KIP-74?


"The serialisation of the [int, bye[]] header set will on the wire
using a strict format"  bye[] -> byte[]

Sean
--
Sean McCauliff
Staff Software Engineer
Kafka

smccaul...@linkedin.com
linkedin.com/in/sean-mccauliff-b563192


On Fri, Sep 30, 2016 at 3:43 PM, radai  wrote:
> I think headers are a great idea.
>
> Right now, people who are trying to implement any sort of org-wide
> functionality like monitoring, tracing, profiling etc pretty much have to
> define their own wrapper layers, which probably leads to everyone
> implementing their own variants of the same underlying functionality.
>
> I think a common base for headers would allow implementing a lot of this
> functionality only one in a way that different header-based capabilities
> could be shared and composed and open the door the a wide range of possible
> Kafka middleware that's simply impossible to write against the current API.
>
> Here's a list of things that could be implemented as "plugins" on top of a
> header mechanism (full list here -
> https://cwiki.apache.org/confluence/display/KAFKA/A+Case+for+Kafka+Headers).
>
> A lot of these already exist within LinkedIn and could for example be open
> sourced if Kafka had headers. I'm fairly certain the same is true in other
> organizations using Kafka
>
>
>
> On Thu, Sep 22, 2016 at 12:31 PM, Michael Pearce 
> wrote:
>
>> Hi All,
>>
>>
>> I would like to discuss the following KIP proposal:
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 82+-+Add+Record+Headers
>>
>>
>>
>> I have some initial ?drafts of roughly the changes that would be needed.
>> This is no where finalized and look forward to the discussion especially as
>> some bits I'm personally in two minds about.
>>
>> https://github.com/michaelandrepearce/kafka/tree/kafka-headers-properties
>>
>>
>>
>> Here is a link to a alternative option mentioned in the kip but one i
>> would personally would discard (disadvantages mentioned in kip)
>>
>> https://github.com/michaelandrepearce/kafka/tree/kafka-headers-full?
>>
>>
>> Thanks
>>
>> Mike
>>
>>
>>
>>
>>
>> The information contained in this email is strictly confidential and for
>> the use of the addressee only, unless otherwise indicated. If you are not
>> the intended recipient, please do not read, copy, use or disclose to others
>> this message or any attachment. Please also notify the sender by replying
>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>> official business of this company shall be understood as neither given nor
>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>> registered in England and Wales, company number 04008957) and IG Index
>> Limited (a company registered in England and Wales, company number
>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>> Index Limited (register number 114059) are authorised and regulated by the
>> Financial Conduct Authority.
>>


[jira] [Created] (KAFKA-4840) There are are still cases where producer buffer pool will not remove waiters.

2017-03-03 Thread Sean McCauliff (JIRA)
Sean McCauliff created KAFKA-4840:
-

 Summary: There are are still cases where producer buffer pool will 
not remove waiters.
 Key: KAFKA-4840
 URL: https://issues.apache.org/jira/browse/KAFKA-4840
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.2.0
Reporter: Sean McCauliff


In BufferPool.allocate(int size, long maxTimeToBlockMs):
If a Throwable other than InterruptedException is thrown out of await() for 
some reason or if there is an exception thrown in the corresponding finally 
block around the await(), for example if waitTime.record(.) throws an 
exception, then the waiters are not removed from the waiters deque.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4840) There are are still cases where producer buffer pool will not remove waiters.

2017-03-08 Thread Sean McCauliff (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean McCauliff updated KAFKA-4840:
--
Description: 
In BufferPool.allocate(int size, long maxTimeToBlockMs):
If a Throwable other than InterruptedException is thrown out of await() for 
some reason or if there is an exception thrown in the corresponding finally 
block around the await(), for example if waitTime.record(.) throws an 
exception, then the waiters are not removed from the waiters deque.

The number of available bytes are also not restored when an exception happens.

  was:
In BufferPool.allocate(int size, long maxTimeToBlockMs):
If a Throwable other than InterruptedException is thrown out of await() for 
some reason or if there is an exception thrown in the corresponding finally 
block around the await(), for example if waitTime.record(.) throws an 
exception, then the waiters are not removed from the waiters deque.


> There are are still cases where producer buffer pool will not remove waiters.
> -
>
> Key: KAFKA-4840
> URL: https://issues.apache.org/jira/browse/KAFKA-4840
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>    Reporter: Sean McCauliff
>
> In BufferPool.allocate(int size, long maxTimeToBlockMs):
> If a Throwable other than InterruptedException is thrown out of await() for 
> some reason or if there is an exception thrown in the corresponding finally 
> block around the await(), for example if waitTime.record(.) throws an 
> exception, then the waiters are not removed from the waiters deque.
> The number of available bytes are also not restored when an exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4840) There are are still cases where producer buffer pool will not remove waiters.

2017-03-08 Thread Sean McCauliff (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean McCauliff updated KAFKA-4840:
--
Status: Patch Available  (was: Open)

> There are are still cases where producer buffer pool will not remove waiters.
> -
>
> Key: KAFKA-4840
> URL: https://issues.apache.org/jira/browse/KAFKA-4840
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>    Reporter: Sean McCauliff
>
> In BufferPool.allocate(int size, long maxTimeToBlockMs):
> If a Throwable other than InterruptedException is thrown out of await() for 
> some reason or if there is an exception thrown in the corresponding finally 
> block around the await(), for example if waitTime.record(.) throws an 
> exception, then the waiters are not removed from the waiters deque.
> The number of available bytes are also not restored when an exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4840) There are are still cases where producer buffer pool will not remove waiters.

2017-03-13 Thread Sean McCauliff (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean McCauliff updated KAFKA-4840:
--
Description: 
There are several problems dealing with errors in  BufferPool.allocate(int 
size, long maxTimeToBlockMs):

* The accumulated number of bytes are not put back into the available pool when 
an exception happens and a thread is waiting for bytes to become available.  
This will cause the capacity of the buffer pool to decrease over time any time 
a timeout is hit within this method.
* If a Throwable other than InterruptedException is thrown out of await() for 
some reason or if there is an exception thrown in the corresponding finally 
block around the await(), for example if waitTime.record(.) throws an 
exception, then the waiters are not removed from the waiters deque.
* On timeout or other exception waiters could be signaled, but are not.  If no 
other buffers are freed then the next waiting thread will also timeout and so 
on.


  was:
In BufferPool.allocate(int size, long maxTimeToBlockMs):
If a Throwable other than InterruptedException is thrown out of await() for 
some reason or if there is an exception thrown in the corresponding finally 
block around the await(), for example if waitTime.record(.) throws an 
exception, then the waiters are not removed from the waiters deque.

The number of available bytes are also not restored when an exception happens.


> There are are still cases where producer buffer pool will not remove waiters.
> -
>
> Key: KAFKA-4840
> URL: https://issues.apache.org/jira/browse/KAFKA-4840
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>    Reporter: Sean McCauliff
>
> There are several problems dealing with errors in  BufferPool.allocate(int 
> size, long maxTimeToBlockMs):
> * The accumulated number of bytes are not put back into the available pool 
> when an exception happens and a thread is waiting for bytes to become 
> available.  This will cause the capacity of the buffer pool to decrease over 
> time any time a timeout is hit within this method.
> * If a Throwable other than InterruptedException is thrown out of await() for 
> some reason or if there is an exception thrown in the corresponding finally 
> block around the await(), for example if waitTime.record(.) throws an 
> exception, then the waiters are not removed from the waiters deque.
> * On timeout or other exception waiters could be signaled, but are not.  If 
> no other buffers are freed then the next waiting thread will also timeout and 
> so on.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-8969) Log the partition being made online due to unclean leader election.

2019-10-02 Thread Sean McCauliff (Jira)
Sean McCauliff created KAFKA-8969:
-

 Summary: Log the partition being made online due to unclean leader 
election.
 Key: KAFKA-8969
 URL: https://issues.apache.org/jira/browse/KAFKA-8969
 Project: Kafka
  Issue Type: Bug
Reporter: Sean McCauliff
Assignee: Sean McCauliff


When unclean leader election happens it's difficult to find which partitions 
were affected. Knowledge of the affected partitions is sometimes needed when 
users are doing root cause investigations and want to narrow down the source of 
data loss.  Without logging this information somewhere it's not possible to 
know which partitions were affected by ULE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-5239) Producer buffer pool allocates memory inside a lock.

2017-05-14 Thread Sean McCauliff (JIRA)
Sean McCauliff created KAFKA-5239:
-

 Summary: Producer buffer pool allocates memory inside a lock.
 Key: KAFKA-5239
 URL: https://issues.apache.org/jira/browse/KAFKA-5239
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Sean McCauliff


KAFKA-4840 placed the ByteBuffer allocation inside the critical section.  
Previously byte buffer allocation happened outside of the critical section.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-10877) Instantiating loggers for every FetchContext causes low request handler idle pool ratio.

2020-12-21 Thread Sean McCauliff (Jira)
Sean McCauliff created KAFKA-10877:
--

 Summary: Instantiating loggers for every FetchContext causes low 
request handler idle pool ratio.
 Key: KAFKA-10877
 URL: https://issues.apache.org/jira/browse/KAFKA-10877
 Project: Kafka
  Issue Type: Bug
Reporter: Sean McCauliff


JDK11 has removed some classes used by log4j2 to initialize logging contexts.  
Now log4j2 uses StackWalker to discover where it has been instantiated.  
StackWalker is apparently very expensive.

Kafka has a Logging trait.  Classes which want to log application messages get 
access to the methods provided by the trait by mixing them in using "with 
Logging".  When this is done on scala object (a singleton) this is fine as the 
logging context in the Logging trait is only initialized at most once.   When 
this is done on class (e.g. class X extends Logging) the logging context is 
potentially created for each instance.  The logging context is needed to 
determine if a log message will be emitted.  So if the method debug("log me") 
is called the logging context is still initialized to determine if debug 
logging is enabled.  Initializing the logging context calls StackWalker.  This 
can't be avoided even if the log message would never be written to the log.

IncrementalFetchContext is one such class that is inheriting from Logging and 
incurring a very high cpu cost.  It also does this inside of locks.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)