Hi Joel,
I'm looking more closely at the OffsetCommitRequest wire protocol change
you mentioned below, and I cannot figure out how to explicitly construct a
request with the earlier version. Should the api version be different for
requests that do not include it and/or servers that do not support
re an example?
Gwen
On Sun, Jan 4, 2015 at 11:15 PM, Dana Powers wrote:
> Hi Joel,
>
> I'm looking more closely at the OffsetCommitRequest wire protocol change
> you mentioned below, and I cannot figure out how to explicitly construct a
> request with the earlier version.
check?
Dana Powers
Rdio, Inc.
dana.pow...@rd.io
rdio.com/people/dpkp/
On Mon, Jan 5, 2015 at 9:49 AM, Gwen Shapira wrote:
> Ah, I see :)
>
> The readFrom function basically tries to read two extra fields if you
> are on version 1:
>
> if (versionId == 1) {
> groupG
Java client either. It will send
> the timestamp no matter which version is used.
>
> This looks like a bug and I'd even mark it as blocker for 0.8.2 since
> it may prevent rolling upgrades.
>
> Are you opening the JIRA?
>
> Gwen
>
> On Mon, Jan 5, 2015 at
t;
> >
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaApis.scala#L678-L700
> > > > > > and version = 1 the new
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
nd I think we should work to tighten that process up.
Dana Powers
Rdio, Inc.
dana.pow...@rd.io
rdio.com/people/dpkp/
On Wed, Jan 14, 2015 at 5:43 PM, Jun Rao wrote:
> Hi, Dana,
>
> Thanks for reporting this. I investigated this a bit more. What you
> observed is the following: a
discussion. Thanks to everyone that contributed!
https://pypi.python.org/pypi/kafka-python
Dana Powers
Rdio, Inc.
dana.pow...@rd.io
rdio.com/people/dpkp/
I think the answer here is that the Kafka protocol includes a broker
metadata api. The client uses the broker host(s) you provide to discover
the full list of brokers in the cluster (and the topics+partitions each
manages/leads). The java client has a similar interface via
metadata.brokers.list / b
Hi Keith,
kafka-python raises FailedPayloadsError on unspecified server failures.
Typically this is caused by a server exception that results in a 0 byte
response. Have you checked your server logs?
-Dana
On Tue, Jul 28, 2015 at 2:01 PM, JIEFU GONG wrote:
> This won't be very helpful as I am
Hi Keith,
you can use the `auto_offset_reset` parameter to kafka-python's
KafkaConsumer. It behaves the same as the java consumer configuration of
the same name. See
http://kafka-python.readthedocs.org/en/latest/apidoc/kafka.consumer.html#kafka.consumer.KafkaConsumer.configure
for more details on
print " New message"
> > print " " + message.topic
> > print " " + message.partition
> > print " " + message.offset
> > print " " + message.key
> > print
Hi AL,
kafka deals in blobs, so you generally have to manage serialization /
deserialization at the producer + consumer level. kafka-python's
SimpleProducer and SimpleConsumer classes are fairly naive and operate
exclusively on bytes, so if you use those you will have to serialize before
producing
Hi Ben and Marko -- great suggestions re: connection failures and docker.
The specific error here is: LeaderNotAvailableError:
TopicMetadata(topic='topic-test-production', error=5, partitions=[])
That is an error code (5) returned from a MetadataRequest. In this context
it means that the topic di
I don't have much to add on this, but q: what is version 0.8.2.3? I thought
the latest in 0.8 series was 0.8.2.2?
-Dana
On Dec 17, 2015 5:56 PM, "Rajiv Kurian" wrote:
> Yes we are in the process of upgrading to the new producers. But the
> problem seems deeper than a compatibility issue. We have
vers and rebuild.
>
> Thanks
>
> On Fri, Dec 18, 2015 at 2:28 AM, Dana Powers
> wrote:
>
> > Hi Ben and Marko -- great suggestions re: connection failures and docker.
> >
> > The specific error here is: LeaderNotAvailableError:
> > TopicMetadata(topic=
If you don't like messing w/ ZK directly, another alternative is to
manually seek to offset 0 on all relevant topic-partitions (via
OffsetCommitRequest or your favorite client api) and change the
auto-offset-reset policy on your consumer to earliest/smallest. Bonus is
that this should also work for
Hi all,
I've been helping debug an issue filed against kafka-python related to
compatibility w/ Hortonworks 2.3.0.0 kafka release. As I understand it, HDP
is currently based on snapshots of apache/kafka trunk, merged with some
custom patches from HDP itself.
In this case, HDP's 2.3.0.0 kafka rele
y compatibility patches like these with 3rd party developers and have
> tests to ensure that.
>
> Thanks,
> Harsha
> On Wed, Dec 23, 2015, at 04:11 PM, Dana Powers wrote:
> > Hi all,
> >
> > I've been helping debug an issue filed against kafka-python related
Hi David - did you see my response on the earlier thread? Let me know if
this is a different issue.
http://mail-archives.apache.org/mod_mbox/kafka-users/201512.mbox/%3CCAKRFjjVZ7u4vnnMiPzxGc_oYYnnBqNjAGpfkC8GLj6MdLUvq6w%40mail.gmail.com%3E
-Dana
On Dec 27, 2015 3:08 AM, "David Montgomery"
wrote:
Do you have access to the server logs? Any error is likely recorded there
with a stack trace. You also might check what server version you are
connecting to.
-Dana
On Dec 30, 2015 3:49 AM, "Birendra Kumar Singh" wrote:
> I keep getting such warnings intermittenly in my application . The
> applic
A few thoughts from a non-expert:
connections are also processed asynchronously in the poll loop. If you are
not enabling any timeout, you may be seeing a few initial iterations spent
on setting up the channel connections. Also you probably need a few loop
iterations to get through an initial meta
then ERROR
>
> On Wed, Dec 30, 2015 at 9:51 PM, Dana Powers
> wrote:
>
> > Do you have access to the server logs? Any error is likely recorded there
> > with a stack trace. You also might check what server version you are
> > connecting to.
> >
> > -Dana
>
Very nice!
On Jan 7, 2016 04:41, "Pierre-Yves Ritschard" wrote:
> Hi list,
>
> While the 0.9.0.0 client lib is great to work with, I extracted some of
> the facade code I use internally into a library which smooths some
> aspects of interacting with Kafka from Clojure.
>
> The library provides a
That is expected when a topic is first created. Once the topic and all
partitions are initialized, the leader(s) will be available and you can
produce messages.
Note that if you disable topic auto creation, you should get an
UnknownTopicOrPartitionError, and you would need to create the topic
manu
Hi Doug,
The differences are fairly subtle. kafka-python is a community-backed
project that aims to be consistent w/ the official java client; pykafka is
sponsored by parse.ly and aims to provide a pythonic interface. whichever
you go with, I would love to hear your specific feedback on kafka-pyth
e no other big difference.
>
>
> Best Regards
>
> > 在 2016年1月9日,08:58,Dana Powers 写道:
> >
> > Hi Doug,
> >
> > The differences are fairly subtle. kafka-python is a community-backed
> > project that aims to be consistent w/ the official java clie
Looks like you aren't setting the request client-id, and server is crashing
on it. I'm not sure whether server api is expected to work w/o client-id,
but you can probably fix by sending one. Fwiw, kafka-python sends
'kafka-python' unless user specifies something else.
-Dana
On Jan 11, 2016 8:41 AM
t; python-kafka (oh, I just saw this 0.9x version, hm!) was better at
> > producing than pykafka for us, so we am currently using pykafka for
> > consumption, and python-kafka for production. python-kafka allows you to
> > produce to multiple topics using the same client instance.
Not exactly - there is some documentation in the source code, but I agree
that a wiki on this would be extremely useful.
Can anyone create a wiki page? If so, I'm happy to get something started.
It is really the missing piece for folks writing custom clients / anything
at the api layer.
-Dana
On
is, it would be great to move the protocol docs to the
> >> docs folder of the git repo:
> >>
> >> https://github.com/apache/kafka/tree/trunk/docs
> >>
> >> This way, we can ensure that the protocol docs are updated at the same
> >> time
&g
Sadly HDP 2.3.2 shipped with a broken OffsetCommit api (the 0.8.2-beta
version). You should use the apache releases, or upgrade to HDP 2.3.4.0 or
later.
-Dana
On Tue, Jan 19, 2016 at 12:03 AM, Krzysztof Zarzycki
wrote:
> Hi Kafka users,
> I have an issue with saving Kafka offsets to Zookeeper t
hink? Is Kafka 0.9 completely backward compatible? I.e.
> clients(both producers & consumers) using libraries for 0.8.2 (both
> "kafka-clients" as well as straight "kafka") connecting to it will work
> after upgrade?
>
> Thanks for your answer,
> Krzysztof
version 0.9.5 of kafka-python does not support coordinated consumer groups.
You can get this feature in the master branch on github (
https://github.com/dpkp/kafka-python) using kafka 0.9.0.0 brokers. I expect
to release the updates to pypi soon, but for now you'll have to install
from source.
Oth
Hi Fang, take a look at the docs on KIP-1 for some background info on acks
policy:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks
-Dana
On Wed, Jan 20, 2016 at 3:50 PM, Fang Wong wrote:
> We are using kafka 0.8.2.1 and set acks to 2, see the f
You can find protocol documentation here (including a list of api key #s):
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
-Dana
On Sun, Jan 31, 2016 at 5:46 PM, Heath Ivie wrote:
> To piggy back , where can I find the api key values?
>
> Sent from Outlook Mobi
The committed offset is actually the next message to consume, not the last
message consumed. So that sounds like expected behavior to me. The consumer
code handles this internally, but if you write code to commit offsets
manually, it can be a gotcha.
-Dana
On Mon, Feb 1, 2016 at 1:35 PM, Adam Kun
Some comments based on your code snippet:
(1) you aren't looping on brokerCount -- you should be decoding broker
metadata for each count
(2) you are missing the topic metadata array count (and loop) -- 4 bytes
(3) topic errorcode is an Int16, so you should be reading 2 bytes, not 4
(4) you are mis
Hi Heath, a few comments:
(1) you should be looping on brokerCount
(2) you are missing the topic array count (4 bytes), and loop
(3) topic error code is int16, so only 2 bytes not 4
(4) you are missing the partition metadata array count (4 bytes), and loop
(5) you are missing the replicas and isr
I think it is possible to infer a version based on a remote broker's
response to various newer api methods. I wrote some probes in python that
check ListGroups for 0.9, GroupCoordinatorRequest for 0.8.2,
OffsetFetchRequest for 0.8.1, and MetadataRequest for 0.8.0. It seems to
work well enough.
Als
Hi karthik - I usually address kafka-python specific questions via github.
Can you file an issue at github.com/dpkp/kafka-python and I will follow up
there?
My initial reaction is you should leave group_id=None if you want to
duplicate behavior of the console consumer.
-Dana
Hello,
I am having t
+1 -- passes kafka-python test suite.
-Dana
On Sun, Jun 18, 2017 at 10:49 PM, Magnus Edenhill
wrote:
> +1 (non-binding)
>
> Passes librdkafka integration tests (v0.9.5 and master)
>
>
> 2017-06-19 0:32 GMT+02:00 Ismael Juma :
>
> > Hello Kafka users, developers and client-developers,
> >
> > Th
+1. passed kafka-python integration tests, and manually verified
producer/consumer on both compressed and non-compressed data.
-Dana
On Mon, Oct 23, 2017 at 6:00 PM, Guozhang Wang wrote:
> Hello Kafka users, developers and client-developers,
>
> This is the third candidate for release of Apache
The LZ4 implementation "works" but has a framing bug that can make third
party client use difficult. See KAFKA-3160. If you only plan to use the
official Java client then that issue shouldn't be a problem.
-Dana
On Mar 21, 2016 12:26 PM, "Pete Wright" wrote:
>
>
> On 03/17/2016 04:03 PM, Virendr
The MemberAssignment bytes returned in SyncResponse should be the bytes
that your leader sent in its SyncRequest. <<0, 0, 0, 0>> is simply an empty
Bytes array (32 bit length of 0). The broker does not alter those bytes as
far as I know, so despite the protocol doc describing what a
MemberAssignmen
t; misinterpreting the docs I accidently drop four zeroes in a bit that the
> server expects as a length marker.
>
> On Sat, Mar 26, 2016 at 6:05 PM, Dana Powers
> wrote:
>
> > The MemberAssignment bytes returned in SyncResponse should be the bytes
> > that your leader
I also found the documentation difficult to parse when it came time to
implement group APIs. I ended up just reading the client source code and
trying api calls until it made sense.
My general description from off the top of my head:
(1) have all consumers submit a shared protocol_name string* an
Somewhat of an aside, but you might note that the kafka producer is
intended to block during send() as backpressure on memory allocation.
This is admittedly different than blocking on metadata, but it is worth
considering that the premise that send() should *never* block because
it returns a Future
Hi Marcos,
ConsumerMetadata* was renamed to GroupCoordinator* in 0.9 . the api
protocol is unchanged.
However, the new Java clients use non-blocking network channels. It looks
like the example code may reference the deprecated, or
soon-to-be-deprecated, Scala client.
Rather than roll your own mo
The prior discussion explained:
(1) The code you point to blocks for a maximum of max.block.ms, which is
user configurable. It does not block indefinitely with no user control as
you suggest. You are free to configure this to 0 if you like at it will not
block at all. Have you tried this like I su
Not a typo. This happens because the consumer closes the coordinator,
and the coordinator attempts to commit any pending offsets
synchronously in order to avoid duplicate message delivery. The
Coordinator method commitOffsetsSync will retry indefinitely unless a
non-recoverable error is encountered
getting
> stuck ok close() call when I really want my system that uses KafkaConsumer to
> exit. So Consumer.close(timeout) is what I was really asking about.
> So, is there a way now to interrupt such block?
>
> Cheers
> Oleg
>
>> On Apr 11, 2016, at 4:08 PM, Dana Power
ove a few mountains to work around what most would
> perceive to be a design issue is not the acceptable answer.
> I’ll raise the JIRA
>
> Cheers
> Oleg
>
>> On Apr 11, 2016, at 4:25 PM, Dana Powers wrote:
>>
>> If you wanted to implement a timeout, you'd need t
This error is thrown by the old scala producer, correct? You might
also consider switching to the new java producer. It handles this a
bit differently by blocking during the send call until the internal
buffers can enqueue a new message. There are a few more configuration
options available as well,
There's also an irc channel on freenode #apache-kafka that hosts some
periodic user discussion.
On Tue, Apr 26, 2016 at 7:17 PM, Harsha wrote:
> We use apache JIRA and mailing lists for any discussion.
> Thanks,
> Harsha
>
> On Tue, Apr 26, 2016, at 06:20 PM, Kanagha wrote:
>> Hi,
>>
>> Is there
It means there was a consumer group rebalance that this consumer missed.
You may be spending too much time in msg processing between poll() calls.
-Dana
0.8 clients manage groups by connecting directly to zookeeper and
implementing shared group management code. There are no broker APIs used.
0.9 clients manage groups using new kafka broker APIs. These clients no
longer connect directly to zookeeper. JoinGroupRequest is an 0.9 api.
For an 0.8ish r
sult of an asynchronous computation"
> >>
> >> executor.submit(new Callable() {
> >>@Override
> >>public RecordMetadata call() throws Exception {
> >>// first make sure the metadata for the topic is available
> >>long waitedO
If you're attempting to run a heterogeneous consumer group (different
clients joining same group) then you must take care that the group
protocols have been implemented the same. kafka-python attempts to mirror
the Java client exactly. I haven't looked at pykafka in a while, but it is
possible that
I'd also add that there is some discussion of the mixed-broker issue
on the KIP-35 wiki page as well:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version#KIP-35-Retrievingprotocolversion-Aclientdeveloperwantstoaddsupportforanewfeature
Of course, since 0.9 (and e
Hi all,
Happy to release kafka-python v1.2.0 to pypi, including support for
new 0.10 features: ApiVersions, and the new v1 message format.
Changelog details are at
http://kafka-python.readthedocs.io/en/1.2.0/changelog.html
Many thanks to all of the great contributions, testing, and bug fixing!
Barry - i believe the error refers to the consumer group "protocol" that is
used to decide which partitions get assigned to which consumers. The way it
works is that each consumer says it wants to join X group and it can
support protocols (1, 2, 3...). The broker looks at all consumers in group
X a
Very nice!
On Wed, Jun 15, 2016 at 6:40 PM, John Dennison wrote:
> My team has published a post comparing python kafka clients. Might be of
> interest to python users.
>
> http://activisiongamescience.github.io/2016/06/15/Kafka-Client-Benchmarking/
Is your test reusing a group name? And if so, are your consumer instances
gracefully leaving? This may cause subsequent 'rebalance' operations to
block until those old consumers check-in or the session timeout happens
(30secs)
-Dana
On Jun 18, 2016 8:56 PM, "Rohit Sardesai"
wrote:
> I am using t
Yes. The expectation is that brokers are backwards compatible: new broker
releases should work with old clients (to 0.8, but not back to 0.7).
The opposite, backwards compatible clients, is generally not supported: new
clients may not always work with old brokers (except for, *cough* *cough*,
kafk
+1
tested against kafka-python integration test suite = pass.
Aside: as the scope of kafka gets bigger, it may be useful to organize
release notes into functional groups like core, brokers, clients,
kafka-streams, etc. I've found this useful when organizing
kafka-python release notes.
-Dana
On
Are you asking about a Kafka python driver? Or are referring to pyspark?
On Aug 1, 2016 10:03, "Andy Davidson" wrote:
> I am new to python.
>
> I find my self working with several data frames at the same time. I have
> run
> into some driver memory problems and want to make sure I release all
> r
kafka-python by default uses the same partitioning algorithm as the Java
client. If there are bugs, please let me know. I think the issue here is
with the default nodejs partitioner.
-Dana
On Aug 3, 2016 7:03 PM, "Jack Huang" wrote:
I see, thanks for the clarification.
On Tue, Aug 2, 2016 at 10
passed kafka-python integration tests, +1
-Dana
On Fri, Aug 5, 2016 at 9:35 AM, Tom Crayford wrote:
> Heroku has tested this using the same performance testing setup we used to
> evaluate the impact of 0.9 -> 0.10 (see https://engineering.
> heroku.com/blogs/2016-05-27-apache-kafka-010-evaluati
python is generally restricted to a single CPU, and kafka-python will max
out a single CPU well before it maxes a network card. I would recommend
other tools for bulk transfers. Otherwise you may find that partitioning
your data set and running separate python processes for each will increase
the o
Kafka
Producer API, but should this really affect the performance under the
assumption that the Kafka configuration stays as is?
>
>> On 25 Aug 2016, at 18:43, Dana Powers wrote:
>>
>> python is generally restricted to a single CPU, and kafka-python will
>> max out a s
[+ dev list]
I have not worked on KAFKA-1793 directly, but I believe most of the
work so far has been in removing all zookeeper dependencies from
clients. The two main areas for that are (1) consumer rebalancing, and
(2) administrative commands.
1) Consumer rebalancing APIs were added to the brok
I am not aware of any active development. I did some initial work on a
branch on my laptop w/ basic functionality built on kafka-python. I'm
happy to ping you if/when I push to github. I expect that Confluent
may also be preparing something with their python client wrapper
around librdkafka, but I
+1; all kafka-python integration tests pass.
-Dana
On Wed, Oct 12, 2016 at 10:41 AM, Jason Gustafson wrote:
> Hello Kafka users, developers and client-developers,
>
> One more RC for 0.10.1.0. I think we're getting close!
>
> Release plan:
> https://cwiki.apache.org/confluence/display/KAFKA/Rel
+1 -- passes kafka-python integration tests
On Mon, Oct 17, 2016 at 1:28 PM, Jun Rao wrote:
> Thanks for preparing the release. Verified quick start on scala 2.11 binary.
> +1
>
> Jun
>
> On Fri, Oct 14, 2016 at 4:29 PM, Jason Gustafson wrote:
>>
>> Hello Kafka users, developers and client-devel
-1
On Wed, Oct 26, 2016 at 9:23 AM, Shekar Tippur wrote:
> +1
>
> Thanks
Requires stopping your existing consumers, but otherwise should work:
from kafka import KafkaConsumer, TopicPartition, OffsetAndMetadata
def reset_offsets(group_id, topic, bootstrap_servers):
consumer = KafkaConsumer(bootstrap_servers=bootstrap_servers,
group_id=group_id)
consumer.assign([Top
kafka-python, yes.
On May 4, 2017 2:28 AM, "Paul van der Linden" wrote:
Thanks everyone. @Dana is that using the kafka-python library?
On Thu, May 4, 2017 at 4:52 AM, Dana Powers wrote:
> Requires stopping your existing consumers, but otherwise should work:
>
>
Be aware that JMX metrics changed between 0.7 and 0.8. If you use chef,
you might also check out https://github.com/bflad/chef-jmxtrans which has
recipes for both 0.7 and 0.8 kafka metrics -> graphite.
Dana Powers
Rdio, Inc.
dana.pow...@rd.io
rdio.com/people/dpkp/
On Thu, Mar 20, 2014 at 7
I created kafka-clie...@groups.google.com
https://groups.google.com/forum/m/#!forum/kafka-clients
No members and no guidelines yet, but it's a start. Would love to get this
going.
Dana
On Aug 19, 2014 9:03 AM, "Mark Roberts" wrote:
> Did this mailing list ever get created? Was there consensu
n.com/wiki/cf/display/ENGS/How+to+write+a+0.9+consumer
>From here:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design#Kafka0.9ConsumerRewriteDesign-ConsumerHowTo
Thanks,
Dana Powers
Rdio, Inc.
dana.pow...@rd.io
rdio.com/people/dpkp/
following up on this -- I think the online API docs for OffsetCommitRequest
still incorrectly refer to client-side timestamps:
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
Wasn't that removed and now always handled se
81 matches
Mail list logo