Re: Kafka Read Data from All Partition Using Key or Timestamp

2017-05-25 Thread SenthilKumar K
Hello Dev Team, Pls let me know if any option to read data from Kafka (all
partition ) using timestamp . Also can we set custom offset value to
messages ?

Cheers,
Senthil

On Wed, May 24, 2017 at 7:33 PM, SenthilKumar K 
wrote:

> Hi All ,  We have been using Kafka for our Use Case which helps in
> delivering real time raw logs.. I have a requirement to fetch data from
> Kafka by using offset ..
>
> DataSet Example :
> {"access_date":"2017-05-24 13:57:45.044","format":"json",
> "start":"1490296463.031"}
> {"access_date":"2017-05-24 13:57:46.044","format":"json",
> "start":"1490296463.031"}
> {"access_date":"2017-05-24 13:57:47.044","format":"json",
> "start":"1490296463.031"}
> {"access_date":"2017-05-24 13:58:02.042","format":"json",
> "start":"1490296463.031"}
>
> Above JSON data will be stored in Kafka..
>
> Key --> acces_date in epoch format
> Value --> whole JSON.
>
> Data Access Pattern:
>   1) Get me last 2 minz data ?
>2) Get me records between 2017-05-24 13:57:42:00 to 2017-05-24
> 13:57:44:00 ?
>
> How to achieve this in Kafka ?
>
> I tried using SimpleConsumer , but it expects partition and not sure
> SimpleConsumer would match our requirement...
>
> Appreciate you help !
>
> Cheers,
> Senthil
>


[GitHub] kafka pull request #3123: KAFKA-4935: Deprecate client checksum API and comp...

2017-05-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3123


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4935) Consider disabling record level CRC checks for message format V2

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024335#comment-16024335
 ] 

ASF GitHub Bot commented on KAFKA-4935:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3123


> Consider disabling record level CRC checks for message format V2
> 
>
> Key: KAFKA-4935
> URL: https://issues.apache.org/jira/browse/KAFKA-4935
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> With the new message format proposed in KIP-98, the record level CRC has been 
> moved to the the batch header.
> Because we expose the record-level CRC in `RecordMetadata` and 
> `ConsumerRecord`, we currently compute it eagerly based on the key, value and 
> timestamp even though these methods are rarely used. Ideally, we'd deprecate 
> the relevant methods in `RecordMetadata` and `ConsumerRecord` while making 
> the CRC computation lazy. This seems pretty hard to achieve in the Producer 
> without increasing memory retention, but it may be possible to do in the 
> Consumer.
> An alternative option is to return the batch CRC from the relevant methods.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4935) Consider disabling record level CRC checks for message format V2

2017-05-25 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4935:
---
   Resolution: Fixed
Fix Version/s: (was: 0.11.0.0)
   0.11.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3123
[https://github.com/apache/kafka/pull/3123]

> Consider disabling record level CRC checks for message format V2
> 
>
> Key: KAFKA-4935
> URL: https://issues.apache.org/jira/browse/KAFKA-4935
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.1.0
>
>
> With the new message format proposed in KIP-98, the record level CRC has been 
> moved to the the batch header.
> Because we expose the record-level CRC in `RecordMetadata` and 
> `ConsumerRecord`, we currently compute it eagerly based on the key, value and 
> timestamp even though these methods are rarely used. Ideally, we'd deprecate 
> the relevant methods in `RecordMetadata` and `ConsumerRecord` while making 
> the CRC computation lazy. This seems pretty hard to achieve in the Producer 
> without increasing memory retention, but it may be possible to do in the 
> Consumer.
> An alternative option is to return the batch CRC from the relevant methods.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5325) Connection Lose during Kafka Kerberos Renewal process

2017-05-25 Thread MuthuKumar (JIRA)
MuthuKumar created KAFKA-5325:
-

 Summary: Connection Lose during Kafka Kerberos Renewal process
 Key: KAFKA-5325
 URL: https://issues.apache.org/jira/browse/KAFKA-5325
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.9.0.0
Reporter: MuthuKumar


During Kerberos Ticket renewal, all requests reaching the server interim 
Kerberos renewal ticket logout & re-login  is getting failed with below 
mentioned error.

kafka-clients-0.9.0.0.jar is being used for producer end. Reason for using 
Kafka version 0.9.0.0 at producer end as the server is running in 0.10.0.x 

OS: Oracle Linux Server release 6.7

Kerberos Configuration - Producer end
-
KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
refreshKrb5Config=true
principal="u...@.com"
useKeyTab=true
serviceName="kafka"
keyTab="x.keytab"
client=true;
};

Application Log
---
2017-05-25 02:20:37,515 INF [Login.java:354] Initiating logout for u...@.com
2017-05-25 02:20:37,515 INF [Login.java:365] Initiating re-login for 
u...@.com
2017-05-25 02:20:37,525 INF [SaslChannelBuilder.java:91] Failed to create 
channel due to
org.apache.kafka.common.KafkaException: Failed to configure 
SaslClientAuthenticator
at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:94)
at 
org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:88)
at org.apache.kafka.common.network.Selector.connect(Selector.java:162)
at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:514)
at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:169)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:180)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.NoSuchElementException: null
at java.util.LinkedList$ListItr.next(LinkedList.java:890)
at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1056)
at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:90)
... 7 common frames omitted
2017-05-25 02:20:37,526 ERR [Sender.java:130] Uncaught error in kafka producer 
I/O thread:
org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: 
Failed to configure SaslClientAuthenticator
at 
org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:92)
at org.apache.kafka.common.network.Selector.connect(Selector.java:162)
at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:514)
at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:169)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:180)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kafka.common.KafkaException: Failed to configure 
SaslClientAuthenticator
at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:94)
at 
org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:88)
... 6 common frames omitted
Caused by: java.util.NoSuchElementException: null
at java.util.LinkedList$ListItr.next(LinkedList.java:890)
at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1056)
at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:90)
... 7 common frames omitted
2017-05-25 02:20:37,536 ERR [Sender.java:130] Uncaught error in kafka producer 
I/O thread:
java.lang.NullPointerException: null
2017-05-25 02:20:37,536 ERR [Sender.java:130] Uncaught error in kafka producer 
I/O thread:
java.lang.NullPointerException: null
2017-05-25 02:20:37,536 ERR [Sender.java:130] Uncaught error in kafka producer 
I/O thread:
java.lang.NullPointerException: null




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Kafka Read Data from All Partition Using Key or Timestamp

2017-05-25 Thread Mayuresh Gharat
Hi Senthil,

Kafka does allow search message by timestamp after KIP-33 :
https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-Searchmessagebytimestamp

The new consumer does provide you a way to get offsets by timestamp. You
can use these offsets to seek to that offset and consume from there. So if
you want to consume between a range you can get the start and end offset
based on the timestamps, seek to the start offset and consume and process
the data till you reach the end offset.

But these timestamps are either CreateTime(when the message was created and
you will have to specify this when you do the send()) or LogAppendTime(when
the message was appended to the log on the kafka broker) :
https://kafka.apache.org/0101/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html

Kafka does not look at the fields in your data (key/value) for giving back
you the data. What I meant was it will not look at the timestamp specified
by you in the actual data payload.

Thanks,

Mayuresh

On Thu, May 25, 2017 at 12:43 PM, SenthilKumar K 
wrote:

> Hello Dev Team, Pls let me know if any option to read data from Kafka (all
> partition ) using timestamp . Also can we set custom offset value to
> messages ?
>
> Cheers,
> Senthil
>
> On Wed, May 24, 2017 at 7:33 PM, SenthilKumar K 
> wrote:
>
> > Hi All ,  We have been using Kafka for our Use Case which helps in
> > delivering real time raw logs.. I have a requirement to fetch data from
> > Kafka by using offset ..
> >
> > DataSet Example :
> > {"access_date":"2017-05-24 13:57:45.044","format":"json",
> > "start":"1490296463.031"}
> > {"access_date":"2017-05-24 13:57:46.044","format":"json",
> > "start":"1490296463.031"}
> > {"access_date":"2017-05-24 13:57:47.044","format":"json",
> > "start":"1490296463.031"}
> > {"access_date":"2017-05-24 13:58:02.042","format":"json",
> > "start":"1490296463.031"}
> >
> > Above JSON data will be stored in Kafka..
> >
> > Key --> acces_date in epoch format
> > Value --> whole JSON.
> >
> > Data Access Pattern:
> >   1) Get me last 2 minz data ?
> >2) Get me records between 2017-05-24 13:57:42:00 to 2017-05-24
> > 13:57:44:00 ?
> >
> > How to achieve this in Kafka ?
> >
> > I tried using SimpleConsumer , but it expects partition and not sure
> > SimpleConsumer would match our requirement...
> >
> > Appreciate you help !
> >
> > Cheers,
> > Senthil
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


[jira] [Commented] (KAFKA-5211) KafkaConsumer should not skip a corrupted record after throwing an exception.

2017-05-25 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024372#comment-16024372
 ] 

Eno Thereska commented on KAFKA-5211:
-

Thanks [~becket_qin], [~ijuma], [~hachikuji]. Makes sense that this JIRA 
doesn't need a KIP and that for the future we'll consider a KIP if we change 
things. I'll do a KIP for streams-specific changes (that don't touch the Kafka 
protocol) and we can have some discussion as part of that KIP. Thanks.

> KafkaConsumer should not skip a corrupted record after throwing an exception.
> -
>
> Key: KAFKA-5211
> URL: https://issues.apache.org/jira/browse/KAFKA-5211
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>  Labels: clients, consumer
> Fix For: 0.11.0.0
>
>
> In 0.10.2, when there is a corrupted record, KafkaConsumer.poll() will throw 
> an exception and block on that corrupted record. In the latest trunk this 
> behavior has changed to skip the corrupted record (which is the old consumer 
> behavior). With KIP-98, skipping corrupted messages would be a little 
> dangerous as the message could be a control message for a transaction. We 
> should fix the issue to let the KafkaConsumer block on the corrupted messages.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4935) Consider disabling record level CRC checks for message format V2

2017-05-25 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4935:
---
Fix Version/s: 0.11.0.0

> Consider disabling record level CRC checks for message format V2
> 
>
> Key: KAFKA-4935
> URL: https://issues.apache.org/jira/browse/KAFKA-4935
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> With the new message format proposed in KIP-98, the record level CRC has been 
> moved to the the batch header.
> Because we expose the record-level CRC in `RecordMetadata` and 
> `ConsumerRecord`, we currently compute it eagerly based on the key, value and 
> timestamp even though these methods are rarely used. Ideally, we'd deprecate 
> the relevant methods in `RecordMetadata` and `ConsumerRecord` while making 
> the CRC computation lazy. This seems pretty hard to achieve in the Producer 
> without increasing memory retention, but it may be possible to do in the 
> Consumer.
> An alternative option is to return the batch CRC from the relevant methods.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5173) SASL tests failing with Could not find a 'KafkaServer' or 'sasl_plaintext.KafkaServer' entry in the JAAS configuration

2017-05-25 Thread liumei (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024439#comment-16024439
 ] 

liumei commented on KAFKA-5173:
---

When i start broker with sasl config, it failed. Is it the same reason of 
KAFKA-5173? 

[2017-05-23 18:03:53,722] FATAL Fatal error during KafkaServerStartable 
startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: 
Jaas configuration not found
 at 
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:86)
 at 
org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:70)
 at kafka.network.Processor.(SocketServer.scala:406)
 at kafka.network.SocketServer.newProcessor(SocketServer.scala:141)
 at 
kafka.network.SocketServer$$anonfun$startup$1$$anonfun$apply$1.apply$mcVI$sp(SocketServer.scala:94)
 at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
 at kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:93)
 at kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:89)
 at scala.collection.Iterator$class.foreach(Iterator.scala:893)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
 at scala.collection.MapLike$DefaultValuesIterable.foreach(MapLike.scala:206)
 at kafka.network.SocketServer.startup(SocketServer.scala:89)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:219)
 at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:39)
 at kafka.Kafka$.main(Kafka.scala:67)
 at kafka.Kafka.main(Kafka.scala)
Caused by: org.apache.kafka.common.KafkaException: Jaas configuration not found
 at 
org.apache.kafka.common.security.kerberos.KerberosLogin.getServiceName(KerberosLogin.java:299)
 at 
org.apache.kafka.common.security.kerberos.KerberosLogin.configure(KerberosLogin.java:103)
 at 
org.apache.kafka.common.security.authenticator.LoginManager.(LoginManager.java:45)
 at 
org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:68)
 at 
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:78)
 ... 15 more
Caused by: java.io.IOException: Could not find a 'KafkaServer' entry in this 
configuration.
 at org.apache.kafka.common.security.JaasUtils.jaasConfig(JaasUtils.java:50)
 at 
org.apache.kafka.common.security.kerberos.KerberosLogin.getServiceName(KerberosLogin.java:297)
 ... 19 more
[2017-05-23 18:03:53,724] INFO [Kafka Server 0], shutting down 
(kafka.server.KafkaServer)
-Djava.security.auth.login.config=/home/admin/kafka/0.10.1.0/kafka_2.11-0.10.1.0/kafka_server_jaas.conf

> SASL tests failing with Could not find a 'KafkaServer' or 
> 'sasl_plaintext.KafkaServer' entry in the JAAS configuration
> --
>
> Key: KAFKA-5173
> URL: https://issues.apache.org/jira/browse/KAFKA-5173
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> I've seen this a few times. One example:
> {code}
> java.lang.IllegalArgumentException: Could not find a 'KafkaServer' or 
> 'sasl_plaintext.KafkaServer' entry in the JAAS configuration. System property 
> 'java.security.auth.login.config' is /tmp/kafka8162725028002772063.tmp
>   at 
> org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:131)
>   at 
> org.apache.kafka.common.security.JaasContext.load(JaasContext.java:96)
>   at 
> org.apache.kafka.common.security.JaasContext.load(JaasContext.java:78)
>   at 
> org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:100)
>   at 
> org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:73)
>   at kafka.network.Processor.(SocketServer.scala:423)
>   at kafka.network.SocketServer.newProcessor(SocketServer.scala:145)
>   at 
> kafka.network.SocketServer$$anonfun$startup$1$$anonfun$apply$1.apply$mcVI$sp(SocketServer.scala:96)
>   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
>   at 
> kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:95)
>   at 
> kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:90)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at kafka.network.SocketServer.startup(SocketServer.scala:90)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:218)
>   at kafka.utils.TestUtils$.createServer(TestUtils.scala:126)
>   at 
> kafka.integration.BaseTopicMetadataTest.setUp(BaseTopicMetadataTest.scala:51)
>   at 
> kafk

Jenkins build is back to normal : kafka-trunk-jdk8 #1590

2017-05-25 Thread Apache Jenkins Server
See 




[DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-25 Thread Eno Thereska
Hi there,

I’ve added a KIP on improving exception handling in streams:
KIP-161: streams record processing exception handlers. 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+record+processing+exception+handlers
 


Discussion and feedback is welcome, thank you.
Eno

Re: Kafka Read Data from All Partition Using Key or Timestamp

2017-05-25 Thread SenthilKumar K
Thanks a lot Mayuresh. I will look into SearchMessageByTimestamp feature in
Kafka ..

Cheers,
Senthil

On Thu, May 25, 2017 at 1:12 PM, Mayuresh Gharat  wrote:

> Hi Senthil,
>
> Kafka does allow search message by timestamp after KIP-33 :
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 33+-+Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-
> Searchmessagebytimestamp
>
> The new consumer does provide you a way to get offsets by timestamp. You
> can use these offsets to seek to that offset and consume from there. So if
> you want to consume between a range you can get the start and end offset
> based on the timestamps, seek to the start offset and consume and process
> the data till you reach the end offset.
>
> But these timestamps are either CreateTime(when the message was created
> and you will have to specify this when you do the send()) or
> LogAppendTime(when the message was appended to the log on the kafka broker)
> : https://kafka.apache.org/0101/javadoc/org/apache/kafka/clients/producer/
> ProducerRecord.html
>
> Kafka does not look at the fields in your data (key/value) for giving back
> you the data. What I meant was it will not look at the timestamp specified
> by you in the actual data payload.
>
> Thanks,
>
> Mayuresh
>
> On Thu, May 25, 2017 at 12:43 PM, SenthilKumar K 
> wrote:
>
>> Hello Dev Team, Pls let me know if any option to read data from Kafka (all
>> partition ) using timestamp . Also can we set custom offset value to
>> messages ?
>>
>> Cheers,
>> Senthil
>>
>> On Wed, May 24, 2017 at 7:33 PM, SenthilKumar K 
>> wrote:
>>
>> > Hi All ,  We have been using Kafka for our Use Case which helps in
>> > delivering real time raw logs.. I have a requirement to fetch data from
>> > Kafka by using offset ..
>> >
>> > DataSet Example :
>> > {"access_date":"2017-05-24 13:57:45.044","format":"json",
>> > "start":"1490296463.031"}
>> > {"access_date":"2017-05-24 13:57:46.044","format":"json",
>> > "start":"1490296463.031"}
>> > {"access_date":"2017-05-24 13:57:47.044","format":"json",
>> > "start":"1490296463.031"}
>> > {"access_date":"2017-05-24 13:58:02.042","format":"json",
>> > "start":"1490296463.031"}
>> >
>> > Above JSON data will be stored in Kafka..
>> >
>> > Key --> acces_date in epoch format
>> > Value --> whole JSON.
>> >
>> > Data Access Pattern:
>> >   1) Get me last 2 minz data ?
>> >2) Get me records between 2017-05-24 13:57:42:00 to 2017-05-24
>> > 13:57:44:00 ?
>> >
>> > How to achieve this in Kafka ?
>> >
>> > I tried using SimpleConsumer , but it expects partition and not sure
>> > SimpleConsumer would match our requirement...
>> >
>> > Appreciate you help !
>> >
>> > Cheers,
>> > Senthil
>> >
>>
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>


[GitHub] kafka pull request #3139: MINOR: fix flakiness in testDeleteAcls

2017-05-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3139


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5263) kakfa-clients consume 100% CPU with manual partition assignment when network connection is lost

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024492#comment-16024492
 ] 

ASF GitHub Bot commented on KAFKA-5263:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3124


> kakfa-clients consume 100% CPU with manual partition assignment when network 
> connection is lost
> ---
>
> Key: KAFKA-5263
> URL: https://issues.apache.org/jira/browse/KAFKA-5263
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0, 0.10.2.1
>Reporter: Konstantin Smirnov
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0, 0.11.1.0
>
> Attachments: cpu_consuming.log, cpu_consuming_profile.csv
>
>
> Noticed that lose of the connection to Kafka broker leads kafka-clients to 
> consume 100% CPU. The bug only appears when the manual partition assignmet is 
> used. It appears since the version 0.10.1.0. The bug is quite similar to 
> KAFKA-1642.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3124: KAFKA-5263: Avoid tight polling loop in consumer w...

2017-05-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3124


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5263) kakfa-clients consume 100% CPU with manual partition assignment when network connection is lost

2017-05-25 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-5263.
---
   Resolution: Fixed
Fix Version/s: 0.11.1.0
   0.11.0.0

Issue resolved by pull request 3124
[https://github.com/apache/kafka/pull/3124]

> kakfa-clients consume 100% CPU with manual partition assignment when network 
> connection is lost
> ---
>
> Key: KAFKA-5263
> URL: https://issues.apache.org/jira/browse/KAFKA-5263
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0, 0.10.2.1
>Reporter: Konstantin Smirnov
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0, 0.11.1.0
>
> Attachments: cpu_consuming.log, cpu_consuming_profile.csv
>
>
> Noticed that lose of the connection to Kafka broker leads kafka-clients to 
> consume 100% CPU. The bug only appears when the manual partition assignmet is 
> used. It appears since the version 0.10.1.0. The bug is quite similar to 
> KAFKA-1642.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5263) kakfa-clients consume 100% CPU with manual partition assignment when network connection is lost

2017-05-25 Thread Konstantin Smirnov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024593#comment-16024593
 ] 

Konstantin Smirnov commented on KAFKA-5263:
---

Thanks a lot for your fix!

> kakfa-clients consume 100% CPU with manual partition assignment when network 
> connection is lost
> ---
>
> Key: KAFKA-5263
> URL: https://issues.apache.org/jira/browse/KAFKA-5263
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0, 0.10.2.1
>Reporter: Konstantin Smirnov
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0, 0.11.1.0
>
> Attachments: cpu_consuming.log, cpu_consuming_profile.csv
>
>
> Noticed that lose of the connection to Kafka broker leads kafka-clients to 
> consume 100% CPU. The bug only appears when the manual partition assignmet is 
> used. It appears since the version 0.10.1.0. The bug is quite similar to 
> KAFKA-1642.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5319) Add a tool to make cluster replica and leader balance

2017-05-25 Thread Ma Tianchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ma Tianchi updated KAFKA-5319:
--
Attachment: (was: ArithmeticDescription.png)

> Add a tool to make cluster replica and leader balance
> -
>
> Key: KAFKA-5319
> URL: https://issues.apache.org/jira/browse/KAFKA-5319
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> When a new broker is added to cluster,there is not any topics in the new 
> broker.When we use console command to create a topic without 
> 'replicaAssignmentOpt',Kafka use AdminUtils.assignReplicasToBrokers to get 
> replicaAssignment.Even though it is balance at the creating time if the 
> cluster never change,with more and more brokers added to cluster the replica 
> balanced will become not well. We also can use 'kafka-reassign-partitions.sh' 
> to balance ,but the big amount of topics make it not easy.And at the topic 
> created time , Kafka choose a  PreferredReplicaLeader which be put at the 
> first position of  the AR to make leader balance.But the balance will be 
> destroyed when cluster changed.Using  'kafka-reassign-partitions.sh' to make 
> partition reassignment may be also destroy the leader balance ,because user 
> can change the AR of the partition . It may be not balanced , but Kafka 
> believe cluster leader balance is well with every leaders is the first on at 
> AR.
> So we create a tool to make the number of replicas and number of leaders on 
> every brokers is balanced.It uses a algorithm to get a balanced replica and 
> leader  reassignment,then uses ReassignPartitionsCommand to real balance the 
> cluster.
> It can be used to make balance when cluster added brokers or cluster is not 
> balanced .It does not deal with moving replicas of a dead broker to an alive 
> broker,it only makes replicas and leaders on alive brokers is balanced.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5319) Add a tool to make cluster replica and leader balance

2017-05-25 Thread Ma Tianchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ma Tianchi updated KAFKA-5319:
--
Attachment: AlgorithmDescription.png

> Add a tool to make cluster replica and leader balance
> -
>
> Key: KAFKA-5319
> URL: https://issues.apache.org/jira/browse/KAFKA-5319
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
> Attachments: AlgorithmDescription.png
>
>
> When a new broker is added to cluster,there is not any topics in the new 
> broker.When we use console command to create a topic without 
> 'replicaAssignmentOpt',Kafka use AdminUtils.assignReplicasToBrokers to get 
> replicaAssignment.Even though it is balance at the creating time if the 
> cluster never change,with more and more brokers added to cluster the replica 
> balanced will become not well. We also can use 'kafka-reassign-partitions.sh' 
> to balance ,but the big amount of topics make it not easy.And at the topic 
> created time , Kafka choose a  PreferredReplicaLeader which be put at the 
> first position of  the AR to make leader balance.But the balance will be 
> destroyed when cluster changed.Using  'kafka-reassign-partitions.sh' to make 
> partition reassignment may be also destroy the leader balance ,because user 
> can change the AR of the partition . It may be not balanced , but Kafka 
> believe cluster leader balance is well with every leaders is the first on at 
> AR.
> So we create a tool to make the number of replicas and number of leaders on 
> every brokers is balanced.It uses a algorithm to get a balanced replica and 
> leader  reassignment,then uses ReassignPartitionsCommand to real balance the 
> cluster.
> It can be used to make balance when cluster added brokers or cluster is not 
> balanced .It does not deal with moving replicas of a dead broker to an alive 
> broker,it only makes replicas and leaders on alive brokers is balanced.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread Ma Tianchi (JIRA)
Ma Tianchi created KAFKA-5326:
-

 Summary: Log offset index  resize cause empty  index file
 Key: KAFKA-5326
 URL: https://issues.apache.org/jira/browse/KAFKA-5326
 Project: Kafka
  Issue Type: Bug
  Components: core, log
Affects Versions: 0.10.2.1
Reporter: Ma Tianchi


After a log index file lost or be removed when kafka server is running,then 
some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index file 
be created but there is no values in it,even though the size is same with old 
one.
It will do something as below. 
val raf = new RandomAccessFile(file, "rw")
This time file does not exit.It will create a new file.
Then at:
mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
The mmap is created from new file which is empty.
and then:
mmap.position(position)
This make the new index file which size is same with old index file,but there 
is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3143: to resolved the KAFKA-5326

2017-05-25 Thread ZanderXu
GitHub user ZanderXu opened a pull request:

https://github.com/apache/kafka/pull/3143

to resolved the KAFKA-5326

This modify to resolved the 
[KAFKA-5326](https://issues.apache.org/jira/browse/KAFKA-5326).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZanderXu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3143


commit 0be6d5b1397443f2d6bb953ecd1cd13e31abd4fd
Author: xuzq 
Date:   2017-05-25T13:26:38Z

to resolved the KAFKA-5326




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024728#comment-16024728
 ] 

ASF GitHub Bot commented on KAFKA-5326:
---

GitHub user ZanderXu opened a pull request:

https://github.com/apache/kafka/pull/3143

to resolved the KAFKA-5326

This modify to resolved the 
[KAFKA-5326](https://issues.apache.org/jira/browse/KAFKA-5326).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZanderXu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3143


commit 0be6d5b1397443f2d6bb953ecd1cd13e31abd4fd
Author: xuzq 
Date:   2017-05-25T13:26:38Z

to resolved the KAFKA-5326




> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Kafka Read Data from All Partition Using Key or Timestamp

2017-05-25 Thread SenthilKumar K
I did an experiment on searching messages using timestamps ..

Step 1: Used Producer with Create Time ( CT )
Step 2 : Verify whether it reflects in Kafka or not
  .index  .log
  .timeindex
These three files in disk and seems to be time_index working .

Step 3: Let's look into data
offset: 121 position: 149556 *CreateTime*: 1495718896912 isvalid:
true payloadsize: 1194 magic: 1 compresscodec: NONE crc: 1053048980
keysize: 8

  Looks good ..
Step 4 :  Check .timeindex file .
  timestamp: 1495718846912 offset: 116
  timestamp: 1495718886912 offset: 120
  timestamp: 1495718926912 offset: 124
  timestamp: 1495718966912 offset: 128

So all set for Querying data using timestamp ?

Kafka version : kafka_2.11-0.10.2.1

Here is the code i'm using to search query -->
https://gist.github.com/senthilec566/bc8ed1dfcf493f0bb5c473c50854dff9

requestInfo.put(topicAndPartition, new PartitionOffsetRequestInfo(queryTime,
1));
If i pass my own timestamp , always getting zero result ..
*Same question asked here too
**https://stackoverflow.com/questions/31917134/how-to-use-unix-timestamp-to-get-offset-using-simpleconsumer-api
*
.


Also i could notice below error in index file:

*Found timestamp mismatch* in
:/home/user/kafka-logs/topic-0/.timeindex

  Index timestamp: 0, log timestamp: 1495717686913

*Found out of order timestamp* in
:/home/user/kafka-logs/topic-0/.timeindex

  Index timestamp: 0, Previously indexed timestamp: 1495719406912

Not sure what is missing here :-( ... Pls advise me here!


Cheers,
Senthil

On Thu, May 25, 2017 at 3:36 PM, SenthilKumar K 
wrote:

> Thanks a lot Mayuresh. I will look into SearchMessageByTimestamp feature
> in Kafka ..
>
> Cheers,
> Senthil
>
> On Thu, May 25, 2017 at 1:12 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
>> Hi Senthil,
>>
>> Kafka does allow search message by timestamp after KIP-33 :
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+
>> Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-S
>> earchmessagebytimestamp
>>
>> The new consumer does provide you a way to get offsets by timestamp. You
>> can use these offsets to seek to that offset and consume from there. So if
>> you want to consume between a range you can get the start and end offset
>> based on the timestamps, seek to the start offset and consume and process
>> the data till you reach the end offset.
>>
>> But these timestamps are either CreateTime(when the message was created
>> and you will have to specify this when you do the send()) or
>> LogAppendTime(when the message was appended to the log on the kafka broker)
>> : https://kafka.apache.org/0101/javadoc/org/apache/kafka/clien
>> ts/producer/ProducerRecord.html
>>
>> Kafka does not look at the fields in your data (key/value) for giving
>> back you the data. What I meant was it will not look at the timestamp
>> specified by you in the actual data payload.
>>
>> Thanks,
>>
>> Mayuresh
>>
>> On Thu, May 25, 2017 at 12:43 PM, SenthilKumar K 
>> wrote:
>>
>>> Hello Dev Team, Pls let me know if any option to read data from Kafka
>>> (all
>>> partition ) using timestamp . Also can we set custom offset value to
>>> messages ?
>>>
>>> Cheers,
>>> Senthil
>>>
>>> On Wed, May 24, 2017 at 7:33 PM, SenthilKumar K 
>>> wrote:
>>>
>>> > Hi All ,  We have been using Kafka for our Use Case which helps in
>>> > delivering real time raw logs.. I have a requirement to fetch data from
>>> > Kafka by using offset ..
>>> >
>>> > DataSet Example :
>>> > {"access_date":"2017-05-24 13:57:45.044","format":"json",
>>> > "start":"1490296463.031"}
>>> > {"access_date":"2017-05-24 13:57:46.044","format":"json",
>>> > "start":"1490296463.031"}
>>> > {"access_date":"2017-05-24 13:57:47.044","format":"json",
>>> > "start":"1490296463.031"}
>>> > {"access_date":"2017-05-24 13:58:02.042","format":"json",
>>> > "start":"1490296463.031"}
>>> >
>>> > Above JSON data will be stored in Kafka..
>>> >
>>> > Key --> acces_date in epoch format
>>> > Value --> whole JSON.
>>> >
>>> > Data Access Pattern:
>>> >   1) Get me last 2 minz data ?
>>> >2) Get me records between 2017-05-24 13:57:42:00 to 2017-05-24
>>> > 13:57:44:00 ?
>>> >
>>> > How to achieve this in Kafka ?
>>> >
>>> > I tried using SimpleConsumer , but it expects partition and not sure
>>> > SimpleConsumer would match our requirement...
>>> >
>>> > Appreciate you help !
>>> >
>>> > Cheers,
>>> > Senthil
>>> >
>>>
>>
>>
>>
>> --
>> -Regards,
>> Mayuresh R. Gharat
>> (862) 250-7125
>>
>
>


GlobalKTable limitations

2017-05-25 Thread Thomas Becker
We need to do a series of joins against a KTable that we can't co-
partition with the stream, so we're looking at GlobalKTable.  But the
topic backing the table is not ideally keyed for the sort of lookups
this particular processor needs to do. Unfortunately, GlobalKTable is
very limited in that you can only build one with the exact keys/values
from the backing topic. I'd like to be able to perform various
transformations on the topic before materializing the table.  I'd
envision it looking something like the following:

builder.globalTable(keySerde, valueSerde, topicName)
.filter((k, v) -> k.isFoo())
.map((k, v) -> new KeyValue<>(k.getBar(), v.getBaz()))
.build(tableKeySerde, tableValueSerde, storeName);

Is this something that has been considered or that others would find
useful?

--


Tommy Becker

Senior Software Engineer

O +1 919.460.4747

tivo.com




This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.


[GitHub] kafka pull request #3143: KAFKA-5326

2017-05-25 Thread ZanderXu
Github user ZanderXu closed the pull request at:

https://github.com/apache/kafka/pull/3143


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3143: KAFKA-5326

2017-05-25 Thread ZanderXu
GitHub user ZanderXu reopened a pull request:

https://github.com/apache/kafka/pull/3143

KAFKA-5326

This modify to resolved the 
[KAFKA-5326](https://issues.apache.org/jira/browse/KAFKA-5326).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZanderXu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3143


commit 0be6d5b1397443f2d6bb953ecd1cd13e31abd4fd
Author: xuzq 
Date:   2017-05-25T13:26:38Z

to resolved the KAFKA-5326




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024752#comment-16024752
 ] 

ASF GitHub Bot commented on KAFKA-5326:
---

Github user ZanderXu closed the pull request at:

https://github.com/apache/kafka/pull/3143


> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3143: KAFKA-5326

2017-05-25 Thread ZanderXu
Github user ZanderXu closed the pull request at:

https://github.com/apache/kafka/pull/3143


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024755#comment-16024755
 ] 

ASF GitHub Bot commented on KAFKA-5326:
---

Github user ZanderXu closed the pull request at:

https://github.com/apache/kafka/pull/3143


> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024753#comment-16024753
 ] 

ASF GitHub Bot commented on KAFKA-5326:
---

GitHub user ZanderXu reopened a pull request:

https://github.com/apache/kafka/pull/3143

KAFKA-5326

This modify to resolved the 
[KAFKA-5326](https://issues.apache.org/jira/browse/KAFKA-5326).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZanderXu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3143


commit 0be6d5b1397443f2d6bb953ecd1cd13e31abd4fd
Author: xuzq 
Date:   2017-05-25T13:26:38Z

to resolved the KAFKA-5326




> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3143: KAFKA-5326

2017-05-25 Thread ZanderXu
GitHub user ZanderXu reopened a pull request:

https://github.com/apache/kafka/pull/3143

KAFKA-5326

This modify to resolved the 
[KAFKA-5326](https://issues.apache.org/jira/browse/KAFKA-5326).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZanderXu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3143


commit 0be6d5b1397443f2d6bb953ecd1cd13e31abd4fd
Author: xuzq 
Date:   2017-05-25T13:26:38Z

to resolved the KAFKA-5326




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024759#comment-16024759
 ] 

ASF GitHub Bot commented on KAFKA-5326:
---

GitHub user ZanderXu reopened a pull request:

https://github.com/apache/kafka/pull/3143

KAFKA-5326

This modify to resolved the 
[KAFKA-5326](https://issues.apache.org/jira/browse/KAFKA-5326).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZanderXu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3143


commit 0be6d5b1397443f2d6bb953ecd1cd13e31abd4fd
Author: xuzq 
Date:   2017-05-25T13:26:38Z

to resolved the KAFKA-5326




> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread xuzq (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024835#comment-16024835
 ] 

xuzq commented on KAFKA-5326:
-

When the kafka server is running, someone removed the index file on the disk 
manually, if at this point, the function AbstractIndex.Resize(newSize: Int) is 
triggered, will create a new ".index" file which the size is "roundedNewSize", 
but the content is empty.

> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5326) Log offset index resize cause empty index file

2017-05-25 Thread xuzq (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024835#comment-16024835
 ] 

xuzq edited comment on KAFKA-5326 at 5/25/17 3:04 PM:
--

When the kafka server is running, someone removed the index file on the disk 
manually, if at this point, the function AbstractIndex.Resize(newSize: Int) is 
triggered, will create a new ".index" file which the size is "roundedNewSize", 
but the content is empty.

I think if the ".index" is not exist, we should copy the contents from old 
"mmap" to new "mmap" to avoid the "empty file".


was (Author: xuzq_zander):
When the kafka server is running, someone removed the index file on the disk 
manually, if at this point, the function AbstractIndex.Resize(newSize: Int) is 
triggered, will create a new ".index" file which the size is "roundedNewSize", 
but the content is empty.

> Log offset index  resize cause empty  index file
> 
>
> Key: KAFKA-5326
> URL: https://issues.apache.org/jira/browse/KAFKA-5326
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.10.2.1
>Reporter: Ma Tianchi
>
> After a log index file lost or be removed when kafka server is running,then 
> some one calls OffsetIndex.resize(newSize: Int) .It will cause a new index 
> file be created but there is no values in it,even though the size is same 
> with old one.
> It will do something as below. 
> val raf = new RandomAccessFile(file, "rw")
> This time file does not exit.It will create a new file.
> Then at:
> mmap = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, roundedNewSize)
> The mmap is created from new file which is empty.
> and then:
> mmap.position(position)
> This make the new index file which size is same with old index file,but there 
> is nothing in it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Kafka Read Data from All Partition Using Key or Timestamp

2017-05-25 Thread Hans Jespersen
The timeindex was added in 0.10 so I think you need to use the new Consumer API 
to access this functionality. Specifically you should call offsetsForTimes()

https://kafka.apache.org/0102/javadoc/org/apache/kafka/clients/consumer/Consumer.html#offsetsForTimes(java.util.Map)

-hans

> On May 25, 2017, at 6:39 AM, SenthilKumar K  wrote:
> 
> I did an experiment on searching messages using timestamps ..
> 
> Step 1: Used Producer with Create Time ( CT )
> Step 2 : Verify whether it reflects in Kafka or not
>  .index  .log
>  .timeindex
>These three files in disk and seems to be time_index working .
> 
> Step 3: Let's look into data
>offset: 121 position: 149556 *CreateTime*: 1495718896912 isvalid:
> true payloadsize: 1194 magic: 1 compresscodec: NONE crc: 1053048980
> keysize: 8
> 
>  Looks good ..
> Step 4 :  Check .timeindex file .
>  timestamp: 1495718846912 offset: 116
>  timestamp: 1495718886912 offset: 120
>  timestamp: 1495718926912 offset: 124
>  timestamp: 1495718966912 offset: 128
> 
> So all set for Querying data using timestamp ?
> 
> Kafka version : kafka_2.11-0.10.2.1
> 
> Here is the code i'm using to search query -->
> https://gist.github.com/senthilec566/bc8ed1dfcf493f0bb5c473c50854dff9
> 
> requestInfo.put(topicAndPartition, new PartitionOffsetRequestInfo(queryTime,
> 1));
> If i pass my own timestamp , always getting zero result ..
> *Same question asked here too
> **https://stackoverflow.com/questions/31917134/how-to-use-unix-timestamp-to-get-offset-using-simpleconsumer-api
> *
> .
> 
> 
> Also i could notice below error in index file:
> 
> *Found timestamp mismatch* in
> :/home/user/kafka-logs/topic-0/.timeindex
> 
>  Index timestamp: 0, log timestamp: 1495717686913
> 
> *Found out of order timestamp* in
> :/home/user/kafka-logs/topic-0/.timeindex
> 
>  Index timestamp: 0, Previously indexed timestamp: 1495719406912
> 
> Not sure what is missing here :-( ... Pls advise me here!
> 
> 
> Cheers,
> Senthil
> 
> On Thu, May 25, 2017 at 3:36 PM, SenthilKumar K 
> wrote:
> 
>> Thanks a lot Mayuresh. I will look into SearchMessageByTimestamp feature
>> in Kafka ..
>> 
>> Cheers,
>> Senthil
>> 
>> On Thu, May 25, 2017 at 1:12 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com> wrote:
>> 
>>> Hi Senthil,
>>> 
>>> Kafka does allow search message by timestamp after KIP-33 :
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+
>>> Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-S
>>> earchmessagebytimestamp
>>> 
>>> The new consumer does provide you a way to get offsets by timestamp. You
>>> can use these offsets to seek to that offset and consume from there. So if
>>> you want to consume between a range you can get the start and end offset
>>> based on the timestamps, seek to the start offset and consume and process
>>> the data till you reach the end offset.
>>> 
>>> But these timestamps are either CreateTime(when the message was created
>>> and you will have to specify this when you do the send()) or
>>> LogAppendTime(when the message was appended to the log on the kafka broker)
>>> : https://kafka.apache.org/0101/javadoc/org/apache/kafka/clien
>>> ts/producer/ProducerRecord.html
>>> 
>>> Kafka does not look at the fields in your data (key/value) for giving
>>> back you the data. What I meant was it will not look at the timestamp
>>> specified by you in the actual data payload.
>>> 
>>> Thanks,
>>> 
>>> Mayuresh
>>> 
>>> On Thu, May 25, 2017 at 12:43 PM, SenthilKumar K 
>>> wrote:
>>> 
 Hello Dev Team, Pls let me know if any option to read data from Kafka
 (all
 partition ) using timestamp . Also can we set custom offset value to
 messages ?
 
 Cheers,
 Senthil
 
 On Wed, May 24, 2017 at 7:33 PM, SenthilKumar K 
 wrote:
 
> Hi All ,  We have been using Kafka for our Use Case which helps in
> delivering real time raw logs.. I have a requirement to fetch data from
> Kafka by using offset ..
> 
> DataSet Example :
> {"access_date":"2017-05-24 13:57:45.044","format":"json",
> "start":"1490296463.031"}
> {"access_date":"2017-05-24 13:57:46.044","format":"json",
> "start":"1490296463.031"}
> {"access_date":"2017-05-24 13:57:47.044","format":"json",
> "start":"1490296463.031"}
> {"access_date":"2017-05-24 13:58:02.042","format":"json",
> "start":"1490296463.031"}
> 
> Above JSON data will be stored in Kafka..
> 
> Key --> acces_date in epoch format
> Value --> whole JSON.
> 
> Data Access Pattern:
>  1) Get me last 2 minz data ?
>   2) Get me records between 2017-05-24 13:57:42:00 to 2017-05-24
> 13:57:44:00 ?
> 
> How to achieve this in Kafka ?

[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-25 Thread Damian Guy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024889#comment-16024889
 ] 

Damian Guy commented on KAFKA-5154:
---

[~Lukas Gemela] if it happens again can you please provide logs from the 
brokers and from the other Kafka Streams instances so we can get a better 
picture of what is happening?
Thanks,
Damian

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Assignee: Damian Guy
> Attachments: 5154_problem.log, clio_afa596e9b809.gz, clio_reduced.gz, 
> clio.txt.gz
>
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.proces

Re: GlobalKTable limitations

2017-05-25 Thread Eno Thereska
Hi Thomas,

Have you considered doing the transformations on the topic, then outputting to 
another topic and then constructing the GlobalKTable from the latter? 

The GlobalKTable has the limitations you mention since it was primarily 
designed for joins only. We should consider allowing a less restrictive 
interface if it makes sense.

Eno

> On 25 May 2017, at 14:48, Thomas Becker  wrote:
> 
> We need to do a series of joins against a KTable that we can't co-
> partition with the stream, so we're looking at GlobalKTable.  But the
> topic backing the table is not ideally keyed for the sort of lookups
> this particular processor needs to do. Unfortunately, GlobalKTable is
> very limited in that you can only build one with the exact keys/values
> from the backing topic. I'd like to be able to perform various
> transformations on the topic before materializing the table.  I'd
> envision it looking something like the following:
> 
> builder.globalTable(keySerde, valueSerde, topicName)
>.filter((k, v) -> k.isFoo())
>.map((k, v) -> new KeyValue<>(k.getBar(), v.getBaz()))
>.build(tableKeySerde, tableValueSerde, storeName);
> 
> Is this something that has been considered or that others would find
> useful?
> 
> --
> 
> 
>Tommy Becker
> 
>Senior Software Engineer
> 
>O +1 919.460.4747
> 
>tivo.com
> 
> 
> 
> 
> This email and any attachments may contain confidential and privileged 
> material for the sole use of the intended recipient. Any review, copying, or 
> distribution of this email (or any attachments) by others is prohibited. If 
> you are not the intended recipient, please contact the sender immediately and 
> permanently delete this email and any attachments. No employee or agent of 
> TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo 
> Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed 
> written agreement.



[jira] [Updated] (KAFKA-4362) Consumer can fail after reassignment of the offsets topic partition

2017-05-25 Thread Jeff Widman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Widman updated KAFKA-4362:
---
Affects Version/s: 0.10.0.1

> Consumer can fail after reassignment of the offsets topic partition
> ---
>
> Key: KAFKA-4362
> URL: https://issues.apache.org/jira/browse/KAFKA-4362
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1, 0.10.1.0
>Reporter: Joel Koshy
>Assignee: Mayuresh Gharat
> Fix For: 0.10.1.1
>
>
> When a consumer offsets topic partition reassignment completes, an offset 
> commit shows this:
> {code}
> java.lang.IllegalArgumentException: Message format version for partition 100 
> not found
> at 
> kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>  ~[kafka_2.10.jar:?]
> at 
> kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>  ~[kafka_2.10.jar:?]
> at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.4.jar:?]
> at 
> kafka.coordinator.GroupMetadataManager.kafka$coordinator$GroupMetadataManager$$getMessageFormatVersionAndTimestamp(GroupMetadataManager.scala:632)
>  ~[kafka_2.10.jar:?]
> at 
> ...
> {code}
> The issue is that the replica has been deleted so the 
> {{GroupMetadataManager.getMessageFormatVersionAndTimestamp}} throws this 
> exception instead which propagates as an unknown error.
> Unfortunately consumers don't respond to this and will fail their offset 
> commits.
> One workaround in the above situation is to bounce the cluster - the consumer 
> will be forced to rediscover the group coordinator.
> (Incidentally, the message incorrectly prints the number of partitions 
> instead of the actual partition.)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4362) Consumer can fail after reassignment of the offsets topic partition

2017-05-25 Thread Jeff Widman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024997#comment-16024997
 ] 

Jeff Widman commented on KAFKA-4362:


Updated the applicable version as we just encountered this on a {{0.10.0.1}} 
cluster.

> Consumer can fail after reassignment of the offsets topic partition
> ---
>
> Key: KAFKA-4362
> URL: https://issues.apache.org/jira/browse/KAFKA-4362
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.1, 0.10.1.0
>Reporter: Joel Koshy
>Assignee: Mayuresh Gharat
> Fix For: 0.10.1.1
>
>
> When a consumer offsets topic partition reassignment completes, an offset 
> commit shows this:
> {code}
> java.lang.IllegalArgumentException: Message format version for partition 100 
> not found
> at 
> kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>  ~[kafka_2.10.jar:?]
> at 
> kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>  ~[kafka_2.10.jar:?]
> at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.4.jar:?]
> at 
> kafka.coordinator.GroupMetadataManager.kafka$coordinator$GroupMetadataManager$$getMessageFormatVersionAndTimestamp(GroupMetadataManager.scala:632)
>  ~[kafka_2.10.jar:?]
> at 
> ...
> {code}
> The issue is that the replica has been deleted so the 
> {{GroupMetadataManager.getMessageFormatVersionAndTimestamp}} throws this 
> exception instead which propagates as an unknown error.
> Unfortunately consumers don't respond to this and will fail their offset 
> commits.
> One workaround in the above situation is to bounce the cluster - the consumer 
> will be forced to rediscover the group coordinator.
> (Incidentally, the message incorrectly prints the number of partitions 
> instead of the actual partition.)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3138: KAFKA-5017: Record batch first offset remains accu...

2017-05-25 Thread hachikuji
Github user hachikuji closed the pull request at:

https://github.com/apache/kafka/pull/3138


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5017) Consider making baseOffset the first offset in message format v2

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025074#comment-16025074
 ] 

ASF GitHub Bot commented on KAFKA-5017:
---

Github user hachikuji closed the pull request at:

https://github.com/apache/kafka/pull/3138


> Consider making baseOffset the first offset in message format v2
> 
>
> Key: KAFKA-5017
> URL: https://issues.apache.org/jira/browse/KAFKA-5017
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
> Fix For: 0.11.0.0
>
>
> Currently baseOffset starts as the first offset of the batch. If the first 
> record is removed by compaction, baseOffset doesn't change and it is no 
> longer the same as the first offset of the batch. This is inconsistent with 
> every other field in the record batch header and it seems like there is no 
> longer a reason for this behaviour.
> We should consider simplifying the behaviour so that baseOffset is simply the 
> first offset of the record batch. We need to do this before 0.11 or we 
> probably won't be able to do it until the next message format version change.
> cc [~hachikuji]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5017) Consider making baseOffset the first offset in message format v2

2017-05-25 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5017:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

I had thought that this would be OK, but the idempotent producer's dedup check 
depends both on the first and last sequence number in the batch, so changing 
the former would cause that check to fail. This could result in an incorrect 
OutOfSequence error. We could change the logic to only depend on the last 
sequence, but it weakens the check and we would need a bit of ugly work to 
compute the first offset to include in the duplicate ProduceResponse.

> Consider making baseOffset the first offset in message format v2
> 
>
> Key: KAFKA-5017
> URL: https://issues.apache.org/jira/browse/KAFKA-5017
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
> Fix For: 0.11.0.0
>
>
> Currently baseOffset starts as the first offset of the batch. If the first 
> record is removed by compaction, baseOffset doesn't change and it is no 
> longer the same as the first offset of the batch. This is inconsistent with 
> every other field in the record batch header and it seems like there is no 
> longer a reason for this behaviour.
> We should consider simplifying the behaviour so that baseOffset is simply the 
> first offset of the record batch. We need to do this before 0.11 or we 
> probably won't be able to do it until the next message format version change.
> cc [~hachikuji]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5279) TransactionCoordinator must expire transactionalIds

2017-05-25 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-5279:
-
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3101
[https://github.com/apache/kafka/pull/3101]

> TransactionCoordinator must expire transactionalIds
> ---
>
> Key: KAFKA-5279
> URL: https://issues.apache.org/jira/browse/KAFKA-5279
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Damian Guy
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Currently transactionalIds are not expired anywhere, so we accumulate forever.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3101: KAFKA-5279: TransactionCoordinator must expire tra...

2017-05-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3101


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5279) TransactionCoordinator must expire transactionalIds

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025128#comment-16025128
 ] 

ASF GitHub Bot commented on KAFKA-5279:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3101


> TransactionCoordinator must expire transactionalIds
> ---
>
> Key: KAFKA-5279
> URL: https://issues.apache.org/jira/browse/KAFKA-5279
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Damian Guy
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Currently transactionalIds are not expired anywhere, so we accumulate forever.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3144: KAFKA-5323: AdminUtils.createTopic should check to...

2017-05-25 Thread onurkaraman
GitHub user onurkaraman opened a pull request:

https://github.com/apache/kafka/pull/3144

KAFKA-5323: AdminUtils.createTopic should check topic existence upfront

When a topic exists, AdminUtils.createTopic unnecessarily does N+2 
zookeeper reads where N is the number of brokers. Here is the breakdown of the 
N+2 zookeeper reads:
1. reads the current list of brokers in zookeeper (1 zookeeper read)
2. reads metadata for each broker in zookeeper (N zookeeper reads where N 
is the number of brokers)
3. checks for topic existence in zookeeper (1 zookeeper read)

This can have a larger impact than one might initially suspect. For 
instance, a broker only populates its MetadataCache after it has joined the 
cluster and the controller sends it an UpdateMetadataRequest. But a broker can 
begin processing requests even before registering itself in zookeeper (before 
the controller even knows the broker is alive). In other words, a broker can 
begin processing MetadataRequests before processing the controller's 
UpdateMetadataRequest following broker registration.

Processing these MetadataRequests in this scenario leads to large local 
times and can cause substantial request queue backup, causing significant 
delays in the broker processing its initial UpdateMetadataRequest. Since the 
broker hasn't received any UpdateMetadataRequest from the controller yet, its 
MetadataCache is empty. So the topics from all the client MetadataRequests are 
treated as brand new topics, which means the broker tries to auto create these 
topics. For each pre-existing topic queried in the MetadataRequest, auto topic 
creation performs the N+2 zookeeper reads mentioned earlier.

In one bad production scenario (while recovering from KAFKA-4959), this 
caused a significant delay in bringing replicas online, as both the initial 
LeaderAndIsrRequest and UpdateMetadataRequest from the controller on broker 
startup was stuck behind these client MetadataRequests hammering zookeeper.

We can reduce the N+2 reads down to 1 by checking topic existence upfront.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/onurkaraman/kafka KAFKA-5323

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3144.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3144


commit 90dc1c45db30d579d04529ba6edfead8e198e762
Author: Onur Karaman 
Date:   2017-05-25T18:03:48Z

KAFKA-5323: AdminUtils.createTopic should check topic existence upfront

When a topic exists, AdminUtils.createTopic unnecessarily does N+2 
zookeeper reads where N is the number of brokers. Here is the breakdown of the 
N+2 zookeeper reads:
1. reads the current list of brokers in zookeeper (1 zookeeper read)
2. reads metadata for each broker in zookeeper (N zookeeper reads where N 
is the number of brokers)
3. checks for topic existence in zookeeper (1 zookeeper read)

This can have a larger impact than one might initially suspect. For 
instance, a broker only populates its MetadataCache after it has joined the 
cluster and the controller sends it an UpdateMetadataRequest. But a broker can 
begin processing requests even before registering itself in zookeeper (before 
the controller even knows the broker is alive). In other words, a broker can 
begin processing MetadataRequests before processing the controller's 
UpdateMetadataRequest following broker registration.

Processing these MetadataRequests in this scenario leads to large local 
times and can cause substantial request queue backup, causing significant 
delays in the broker processing its initial UpdateMetadataRequest. Since the 
broker hasn't received any UpdateMetadataRequest from the controller yet, its 
MetadataCache is empty. So the topics from all the client MetadataRequests are 
treated as brand new topics, which means the broker tries to auto create these 
topics. For each pre-existing topic queried in the MetadataRequest, auto topic 
creation performs the N+2 zookeeper reads mentioned earlier.

In one bad production scenario (while recovering from KAFKA-4959), this 
caused a significant delay in bringing replicas online, as both the initial 
LeaderAndIsrRequest and UpdateMetadataRequest from the controller on broker 
startup was stuck behind these client MetadataRequests hammering zookeeper.

We can reduce the N+2 reads down to 1 by checking topic existence upfront.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5323) AdminUtils.createTopic should check topic existence upfront

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025147#comment-16025147
 ] 

ASF GitHub Bot commented on KAFKA-5323:
---

GitHub user onurkaraman opened a pull request:

https://github.com/apache/kafka/pull/3144

KAFKA-5323: AdminUtils.createTopic should check topic existence upfront

When a topic exists, AdminUtils.createTopic unnecessarily does N+2 
zookeeper reads where N is the number of brokers. Here is the breakdown of the 
N+2 zookeeper reads:
1. reads the current list of brokers in zookeeper (1 zookeeper read)
2. reads metadata for each broker in zookeeper (N zookeeper reads where N 
is the number of brokers)
3. checks for topic existence in zookeeper (1 zookeeper read)

This can have a larger impact than one might initially suspect. For 
instance, a broker only populates its MetadataCache after it has joined the 
cluster and the controller sends it an UpdateMetadataRequest. But a broker can 
begin processing requests even before registering itself in zookeeper (before 
the controller even knows the broker is alive). In other words, a broker can 
begin processing MetadataRequests before processing the controller's 
UpdateMetadataRequest following broker registration.

Processing these MetadataRequests in this scenario leads to large local 
times and can cause substantial request queue backup, causing significant 
delays in the broker processing its initial UpdateMetadataRequest. Since the 
broker hasn't received any UpdateMetadataRequest from the controller yet, its 
MetadataCache is empty. So the topics from all the client MetadataRequests are 
treated as brand new topics, which means the broker tries to auto create these 
topics. For each pre-existing topic queried in the MetadataRequest, auto topic 
creation performs the N+2 zookeeper reads mentioned earlier.

In one bad production scenario (while recovering from KAFKA-4959), this 
caused a significant delay in bringing replicas online, as both the initial 
LeaderAndIsrRequest and UpdateMetadataRequest from the controller on broker 
startup was stuck behind these client MetadataRequests hammering zookeeper.

We can reduce the N+2 reads down to 1 by checking topic existence upfront.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/onurkaraman/kafka KAFKA-5323

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3144.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3144


commit 90dc1c45db30d579d04529ba6edfead8e198e762
Author: Onur Karaman 
Date:   2017-05-25T18:03:48Z

KAFKA-5323: AdminUtils.createTopic should check topic existence upfront

When a topic exists, AdminUtils.createTopic unnecessarily does N+2 
zookeeper reads where N is the number of brokers. Here is the breakdown of the 
N+2 zookeeper reads:
1. reads the current list of brokers in zookeeper (1 zookeeper read)
2. reads metadata for each broker in zookeeper (N zookeeper reads where N 
is the number of brokers)
3. checks for topic existence in zookeeper (1 zookeeper read)

This can have a larger impact than one might initially suspect. For 
instance, a broker only populates its MetadataCache after it has joined the 
cluster and the controller sends it an UpdateMetadataRequest. But a broker can 
begin processing requests even before registering itself in zookeeper (before 
the controller even knows the broker is alive). In other words, a broker can 
begin processing MetadataRequests before processing the controller's 
UpdateMetadataRequest following broker registration.

Processing these MetadataRequests in this scenario leads to large local 
times and can cause substantial request queue backup, causing significant 
delays in the broker processing its initial UpdateMetadataRequest. Since the 
broker hasn't received any UpdateMetadataRequest from the controller yet, its 
MetadataCache is empty. So the topics from all the client MetadataRequests are 
treated as brand new topics, which means the broker tries to auto create these 
topics. For each pre-existing topic queried in the MetadataRequest, auto topic 
creation performs the N+2 zookeeper reads mentioned earlier.

In one bad production scenario (while recovering from KAFKA-4959), this 
caused a significant delay in bringing replicas online, as both the initial 
LeaderAndIsrRequest and UpdateMetadataRequest from the controller on broker 
startup was stuck behind these client MetadataRequests hammering zookeeper.

We can reduce the N+2 reads down to 1 by checking topic existence upfront.




> AdminUtils.createTopic should check topic existence upfront
> 

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-25 Thread Matthias J. Sax
Thanks for the KIP Eno!

Couple of comments:

I think we don't need `RecordContext` in `RecordExceptionHandler#handle`
because the `ConsumerRecord` provides all this information anyway.

Why we introduce `ExceptionType` and not just hand in the actual exception?

As return type of `handle()` is void, how would the handler fail? By
throwing an exception? Maybe it would be better to add a proper return
type from the beginning on -- this might also make backward
compatibility easier later on.


Question about `LogAndThresholdExceptionHandler` -- how would we be able
to track this?


With regard to `DeserializationException`, do you thing it might make
sense to have a "dead letter queue" as a feature to provide out-of-the-box?


-Matthias

On 5/25/17 2:47 AM, Eno Thereska wrote:
> Hi there,
> 
> I’ve added a KIP on improving exception handling in streams:
> KIP-161: streams record processing exception handlers. 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+record+processing+exception+handlers
>  
> 
> 
> Discussion and feedback is welcome, thank you.
> Eno
> 



signature.asc
Description: OpenPGP digital signature


[jira] [Updated] (KAFKA-5323) AdminUtils.createTopic should check topic existence upfront

2017-05-25 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-5323:

Status: Patch Available  (was: Open)

> AdminUtils.createTopic should check topic existence upfront
> ---
>
> Key: KAFKA-5323
> URL: https://issues.apache.org/jira/browse/KAFKA-5323
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Onur Karaman
>Assignee: Onur Karaman
>
> When a topic exists, AdminUtils.createTopic unnecessarily does N+2 zookeeper 
> reads where N is the number of brokers. Here is the breakdown of the N+2 
> zookeeper reads:
> # reads the current list of brokers in zookeeper (1 zookeeper read)
> # reads metadata for each broker in zookeeper (N zookeeper reads where N is 
> the number of brokers)
> # checks for topic existence in zookeeper (1 zookeeper read)
> We can reduce the N+2 reads down to 1 by checking topic existence upfront.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5327) Console Consumer should only poll for up to max messages

2017-05-25 Thread Dustin Cote (JIRA)
Dustin Cote created KAFKA-5327:
--

 Summary: Console Consumer should only poll for up to max messages
 Key: KAFKA-5327
 URL: https://issues.apache.org/jira/browse/KAFKA-5327
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Dustin Cote
Priority: Minor


The ConsoleConsumer has a --max-messages flag that can be used to limit the 
number of messages consumed. However, the number of records actually consumed 
is governed by max.poll.records. This means you see one message on the console, 
but your offset has moved forward a default of 500, which is kind of 
counterintuitive. It would be good to only commit offsets for messages we have 
printed to the console.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3145: MINOR: Add test case for duplicate check after log...

2017-05-25 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3145

MINOR: Add test case for duplicate check after log cleaning



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka 
minor-improve-base-sequence-docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3145


commit b75e295f2743b367ca92525c7863fc05fca77438
Author: Jason Gustafson 
Date:   2017-05-25T18:48:02Z

MINOR: Add test case for duplicate check after log cleaning




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-25 Thread Avi Flax

> On May 25, 2017, at 14:27, Matthias J. Sax  wrote:
> 
> Thanks for the KIP Eno!

Yes, thank you Eno! It’s very welcome and heartening to see work on this front.

I have a few follow-ups to Matthias’ comments:

> Why we introduce `ExceptionType` and not just hand in the actual exception?

I was wondering this myself.

> As return type of `handle()` is void, how would the handler fail? By
> throwing an exception?

I was wondering this myself.

> Maybe it would be better to add a proper return
> type from the beginning on -- this might also make backward
> compatibility easier later on.

I agree, that would be very clear.

> With regard to `DeserializationException`, do you thing it might make
> sense to have a "dead letter queue" as a feature to provide out-of-the-box?

+1 that would be *fantastic*!

Thanks!

Avi


Software Architect @ Park Assist » http://tech.parkassist.com/

[GitHub] kafka pull request #3146: MINOR: Cleanup in tests to avoid threads being lef...

2017-05-25 Thread rajinisivaram
GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3146

MINOR: Cleanup in tests to avoid threads being left behind



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka MINOR-test-cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3146.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3146


commit 2fb408fd46c0c090067268a6276aab4c44672e48
Author: Rajini Sivaram 
Date:   2017-05-25T18:34:20Z

MINOR: Cleanup in tests to avoid threads being left behind




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3137: KAFKA-5320: Include all request throttling in clie...

2017-05-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3137


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5320) Update produce/fetch throttle time metrics for any request throttle

2017-05-25 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-5320.
---
   Resolution: Fixed
Fix Version/s: 0.11.1.0

Issue resolved by pull request 3137
[https://github.com/apache/kafka/pull/3137]

> Update produce/fetch throttle time metrics for any request throttle
> ---
>
> Key: KAFKA-5320
> URL: https://issues.apache.org/jira/browse/KAFKA-5320
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> From KIP-124:
> {quote}
> Producers and consumers currently expose average and maximum producer/fetch 
> request throttle time as JMX metrics. These metrics will be updated to 
> reflect total throttle time for the producer or consumer including byte-rate 
> throttling and request time throttling for all requests of the 
> producer/consumer. 
> {quote}
> Missed this during the request quota implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5320) Update produce/fetch throttle time metrics for any request throttle

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025256#comment-16025256
 ] 

ASF GitHub Bot commented on KAFKA-5320:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3137


> Update produce/fetch throttle time metrics for any request throttle
> ---
>
> Key: KAFKA-5320
> URL: https://issues.apache.org/jira/browse/KAFKA-5320
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> From KIP-124:
> {quote}
> Producers and consumers currently expose average and maximum producer/fetch 
> request throttle time as JMX metrics. These metrics will be updated to 
> reflect total throttle time for the producer or consumer including byte-rate 
> throttling and request time throttling for all requests of the 
> producer/consumer. 
> {quote}
> Missed this during the request quota implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3925) Default log.dir=/tmp/kafka-logs is unsafe

2017-05-25 Thread Scott Glajch (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025259#comment-16025259
 ] 

Scott Glajch commented on KAFKA-3925:
-

This really bit us as well.  I don't know why your default directory for data 
in /tmp, a directory which is by definition transient, is insane.

> Default log.dir=/tmp/kafka-logs is unsafe
> -
>
> Key: KAFKA-3925
> URL: https://issues.apache.org/jira/browse/KAFKA-3925
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Affects Versions: 0.10.0.0
> Environment: Various, depends on OS and configuration
>Reporter: Peter Davis
>
> Many operating systems are configured to delete files under /tmp.  For 
> example Ubuntu has 
> [tmpreaper|http://manpages.ubuntu.com/manpages/wily/man8/tmpreaper.8.html], 
> others use tmpfs, others delete /tmp on startup. 
> Defaults are OK to make getting started easier but should not be unsafe 
> (risking data loss). 
> Something under /var would be a better default log.dir under *nix.  Or 
> relative to the Kafka bin directory to avoid needing root.  
> If the default cannot be changed, I would suggest a special warning print to 
> the console on broker startup if log.dir is under /tmp. 
> See [users list 
> thread|http://mail-archives.apache.org/mod_mbox/kafka-users/201607.mbox/%3cCAD5tkZb-0MMuWJqHNUJ3i1+xuNPZ4tnQt-RPm65grxE0=0o...@mail.gmail.com%3e].
>   I've also been bitten by this. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Kafka Read Data from All Partition Using Key or Timestamp

2017-05-25 Thread SenthilKumar K
Thanks a lot Hans..  By using KafkaConsumer API
https://gist.github.com/senthilec566/16e8e28b32834666fea132afc3a4e2f9 i can
query the data using timestamp ..

It worked !


Now another question to achieve Parallelism on reading data ..

Example :  topic : test
  partitions : 4

KafkaConsumer allows to search timestamp based messaged and search it in
each partition , Right now the way i coded is
1) Fetch no of Partitions
2) Use ForkJoinPool
3) Submit Task to ForknJoinPool
4) Combine result

Each task creates its own Consumer and reads the data ... Example total 4
consumers .. the expected search result time is little high.. Same case for
the case if i use Single Consumer since 4 times it has to read and join the
result..

How to implement this efficiently i.e in a Single Request read data from
all partitions etc ??? whichever way gives me good performance it would opt
for it :-) ..

Pls suggest me here !

Cheers,
Senthil


On Thu, May 25, 2017 at 8:30 PM, Hans Jespersen  wrote:

> The timeindex was added in 0.10 so I think you need to use the new
> Consumer API to access this functionality. Specifically you should call
> offsetsForTimes()
>
> https://kafka.apache.org/0102/javadoc/org/apache/kafka/
> clients/consumer/Consumer.html#offsetsForTimes(java.util.Map)
>
> -hans
>
> > On May 25, 2017, at 6:39 AM, SenthilKumar K 
> wrote:
> >
> > I did an experiment on searching messages using timestamps ..
> >
> > Step 1: Used Producer with Create Time ( CT )
> > Step 2 : Verify whether it reflects in Kafka or not
> >  .index  .log
> >  .timeindex
> >These three files in disk and seems to be time_index working .
> >
> > Step 3: Let's look into data
> >offset: 121 position: 149556 *CreateTime*: 1495718896912 isvalid:
> > true payloadsize: 1194 magic: 1 compresscodec: NONE crc: 1053048980
> > keysize: 8
> >
> >  Looks good ..
> > Step 4 :  Check .timeindex file .
> >  timestamp: 1495718846912 offset: 116
> >  timestamp: 1495718886912 offset: 120
> >  timestamp: 1495718926912 offset: 124
> >  timestamp: 1495718966912 offset: 128
> >
> > So all set for Querying data using timestamp ?
> >
> > Kafka version : kafka_2.11-0.10.2.1
> >
> > Here is the code i'm using to search query -->
> > https://gist.github.com/senthilec566/bc8ed1dfcf493f0bb5c473c50854dff9
> >
> > requestInfo.put(topicAndPartition, new PartitionOffsetRequestInfo(
> queryTime,
> > 1));
> > If i pass my own timestamp , always getting zero result ..
> > *Same question asked here too
> > **https://stackoverflow.com/questions/31917134/how-to-use-
> unix-timestamp-to-get-offset-using-simpleconsumer-api
> >  unix-timestamp-to-get-offset-using-simpleconsumer-api>*
> > .
> >
> >
> > Also i could notice below error in index file:
> >
> > *Found timestamp mismatch* in
> > :/home/user/kafka-logs/topic-0/.timeindex
> >
> >  Index timestamp: 0, log timestamp: 1495717686913
> >
> > *Found out of order timestamp* in
> > :/home/user/kafka-logs/topic-0/.timeindex
> >
> >  Index timestamp: 0, Previously indexed timestamp: 1495719406912
> >
> > Not sure what is missing here :-( ... Pls advise me here!
> >
> >
> > Cheers,
> > Senthil
> >
> > On Thu, May 25, 2017 at 3:36 PM, SenthilKumar K 
> > wrote:
> >
> >> Thanks a lot Mayuresh. I will look into SearchMessageByTimestamp feature
> >> in Kafka ..
> >>
> >> Cheers,
> >> Senthil
> >>
> >> On Thu, May 25, 2017 at 1:12 PM, Mayuresh Gharat <
> >> gharatmayures...@gmail.com> wrote:
> >>
> >>> Hi Senthil,
> >>>
> >>> Kafka does allow search message by timestamp after KIP-33 :
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+
> >>> Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-S
> >>> earchmessagebytimestamp
> >>>
> >>> The new consumer does provide you a way to get offsets by timestamp.
> You
> >>> can use these offsets to seek to that offset and consume from there.
> So if
> >>> you want to consume between a range you can get the start and end
> offset
> >>> based on the timestamps, seek to the start offset and consume and
> process
> >>> the data till you reach the end offset.
> >>>
> >>> But these timestamps are either CreateTime(when the message was created
> >>> and you will have to specify this when you do the send()) or
> >>> LogAppendTime(when the message was appended to the log on the kafka
> broker)
> >>> : https://kafka.apache.org/0101/javadoc/org/apache/kafka/clien
> >>> ts/producer/ProducerRecord.html
> >>>
> >>> Kafka does not look at the fields in your data (key/value) for giving
> >>> back you the data. What I meant was it will not look at the timestamp
> >>> specified by you in the actual data payload.
> >>>
> >>> Thanks,
> >>>
> >>> Mayuresh
> >>>
> >>> On Thu, May 25, 2017 at 12:43 PM, SenthilKumar K <
> senthilec...@gmail.com

Jenkins build is back to normal : kafka-trunk-jdk8 #1593

2017-05-25 Thread Apache Jenkins Server
See 




[jira] [Commented] (KAFKA-5265) Move ACLs, Config, NodeVersions classes into org.apache.kafka.common

2017-05-25 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025356#comment-16025356
 ] 

Colin P. McCabe commented on KAFKA-5265:


CreateTopicResults -> CreateTopicsResults
DeleteTopicResults -> DeleteTopicsResults
move TopicPartitionInfo to org.apache.kafka.common
move AccessControlEntry to org.apache.kafka.common.acl
move AccessControlEntryData to org.apache.kafka.common.acl
move AccessControlEntryFilter to org.apache.kafka.common.acl
move AclBinding to org.apache.kafka.common.acl
move AclBindingFilter to org.apache.kafka.common.acl
move AclOperation to org.apache.kafka.common.acl
move AclPermissionType to org.apache.kafka.common.acl
move ConfigResource to org.apache.common.config
move Resource to org.apache.common.resource
move ResourceFilter to org.apache.common.resource
move ResourceType to org.apache.common.resource
Add TopicConfigs.java

> Move ACLs, Config, NodeVersions classes into org.apache.kafka.common
> 
>
> Key: KAFKA-5265
> URL: https://issues.apache.org/jira/browse/KAFKA-5265
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.0
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 0.11.0.0
>
>
> We should move the `ACLs`, `Config`, and `NodeVersions` classes into 
> `org.apache.kafka.common`.  That will make the easier to use in server code 
> as well as admin client code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5226) NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize

2017-05-25 Thread Bill Bejeck (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025404#comment-16025404
 ] 

Bill Bejeck commented on KAFKA-5226:


I've been able to reproduce the problem.  The issue is not with the 
`SourceNodeRecordDeserializer`, that's just the symptom.  

My steps to reproduce:
   1. Start 3 instances of a simple streams application with regex subscription.
   2. Add topics with names matching the pattern.
  
It doesn't always happen, so I needed to make a few attempts.

The issue is sometimes when a topic is added (while a streams app with regex 
subscription is running) the new topic does not get passed into the 
`StreamPartitionAssignor.subscribe` method, but it will show up during a 
rebalance.In the `StreamPartitionAssignor.subscribe` method we update the 
topology with any topics matching the subscription regex.  When we go to create 
the stream task for the new assignment we don't have corresponding source node 
for the topic, causing the error.   

Here's the relevant log lines (logs from all 3 applications attached) from me 
reproducing the error by adding a topic named `foo.Trigger`.  The 
`StreamPartitionAssignor` only gets passed the topics `us.Trigger` and 
`demo.Trigger`.  But during the rebalance the following are assignments are 
passed to the `ConsumerRebalanceListener.onPartitionsAssigned` method 
`us.Trigger`, `demo.Trigger` and `foo.Trigger`

[2017-05-24 15:30:14,367] DEBUG stream-thread 
[reballancing-test-df1e3cfa-5620-4a81-89bf-88ed14115321-StreamThread-1] found 
[us.Trigger, demo.Trigger] topics possibly matching regex 
(org.apache.kafka.streams.processor.internals.StreamPartitionAssignor:258)

[2017-05-24 15:30:14,474] INFO stream-thread 
[reballancing-test-df1e3cfa-5620-4a81-89bf-88ed14115321-StreamThread-1] at 
state PARTITIONS_REVOKED: new partitions [foo.Trigger-1, us.Trigger-1, 
demo.Trigger-1] assigned at the end of consumer rebalance.
.
2017-05-24 15:30:14,478] INFO stream-thread 
[reballancing-test-df1e3cfa-5620-4a81-89bf-88ed14115321-StreamThread-1] Created 
active task 0_1 with assigned partitions [foo.Trigger-1, us.Trigger-1, 
demo.Trigger-1] (org.apache.kafka.streams.processor.internals.StreamThread:1240)



> NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize
> --
>
> Key: KAFKA-5226
> URL: https://issues.apache.org/jira/browse/KAFKA-5226
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
> Environment: 64-bit Amazon Linux, JDK8
>Reporter: Ian Springer
>Assignee: Bill Bejeck
> Attachments: kafka.log, streamsAppOne.log, streamsAppThree.log, 
> streamsAppTwo.log
>
>
> I saw the following NPE in our Kafka Streams app, which has 3 nodes running 
> on 3 separate machines.. Out of hundreds of messages processed, the NPE only 
> occurred twice. I are not sure of the cause, so I am unable to reproduce it. 
> I'm hoping the Kafka Streams team can guess the cause based on the stack 
> trace. If I can provide any additional details about our app, please let me 
> know.
>  
> {code}
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka version : 0.10.2.1
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka commitId : e89bffd6b2eff799
> INFO  2017-05-10 02:58:26,031 o.s.context.support.DefaultLifecycleProcessor  
> Starting beans in phase 0
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from CREATED to RUNNING.
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] Started 
> Kafka Stream process
> INFO  2017-05-10 02:58:26,086 o.a.k.c.consumer.internals.AbstractCoordinator  
> Discovered coordinator p1kaf1.prod.apptegic.com:9092 (id: 2147482646 rack: 
> null) for group evergage-app.
> INFO  2017-05-10 02:58:26,126 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [] for group evergage-app
> INFO  2017-05-10 02:58:26,126 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from RUNNING to REBALANCING.
> INFO  2017-05-10 02:58:26,127 o.a.k.c.consumer.internals.AbstractCoordinator  
> (Re-)joining group evergage-app
> INFO  2017-05-10 02:58:27,712 o.a.k.c.consumer.internals.AbstractCoordinator  
> Successfully joined group evergage-app with generation 18
> INFO  2017-05-10 02:58:27,716 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Setting newly assigned partitions [us.app.Trigger-0] for group evergage-app
> INFO  2017-05-10 02:58:27,716 org.apache.kaf

[jira] [Updated] (KAFKA-5226) NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize

2017-05-25 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-5226:
---
Attachment: streamsAppTwo.log
streamsAppThree.log
streamsAppOne.log

> NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize
> --
>
> Key: KAFKA-5226
> URL: https://issues.apache.org/jira/browse/KAFKA-5226
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
> Environment: 64-bit Amazon Linux, JDK8
>Reporter: Ian Springer
>Assignee: Bill Bejeck
> Attachments: kafka.log, streamsAppOne.log, streamsAppThree.log, 
> streamsAppTwo.log
>
>
> I saw the following NPE in our Kafka Streams app, which has 3 nodes running 
> on 3 separate machines.. Out of hundreds of messages processed, the NPE only 
> occurred twice. I are not sure of the cause, so I am unable to reproduce it. 
> I'm hoping the Kafka Streams team can guess the cause based on the stack 
> trace. If I can provide any additional details about our app, please let me 
> know.
>  
> {code}
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka version : 0.10.2.1
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka commitId : e89bffd6b2eff799
> INFO  2017-05-10 02:58:26,031 o.s.context.support.DefaultLifecycleProcessor  
> Starting beans in phase 0
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from CREATED to RUNNING.
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] Started 
> Kafka Stream process
> INFO  2017-05-10 02:58:26,086 o.a.k.c.consumer.internals.AbstractCoordinator  
> Discovered coordinator p1kaf1.prod.apptegic.com:9092 (id: 2147482646 rack: 
> null) for group evergage-app.
> INFO  2017-05-10 02:58:26,126 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [] for group evergage-app
> INFO  2017-05-10 02:58:26,126 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from RUNNING to REBALANCING.
> INFO  2017-05-10 02:58:26,127 o.a.k.c.consumer.internals.AbstractCoordinator  
> (Re-)joining group evergage-app
> INFO  2017-05-10 02:58:27,712 o.a.k.c.consumer.internals.AbstractCoordinator  
> Successfully joined group evergage-app with generation 18
> INFO  2017-05-10 02:58:27,716 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Setting newly assigned partitions [us.app.Trigger-0] for group evergage-app
> INFO  2017-05-10 02:58:27,716 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to REBALANCING.
> INFO  2017-05-10 02:58:27,729 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> state stores
> INFO  2017-05-10 02:58:27,731 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> processor nodes of the topology
> INFO  2017-05-10 02:58:27,742 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to RUNNING.
> [14 hours pass...]
> INFO  2017-05-10 16:21:27,476 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [us.app.Trigger-0] for group 
> evergage-app
> INFO  2017-05-10 16:21:27,477 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from RUNNING to REBALANCING.
> INFO  2017-05-10 16:21:27,482 o.a.k.c.consumer.internals.AbstractCoordinator  
> (Re-)joining group evergage-app
> INFO  2017-05-10 16:21:27,489 o.a.k.c.consumer.internals.AbstractCoordinator  
> Successfully joined group evergage-app with generation 19
> INFO  2017-05-10 16:21:27,489 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Setting newly assigned partitions [us.app.Trigger-0] for group evergage-app
> INFO  2017-05-10 16:21:27,489 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to REBALANCING.
> INFO  2017-05-10 16:21:27,489 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> processor nodes of the topology
> INFO  2017-05-10 16:21:27,493 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to RUNNING.
> INFO  2017-05-10 16:21:30,584 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previ

[GitHub] kafka pull request #3147: MINOR: Remove unused method parameter in `SimpleAc...

2017-05-25 Thread vahidhashemian
GitHub user vahidhashemian opened a pull request:

https://github.com/apache/kafka/pull/3147

MINOR: Remove unused method parameter in `SimpleAclAuthorizer`



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka 
minor/remove_unsed_method_parameter_simpleaclauthorizer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3147


commit 249e23cd1f36bd878c90c03fde872a40909029cf
Author: Vahid Hashemian 
Date:   2017-05-25T20:52:17Z

MINOR: Remove unused method parameter in `SimpleAclAuthorizer`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Work started] (KAFKA-4585) Offset fetch and commit requests use the same permissions

2017-05-25 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-4585 started by Vahid Hashemian.
--
> Offset fetch and commit requests use the same permissions
> -
>
> Key: KAFKA-4585
> URL: https://issues.apache.org/jira/browse/KAFKA-4585
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Vahid Hashemian
>  Labels: needs-kip
>
> Currently the handling of permissions for consumer groups seems a bit odd 
> because most of the requests use the Read permission on the Group (join, 
> sync, heartbeat, leave, offset commit, and offset fetch). This means you 
> cannot lock down certain functionality for certain users. For this issue I'll 
> highlight a realistic issue since conflating the ability to perform most of 
> these operations may not be a serious issue.
> In particular, if you want tooling for monitoring offsets (i.e. you want to 
> be able to read from all groups) but don't want that tool to be able to write 
> offsets, you currently cannot achieve this. Part of the reason this seems odd 
> to me is that any operation which can mutate state seems like it should be a 
> Write operation (i.e. joining, syncing, leaving, and committing; maybe 
> heartbeat as well). However, [~hachikuji] has mentioned that the use of Read 
> may have been intentional. If that is the case, changing at least offset 
> fetch to be a Describe operation instead would allow isolating the mutating 
> vs non-mutating request types.
> Note that this would require a KIP and would potentially have some 
> compatibility implications. Note however, that if we went with the Describe 
> option, Describe is allowed by default when Read, Write, or Delete are 
> allowed, so this may not have to have any compatibility issues (if the user 
> previously allowed Read, they'd still have all the same capabilities as 
> before).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5147) KafkaProducer's TransactionManager needs a review on synchronization

2017-05-25 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5147:
---
   Resolution: Fixed
Fix Version/s: 0.11.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3132
[https://github.com/apache/kafka/pull/3132]

> KafkaProducer's TransactionManager needs a review on synchronization
> 
>
> Key: KAFKA-5147
> URL: https://issues.apache.org/jira/browse/KAFKA-5147
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> Currently, the completion handlers are not synchronized, though they access 
> shared state like `partitionsInTransaction`, and `lastError`, 
> `pendingPartitionsToBeaddedToTransaction`, etc. 
> We should either make the collections concurrent or synchronize the handlers. 
> In general, we need to review this code to ensure that the synchronization is 
> correct. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3132: KAFKA-5147: Add missing synchronization to Transac...

2017-05-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3132


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5147) KafkaProducer's TransactionManager needs a review on synchronization

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025552#comment-16025552
 ] 

ASF GitHub Bot commented on KAFKA-5147:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3132


> KafkaProducer's TransactionManager needs a review on synchronization
> 
>
> Key: KAFKA-5147
> URL: https://issues.apache.org/jira/browse/KAFKA-5147
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> Currently, the completion handlers are not synchronized, though they access 
> shared state like `partitionsInTransaction`, and `lastError`, 
> `pendingPartitionsToBeaddedToTransaction`, etc. 
> We should either make the collections concurrent or synchronize the handlers. 
> In general, we need to review this code to ensure that the synchronization is 
> correct. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4585) Offset fetch and commit requests use the same permissions

2017-05-25 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025569#comment-16025569
 ] 

Vahid Hashemian commented on KAFKA-4585:


[~ewencp] I'll work on a KIP for changing the required permission for offset 
fetch. I am wondering why the required permission for heartbeat is set to Read 
(vs. Describe). Because it seems that heartbeat is non-mutating by itself and 
during the processing of a heartbeat request no change occurs to the group or 
the consumer. Also, cc [~hachikuji]. Thanks.

> Offset fetch and commit requests use the same permissions
> -
>
> Key: KAFKA-4585
> URL: https://issues.apache.org/jira/browse/KAFKA-4585
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Vahid Hashemian
>  Labels: needs-kip
>
> Currently the handling of permissions for consumer groups seems a bit odd 
> because most of the requests use the Read permission on the Group (join, 
> sync, heartbeat, leave, offset commit, and offset fetch). This means you 
> cannot lock down certain functionality for certain users. For this issue I'll 
> highlight a realistic issue since conflating the ability to perform most of 
> these operations may not be a serious issue.
> In particular, if you want tooling for monitoring offsets (i.e. you want to 
> be able to read from all groups) but don't want that tool to be able to write 
> offsets, you currently cannot achieve this. Part of the reason this seems odd 
> to me is that any operation which can mutate state seems like it should be a 
> Write operation (i.e. joining, syncing, leaving, and committing; maybe 
> heartbeat as well). However, [~hachikuji] has mentioned that the use of Read 
> may have been intentional. If that is the case, changing at least offset 
> fetch to be a Describe operation instead would allow isolating the mutating 
> vs non-mutating request types.
> Note that this would require a KIP and would potentially have some 
> compatibility implications. Note however, that if we went with the Describe 
> option, Describe is allowed by default when Read, Write, or Delete are 
> allowed, so this may not have to have any compatibility issues (if the user 
> previously allowed Read, they'd still have all the same capabilities as 
> before).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


RE: GlobalKTable limitations

2017-05-25 Thread Thomas Becker
Hey Eno,
Thanks for the response. We have considered, but not yet tried that. One of the 
nice things about the GlobalKTable is that it fully "bootstraps" before the 
rest of the topology is started. But if part of the topology is itself 
generating the topic that backs the global table, it seems like that would 
effectively break. Also, doing this obviously requires re-materializing the 
data in a new topic.  To be fair, the topic we are building the table from has 
a bit of an unusual format, which is why I was trying to see if anyone else 
would think this was useful.

-Tommy


From: Eno Thereska [eno.there...@gmail.com]
Sent: Thursday, May 25, 2017 12:03 PM
To: dev@kafka.apache.org
Subject: Re: GlobalKTable limitations

Hi Thomas,

Have you considered doing the transformations on the topic, then outputting to 
another topic and then constructing the GlobalKTable from the latter?

The GlobalKTable has the limitations you mention since it was primarily 
designed for joins only. We should consider allowing a less restrictive 
interface if it makes sense.

Eno

> On 25 May 2017, at 14:48, Thomas Becker  wrote:
>
> We need to do a series of joins against a KTable that we can't co-
> partition with the stream, so we're looking at GlobalKTable.  But the
> topic backing the table is not ideally keyed for the sort of lookups
> this particular processor needs to do. Unfortunately, GlobalKTable is
> very limited in that you can only build one with the exact keys/values
> from the backing topic. I'd like to be able to perform various
> transformations on the topic before materializing the table.  I'd
> envision it looking something like the following:
>
> builder.globalTable(keySerde, valueSerde, topicName)
>.filter((k, v) -> k.isFoo())
>.map((k, v) -> new KeyValue<>(k.getBar(), v.getBaz()))
>.build(tableKeySerde, tableValueSerde, storeName);
>
> Is this something that has been considered or that others would find
> useful?
>
> --
>
>
>Tommy Becker
>
>Senior Software Engineer
>
>O +1 919.460.4747
>
>tivo.com
>
>
> 
>
> This email and any attachments may contain confidential and privileged 
> material for the sole use of the intended recipient. Any review, copying, or 
> distribution of this email (or any attachments) by others is prohibited. If 
> you are not the intended recipient, please contact the sender immediately and 
> permanently delete this email and any attachments. No employee or agent of 
> TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo 
> Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed 
> written agreement.



This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.


[jira] [Updated] (KAFKA-5320) Update produce/fetch throttle time metrics for any request throttle

2017-05-25 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-5320:
---
Fix Version/s: (was: 0.11.1.0)

> Update produce/fetch throttle time metrics for any request throttle
> ---
>
> Key: KAFKA-5320
> URL: https://issues.apache.org/jira/browse/KAFKA-5320
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> From KIP-124:
> {quote}
> Producers and consumers currently expose average and maximum producer/fetch 
> request throttle time as JMX metrics. These metrics will be updated to 
> reflect total throttle time for the producer or consumer including byte-rate 
> throttling and request time throttling for all requests of the 
> producer/consumer. 
> {quote}
> Missed this during the request quota implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5226) NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize

2017-05-25 Thread Ian Springer (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025688#comment-16025688
 ] 

Ian Springer commented on KAFKA-5226:
-

Thanks for the update, [~bbejeck]. That's great news.

> NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize
> --
>
> Key: KAFKA-5226
> URL: https://issues.apache.org/jira/browse/KAFKA-5226
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
> Environment: 64-bit Amazon Linux, JDK8
>Reporter: Ian Springer
>Assignee: Bill Bejeck
> Attachments: kafka.log, streamsAppOne.log, streamsAppThree.log, 
> streamsAppTwo.log
>
>
> I saw the following NPE in our Kafka Streams app, which has 3 nodes running 
> on 3 separate machines.. Out of hundreds of messages processed, the NPE only 
> occurred twice. I are not sure of the cause, so I am unable to reproduce it. 
> I'm hoping the Kafka Streams team can guess the cause based on the stack 
> trace. If I can provide any additional details about our app, please let me 
> know.
>  
> {code}
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka version : 0.10.2.1
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka commitId : e89bffd6b2eff799
> INFO  2017-05-10 02:58:26,031 o.s.context.support.DefaultLifecycleProcessor  
> Starting beans in phase 0
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from CREATED to RUNNING.
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] Started 
> Kafka Stream process
> INFO  2017-05-10 02:58:26,086 o.a.k.c.consumer.internals.AbstractCoordinator  
> Discovered coordinator p1kaf1.prod.apptegic.com:9092 (id: 2147482646 rack: 
> null) for group evergage-app.
> INFO  2017-05-10 02:58:26,126 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [] for group evergage-app
> INFO  2017-05-10 02:58:26,126 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from RUNNING to REBALANCING.
> INFO  2017-05-10 02:58:26,127 o.a.k.c.consumer.internals.AbstractCoordinator  
> (Re-)joining group evergage-app
> INFO  2017-05-10 02:58:27,712 o.a.k.c.consumer.internals.AbstractCoordinator  
> Successfully joined group evergage-app with generation 18
> INFO  2017-05-10 02:58:27,716 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Setting newly assigned partitions [us.app.Trigger-0] for group evergage-app
> INFO  2017-05-10 02:58:27,716 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to REBALANCING.
> INFO  2017-05-10 02:58:27,729 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> state stores
> INFO  2017-05-10 02:58:27,731 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> processor nodes of the topology
> INFO  2017-05-10 02:58:27,742 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to RUNNING.
> [14 hours pass...]
> INFO  2017-05-10 16:21:27,476 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [us.app.Trigger-0] for group 
> evergage-app
> INFO  2017-05-10 16:21:27,477 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from RUNNING to REBALANCING.
> INFO  2017-05-10 16:21:27,482 o.a.k.c.consumer.internals.AbstractCoordinator  
> (Re-)joining group evergage-app
> INFO  2017-05-10 16:21:27,489 o.a.k.c.consumer.internals.AbstractCoordinator  
> Successfully joined group evergage-app with generation 19
> INFO  2017-05-10 16:21:27,489 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Setting newly assigned partitions [us.app.Trigger-0] for group evergage-app
> INFO  2017-05-10 16:21:27,489 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to REBALANCING.
> INFO  2017-05-10 16:21:27,489 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> processor nodes of the topology
> INFO  2017-05-10 16:21:27,493 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to RUNNING.
> INFO  2017-05-10 16:21:30,584 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revo

[jira] [Assigned] (KAFKA-5327) Console Consumer should only poll for up to max messages

2017-05-25 Thread huxi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxi reassigned KAFKA-5327:
---

Assignee: huxi

> Console Consumer should only poll for up to max messages
> 
>
> Key: KAFKA-5327
> URL: https://issues.apache.org/jira/browse/KAFKA-5327
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Dustin Cote
>Assignee: huxi
>Priority: Minor
>
> The ConsoleConsumer has a --max-messages flag that can be used to limit the 
> number of messages consumed. However, the number of records actually consumed 
> is governed by max.poll.records. This means you see one message on the 
> console, but your offset has moved forward a default of 500, which is kind of 
> counterintuitive. It would be good to only commit offsets for messages we 
> have printed to the console.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3148: KAFKA-5327: ConsoleConsumer explicitly set `max.po...

2017-05-25 Thread amethystic
GitHub user amethystic opened a pull request:

https://github.com/apache/kafka/pull/3148

KAFKA-5327: ConsoleConsumer explicitly set `max.poll.records`...

KAFKA-5327: ConsoleConsumer explicitly set `max.poll.records` if 
`--max-messages` is set.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amethystic/kafka 
KAFKA-5327_ConsoleConsumer_distable_autocommit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3148.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3148


commit fc884eb0bd9940c3914df5e74eae25ff12bb14b9
Author: amethystic 
Date:   2017-05-26T03:12:17Z

KAFKA-5327: ConsoleConsumer explicitly set `max.poll.records` if 
`--max-messages` is set.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5327) Console Consumer should only poll for up to max messages

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025743#comment-16025743
 ] 

ASF GitHub Bot commented on KAFKA-5327:
---

GitHub user amethystic opened a pull request:

https://github.com/apache/kafka/pull/3148

KAFKA-5327: ConsoleConsumer explicitly set `max.poll.records`...

KAFKA-5327: ConsoleConsumer explicitly set `max.poll.records` if 
`--max-messages` is set.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amethystic/kafka 
KAFKA-5327_ConsoleConsumer_distable_autocommit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3148.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3148


commit fc884eb0bd9940c3914df5e74eae25ff12bb14b9
Author: amethystic 
Date:   2017-05-26T03:12:17Z

KAFKA-5327: ConsoleConsumer explicitly set `max.poll.records` if 
`--max-messages` is set.




> Console Consumer should only poll for up to max messages
> 
>
> Key: KAFKA-5327
> URL: https://issues.apache.org/jira/browse/KAFKA-5327
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Dustin Cote
>Assignee: huxi
>Priority: Minor
>
> The ConsoleConsumer has a --max-messages flag that can be used to limit the 
> number of messages consumed. However, the number of records actually consumed 
> is governed by max.poll.records. This means you see one message on the 
> console, but your offset has moved forward a default of 500, which is kind of 
> counterintuitive. It would be good to only commit offsets for messages we 
> have printed to the console.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4585) Offset fetch and commit requests use the same permissions

2017-05-25 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025746#comment-16025746
 ] 

Ewen Cheslack-Postava commented on KAFKA-4585:
--

[~vahid] Honestly not sure -- there are a number of permissions that I find odd 
choices, especially in the consumer. Unfortunately I didn't participate in the 
discussions/patches that introduced those, so I'm not sure of the reasoning. 
I've discussed this previously with [~hachikuji] and iirc he also agreed some 
of them may seem a bit odd. Not sure if he has anything more to add here wrt 
your specific question though.

> Offset fetch and commit requests use the same permissions
> -
>
> Key: KAFKA-4585
> URL: https://issues.apache.org/jira/browse/KAFKA-4585
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Vahid Hashemian
>  Labels: needs-kip
>
> Currently the handling of permissions for consumer groups seems a bit odd 
> because most of the requests use the Read permission on the Group (join, 
> sync, heartbeat, leave, offset commit, and offset fetch). This means you 
> cannot lock down certain functionality for certain users. For this issue I'll 
> highlight a realistic issue since conflating the ability to perform most of 
> these operations may not be a serious issue.
> In particular, if you want tooling for monitoring offsets (i.e. you want to 
> be able to read from all groups) but don't want that tool to be able to write 
> offsets, you currently cannot achieve this. Part of the reason this seems odd 
> to me is that any operation which can mutate state seems like it should be a 
> Write operation (i.e. joining, syncing, leaving, and committing; maybe 
> heartbeat as well). However, [~hachikuji] has mentioned that the use of Read 
> may have been intentional. If that is the case, changing at least offset 
> fetch to be a Describe operation instead would allow isolating the mutating 
> vs non-mutating request types.
> Note that this would require a KIP and would potentially have some 
> compatibility implications. Note however, that if we went with the Describe 
> option, Describe is allowed by default when Read, Write, or Delete are 
> allowed, so this may not have to have any compatibility issues (if the user 
> previously allowed Read, they'd still have all the same capabilities as 
> before).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1596

2017-05-25 Thread Apache Jenkins Server
See 

--
[...truncated 2.94 MB...]

kafka.security.auth.SimpleAclAuthorizerTest > 
testDistributedConcurrentModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testLoadCache STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testLoadCache PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
STARTED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision PASSED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics STARTED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.TopicMetadat

[jira] [Created] (KAFKA-5328) consider switching json parser from scala to jackson

2017-05-25 Thread Onur Karaman (JIRA)
Onur Karaman created KAFKA-5328:
---

 Summary: consider switching json parser from scala to jackson
 Key: KAFKA-5328
 URL: https://issues.apache.org/jira/browse/KAFKA-5328
 Project: Kafka
  Issue Type: Sub-task
Reporter: Onur Karaman
Assignee: Onur Karaman


The scala json parser is significantly slower than jackson.

This can have a nontrivial impact on controller initialization since the 
controller loads and parses almost all zookeeper state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5329) Metadata cache on the broker may have replica list may have different order from zookeeper

2017-05-25 Thread Jiangjie Qin (JIRA)
Jiangjie Qin created KAFKA-5329:
---

 Summary: Metadata cache on the broker may have replica list may 
have different order from zookeeper
 Key: KAFKA-5329
 URL: https://issues.apache.org/jira/browse/KAFKA-5329
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.2.1
Reporter: Jiangjie Qin
 Fix For: 0.11.0.1


It looks that in {{PartitionStateInfo}} we are storing the replicas in a set 
instead of a Seq. This causes the replica order to be lost. In most case it is 
fine, but in the context of preferred leader election, the replica order 
determines which replica is the preferred leader of a partition. It would be 
useful to make the order always be consistent with the order in zookeeper. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5330) Use per-task converters in Connect

2017-05-25 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-5330:


 Summary: Use per-task converters in Connect
 Key: KAFKA-5330
 URL: https://issues.apache.org/jira/browse/KAFKA-5330
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Affects Versions: 0.11.0.0
Reporter: Ewen Cheslack-Postava


Because Connect started with a worker-wide model of data formats, we currently 
allocate a single Converter per worker and only allocate an independent one 
when the user overrides the converter.

This can lead to performance problems when the worker-level default converter 
is used by a large number of tasks because converters need to be threadsafe to 
support this model and they may spend a lot of time just on synchronization.

We could, instead, simply allocate one converter per task. There is some 
overhead involved, but generally it shouldn't be that large. For example, 
Confluent's Avro converters will each have their own schema cache and have to 
make their on calls to the schema registry API, but these are relatively small, 
likely inconsequential compared to any normal overhead we would already have 
for creating and managing each task. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5329) Replica list in the metadata cache on the broker may have different order from zookeeper

2017-05-25 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-5329:

Summary: Replica list in the metadata cache on the broker may have 
different order from zookeeper  (was: Metadata cache on the broker may have 
replica list may have different order from zookeeper)

> Replica list in the metadata cache on the broker may have different order 
> from zookeeper
> 
>
> Key: KAFKA-5329
> URL: https://issues.apache.org/jira/browse/KAFKA-5329
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.1
>Reporter: Jiangjie Qin
>  Labels: newbie
> Fix For: 0.11.0.1
>
>
> It looks that in {{PartitionStateInfo}} we are storing the replicas in a set 
> instead of a Seq. This causes the replica order to be lost. In most case it 
> is fine, but in the context of preferred leader election, the replica order 
> determines which replica is the preferred leader of a partition. It would be 
> useful to make the order always be consistent with the order in zookeeper. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3149: KAFKA-5281: System tests for transactions

2017-05-25 Thread apurvam
GitHub user apurvam opened a pull request:

https://github.com/apache/kafka/pull/3149

KAFKA-5281: System tests for transactions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apurvam/kafka 
KAFKA-5281-transactions-system-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3149.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3149


commit 07d38380e8734349666e0ddef0da469bfd2a38a4
Author: Apurva Mehta 
Date:   2017-05-25T00:58:20Z

Initial commit of TransactionalMessageCopier program

commit 9a711d7a0ff9c9a3ecb0aefe153a9eb411265609
Author: Apurva Mehta 
Date:   2017-05-25T21:17:18Z

Initial commit of transactions system test

commit 679df52ad77f731baca5eb5a74d1ad317856096a
Author: Apurva Mehta 
Date:   2017-05-25T21:45:27Z

WIP

commit fc62578008353cac254ea76aa5eef8dbf2b35c9d
Author: Apurva Mehta 
Date:   2017-05-26T05:13:05Z

WIP - Got the basic copy / validate to work. Now to add bouncing and 
failures

commit aa88be80623909d2bda2b2536db690e905fe702d
Author: Apurva Mehta 
Date:   2017-05-26T06:10:50Z

Implemented clean and hard bounces of brokers. Hard bounce test fails 
because we can't find the coordinator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5281) System tests for KIP-98 / transactions

2017-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025865#comment-16025865
 ] 

ASF GitHub Bot commented on KAFKA-5281:
---

GitHub user apurvam opened a pull request:

https://github.com/apache/kafka/pull/3149

KAFKA-5281: System tests for transactions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apurvam/kafka 
KAFKA-5281-transactions-system-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3149.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3149


commit 07d38380e8734349666e0ddef0da469bfd2a38a4
Author: Apurva Mehta 
Date:   2017-05-25T00:58:20Z

Initial commit of TransactionalMessageCopier program

commit 9a711d7a0ff9c9a3ecb0aefe153a9eb411265609
Author: Apurva Mehta 
Date:   2017-05-25T21:17:18Z

Initial commit of transactions system test

commit 679df52ad77f731baca5eb5a74d1ad317856096a
Author: Apurva Mehta 
Date:   2017-05-25T21:45:27Z

WIP

commit fc62578008353cac254ea76aa5eef8dbf2b35c9d
Author: Apurva Mehta 
Date:   2017-05-26T05:13:05Z

WIP - Got the basic copy / validate to work. Now to add bouncing and 
failures

commit aa88be80623909d2bda2b2536db690e905fe702d
Author: Apurva Mehta 
Date:   2017-05-26T06:10:50Z

Implemented clean and hard bounces of brokers. Hard bounce test fails 
because we can't find the coordinator




> System tests for KIP-98 / transactions
> --
>
> Key: KAFKA-5281
> URL: https://issues.apache.org/jira/browse/KAFKA-5281
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Kafka core System tests:
> # Failures for consumer coordinator, transaction coordinator, producer, 
> partition leader, and controller.
> # Same as above, but with multiple producers.
> # Multiple producers and consumers work together in failure modes, with 
> aborts.
> # Hard and soft failures.
> Integration test:
> # Multiple producers and consumers work together in a normal mode, with 
> aborts.
> # Transaction expiration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5331) MirrorMaker can't guarantee data is consistent between clusters

2017-05-25 Thread xuzq (JIRA)
xuzq created KAFKA-5331:
---

 Summary: MirrorMaker can't guarantee data is consistent between 
clusters
 Key: KAFKA-5331
 URL: https://issues.apache.org/jira/browse/KAFKA-5331
 Project: Kafka
  Issue Type: Improvement
  Components: core, tools
Affects Versions: 0.10.2.1
Reporter: xuzq
Priority: Critical


MirrorMaker can't guarantee data is consistent between clusters although the 
topic partition Numbers are equal.

If we want to copy data from the old cluster to the new cluster, at the same 
time guaranteeing the data consistency, we have no choice but to code a 
MirrorMakerMessageHandler type of class manually to achieve it.

I think we can add a parameter to say whether or not you want to ensure the 
order, such as data.consistency and default value is true.

Of course, you can set data.consistency = false, if you don't want the data in 
order between clusters or some topic partition Numbers are not equal between 
clusters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-5251) Producer should drop queued sends when transaction is aborted

2017-05-25 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson reassigned KAFKA-5251:
--

Assignee: Jason Gustafson

> Producer should drop queued sends when transaction is aborted
> -
>
> Key: KAFKA-5251
> URL: https://issues.apache.org/jira/browse/KAFKA-5251
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> As an optimization, if a transaction is aborted, we can drop any records 
> which have not yet been sent to the brokers. However, to avoid the sequence 
> number getting out of sync, we need to continue sending any request which has 
> been sent at least once.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5281) System tests for KIP-98 / transactions

2017-05-25 Thread Apurva Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta updated KAFKA-5281:

Status: Patch Available  (was: In Progress)

> System tests for KIP-98 / transactions
> --
>
> Key: KAFKA-5281
> URL: https://issues.apache.org/jira/browse/KAFKA-5281
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Kafka core System tests:
> # Failures for consumer coordinator, transaction coordinator, producer, 
> partition leader, and controller.
> # Same as above, but with multiple producers.
> # Multiple producers and consumers work together in failure modes, with 
> aborts.
> # Hard and soft failures.
> Integration test:
> # Multiple producers and consumers work together in a normal mode, with 
> aborts.
> # Transaction expiration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5331) MirrorMaker can't guarantee data is consistent between clusters

2017-05-25 Thread xuzq (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuzq updated KAFKA-5331:

Attachment: KAFKA-5331.patch

> MirrorMaker can't guarantee data is consistent between clusters
> ---
>
> Key: KAFKA-5331
> URL: https://issues.apache.org/jira/browse/KAFKA-5331
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, tools
>Affects Versions: 0.10.2.1
>Reporter: xuzq
>Priority: Critical
> Attachments: KAFKA-5331.patch
>
>
> MirrorMaker can't guarantee data is consistent between clusters although the 
> topic partition Numbers are equal.
> If we want to copy data from the old cluster to the new cluster, at the same 
> time guaranteeing the data consistency, we have no choice but to code a 
> MirrorMakerMessageHandler type of class manually to achieve it.
> I think we can add a parameter to say whether or not you want to ensure the 
> order, such as data.consistency and default value is true.
> Of course, you can set data.consistency = false, if you don't want the data 
> in order between clusters or some topic partition Numbers are not equal 
> between clusters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5282) Transactions integration test: Use factory methods to keep track of open producers and consumers and close them all on tearDown

2017-05-25 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian updated KAFKA-5282:
---
Status: Patch Available  (was: In Progress)

> Transactions integration test: Use factory methods to keep track of open 
> producers and consumers and close them all on tearDown
> ---
>
> Key: KAFKA-5282
> URL: https://issues.apache.org/jira/browse/KAFKA-5282
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Apurva Mehta
>Assignee: Vahid Hashemian
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> See: https://github.com/apache/kafka/pull/3093/files#r117354588
> The current transactions integration test creates individual producers and 
> consumer per test, and closes them independently. 
> It would be more robust to create them through a central factory method that 
> keeps track of each instance, and then close those instances on `tearDown`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)