[jira] [Created] (KAFKA-7617) Document security primitives

2018-11-12 Thread Viktor Somogyi (JIRA)
Viktor Somogyi created KAFKA-7617:
-

 Summary: Document security primitives
 Key: KAFKA-7617
 URL: https://issues.apache.org/jira/browse/KAFKA-7617
 Project: Kafka
  Issue Type: Task
Reporter: Viktor Somogyi
Assignee: Viktor Somogyi


Although the documentation gives help on configuring the authentication and 
authorization, it won't list what are the security primitives (operations and 
resources) that can be used which makes it hard for users to easily set up 
thorough authorization rules.
This task would cover adding these to the security page of the Kafka 
documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-380: Detect outdated control requests and bounced brokers using broker generation

2018-11-12 Thread Becket Qin
Hi Patrick,

I am wondering why the controller should send STALE_BROKER_EPOCH error to
the broker if the broker epoch is stale? Would this be a little confusing
to the current broker if the request was sent by a broker with previous
epoch? Should the controller just ignore those requests in that case?

Thanks,

Jiangjie (Becket) Qin

On Fri, Nov 9, 2018 at 2:17 AM Patrick Huang  wrote:

> Hi,
>
> In this KIP, we are also going to add a new exception and a new error code
> "STALE_BROKER_EPOCH" in order to allow the broker to respond back the right
> error when it sees outdated broker epoch in the control requests. Since
> adding a new exception and error code is also considered as public
> interface change, I have updated the original KIP accordingly to include
> this change. Feel free to comment if there is any concern.
>
> Thanks,
> Zhanxiang (Patrick) Huang
>
> 
> From: Patrick Huang 
> Sent: Tuesday, October 23, 2018 6:20
> To: Jun Rao; dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
> bounced brokers using broker generation
>
> Agreed. I have updated the PR to add czxid in ControlledShutdownRequest (
> https://github.com/apache/kafka/pull/5821). Appreciated if you can take a
> look.
>
> Btw, I also have the vote thread for this KIP:
> https://lists.apache.org/thread.html/3689d83db537d5aa86d10967dee7ee29578897fc123daae4f77a8605@%3Cdev.kafka.apache.org%3E
>
> Best,
> Zhanxiang (Patrick) Huang
>
> 
> From: Jun Rao 
> Sent: Monday, October 22, 2018 21:31
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
> bounced brokers using broker generation
>
> Hi, Patrick,
>
> Yes, that's the general sequence. After step 2, the shutting down broker
> can give up the controlled shutdown process and proceed to shut down. When
> it's restarted, it could still receive StopReplica requests from the
> controller in reaction to the previous controlled shutdown requests. This
> could lead the restarted broker to a bad state.
>
> Thanks,
>
> Jun
>
>
> On Mon, Oct 22, 2018 at 4:32 PM, Patrick Huang  wrote:
>
> > Hi Jun,
> >
> > That is a good point. I want to make it clear about the scenario you
> > mentioned. Correct me if I am wrong. Here is the sequence of the event:
> >
> >1. Broker sends ControlledShutdown request 1 to controller
> >2. Broker sends ControlledShutdown request 2 to controller due to
> >reties
> >3. Controller processes ControlledShutdown request 1
> >4. Controller sends control requests to the broker
> >5. Broker receives ControlledShutdown response 1 from controller
> >6. Broker shuts down and restarts quickly
> >7. Controller processes ControllerShutdown request 2
> >8. Controller sends control requests to the broker
> >
> > It is possible that controller processes the broker change event between
> > 6) and 7). In this case, controller already updates the cached czxid to
> the
> > up-to-date ones so the bounced broker will not reject control requests in
> > 8), which cause a correctness problem.
> >
> >
> > Best,
> > Zhanxiang (Patrick) Huang
> >
> > --
> > *From:* Jun Rao 
> > *Sent:* Monday, October 22, 2018 14:45
> > *To:* dev
> > *Subject:* Re: [DISCUSS] KIP-380: Detect outdated control requests and
> > bounced brokers using broker generation
> >
> > Hi, Patrick,
> >
> > There is another thing that may be worth considering.
> >
> > 10. It will be useful to include the czxid also in the ControlledShutdown
> > request. This way, if the broker has been restarted, the controller can
> > ignore an old ControlledShutdown request(e.g., due to retries). This will
> > prevent the restarted broker from incorrectly stopping replicas.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, Oct 11, 2018 at 5:56 PM, Patrick Huang 
> > wrote:
> >
> > > Hi Jun,
> > >
> > > Thanks a lot for the comments.
> > >
> > > 1. czxid is globally unique and monotonically increasing based on the
> > > zookeeper doc.
> > > References (from
> > > https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html):
> > > "Every change to the ZooKeeper state receives a stamp in the form of a
> > > *zxid* (ZooKeeper Transaction Id). This exposes the total ordering of
> all
> > > changes to ZooKeeper. Each change will have a unique zxid and if zxid1
> is
> > > smaller than zxid2 then zxid1 happened before zxid2."
> > > "czxid: The zxid of the change that caused this znode to be created."
> > >
> > > 2. You are right. There will be only on broker change event fired in
> the
> > > case I mentioned because we will not register the watcher before the
> > read.
> > >
> > > 3. Let's say we want to initialize the states of broker set A and we
> want
> > > the cluster to be aware of the absence of broker set B. The currently
> > live
> > > broker set in the cluster is C.
> > >
> > > From the design point of view, here are the rules 

Jenkins build is back to normal : kafka-trunk-jdk11 #89

2018-11-12 Thread Apache Jenkins Server
See 



Re: [VOTE] KIP-374: Add '--help' option to all available Kafka CLI commands

2018-11-12 Thread Mickael Maison
+1 (non-binding)
Thanks for the KIP!
On Mon, Nov 12, 2018 at 5:16 AM Becket Qin  wrote:
>
> Thanks for the KIP. +1 (binding).
>
> On Mon, Nov 12, 2018 at 9:59 AM Harsha Chintalapani  wrote:
>
> > +1 (binding)
> >
> > -Harsha
> > On Nov 11, 2018, 3:49 PM -0800, Daniele Ascione ,
> > wrote:
> > > +1 (non-binding)
> > >
> > > Il ven 9 nov 2018, 02:09 Colin McCabe  ha scritto:
> > >
> > > > +1 (binding)
> > > >
> > > >
> > > >
> > > > On Wed, Oct 31, 2018, at 05:42, Srinivas Reddy wrote:
> > > > > Hi All,
> > > > >
> > > > > I would like to call for a vote on KIP-374:
> > > > > https://cwiki.apache.org/confluence/x/FgSQBQ
> > > > >
> > > > > Summary:
> > > > > Currently, the '--help' option is recognized by some Kafka commands
> > > > > but not all. To provide a consistent user experience, it would
> > > > > be nice to> add a '--help' option to all Kafka commands.
> > > > >
> > > > > I'd appreciate any votes or feedback.
> > > > >
> > > > > --
> > > > > Srinivas Reddy
> > > > >
> > > > > http://mrsrinivas.com/
> > > > >
> > > > >
> > > > > (Sent via gmail web)
> > > >
> > > >
> >


[jira] [Created] (KAFKA-7618) Trogdor - Fix /tasks endpoint parameters

2018-11-12 Thread Stanislav Kozlovski (JIRA)
Stanislav Kozlovski created KAFKA-7618:
--

 Summary: Trogdor - Fix /tasks endpoint parameters
 Key: KAFKA-7618
 URL: https://issues.apache.org/jira/browse/KAFKA-7618
 Project: Kafka
  Issue Type: Bug
Reporter: Stanislav Kozlovski
Assignee: Stanislav Kozlovski


A Trogdor Coordinator's `/tasks` endpoint returns the status of all tasks. It 
supports arguments to filter the tasks, like `firstStartMs`, `lastStartMs`, 
`firstEndMs`, `lastEndMs`.
These arguments denote milliseconds since the unix epoch.

There is a bug currently where the endpoint parses the arguments as integers, 
whereas they should be long (the current unix millisecond timestamp does not 
fit into an integer).

This results in API calls returning a 404
{code:java}
curl -v -L -G -d "firstStartMs=1542028764787" localhost:8889/coordinator/tasks
* Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 8889 (#0)
> GET /coordinator/tasks?firstStartMs=154202876478 HTTP/1.1
> Host: localhost:8889
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Date: Mon, 12 Nov 2018 13:28:59 GMT
< Content-Type: application/json
< Content-Length: 43
< Server: Jetty(9.4.12.v20180830)
<
* Connection #0 to host localhost left intact{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7619) Trogdor - Allow filtering tasks by state in /coordinator/tasks endpoint

2018-11-12 Thread Stanislav Kozlovski (JIRA)
Stanislav Kozlovski created KAFKA-7619:
--

 Summary: Trogdor - Allow filtering tasks by state in 
/coordinator/tasks endpoint
 Key: KAFKA-7619
 URL: https://issues.apache.org/jira/browse/KAFKA-7619
 Project: Kafka
  Issue Type: Improvement
Reporter: Stanislav Kozlovski
Assignee: Stanislav Kozlovski


A Trogdor Coordinator's `/tasks` endpoint returns the status of all tasks. It 
supports arguments to filter the tasks, like `firstStartMs`, `lastStartMs`, 
`firstEndMs`, `lastEndMs`.
These arguments denote milliseconds since the unix epoch.

It would be useful to support filtering by the state of the task. We currently 
have no way of getting every `RUNNING`, `STOPPED`, or `PENDING` task unless we 
want to manually filter through everything returned by `/coordinator/tasks`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-374: Add '--help' option to all available Kafka CLI commands

2018-11-12 Thread Vahid Hashemian
+1 (non-binding)
Thanks for the KIP.

--Vahid

On Mon, Nov 12, 2018, 04:06 Mickael Maison  wrote:

> +1 (non-binding)
> Thanks for the KIP!
> On Mon, Nov 12, 2018 at 5:16 AM Becket Qin  wrote:
> >
> > Thanks for the KIP. +1 (binding).
> >
> > On Mon, Nov 12, 2018 at 9:59 AM Harsha Chintalapani 
> wrote:
> >
> > > +1 (binding)
> > >
> > > -Harsha
> > > On Nov 11, 2018, 3:49 PM -0800, Daniele Ascione ,
> > > wrote:
> > > > +1 (non-binding)
> > > >
> > > > Il ven 9 nov 2018, 02:09 Colin McCabe  ha
> scritto:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Oct 31, 2018, at 05:42, Srinivas Reddy wrote:
> > > > > > Hi All,
> > > > > >
> > > > > > I would like to call for a vote on KIP-374:
> > > > > > https://cwiki.apache.org/confluence/x/FgSQBQ
> > > > > >
> > > > > > Summary:
> > > > > > Currently, the '--help' option is recognized by some Kafka
> commands
> > > > > > but not all. To provide a consistent user experience, it would
> > > > > > be nice to> add a '--help' option to all Kafka commands.
> > > > > >
> > > > > > I'd appreciate any votes or feedback.
> > > > > >
> > > > > > --
> > > > > > Srinivas Reddy
> > > > > >
> > > > > > http://mrsrinivas.com/
> > > > > >
> > > > > >
> > > > > > (Sent via gmail web)
> > > > >
> > > > >
> > >
>


Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-12 Thread Boyang Chen
Thanks Mayuresh for the feedback! Do you have a quick example for passing in 
consumer config dynamically? I mainly use Kafka Streams at my daily work so 
probably missing the idea how to do it in the current consumer setting.


For differentiating session timeout and registration timeout, I would try to 
enhance the documentation in the first stage to see how people react to the 
confusion (would be great if they feel straightforward!). Since one doesn't 
have to fully understand the difference unless defining the new config "member 
name", for current users we could buy some time to listen to their 
understandings and improve our documentation correspondingly in the follow-up 
KIPs.


Boyang


From: Mayuresh Gharat 
Sent: Sunday, November 11, 2018 1:06 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by 
specifying member id

Hi Boyang,

Thanks for the reply.

Please find the replies inline below :
For having a consumer config at runtime, I think it's not necessary to
address in this KIP because most companies run sidecar jobs through daemon
software like puppet. It should be easy to change the config through script
or UI without actual code change. We still want to leave flexibility for
user to define member name as they like.
 This might be little different for companies that use configuration
management tools that does not allow the applications to define/change the
configs dynamically. For example, if we use something similar to spring to
pull in the configs for the KafkaConsumer and pass it to the constructor to
create the KafkaConsumer object, it will be hard to specify a unique value
to the "MEMBER_NAME" config unless someone deploying the app generates a
unique string for this config outside the deployment workflow and copies it
statically before starting up each consumer instance. Unless we can loosen
the criteria for uniqueness of this config value, for each consumer
instance in the consumer group, I am not sure of a better way of
addressing this. If we don't want to loosen the criteria, then providing a
dynamic way to pass this in at runtime, would put the onus of having the
same unique value each time a consumer is restarted, on to the application
that is running the consumer.

I just updated the kip about having both "registration timeout" and
"session timeout". The benefit of having two configs instead of one is to
reduce the mental burden for operation, for example user just needs to
unset "member name" to cast back to dynamic membership without worrying
about tuning the "session timeout" back to a smaller value.
--- That is a good point. I was thinking, if both the configs are
specified, it would be confusing for the end user without understanding the
internals of the consumer and its interaction with group coordinator, as
which takes precedence when and how it affects the consumer behavior. Just
my 2 cents.

Thanks,

Mayuresh

On Fri, Nov 9, 2018 at 8:27 PM Boyang Chen  wrote:

> Hey Mayuresh,
>
>
> thanks for the thoughtful questions! Let me try to answer your questions
> one by one.
>
>
> For having a consumer config at runtime, I think it's not necessary to
> address in this KIP because most companies run sidecar jobs through daemon
> software like puppet. It should be easy to change the config through script
> or UI without actual code change. We still want to leave flexibility for
> user to define member name as they like.
>
>
> I just updated the kip about having both "registration timeout" and
> "session timeout". The benefit of having two configs instead of one is to
> reduce the mental burden for operation, for example user just needs to
> unset "member name" to cast back to dynamic membership without worrying
> about tuning the "session timeout" back to a smaller value.
>
>
> For backup topic, I think it's a low-level detail which could be addressed
> in the implementation. I feel no preference of adding a new topic vs reuse
> consumer offsets topic. I will do more analysis and make a trade-off
> comparison. Nice catch!
>
>
> I hope the explanations make sense to you. I will keep polishing on the
> edge cases and details.
>
>
> Best,
>
> Boyang
>
> 
> From: Mayuresh Gharat 
> Sent: Saturday, November 10, 2018 10:25 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by
> specifying member id
>
> Hi Boyang,
>
> Thanks for the KIP and sorry for being late to the party. This KIP is
> really useful for us at Linkedin.
>
> I had a few questions :
>
> The idea of having static member name seems nice, but instead of a config,
> would it be possible for it to be passed in to the consumer at runtime?
> This is because an app might want to decide the config value at runtime
> using its host information for example, to generate the unique member name.
>
> Also the KIP talks about using the "REGISTRATION_TIMEOUT_MS". I was
> wond

[jira] [Resolved] (KAFKA-7557) optimize LogManager.truncateFullyAndStartAt()

2018-11-12 Thread Jun Rao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-7557.

   Resolution: Fixed
Fix Version/s: 2.2.0

merged to trunk

> optimize LogManager.truncateFullyAndStartAt()
> -
>
> Key: KAFKA-7557
> URL: https://issues.apache.org/jira/browse/KAFKA-7557
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jun Rao
>Assignee: huxihx
>Priority: Major
> Fix For: 2.2.0
>
>
> When a ReplicaFetcherThread calls LogManager.truncateFullyAndStartAt() for a 
> partition, we call LogManager.checkpointLogRecoveryOffsetsInDir() and then 
> Log.deleteSnapshotsAfterRecoveryPointCheckpoint() on all the logs in that 
> directory. This requires listing all the files in each log dir to figure out 
> the snapshot files. If some logs have many log segment files. This could take 
> some time. The can potentially block a replica fetcher thread, which 
> indirectly causes the request handler threads to be blocked.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-380: Detect outdated control requests and bounced brokers using broker generation

2018-11-12 Thread Patrick Huang
Hi Becket,

I think you are right. STALE_BROKER_EPOCH only makes sense when the broker 
detects outdated control requests and wants the controller to know about that. 
For ControlledShutdownRequest, the controller should just ignore the request 
with stale broker epoch since the broker does not need and will not do anything 
for STALE_BROKER_EPOCH response. Thanks for pointing it out.

Thanks,
Zhanxiang (Patrick) Huang


From: Becket Qin 
Sent: Monday, November 12, 2018 6:46
To: dev
Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and bounced 
brokers using broker generation

Hi Patrick,

I am wondering why the controller should send STALE_BROKER_EPOCH error to
the broker if the broker epoch is stale? Would this be a little confusing
to the current broker if the request was sent by a broker with previous
epoch? Should the controller just ignore those requests in that case?

Thanks,

Jiangjie (Becket) Qin

On Fri, Nov 9, 2018 at 2:17 AM Patrick Huang  wrote:

> Hi,
>
> In this KIP, we are also going to add a new exception and a new error code
> "STALE_BROKER_EPOCH" in order to allow the broker to respond back the right
> error when it sees outdated broker epoch in the control requests. Since
> adding a new exception and error code is also considered as public
> interface change, I have updated the original KIP accordingly to include
> this change. Feel free to comment if there is any concern.
>
> Thanks,
> Zhanxiang (Patrick) Huang
>
> 
> From: Patrick Huang 
> Sent: Tuesday, October 23, 2018 6:20
> To: Jun Rao; dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
> bounced brokers using broker generation
>
> Agreed. I have updated the PR to add czxid in ControlledShutdownRequest (
> https://github.com/apache/kafka/pull/5821). Appreciated if you can take a
> look.
>
> Btw, I also have the vote thread for this KIP:
> https://lists.apache.org/thread.html/3689d83db537d5aa86d10967dee7ee29578897fc123daae4f77a8605@%3Cdev.kafka.apache.org%3E
>
> Best,
> Zhanxiang (Patrick) Huang
>
> 
> From: Jun Rao 
> Sent: Monday, October 22, 2018 21:31
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
> bounced brokers using broker generation
>
> Hi, Patrick,
>
> Yes, that's the general sequence. After step 2, the shutting down broker
> can give up the controlled shutdown process and proceed to shut down. When
> it's restarted, it could still receive StopReplica requests from the
> controller in reaction to the previous controlled shutdown requests. This
> could lead the restarted broker to a bad state.
>
> Thanks,
>
> Jun
>
>
> On Mon, Oct 22, 2018 at 4:32 PM, Patrick Huang  wrote:
>
> > Hi Jun,
> >
> > That is a good point. I want to make it clear about the scenario you
> > mentioned. Correct me if I am wrong. Here is the sequence of the event:
> >
> >1. Broker sends ControlledShutdown request 1 to controller
> >2. Broker sends ControlledShutdown request 2 to controller due to
> >reties
> >3. Controller processes ControlledShutdown request 1
> >4. Controller sends control requests to the broker
> >5. Broker receives ControlledShutdown response 1 from controller
> >6. Broker shuts down and restarts quickly
> >7. Controller processes ControllerShutdown request 2
> >8. Controller sends control requests to the broker
> >
> > It is possible that controller processes the broker change event between
> > 6) and 7). In this case, controller already updates the cached czxid to
> the
> > up-to-date ones so the bounced broker will not reject control requests in
> > 8), which cause a correctness problem.
> >
> >
> > Best,
> > Zhanxiang (Patrick) Huang
> >
> > --
> > *From:* Jun Rao 
> > *Sent:* Monday, October 22, 2018 14:45
> > *To:* dev
> > *Subject:* Re: [DISCUSS] KIP-380: Detect outdated control requests and
> > bounced brokers using broker generation
> >
> > Hi, Patrick,
> >
> > There is another thing that may be worth considering.
> >
> > 10. It will be useful to include the czxid also in the ControlledShutdown
> > request. This way, if the broker has been restarted, the controller can
> > ignore an old ControlledShutdown request(e.g., due to retries). This will
> > prevent the restarted broker from incorrectly stopping replicas.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, Oct 11, 2018 at 5:56 PM, Patrick Huang 
> > wrote:
> >
> > > Hi Jun,
> > >
> > > Thanks a lot for the comments.
> > >
> > > 1. czxid is globally unique and monotonically increasing based on the
> > > zookeeper doc.
> > > References (from
> > > https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html):
> > > "Every change to the ZooKeeper state receives a stamp in the form of a
> > > *zxid* (ZooKeeper Transaction Id). This exposes the total ordering of
> all
> > > changes to ZooKeeper. Each cha

[jira] [Resolved] (KAFKA-7452) Deleting snapshot files after check-pointing log recovery offsets can slow down replication when truncation happens

2018-11-12 Thread Zhanxiang (Patrick) Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanxiang (Patrick) Huang resolved KAFKA-7452.
--
Resolution: Duplicate

KAFKA-7557 fixed this.

> Deleting snapshot files after check-pointing log recovery offsets can slow 
> down replication when truncation happens
> ---
>
> Key: KAFKA-7452
> URL: https://issues.apache.org/jira/browse/KAFKA-7452
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.0.1, 1.1.0, 1.1.1, 2.0.0
>Reporter: Zhanxiang (Patrick) Huang
>Assignee: Zhanxiang (Patrick) Huang
>Priority: Major
>
> After KAFKA-5829, Kafka will try to iterate through all the partition dirs to 
> delete useless snapshot files in "checkpointLogRecoveryOffsetsInDir". 
> Currently, "checkpointLogRecoveryOffsetsInDir" is used in the following 
> places:
>  # Truncation
>  # Log dir deletion and movement
>  # Background thread checkpointing recovery offsets
> In 2.0 deployment on a cluster with 10k partitions per broker, we found out 
> that deleting useless snapshot files in the critical path of log truncation 
> can significantly slow down followers to catch up with leader during rolling 
> bounce (~2x slower than 0.11). The reason is that we basically do a "ls -R" 
> for the whole data directory only to potentially delete the snapshot files in 
> one partition directory because the way we identify snapshot files is to list 
> the directories and check the filename suffix.
> In our case, "listSnapshotFiles" takes ~1ms per partition directory so it 
> takes ~1ms * 10k = ~10s to just delete snapshot files in one partition after 
> the truncation, which delays future fetches in the fetcher thread.
> Here are the related code snippets:
>  LogManager.scala
>  
> {code:java}
> private def checkpointLogRecoveryOffsetsInDir(dir: File): Unit = {
>   for {
> partitionToLog <- logsByDir.get(dir.getAbsolutePath)
> checkpoint <- recoveryPointCheckpoints.get(dir)
>   } {
> try {
>   checkpoint.write(partitionToLog.mapValues(_.recoveryPoint))
>   allLogs.foreach(_.deleteSnapshotsAfterRecoveryPointCheckpoint())
> } catch {
>   case e: IOException =>
> logDirFailureChannel.maybeAddOfflineLogDir(dir.getAbsolutePath, 
> s"Disk error while writing to recovery point " +
>   s"file in directory $dir", e)
> }
>   }
> }
> {code}
>  
>  ProducerStateChangeManager.scala
>  
> {code:java}
> private[log] def listSnapshotFiles(dir: File): Seq[File] = {
>   if (dir.exists && dir.isDirectory) {
> Option(dir.listFiles).map { files =>
>   files.filter(f => f.isFile && isSnapshotFile(f)).toSeq
> }.getOrElse(Seq.empty)
>   } else Seq.empty
> }
> private def deleteSnapshotFiles(dir: File, predicate: Long => Boolean = _ => 
> true) {
>   listSnapshotFiles(dir).filter(file => 
> predicate(offsetFromFile(file))).foreach { file =>
> Files.deleteIfExists(file.toPath)
>   }
> }
> {code}
>  
> There are a few things that can be optimized here:
>  # We can have an in-memory cache for the snapshot files metadata (filename) 
> in ProducerStateManager to avoid calling dir.listFiles in 
> "deleteSnapshotFiles", "latestSnapshotFile" and "oldestSnapshotFile".
>  # After truncation, we can only try to delete snapshot files for the 
> truncated partitions (in replica fetcher thread, we truncate one partition at 
> a time) instead of all partitions. Or maybe we don't even need to delete 
> snapshot files in the critical path of truncation because the background 
> log-recovery-offset-checkpoint-thread will do it periodically. This can also 
> apply on log deletion/movement.
>  # If we want to further optimize the actual snapshot file deletion, we can 
> make it async. But I am not sure whether it is needed after we have 1) and 2).
> Also, we notice that there is no way to disable transaction/exactly-once 
> support in the broker-side given that it will bring in some extra overhead 
> even though we have no clients using this feature. Not sure whether this is a 
> common use case, but it is useful if we can have a switch to avoid the extra 
> performance overhead.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-354 Time-based log compaction policy

2018-11-12 Thread xiongqi wu
Hi all,

Can I have one more vote on this KIP?
Any comment is appreciated.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A+Add+a+Maximum+Log+Compaction+Lag


Xiongqi (Wesley) Wu


On Fri, Nov 9, 2018 at 7:56 PM xiongqi wu  wrote:

> Thanks Dong.
> I have updated the KIP.
>
> Xiongqi (Wesley) Wu
>
>
> On Fri, Nov 9, 2018 at 5:31 PM Dong Lin  wrote:
>
>> Thanks for the KIP Xiongqi. LGTM. +1 (binding)
>>
>> One minor comment: it may be a bit better to clarify in the public
>> interface section that the value of the newly added metric is determined
>> based by applying that formula across all compactable segments. For
>> example:
>>
>> The maximum value of Math.max(now -
>> earliest_timestamp_in_ms_of_uncompacted_segment - max.compaction.lag.ms,
>> 0)/1000 across all compactable partitions, where the
>> max.compaction.lag.ms
>> can be overridden on per-topic basis.
>>
>>
>>
>> On Fri, Nov 9, 2018 at 5:16 PM xiongqi wu  wrote:
>>
>> > Thanks Joel.
>> > Tracking the delay at second granularity makes sense
>> > I have updated KIP.
>> >
>> > Xiongqi (Wesley) Wu
>> >
>> >
>> > On Fri, Nov 9, 2018 at 5:05 PM Joel Koshy  wrote:
>> >
>> > > +1 with one suggestion on the proposed metric. You should probably
>> > include
>> > > the unit. So for e.g., max-compaction-delay-secs.
>> > >
>> > > Joel
>> > >
>> > > On Tue, Nov 6, 2018 at 5:30 PM xiongqi wu 
>> wrote:
>> > >
>> > > > bump
>> > > > Xiongqi (Wesley) Wu
>> > > >
>> > > >
>> > > > On Thu, Sep 27, 2018 at 4:20 PM xiongqi wu 
>> > wrote:
>> > > >
>> > > > >
>> > > > > Thanks Eno, Brett, Dong, Guozhang, Colin,  and Xiaohe for
>> feedback.
>> > > > > Can I have more feedback or VOTE on this KIP?
>> > > > >
>> > > > >
>> > > > > Xiongqi (Wesley) Wu
>> > > > >
>> > > > >
>> > > > > On Wed, Sep 19, 2018 at 10:52 AM xiongqi wu 
>> > > wrote:
>> > > > >
>> > > > >> Any other votes or comments?
>> > > > >>
>> > > > >> Xiongqi (Wesley) Wu
>> > > > >>
>> > > > >>
>> > > > >> On Tue, Sep 11, 2018 at 11:45 AM xiongqi wu > >
>> > > > wrote:
>> > > > >>
>> > > > >>> Yes, more votes and code review.
>> > > > >>>
>> > > > >>> Xiongqi (Wesley) Wu
>> > > > >>>
>> > > > >>>
>> > > > >>> On Mon, Sep 10, 2018 at 11:37 PM Brett Rann
>> > > > > > > >
>> > > > >>> wrote:
>> > > > >>>
>> > > >  +1 (non binding) from on 0 then, and on the KIP.
>> > > > 
>> > > >  Where do we go from here? More votes?
>> > > > 
>> > > >  On Tue, Sep 11, 2018 at 5:34 AM Colin McCabe <
>> cmcc...@apache.org>
>> > > >  wrote:
>> > > > 
>> > > >  > On Mon, Sep 10, 2018, at 11:44, xiongqi wu wrote:
>> > > >  > > Thank you for comments. I will use '0' for now.
>> > > >  > >
>> > > >  > > If we create topics through admin client in the future, we
>> > might
>> > > >  perform
>> > > >  > > some useful checks. (but the assumption is all brokers in
>> the
>> > > same
>> > > >  > cluster
>> > > >  > > have the same default configurations value, otherwise,it
>> might
>> > > >  still be
>> > > >  > > tricky to do such cross validation check.)
>> > > >  >
>> > > >  > This isn't something that we might do in the future-- this is
>> > > >  something we
>> > > >  > are doing now. We already have Create Topic policies which
>> are
>> > > >  enforced by
>> > > >  > the broker. Check KIP-108 and KIP-170 for details. This is
>> one
>> > of
>> > > > the
>> > > >  > motivations for getting rid of direct ZK access-- making sure
>> > that
>> > > >  these
>> > > >  > policies are applied.
>> > > >  >
>> > > >  > I agree that having different configurations on different
>> > brokers
>> > > > can
>> > > >  be
>> > > >  > confusing and frustrating . That's why more configurations
>> are
>> > > being
>> > > >  made
>> > > >  > dynamic using KIP-226. Dynamic configurations are stored
>> > centrally
>> > > > in
>> > > >  ZK,
>> > > >  > so they are the same on all brokers (modulo propagation
>> delays).
>> > > In
>> > > >  any
>> > > >  > case, this is a general issue, not specific to "create
>> topics".
>> > > >  >
>> > > >  > cheers,
>> > > >  > Colin
>> > > >  >
>> > > >  >
>> > > >  > >
>> > > >  > >
>> > > >  > > Xiongqi (Wesley) Wu
>> > > >  > >
>> > > >  > >
>> > > >  > > On Mon, Sep 10, 2018 at 11:15 AM Colin McCabe <
>> > > cmcc...@apache.org
>> > > > >
>> > > >  > wrote:
>> > > >  > >
>> > > >  > > > I don't have a strong opinion. But I think we should
>> > probably
>> > > be
>> > > >  > > > consistent with how segment.ms works, and just use 0.
>> > > >  > > >
>> > > >  > > > best,
>> > > >  > > > Colin
>> > > >  > > >
>> > > >  > > >
>> > > >  > > > On Wed, Sep 5, 2018, at 21:19, Brett Rann wrote:
>> > > >  > > > > OK thanks for that clarification. I see why you're
>> > > > uncomfortable
>> > > >  > with 0
>> > > >  > > > now.
>> > > >  > > > >
>> > > >  > > > > I'm not re

Re: [DISCUSS] KIP-380: Detect outdated control requests and bounced brokers using broker generation

2018-11-12 Thread Dong Lin
Hey Becket, Patrick,

Currently we expect that controller can only receive ControlledShutdownRequest
with outdated broker epoch in two cases: 1) the ControlledShutdownRequest
was sent by the broker before it has restarted; and 2) there is bug.

In case 1), it seems that ControlledShutdownResponse will not be delivered
to the broker as the channel should have been disconnected. Thus there is
no confusion to the broker because broker will not receive response
with STALE_BROKER_EPOCH error.

In case 2), it seems that it can be useful to still deliver
ControlledShutdownResponse
with STALE_BROKER_EPOCH so that this broker at least knows that which
response is not accepted. This helps us debug this issue.

Also, in terms of both design and implementation, it seems simpler to still
define a response for a given invalid request rather than have special path
to skip response for the invalid request. Does this sound reasonable?

Thanks,
Dong


On Mon, Nov 12, 2018 at 9:52 AM Patrick Huang  wrote:

> Hi Becket,
>
> I think you are right. STALE_BROKER_EPOCH only makes sense when the broker
> detects outdated control requests and wants the controller to know about
> that. For ControlledShutdownRequest, the controller should just ignore the
> request with stale broker epoch since the broker does not need and will not
> do anything for STALE_BROKER_EPOCH response. Thanks for pointing it out.
>
> Thanks,
> Zhanxiang (Patrick) Huang
>
> 
> From: Becket Qin 
> Sent: Monday, November 12, 2018 6:46
> To: dev
> Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
> bounced brokers using broker generation
>
> Hi Patrick,
>
> I am wondering why the controller should send STALE_BROKER_EPOCH error to
> the broker if the broker epoch is stale? Would this be a little confusing
> to the current broker if the request was sent by a broker with previous
> epoch? Should the controller just ignore those requests in that case?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Nov 9, 2018 at 2:17 AM Patrick Huang  wrote:
>
> > Hi,
> >
> > In this KIP, we are also going to add a new exception and a new error
> code
> > "STALE_BROKER_EPOCH" in order to allow the broker to respond back the
> right
> > error when it sees outdated broker epoch in the control requests. Since
> > adding a new exception and error code is also considered as public
> > interface change, I have updated the original KIP accordingly to include
> > this change. Feel free to comment if there is any concern.
> >
> > Thanks,
> > Zhanxiang (Patrick) Huang
> >
> > 
> > From: Patrick Huang 
> > Sent: Tuesday, October 23, 2018 6:20
> > To: Jun Rao; dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
> > bounced brokers using broker generation
> >
> > Agreed. I have updated the PR to add czxid in ControlledShutdownRequest (
> > https://github.com/apache/kafka/pull/5821). Appreciated if you can take
> a
> > look.
> >
> > Btw, I also have the vote thread for this KIP:
> >
> https://lists.apache.org/thread.html/3689d83db537d5aa86d10967dee7ee29578897fc123daae4f77a8605@%3Cdev.kafka.apache.org%3E
> >
> > Best,
> > Zhanxiang (Patrick) Huang
> >
> > 
> > From: Jun Rao 
> > Sent: Monday, October 22, 2018 21:31
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
> > bounced brokers using broker generation
> >
> > Hi, Patrick,
> >
> > Yes, that's the general sequence. After step 2, the shutting down broker
> > can give up the controlled shutdown process and proceed to shut down.
> When
> > it's restarted, it could still receive StopReplica requests from the
> > controller in reaction to the previous controlled shutdown requests. This
> > could lead the restarted broker to a bad state.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Oct 22, 2018 at 4:32 PM, Patrick Huang 
> wrote:
> >
> > > Hi Jun,
> > >
> > > That is a good point. I want to make it clear about the scenario you
> > > mentioned. Correct me if I am wrong. Here is the sequence of the event:
> > >
> > >1. Broker sends ControlledShutdown request 1 to controller
> > >2. Broker sends ControlledShutdown request 2 to controller due to
> > >reties
> > >3. Controller processes ControlledShutdown request 1
> > >4. Controller sends control requests to the broker
> > >5. Broker receives ControlledShutdown response 1 from controller
> > >6. Broker shuts down and restarts quickly
> > >7. Controller processes ControllerShutdown request 2
> > >8. Controller sends control requests to the broker
> > >
> > > It is possible that controller processes the broker change event
> between
> > > 6) and 7). In this case, controller already updates the cached czxid to
> > the
> > > up-to-date ones so the bounced broker will not reject control requests
> in
> > > 8), which cause a correctness problem.
> > >
> > >
> > > 

Re: [VOTE] KIP-374: Add '--help' option to all available Kafka CLI commands

2018-11-12 Thread Srinivas Reddy
Thank you all for votes. Here is the summary of all votes.

Binding votes:
 Colin McCabe
 Harsha Chintalapani
 Becket Qin

Non binding votes:
 Viktor Somogyi-Vass
 Daniele Ascione
 Mickael Maison
 Vahid Hashemian

I am concluding this voting thread and mark this KIP as accepted.

Thanks


--
Srinivas Reddy

http://mrsrinivas.com/


(Sent via gmail web)


On Mon, 12 Nov 2018 at 23:19, Vahid Hashemian 
wrote:

> +1 (non-binding)
> Thanks for the KIP.
>
> --Vahid
>
> On Mon, Nov 12, 2018, 04:06 Mickael Maison 
> wrote:
>
>> +1 (non-binding)
>> Thanks for the KIP!
>> On Mon, Nov 12, 2018 at 5:16 AM Becket Qin  wrote:
>> >
>> > Thanks for the KIP. +1 (binding).
>> >
>> > On Mon, Nov 12, 2018 at 9:59 AM Harsha Chintalapani 
>> wrote:
>> >
>> > > +1 (binding)
>> > >
>> > > -Harsha
>> > > On Nov 11, 2018, 3:49 PM -0800, Daniele Ascione > >,
>> > > wrote:
>> > > > +1 (non-binding)
>> > > >
>> > > > Il ven 9 nov 2018, 02:09 Colin McCabe  ha
>> scritto:
>> > > >
>> > > > > +1 (binding)
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Oct 31, 2018, at 05:42, Srinivas Reddy wrote:
>> > > > > > Hi All,
>> > > > > >
>> > > > > > I would like to call for a vote on KIP-374:
>> > > > > > https://cwiki.apache.org/confluence/x/FgSQBQ
>> > > > > >
>> > > > > > Summary:
>> > > > > > Currently, the '--help' option is recognized by some Kafka
>> commands
>> > > > > > but not all. To provide a consistent user experience, it would
>> > > > > > be nice to> add a '--help' option to all Kafka commands.
>> > > > > >
>> > > > > > I'd appreciate any votes or feedback.
>> > > > > >
>> > > > > > --
>> > > > > > Srinivas Reddy
>> > > > > >
>> > > > > > http://mrsrinivas.com/
>> > > > > >
>> > > > > >
>> > > > > > (Sent via gmail web)
>> > > > >
>> > > > >
>> > >
>>
>


Build failed in Jenkins: kafka-trunk-jdk8 #3192

2018-11-12 Thread Apache Jenkins Server
See 


Changes:

[junrao] KAFKA-7557: optimize LogManager.truncateFullyAndStartAt() (#5848)

--
[...truncated 2.49 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDiffer

Re: [VOTE] KIP-386: Standardize on Min/Avg/Max metrics' default values

2018-11-12 Thread Stanislav Kozlovski
Thank you, everybody, for the votes and discussion. The KIP has passed with
3 binding votes (Harsha, Jun, Dong) and 1 non-binding vote (Kevin)

@jun, I added a note on the KIP stating that this isn't related to the
Yammer metrics

Best,
Stanislav

On Thu, Nov 8, 2018 at 10:54 PM Dong Lin  wrote:

> Thanks for the KIP Stanislav. +1 (binding)
>
> On Thu, Nov 8, 2018 at 2:40 PM Jun Rao  wrote:
>
> > Hi, Stanislav,
> >
> > Thanks for the KIP. +1. I guess this only covers the Kafka metrics (not
> the
> > Yammer metrics). It would be useful to make this clear.
> >
> > Jun
> >
> > On Tue, Nov 6, 2018 at 1:00 AM, Stanislav Kozlovski <
> > stanis...@confluent.io>
> > wrote:
> >
> > > Hey everybody,
> > >
> > > I'm starting a vote thread on KIP-386: Standardize on Min/Avg/Max
> > metrics'
> > > default values
> > > <
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652345
> > > >
> > > In short, after the discussion thread
> > >  > > 4b9014fbb50f663bf14e5aec67@%3Cdev.kafka.apache.org%3E>,
> > > we decided to have all min/avg/max metrics output `NaN` as a default
> > value.
> > >
> > > --
> > > Best,
> > > Stanislav
> > >
> >
>


-- 
Best,
Stanislav


Build failed in Jenkins: kafka-trunk-jdk11 #90

2018-11-12 Thread Apache Jenkins Server
See 


Changes:

[junrao] KAFKA-7557: optimize LogManager.truncateFullyAndStartAt() (#5848)

--
[...truncated 2.33 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey PASSED

org.a

Jenkins build is back to normal : kafka-trunk-jdk11 #91

2018-11-12 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

2018-11-12 Thread Florian Hussonnois
Hi Matthias,

Sorry I was absent for a while. I have started a new PR for this KIP. It is
still in progress for now. I'm working on it.
https://github.com/apache/kafka/pull/5909

Le ven. 19 oct. 2018 à 20:13, Matthias J. Sax  a
écrit :

> What is the status of this KIP?
>
> -Matthias
>
> On 7/19/18 5:17 PM, Guozhang Wang wrote:
> > Hello Florian,
> >
> > Sorry for being late... Found myself keep apologizing for late replies
> > these days. But I do want to push this KIP's progress forward as I see it
> > very important and helpful feature for extensibility.
> >
> > About the exceptions, I've gone through them and hopefully it is an
> > exhaustive list:
> >
> > 1. KTable#toStream()
> > 2. KStream#merge(KStream)
> > 3. KStream#process() / transform() / transformValues()
> > 4. KGroupedTable / KGroupedStream#count()
> >
> >
> > Here's my reasoning:
> >
> > * It is okay not letting users to override the name for 1/2, since they
> are
> > too trivial to be useful for debugging, plus their processor names would
> > not determine any related topic / store names.
> > * For 3, I'd vote for adding overloaded functions with Named.
> > * For 4, if users really want to name the processor she can call
> > aggregate() instead, so I think it is okay to skip this case.
> >
> >
> > Guozhang
> >
> >
> >
> > On Fri, Jul 6, 2018 at 3:06 PM, Florian Hussonnois <
> fhussonn...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> The option #3 seems to be a good alternative and I find the API more
> >> elegant (thanks John).
> >>
> >> But, we still have the need to overload some methods either because
> they do
> >> not accept an action instance or because they are translated to multiple
> >> processors.
> >>
> >> For example, this is the case for methods branch() and merge(). We could
> >> introduce a new interface Named (or maybe a different name ?) with a
> method
> >> name(). All action interfaces could extend this one to implement the
> option
> >> 3).
> >> This would result by having the following overloads  :
> >>
> >> Stream merge(final Named name, final KStream stream);
> >> KStream[] branch(final Named name, final Predicate super
> >> V>... predicates)
> >>
> >> N.B : The list above is  not exhaustive
> >>
> >> -
> >> user's code will become :
> >>
> >> KStream stream = builder.stream("test");
> >> KStream[] branches =
> >> stream.branch(Named.with("BRANCH-STREAM-ON-VALUE"),
> >> Predicate.named("STREAM-PAIR-VALUE", (k, v) -> v % 2 ==
> >> 0),
> >> Predicate.named("STREAM-IMPAIR-VALUE", (k, v) -> v % 2
> !=
> >> 0));
> >>
> >> branches[0].to("pair");
> >> branches[1].to("impair");
> >> -
> >>
> >> This is a mix of the options 3) and 1)
> >>
> >> Le ven. 6 juil. 2018 à 22:58, Guozhang Wang  a
> écrit :
> >>
> >>> Hi folks, just to summarize the options we have so far:
> >>>
> >>> 1) Add a new "as" for KTable / KStream, plus adding new fields for
> >>> operators-returns-void control objects (the current wiki's proposal).
> >>>
> >>> Pros: no more overloads.
> >>> Cons: a bit departing with the current high-level API design of the
> DSL,
> >>> plus, the inconsistency between operators-returns-void and
> >>> operators-not-return-voids.
> >>>
> >>> 2) Add overloaded functions for all operators, that accepts a new
> control
> >>> object "Described".
> >>>
> >>> Pros: consistent with current APIs.
> >>> Cons: lots of overloaded functions to add.
> >>>
> >>> 3) Add another default function in the interface (thank you J8!) as
> John
> >>> proposed.
> >>>
> >>> Pros: no overloaded functions, no "Described".
> >>> Cons: do we lose lambda functions really (seems not if we provide a
> >> "named"
> >>> for each func)? Plus "Described" may be more extensible than a single
> >>> `String`.
> >>>
> >>>
> >>> My principle of considering which one is better depends primarily on
> "how
> >>> to make advanced users easily use the additional API, while keeping it
> >>> hidden from normal users who do not care at all". For that purpose I
> >> think
> >>> 3) > 1) > 2).
> >>>
> >>> One caveat though, is that changing the interface would not be
> >>> binary-compatible though source-compatible, right? I.e. users need to
> >>> recompile their code though no changes needed.
> >>>
> >>>
> >>>
> >>> Another note: for 3), if we really want to keep extensibility of
> >> Described
> >>> we could do sth. like:
> >>>
> >>> -
> >>>
> >>> public interface Predicate {
> >>> // existing method
> >>> boolean test(final K key, final V value);
> >>>
> >>> // new default method adds the ability to name the predicate
> >>> default Described described() {
> >>> return new Described(null);
> >>> }
> >>> }
> >>>
> >>> --
> >>>
> >>> where user's code becomes:
> >>>
> >>> stream.filter(named("key", (k, v) -> true));   // note `named` now just
> >>> sets a Described("key") in "described()".
> >>>
> >>> stream.filter(described(Described.as("key", /* any oth

Build failed in Jenkins: kafka-trunk-jdk8 #3193

2018-11-12 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-7518: Fix FutureRecordMetadata.get when TimeUnit is not ms 
(#5815)

--
[...truncated 2.73 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAn

Jenkins build is back to normal : kafka-2.1-jdk8 #53

2018-11-12 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk8 #3194

2018-11-12 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Remove unused IteratorTemplate (#5903)

--
[...truncated 2.49 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForC

Re: [DISCUSS] KIP-380: Detect outdated control requests and bounced brokers using broker generation

2018-11-12 Thread Becket Qin
Hi Dong,

That is a good point. But I think the STALE_BROKER_EPOCH error may still be
sent to the broker. For example, think about the following case:

1. Broker sends a ControlledShutdownRequest to the controller
2. Broker had a ZK session timeout
3. Broker created the ephemeral node
4. Controller processes the ControlledShutdownRequest in step 1
5. Broker receives a ControlledShutdownResponse with STALE_BROKER_EPOCH.

However, in this case, the broker should probably resend the controlled
shutdown request again with the new epoch. So it looks that returning a
STALE_BROKER_EPOCH is the correct behavior. If the broker has really been
bounced, that response will not be delivered to the broker. If the broker
has not really restarted, it will just resend the ControlledShutdownRequest
with the current epoch again.

It might worth updating the KIP wiki to mention this behavior.

Thanks,

Jiangjie (Becket) Qin


On Tue, Nov 13, 2018 at 2:04 AM Dong Lin  wrote:

> Hey Becket, Patrick,
>
> Currently we expect that controller can only receive ControlledShutdownRequest
> with outdated broker epoch in two cases: 1) the ControlledShutdownRequest
> was sent by the broker before it has restarted; and 2) there is bug.
>
> In case 1), it seems that ControlledShutdownResponse will not be delivered
> to the broker as the channel should have been disconnected. Thus there is
> no confusion to the broker because broker will not receive response
> with STALE_BROKER_EPOCH error.
>
> In case 2), it seems that it can be useful to still deliver 
> ControlledShutdownResponse
> with STALE_BROKER_EPOCH so that this broker at least knows that which
> response is not accepted. This helps us debug this issue.
>
> Also, in terms of both design and implementation, it seems simpler to
> still define a response for a given invalid request rather than have
> special path to skip response for the invalid request. Does this sound
> reasonable?
>
> Thanks,
> Dong
>
>
> On Mon, Nov 12, 2018 at 9:52 AM Patrick Huang  wrote:
>
>> Hi Becket,
>>
>> I think you are right. STALE_BROKER_EPOCH only makes sense when the
>> broker detects outdated control requests and wants the controller to know
>> about that. For ControlledShutdownRequest, the controller should just
>> ignore the request with stale broker epoch since the broker does not need
>> and will not do anything for STALE_BROKER_EPOCH response. Thanks for
>> pointing it out.
>>
>> Thanks,
>> Zhanxiang (Patrick) Huang
>>
>> 
>> From: Becket Qin 
>> Sent: Monday, November 12, 2018 6:46
>> To: dev
>> Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
>> bounced brokers using broker generation
>>
>> Hi Patrick,
>>
>> I am wondering why the controller should send STALE_BROKER_EPOCH error to
>> the broker if the broker epoch is stale? Would this be a little confusing
>> to the current broker if the request was sent by a broker with previous
>> epoch? Should the controller just ignore those requests in that case?
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Fri, Nov 9, 2018 at 2:17 AM Patrick Huang  wrote:
>>
>> > Hi,
>> >
>> > In this KIP, we are also going to add a new exception and a new error
>> code
>> > "STALE_BROKER_EPOCH" in order to allow the broker to respond back the
>> right
>> > error when it sees outdated broker epoch in the control requests. Since
>> > adding a new exception and error code is also considered as public
>> > interface change, I have updated the original KIP accordingly to include
>> > this change. Feel free to comment if there is any concern.
>> >
>> > Thanks,
>> > Zhanxiang (Patrick) Huang
>> >
>> > 
>> > From: Patrick Huang 
>> > Sent: Tuesday, October 23, 2018 6:20
>> > To: Jun Rao; dev@kafka.apache.org
>> > Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
>> > bounced brokers using broker generation
>> >
>> > Agreed. I have updated the PR to add czxid in ControlledShutdownRequest
>> (
>> > https://github.com/apache/kafka/pull/5821). Appreciated if you can
>> take a
>> > look.
>> >
>> > Btw, I also have the vote thread for this KIP:
>> >
>> https://lists.apache.org/thread.html/3689d83db537d5aa86d10967dee7ee29578897fc123daae4f77a8605@%3Cdev.kafka.apache.org%3E
>> >
>> > Best,
>> > Zhanxiang (Patrick) Huang
>> >
>> > 
>> > From: Jun Rao 
>> > Sent: Monday, October 22, 2018 21:31
>> > To: dev@kafka.apache.org
>> > Subject: Re: [DISCUSS] KIP-380: Detect outdated control requests and
>> > bounced brokers using broker generation
>> >
>> > Hi, Patrick,
>> >
>> > Yes, that's the general sequence. After step 2, the shutting down broker
>> > can give up the controlled shutdown process and proceed to shut down.
>> When
>> > it's restarted, it could still receive StopReplica requests from the
>> > controller in reaction to the previous controlled shutdown requests.
>> This
>> > could lead the restarted broker to a bad state.
>> >
>> > Thanks,
>> 

[jira] [Created] (KAFKA-7620) ConfigProvider is broken for connect when TTL is not null

2018-11-12 Thread Ye Ji (JIRA)
Ye Ji created KAFKA-7620:


 Summary: ConfigProvider is broken for connect when TTL is not null
 Key: KAFKA-7620
 URL: https://issues.apache.org/jira/browse/KAFKA-7620
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.0.1
Reporter: Ye Ji


If the ConfigData returned by ConfigProvider.get implementations has non-null 
and non-negative ttl, it will trigger infinite recursion, here is an excerpt of 
the stack trace:
{code:java}
at 
org.apache.kafka.connect.runtime.WorkerConfigTransformer.scheduleReload(WorkerConfigTransformer.java:62)
at 
org.apache.kafka.connect.runtime.WorkerConfigTransformer.scheduleReload(WorkerConfigTransformer.java:56)
at 
org.apache.kafka.connect.runtime.WorkerConfigTransformer.transform(WorkerConfigTransformer.java:49)
at 
org.apache.kafka.connect.runtime.distributed.ClusterConfigState.connectorConfig(ClusterConfigState.java:121)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.connectorConfigReloadAction(DistributedHerder.java:648)
at 
org.apache.kafka.connect.runtime.WorkerConfigTransformer.scheduleReload(WorkerConfigTransformer.java:62)
at 
org.apache.kafka.connect.runtime.WorkerConfigTransformer.scheduleReload(WorkerConfigTransformer.java:56)
at 
org.apache.kafka.connect.runtime.WorkerConfigTransformer.transform(WorkerConfigTransformer.java:49)
{code}


Basically, 
1) if a non-null ttl is returned from the config provider, connect runtime will 
try to schedule a reload in the future, 
2) scheduleReload function reads the config again to see if it is a restart or 
not, by calling 
org.apache.kafka.connect.runtime.WorkerConfigTransformer.transform to transform 
the config
3) the transform function calls config provider, and gets a non-null ttl, 
causing scheduleReload being called, we are back to step 1.


To reproduce, simply fork the provided FileConfigProvider, and add a 
non-negative ttl to the ConfigData returned by the get functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7612) Fix javac warnings and enable warnings as errors

2018-11-12 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-7612.

   Resolution: Fixed
Fix Version/s: 2.2.0

> Fix javac warnings and enable warnings as errors
> 
>
> Key: KAFKA-7612
> URL: https://issues.apache.org/jira/browse/KAFKA-7612
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 2.2.0
>
>
> The only way to keep warnings away is to treat them as errors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7605) Flaky Test `SaslMultiMechanismConsumerTest.testCoordinatorFailover`

2018-11-12 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-7605.

   Resolution: Fixed
Fix Version/s: 2.2.0

> Flaky Test `SaslMultiMechanismConsumerTest.testCoordinatorFailover`
> ---
>
> Key: KAFKA-7605
> URL: https://issues.apache.org/jira/browse/KAFKA-7605
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 2.2.0
>
>
> {code}
> java.lang.AssertionError: Failed to observe commit callback before timeout
>   at kafka.utils.TestUtils$.fail(TestUtils.scala:351)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:761)
>   at kafka.utils.TestUtils$.pollUntilTrue(TestUtils.scala:727)
>   at 
> kafka.api.BaseConsumerTest.awaitCommitCallback(BaseConsumerTest.scala:198)
>   at 
> kafka.api.BaseConsumerTest.ensureNoRebalance(BaseConsumerTest.scala:214)
>   at 
> kafka.api.BaseConsumerTest.testCoordinatorFailover(BaseConsumerTest.scala:117)
> {code}
> Probably just need to increase the timeout a little.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-388 Add observer interface to record request and response

2018-11-12 Thread Lincong Li
Thanks Mayuresh, Ismael and Colin for your feedback!

I updated the KIP basing on your feedback. The change is basically that two
interfaces are introduced to prevent the internal classes from being
exposed. These two interfaces contain getters that allow user to extract
information from request and response in their own implementation(s) of the
observer interface and they would not constraint future implementation
changes in neither RequestChannel.Request nor AbstractResponse. There could
be more getters defined in these two interfaces. The implementation of
these two interfaces will be provided as part of the KIP.

I also expanded on "Add code to the broker (in KafkaApis) to allow Kafka
servers to invoke any
observers defined. More specifically, change KafkaApis code to invoke all
defined observers, in the order in which they were defined, for every
request-response pair" by providing a sample code block which shows how
these interfaces are used in the KafkaApis class.

Let me know if you have any question, concern, or comments. Thank you very
much!

Best regards,
Lincong Li

On Fri, Nov 9, 2018 at 10:34 AM Mayuresh Gharat 
wrote:

> Hi Lincong,
>
> Thanks for the KIP.
>
> As Colin pointed out, would it better to expose certain specific pieces of
> information from the request/response like api key, request headers, record
> counts, client ID instead of the entire request/response objects ? This
> enables us to change the request response apis independently of this
> pluggable public API, in future, unless you think we have a strong reason
> that we need to expose the request, response objects.
>
> Also, it would be great if you can expand on :
> "Add code to the broker (in KafkaApis) to allow Kafka servers to invoke any
> observers defined. More specifically, change KafkaApis code to invoke all
> defined observers, in the order in which they were defined, for every
> request-response pair."
> probably with an example of how you visualize it. It would help the KIP to
> be more concrete and easier to understand the end to end workflow.
>
> Thanks,
>
> Mayuresh
>
> On Thu, Nov 8, 2018 at 7:44 PM Ismael Juma  wrote:
>
> > I agree, the current KIP doesn't discuss the public API that we would be
> > exposing and it's extensive if the normal usage would allow for casting
> > AbstractRequest into the various subclasses and potentially even
> accessing
> > Records and related for produce request.
> >
> > There are many use cases where this could be useful, but it requires
> quite
> > a bit of thinking around the APIs that we expose and the expected usage.
> >
> > Ismael
> >
> > On Thu, Nov 8, 2018, 6:09 PM Colin McCabe  >
> > > Hi Lincong Li,
> > >
> > > I agree that server-side instrumentation is helpful.  However, I don't
> > > think this is the right approach.
> > >
> > > The problem is that RequestChannel.Request and AbstractResponse are
> > > internal classes that should not be exposed.  These are implementation
> > > details that we may change in the future.  Freezing these into a public
> > API
> > > would really hold back the project.  For example, for really large
> > > responses, we might eventually want to avoid materializing the whole
> > > response all at once.  It would make more sense to return it in a
> > streaming
> > > fashion.  But if we need to support this API forever, we can't do that.
> > >
> > > I think it's fair to say that this is, at best, half a solution to the
> > > problem of tracing requests.  Users still need to write the plugin code
> > and
> > > arrange for it to be on their classpath to make this work.  I think the
> > > alternative here is not client-side instrumentation, but simply making
> > the
> > > change to the broker without using a plugin interface.
> > >
> > > If a public interface is absolutely necessary here we should expose
> only
> > > things like the API key, client ID, time, etc. that don't constrain the
> > > implementation a lot in the future.  I think we should also use java
> here
> > > to avoid the compatibility issues we have had with Scala APIs in the
> > past.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Thu, Nov 8, 2018, at 11:34, radai wrote:
> > > > another downside to client instrumentation (beyond the number of
> > > > client codebases one would need to cover) is that in a large
> > > > environments you'll have a very long tail of applications using older
> > > > clients to upgrade - it would be a long and disruptive process (as
> > > > opposed to updating broker-side instrumentation)
> > > > On Thu, Nov 8, 2018 at 11:04 AM Peter M. Elias <
> petermel...@gmail.com>
> > > wrote:
> > > > >
> > > > > I know we have a lot of use cases for this type of functionality at
> > my
> > > > > enterprise deployment. I think it's helpful for maintaining
> > > reliability of
> > > > > the cluster especially and identifying clients that are not
> properly
> > > tuned
> > > > > and therefore applying excessive load to the brokers. Additionally,
> > > there
> >

Heads up: javac warnings are now treated as errors

2018-11-12 Thread Ismael Juma
Hi all,

As part of KAFKA-7612, all javac warnings were fixed or suppressed. To
prevent them from reappearing, javac warnings are now treated as errors. We
still have some scalac warnings (see KAFKA-7614 for details on what's
needed to eliminate them) and 3 xlint warnings are not yet enabled
(KAFKA-7613).

Before merging PRs that were submitted before KAFKA-7612 was merged, it's a
good idea to rerun the PR tests.

Ismael


[KAFKA-7382] Guarantee atleast one replica of partition to be alive during topic creation

2018-11-12 Thread Suman B N
Team,

Review and merge - https://github.com/apache/kafka/pull/5822.
One round of review has been done by Mani Kumar Reddy in PR -
https://github.com/apache/kafka/pull/5665.

-- 
*Suman*
*OlaCabs*