Re: [DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-12-15 Thread Manikumar
@Gwen, @Rajini,

As mentioned in the KIP, main motivation for this KIP is to reduce load on
Kerberos
server on large kafka deployments with large number of clients.

Also it looks like we are combining two overlapping concepts
1. Single client sending requests with multiple users/authentications
2. Impersonation

Option 1, is definitely useful in some use cases and can be used to
implement workaround for
impersonation

In Impersonation, a super user can send requests on behalf of another
user(Alice) in a secured way.
superuser has credentials but user Alice doesn't have any. The requests are
required
to run as user Alice and accesses/ACLs on Broker are required to be done as
user Alice.
It is required that user Alice can connect to the Broker on a connection
authenticated with
superuser's credentials. In other words superuser is impersonating the user
Alice.

The approach mentioned by Harsha in previous mail is implemented in hadoop,
storm etc..

Some more details here:
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/Superusers.html


@Rajini

Thanks for your comments on SASL/SCRAM usage. I am thinking to send
tokenHmac (salted-hashed version)
as password for authentication and tokenID for retrial of tokenHmac at
server side.
Does above sound OK?


Thanks,
Manikumar

On Wed, Dec 14, 2016 at 10:33 PM, Harsha Chintalapani 
wrote:

> @Gwen @Mani  Not sure why we want to authenticate at every request. Even if
> the token exchange is cheap it still a few calls that need to go through
> round trip.  Impersonation doesn't require authentication for every
> request.
>
> "So a centralized app can create few producers, do the metadata request and
> broker discovery with its own user auth, but then use delegation tokens to
> allow performing produce/fetch requests as different users? Instead of
> having to re-connect for each impersonated user?"
>
> Yes. But what we will have is this centralized user as impersonation user
> on behalf of other users. When it authenticates initially we will create a
> "Subject" and from there on wards centralized user can do
> Subject.doAsPrivileged
> on behalf, other users.
> On the server side, we can retrieve two principals out of this one is the
> authenticated user (centralized user) and another is impersonated user. We
> will first check if the authenticated user allowed to impersonate and then
> move on to check if the user Alice has access to the topic "X" to
> read/write.
>
> @Rajini Intention of this KIP is to support token auth via SASL/SCRAM, not
> just with TLS.  What you raised is a good point let me take a look and add
> details.
>
> It will be easier to add impersonation once we reach agreement on this KIP.
>
>
> On Wed, Dec 14, 2016 at 5:51 AM Ismael Juma  wrote:
>
> > Hi Rajini,
> >
> > I think it would definitely be valuable to have a KIP for impersonation.
> >
> > Ismael
> >
> > On Wed, Dec 14, 2016 at 4:03 AM, Rajini Sivaram 
> > wrote:
> >
> > > It would clearly be very useful to enable clients to send requests on
> > > behalf of multiple users. A separate KIP makes sense, but it may be
> worth
> > > thinking through some of the implications now, especially if the main
> > > interest in delegation tokens comes from its potential to enable
> > > impersonation.
> > >
> > > I understand that delegation tokens are only expected to be used with
> > TLS.
> > > But the choice of SASL/SCRAM for authentication must be based on a
> > > requirement to protect the tokenHmac - otherwise you could just use
> > > SASL/PLAIN. With SASL/SCRAM the tokenHmac is never propagated
> > on-the-wire,
> > > only a salted-hashed version of it is used in the SASL authentication
> > > exchange. If impersonation is based on sending tokenHmac in requests,
> any
> > > benefit of using SCRAM is lost.
> > >
> > > An alternative may be to allow clients to authenticate multiple times
> > using
> > > SASL and include one of its authenticated principals in each request
> > > (optionally). I haven't thought it through yet, obviously. But if the
> > > approach is of interest and no one is working on a KIP for
> impersonation
> > at
> > > the moment, I am happy to write one. It may provide something for
> > > comparison at least.
> > >
> > > Thoughts?
> > >
> > >
> > > On Wed, Dec 14, 2016 at 9:53 AM, Manikumar 
> > > wrote:
> > >
> > > > That's a good idea. Authenticating every request with delegation
> token
> > > will
> > > > be useful for
> > > > impersonation use-cases. But as of now, we are thinking delegation
> > token
> > > as
> > > > just another way
> > > > to authenticate the users. We haven't think through all the use cases
> > > > related to
> > > > impersonation or using delegation token for impersonation. We want to
> > > > handle impersonation
> > > > (KAFKA-3712) as part of separate KIP.
> > > >
> > > > Will that be Ok?
> > > >
> > > >
> > > > On Wed, Dec 14, 2016 at 8:09 AM, Gwen Shapira 
> > wrote:
> > > >
> > > > > Thinking out loud here:
> > > > >
> > > > > It

Re: [DISCUSS] KIP-86: Configurable SASL callback handlers

2016-12-15 Thread Rajini Sivaram
Ismael,

1. At the moment AuthCallbackHandler is not a public interface, so I am
assuming that it can be modified. Yes, agree that we should keep non-public
methods separate. Will do that as part of the implementation of this KIP.

2. Callback handlers do tend to depend on ordering, including those
included in the JVM and these in Kafka. I have specified the ordering in
the KIP. Will make sure they get included in documentation too.

3. Added a note to the KIP. Kafka needs access to the SCRAM credentials to
perform SCRAM authentication. For PLAIN, Kafka only needs to know if the
password is valid for the user. We want to support external authentication
servers whose interface is to validate password, not retrieve it.

4. Added code of ScramCredential to the KIP.


On Wed, Dec 14, 2016 at 3:54 PM, Ismael Juma  wrote:

> Thanks Rajini, that helps. A few comments:
>
> 1. The `AuthCallbackHandler` interface already exists and we are making
> breaking changes (removing a parameter from `configure` and adding
> additional methods). Is the reasoning that it was not a public interface
> before? It would be good to clearly separate public versus non-public
> interfaces in the security code (and we should tweak Gradle to publish
> javadoc for the public ones).
>
> 2. It seems like there is an ordering when it comes to the invocation of
> callbacks. At least the current code assumes that `NameCallback` is called
> first. If I am interpreting this correctly, we should specify that
> ordering.
>
> 3. The approach taken by `ScramCredentialCallback` is different than the
> one taken by `PlainAuthenticateCallback`. The former lets the user pass the
> credentials information while the latter passes the credentials and lets
> the user do the authentication. It would be good to explain the
> inconsistency.
>
> 4. We reference `ScramCredential` in a few places, so it would be good to
> define that class too.
>
> Ismael
>
> On Wed, Dec 14, 2016 at 7:32 AM, Rajini Sivaram <
> rajinisiva...@googlemail.com> wrote:
>
> > Have added sample callback handlers for PLAIN and SCRAM.
> >
> > On Tue, Dec 13, 2016 at 4:10 PM, Rajini Sivaram <
> > rajinisiva...@googlemail.com> wrote:
> >
> > > Ismael,
> > >
> > > Thank you for the review. I will add an example.
> > >
> > > On Tue, Dec 13, 2016 at 1:07 PM, Ismael Juma 
> wrote:
> > >
> > >> Hi Rajini,
> > >>
> > >> Thanks for the KIP. I think this is useful and users have asked for
> > >> something like that. I like that you have a scenarios section, do you
> > >> think
> > >> you could provide a rough sketch of what a callback handler would look
> > >> like
> > >> for the first 2 scenarios? They seem to be the common ones, so it
> would
> > >> help to see a concrete example.
> > >>
> > >> Ismael
> > >>
> > >> On Tue, Oct 11, 2016 at 7:28 AM, Rajini Sivaram <
> > >> rajinisiva...@googlemail.com> wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > I have just created KIP-86 make callback handlers in SASL
> configurable
> > >> so
> > >> > that credential providers for SASL/PLAIN (and SASL/SCRAM when it is
> > >> > implemented) can be used with custom credential callbacks:
> > >> >
> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> > 86%3A+Configurable+SASL+callback+handlers
> > >> >
> > >> > Comments and suggestions are welcome.
> > >> >
> > >> > Thank you...
> > >> >
> > >> >
> > >> > Regards,
> > >> >
> > >> > Rajini
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Rajini
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Rajini
> >
>


Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

2016-12-15 Thread Rajini Sivaram
Hi Apurva,

Thank you, makes sense.

Rajini

On Wed, Dec 14, 2016 at 7:36 PM, Apurva Mehta  wrote:

> Hi Rajini,
>
> I think my original response to your point 15 was not accurate. The regular
> definition of durability is that data once committed would never be lost.
> So it is not enough for only the control messages to be flushed before
> being acknowledged -- all the messages (and offset commits) which are part
> of the transaction would need to be flushed before being acknowledged as
> well.
>
> Otherwise, it is possible that if all replicas of a topic partition crash
> before the transactional messages are flushed, those messages will be lost
> even if the commit marker exists in the log. In this case, the transaction
> would be 'committed' with incomplete data.
>
> Right now, there isn't any config which will ensure that the flush to disk
> happens before the acknowledgement. We could add it in the future, and get
> durability guarantees for kafka transactions.
>
> I hope this clarifies the situation. The present KIP does not intend to add
> the aforementioned config, so even the control messages are susceptible to
> being lost if there is a simultaneous crash across all replicas. So
> transactions are only as durable as existing Kafka messages. We don't
> strengthen any durability guarantees as part of this KIP.
>
> Thanks,
> Apurva
>
>
> On Wed, Dec 14, 2016 at 1:52 AM, Rajini Sivaram 
> wrote:
>
> > Hi Apurva,
> >
> > Thank you for the answers. Just one follow-on.
> >
> > 15. Let me rephrase my original question. If all control messages
> (messages
> > to transaction logs and markers on user logs) were acknowledged only
> after
> > flushing the log segment, will transactions become durable in the
> > traditional sense (i.e. not restricted to min.insync.replicas failures) ?
> > This is not a suggestion to update the KIP. It seems to me that the
> design
> > enables full durability if required in the future with a rather
> > non-intrusive change. I just wanted to make sure I haven't missed
> anything
> > fundamental that prevents Kafka from doing this.
> >
> >
> >
> > On Wed, Dec 14, 2016 at 5:30 AM, Ben Kirwin  wrote:
> >
> > > Hi Apurva,
> > >
> > > Thanks for the detailed answers... and sorry for the late reply!
> > >
> > > It does sound like, if the input-partitions-to-app-id mapping never
> > > changes, the existing fencing mechanisms should prevent duplicates.
> > Great!
> > > I'm a bit concerned the proposed API will be delicate to program
> against
> > > successfully -- even in the simple case, we need to create a new
> producer
> > > instance per input partition, and anything fancier is going to need its
> > own
> > > implementation of the Streams/Samza-style 'task' idea -- but that may
> be
> > > fine for this sort of advanced feature.
> > >
> > > For the second question, I notice that Jason also elaborated on this
> > > downthread:
> > >
> > > > We also looked at removing the producer ID.
> > > > This was discussed somewhere above, but basically the idea is to
> store
> > > the
> > > > AppID in the message set header directly and avoid the mapping to
> > > producer
> > > > ID altogether. As long as batching isn't too bad, the impact on total
> > > size
> > > > may not be too bad, but we were ultimately more comfortable with a
> > fixed
> > > > size ID.
> > >
> > > ...which suggests that the distinction is useful for performance, but
> not
> > > necessary for correctness, which makes good sense to me. (Would a
> 128-bid
> > > ID be a reasonable compromise? That's enough room for a UUID, or a
> > > reasonable hash of an arbitrary string, and has only a marginal
> increase
> > on
> > > the message size.)
> > >
> > > On Tue, Dec 6, 2016 at 11:44 PM, Apurva Mehta 
> > wrote:
> > >
> > > > Hi Ben,
> > > >
> > > > Now, on to your first question of how deal with consumer rebalances.
> > The
> > > > short answer is that the application needs to ensure that the the
> > > > assignment of input partitions to appId is consistent across
> > rebalances.
> > > >
> > > > For Kafka streams, they already ensure that the mapping of input
> > > partitions
> > > > to task Id is invariant across rebalances by implementing a custom
> > sticky
> > > > assignor. Other non-streams apps can trivially have one producer per
> > > input
> > > > partition and have the appId be the same as the partition number to
> > > achieve
> > > > the same effect.
> > > >
> > > > With this precondition in place, we can maintain transactions across
> > > > rebalances.
> > > >
> > > > Hope this answers your question.
> > > >
> > > > Thanks,
> > > > Apurva
> > > >
> > > > On Tue, Dec 6, 2016 at 3:22 PM, Ben Kirwin  wrote:
> > > >
> > > > > Thanks for this! I'm looking forward to going through the full
> > proposal
> > > > in
> > > > > detail soon; a few early questions:
> > > > >
> > > > > First: what happens when a consumer rebalances in the middle of a
> > > > > transaction? The full documentation suggests that such a
> transaction
>

[jira] [Assigned] (KAFKA-4543) Add capability to renew/expire delegation tokens.

2016-12-15 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy reassigned KAFKA-4543:
--

Assignee: Manikumar Reddy

> Add capability to renew/expire delegation tokens.
> -
>
> Key: KAFKA-4543
> URL: https://issues.apache.org/jira/browse/KAFKA-4543
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ashish K Singh
>Assignee: Manikumar Reddy
>
> Add capability to renew/expire delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4547) Consumer.position returns incorrect results for Kafka 0.10.1.0 client

2016-12-15 Thread Pranav Nakhe (JIRA)
Pranav Nakhe created KAFKA-4547:
---

 Summary: Consumer.position returns incorrect results for Kafka 
0.10.1.0 client
 Key: KAFKA-4547
 URL: https://issues.apache.org/jira/browse/KAFKA-4547
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.1.0
 Environment: Windows Kafka 0.10.1.0
Reporter: Pranav Nakhe


Consider the following code -

KafkaConsumer consumer = new 
KafkaConsumer(props);
List listOfPartitions = new ArrayList();
for (int i = 0; i < 
consumer.partitionsFor("IssueTopic").size(); i++) {
listOfPartitions.add(new TopicPartition("IssueTopic", 
i));
}
consumer.assign(listOfPartitions);  
consumer.pause(listOfPartitions);
consumer.seekToEnd(listOfPartitions);
//  consumer.resume(listOfPartitions); -- commented out
for(int i = 0; i < listOfPartitions.size(); i++) {

System.out.println(consumer.position(listOfPartitions.get(i)));
}

I have created a topic IssueTopic with 3 partitions with a single replica on my 
single node kafka installation (0.10.1.0)

The behavior noticed for Kafka client 0.10.1.0 as against Kafka client 0.10.0.1

A) Initially when there are no messages on IssueTopic running the above program 
returns
0.10.1.00.10.0.1
0   0
0   0
0   0

B) Next I send 6 messages and see that the messages have been evenly 
distributed across the three partitions. Running the above program now returns 
0.10.1.00.10.0.1
0   2
0   2
2   2

Clearly there is a difference in behavior for the 2 clients.

Now after seekToEnd call if I make a call to resume (uncomment the resume call 
in code above) then the behavior is

0.10.1.00.10.0.1
2   2
2   2
2   2

This is an issue I came across when using the spark kafka integration for 0.10. 
When I use kafka 0.10.1.0 I started seeing this issue. I had raised a pull 
request to resolve that issue [SPARK-18779] but when looking at the kafka 
client implementation/documentation now it seems the issue is with kafka and 
not with spark. There does not seem to be any documentation which 
specifies/implies that we need to call resume after seekToEnd for position to 
return the correct value. Also there is a clear difference in the behavior in 
the two kafka client implementations. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4547) Consumer.position returns incorrect results for Kafka 0.10.1.0 client

2016-12-15 Thread Pranav Nakhe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranav Nakhe updated KAFKA-4547:

Description: 
Consider the following code -

KafkaConsumer consumer = new 
KafkaConsumer(props);
List listOfPartitions = new ArrayList();
for (int i = 0; i < 
consumer.partitionsFor("IssueTopic").size(); i++) {
listOfPartitions.add(new TopicPartition("IssueTopic", 
i));
}
consumer.assign(listOfPartitions);  
consumer.pause(listOfPartitions);
consumer.seekToEnd(listOfPartitions);
//  consumer.resume(listOfPartitions); -- commented out
for(int i = 0; i < listOfPartitions.size(); i++) {

System.out.println(consumer.position(listOfPartitions.get(i)));
}

I have created a topic IssueTopic with 3 partitions with a single replica on my 
single node kafka installation (0.10.1.0)

The behavior noticed for Kafka client 0.10.1.0 as against Kafka client 0.10.0.1

A) Initially when there are no messages on IssueTopic running the above program 
returns
0.10.1.0   
0  
0  
0   

0.10.0.1
0
0
0

B) Next I send 6 messages and see that the messages have been evenly 
distributed across the three partitions. Running the above program now returns 
0.10.1.0   
0  
0  
2  

0.10.0.1
2
2
2

Clearly there is a difference in behavior for the 2 clients.

Now after seekToEnd call if I make a call to resume (uncomment the resume call 
in code above) then the behavior is

0.10.1.0   
2  
2  
2  

0.10.0.1
2
2
2

This is an issue I came across when using the spark kafka integration for 0.10. 
When I use kafka 0.10.1.0 I started seeing this issue. I had raised a pull 
request to resolve that issue [SPARK-18779] but when looking at the kafka 
client implementation/documentation now it seems the issue is with kafka and 
not with spark. There does not seem to be any documentation which 
specifies/implies that we need to call resume after seekToEnd for position to 
return the correct value. Also there is a clear difference in the behavior in 
the two kafka client implementations. 


  was:
Consider the following code -

KafkaConsumer consumer = new 
KafkaConsumer(props);
List listOfPartitions = new ArrayList();
for (int i = 0; i < 
consumer.partitionsFor("IssueTopic").size(); i++) {
listOfPartitions.add(new TopicPartition("IssueTopic", 
i));
}
consumer.assign(listOfPartitions);  
consumer.pause(listOfPartitions);
consumer.seekToEnd(listOfPartitions);
//  consumer.resume(listOfPartitions); -- commented out
for(int i = 0; i < listOfPartitions.size(); i++) {

System.out.println(consumer.position(listOfPartitions.get(i)));
}

I have created a topic IssueTopic with 3 partitions with a single replica on my 
single node kafka installation (0.10.1.0)

The behavior noticed for Kafka client 0.10.1.0 as against Kafka client 0.10.0.1

A) Initially when there are no messages on IssueTopic running the above program 
returns
0.10.1.00.10.0.1
0   0
0   0
0   0

B) Next I send 6 messages and see that the messages have been evenly 
distributed across the three partitions. Running the above program now returns 
0.10.1.00.10.0.1
0   2
0   2
2   2

Clearly there is a difference in behavior for the 2 clients.

Now after seekToEnd call if I make a call to resume (uncomment the resume call 
in code above) then the behavior is

0.10.1.00.10.0.1
2   2
2   2
2   2

This is an issue I came across when using the spark kafka integration for 0.10. 
When I use kafka 0.10.1.0 I started seeing this issue. I had raised a pull 
request to resolve that issue [SPARK-18779] but when looking at the kafka 
client implementation/documentation now it seems the issue is with kafka and 
not with spark. There does not seem to be any documentation which 
specifies/implies that we need to call resume after seekToEnd for position to 
return the correct value. Also there is a clear difference in the behavior in 
the two kafka client implementations. 



> Consumer.position

[jira] [Commented] (KAFKA-1120) Controller could miss a broker state change

2016-12-15 Thread Ben Stopford (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15750982#comment-15750982
 ] 

Ben Stopford commented on KAFKA-1120:
-

[~junrao] added this comment on a mail thread on this topic which is useful: 
https://www.mail-archive.com/dev@kafka.apache.org/msg62261.html


> Controller could miss a broker state change 
> 
>
> Key: KAFKA-1120
> URL: https://issues.apache.org/jira/browse/KAFKA-1120
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.1
>Reporter: Jun Rao
>  Labels: reliability
>
> When the controller is in the middle of processing a task (e.g., preferred 
> leader election, broker change), it holds a controller lock. During this 
> time, a broker could have de-registered and re-registered itself in ZK. After 
> the controller finishes processing the current task, it will start processing 
> the logic in the broker change listener. However, it will see no broker 
> change and therefore won't do anything to the restarted broker. This broker 
> will be in a weird state since the controller doesn't inform it to become the 
> leader of any partition. Yet, the cached metadata in other brokers could 
> still list that broker as the leader for some partitions. Client requests 
> routed to that broker will then get a TopicOrPartitionNotExistException. This 
> broker will continue to be in this bad state until it's restarted again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2016-12-15 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15750994#comment-15750994
 ] 

Michael Andre Pearce (IG) commented on KAFKA-4477:
--

Have we a timeline on RC1? 

It would be good to have this available asap.

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Attachments: issue_node_1001.log, issue_node_1001_ext.log, 
> issue_node_1002.log, issue_node_1002_ext.log, issue_node_1003.log, 
> issue_node_1003_ext.log, kafka.jstack, state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2016-12-15 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751019#comment-15751019
 ] 

Michael Andre Pearce (IG) commented on KAFKA-4477:
--

Hi [~junrao]

We had similar issue this morning, this time though we DO see deadlock.

I've attached all the logs, and stack we gathered from all the process's. logs 
are in csv original log line is in column _raw in the csv.

Ive also attached screenshots of our monitoring graphs of the follower stats, 
we do see a spike, but this seems to be post restart of the process (i think we 
should expect that?)

All are in 2016_15_12.zip.

Cheers
Mike



> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Attachments: issue_node_1001.log, issue_node_1001_ext.log, 
> issue_node_1002.log, issue_node_1002_ext.log, issue_node_1003.log, 
> issue_node_1003_ext.log, kafka.jstack, state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2016-12-15 Thread Michael Andre Pearce (IG) (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Andre Pearce (IG) updated KAFKA-4477:
-
Attachment: 2016_12_15.zip

IG ISR issue of 2016-12-15 04:27 (this time we see deadlock) attached.

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Attachments: 2016_12_15.zip, issue_node_1001.log, 
> issue_node_1001_ext.log, issue_node_1002.log, issue_node_1002_ext.log, 
> issue_node_1003.log, issue_node_1003_ext.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2016-12-15 Thread Michael Andre Pearce (IG) (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Andre Pearce (IG) updated KAFKA-4477:
-
Comment: was deleted

(was: IG ISR issue of 2016-12-15 04:27 (this time we see deadlock) attached.)

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Attachments: 2016_12_15.zip, issue_node_1001.log, 
> issue_node_1001_ext.log, issue_node_1002.log, issue_node_1002_ext.log, 
> issue_node_1003.log, issue_node_1003_ext.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.10.1.1 RC0

2016-12-15 Thread Michael Pearce
Is there any update on this, when do we expect and RC1 for vote?

We see KAFKA-4497 is marked Resolved.

Cheers
Mike

On 13/12/2016, 00:59, "Guozhang Wang"  wrote:

I see. Currently the upload command has not included the 2.12 version yet,
I will manually do that in the next RC.

Guozhang

On Mon, Dec 12, 2016 at 4:47 PM, Bernard Leach 
wrote:

> I found those ones but was hoping to see them in
> https://repository.apache.org/content/groups/staging/org/apache/kafka/ <
> https://repository.apache.org/content/groups/staging/org/apache/kafka/>
> so we can just point the maven build there for testing.
>
> > On 13 Dec 2016, at 11:38, Guozhang Wang  wrote:
> >
> > @Bernard
> >
> > You should be able to find the artifacts in the staging RC repository I
> > listed above, named as "kafka_2.12-0.10.1.1.tgz
> >  kafka_2.12-0.10.1.1.tgz>
> > ".
> >
> > @Everyone
> >
> > Since KAFKA-4497 is a critical issue, that anyone using compaction with
> > compressed message on server logs can possibly hit it, I'm going to
> create
> > a new RC after it is merged (presumably soon).
> >
> > If you discover any new problems in the meantime, let me know on this
> > thread.
> >
> >
> > Guozhang
> >
> >
> > On Mon, Dec 12, 2016 at 2:51 PM, Ismael Juma  wrote:
> >
> >> Yes, it would be good to include that (that's why I set the fix version
> to
> >> 0.10.1.1 a couple of days ago).
> >>
> >> Ismael
> >>
> >> On Mon, Dec 12, 2016 at 3:47 PM, Moczarski, Swen <
> >> smoczar...@ebay-kleinanzeigen.de> wrote:
> >>
> >>> -0 (non-binding)
> >>>
> >>> Would it make sense to include https://issues.apache.org/
> >>> jira/browse/KAFKA-4497 ? The issue sounds quite critical and a patch
> for
> >>> 0.10.1.1 seems to be available.
> >>>
> >>> On 2016-12-07 23:46 (+0100), Guozhang Wang  w...@
> >>> gmail.com>> wrote:
>  Hello Kafka users, developers and client-developers,>
> 
>  This is the first candidate for the release of Apache Kafka 0.10.1.1.
> >>> This is>
>  a bug fix release and it includes fixes and improvements from 27
> JIRAs.
> >>> See>
>  the release notes for more details:>
> 
>  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/
> RELEASE_NOTES.html
> >>>
> 
>  *** Please download, test and vote by Monday, 13 December, 8am PT 
***>
> 
>  Kafka's KEYS file containing PGP keys we use to sign the release:>
>  http://kafka.apache.org/KEYS>
> 
>  * Release artifacts to be voted upon (source and binary):>
>  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/>
> 
>  * Maven artifacts to be voted upon:>
>  https://repository.apache.org/content/groups/staging/org/
> apache/kafka/
> >>>
> 
>  * Javadoc:>
>  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/javadoc/>
> 
>  * Tag to be voted upon (off 0.10.0 branch) is the 0.10.0.1-rc0 tag:>
>  https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> >>> 8b77507083fdd427ce81021228e7e346da0d814c>
> 
> 
>  Thanks,>
>  Guozhang>
> 
> >>>
> >>
> >
> >
> >
> > --
> > -- Guozhang
>
>


--
-- Guozhang


The information contained in this email is strictly confidential and for the 
use of the addressee only, unless otherwise indicated. If you are not the 
intended recipient, please do not read, copy, use or disclose to others this 
message or any attachment. Please also notify the sender by replying to this 
email or by telephone (+44(020 7896 0011) and then delete the email and any 
copies of it. Opinions, conclusion (etc) that do not relate to the official 
business of this company shall be understood as neither given nor endorsed by 
it. IG is a trading name of IG Markets Limited (a company registered in England 
and Wales, company number 04008957) and IG Index Limited (a company registered 
in England and Wales, company number 01190902). Registered address at Cannon 
Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets Limited 
(register number 195355) and IG Index Limited (register number 114059) are 
authorised and regulated by the Financial Conduct Authority.


Re: [DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-12-15 Thread Rajini Sivaram
@Mani

Can you add a sample Jaas configuration using delegation tokens to the KIP?
Since delegation tokens will be handled differently from other SCRAM
credentials, it should work anyway, but it will be good to see an example
of the configuration the user provides. It sounds like users provide both
tokenID and delegation token, but I wasn't sure.

@Gwen, @Ismael, @Harsha, @Mani

To make sure I have understood correctly, KAFKA-3712 is aimed at enabling a
superuser to impersonate another (single) user, say alice. A producer using
impersonation will authenticate with superuser credentials. All requests
from the producer will be run with the principal alice. But alice is not
involved in the authentication and alice's credentials are not actually
provided to the broker?

The use case I was thinking of was the other one on Mani's list above. I
want to make sure that my understanding matches Gwen's and Ismael's for
this one. Using REST proxy as an example, users sending requests to the
REST proxy provide a delegation token as an auth token. REST proxy itself
authenticates as a different user, perhaps with access to read metadata of
all topics. But Kafka requests corresponding to each REST request should be
authenticated based on the authentication token provided in the request.
At the moment, the REST proxy has to create producers for each user (or
perform authentication and authorization for the user). Instead we want to
create a single producer which has the ability to send requests on behalf
of multiple users where the user is authenticated by Kafka and each request
from the producer is authorized based on the user who initiated the request.

1a) I think Gwen's suggestion was something along the lines of:
producer.send(record1, delegationToken1);
producer.send(record2, delegationToken2);
1b) I was thinking more along the lines of:
subject1 = producer.authenticate(user1Credentials);
subject2 = producer.authenticate(user2Credentials);

Subject.doAs(subject1, (PrivilegedExceptionAction>)
() -> producer.send(record1));
Subject.doAs(subject2, (PrivilegedExceptionAction>)
() -> producer.send(record2));

To make either approach work, each request needs to be associated with a
user. This would be delegation token in 1a) and user principal in 1b). And
both need to ensure that producers dont send records from multiple users in
one request. And if producers hold one metadata rather than one per-user,
calls like partitionsFor() shouldn't return metadata for unauthorized
topics. It is a very big change, so it will be good to know whether there
is a real requirement to support this.


On Thu, Dec 15, 2016 at 9:04 AM, Manikumar 
wrote:

> @Gwen, @Rajini,
>
> As mentioned in the KIP, main motivation for this KIP is to reduce load on
> Kerberos
> server on large kafka deployments with large number of clients.
>
> Also it looks like we are combining two overlapping concepts
> 1. Single client sending requests with multiple users/authentications
> 2. Impersonation
>
> Option 1, is definitely useful in some use cases and can be used to
> implement workaround for
> impersonation
>
> In Impersonation, a super user can send requests on behalf of another
> user(Alice) in a secured way.
> superuser has credentials but user Alice doesn't have any. The requests are
> required
> to run as user Alice and accesses/ACLs on Broker are required to be done as
> user Alice.
> It is required that user Alice can connect to the Broker on a connection
> authenticated with
> superuser's credentials. In other words superuser is impersonating the user
> Alice.
>
> The approach mentioned by Harsha in previous mail is implemented in hadoop,
> storm etc..
>
> Some more details here:
> https://hadoop.apache.org/docs/r2.7.2/hadoop-project-
> dist/hadoop-common/Superusers.html
>
>
> @Rajini
>
> Thanks for your comments on SASL/SCRAM usage. I am thinking to send
> tokenHmac (salted-hashed version)
> as password for authentication and tokenID for retrial of tokenHmac at
> server side.
> Does above sound OK?
>
>
> Thanks,
> Manikumar
>
> On Wed, Dec 14, 2016 at 10:33 PM, Harsha Chintalapani 
> wrote:
>
> > @Gwen @Mani  Not sure why we want to authenticate at every request. Even
> if
> > the token exchange is cheap it still a few calls that need to go through
> > round trip.  Impersonation doesn't require authentication for every
> > request.
> >
> > "So a centralized app can create few producers, do the metadata request
> and
> > broker discovery with its own user auth, but then use delegation tokens
> to
> > allow performing produce/fetch requests as different users? Instead of
> > having to re-connect for each impersonated user?"
> >
> > Yes. But what we will have is this centralized user as impersonation user
> > on behalf of other users. When it authenticates initially we will create
> a
> > "Subject" and from there on wards centralized user can do
> > Subject.doAsPrivileged
> > on behalf, other users.
> > On the server side, we can retrieve two princi

Re: Improve default Kafka logger settings to prevent extremely high disk space usage issue?

2016-12-15 Thread Jaikiran Pai



On Tuesday 13 December 2016 03:29 PM, Ismael Juma wrote:

The log config settings for the controller and state change logger have
been that way since they were introduced. They're generally useful when
investigating issues with the controller. Looks like this is too noisy in
some scenarios though. It may be worth filing a JIRA with specifics of your
use case to see if something can be done to improve that.
For now, we decided to use a custom log4j config. I'll get back to this 
and check if there are specific log messages that cause this or if it's 
just regular logs at TRACE level which contribute to this, before filing 
a JIRA.


-Jaikiran





Ismael

On Tue, Dec 13, 2016 at 9:32 AM, Jaikiran Pai 
wrote:


We happened to run into a disk space usage issue with Kafka 0.10.0.1 (the
version we are using) on one of our production setups this morning. Turns
out (log4j) logging from Kafka ended up using 81G and more of disk space.
Looking at the files, I see the controller.log itself is 30G and more (for
a single day). Looking at the default log4j.properties that's shipped in
Kafka, it uses the DailyRollingFileAppender which is one of the things that
contributes to this issue. I see that there's already a patch and JIRA to
fix this https://issues.apache.org/jira/browse/KAFKA-2394. It's been
marked for 0.11 because there wasn't a clear decision when to ship it.

Given that we have been going through 0.10.x releases these days and the
0.11 release looking a bit away, is there any chance, this specific JIRA
can make it to 0.10.x? I personally don't see any compatibility issues that
it will introduce when it comes to *functionality/features* of Kafka
itself, so I am not sure if it's that big a change to wait all the way till
0.11. Furthermore, since the default shipped setting can cause issues like
the one I noted, I think it probably would be fine to include it in one of
the 0.10.x releases. Of course, we ourselves can setup the logging config
on our setup to use a size based rolling file config instead of the one
shipped by default, but if this is something that can make it to 0.10.x, we
would like to avoid doing this customization ourselves.

That's one part of the issue. The other is, I see this in the default
shipped log4j.properties:


log4j.logger.kafka.controller=*TRACE,* controllerAppender
log4j.additivity.kafka.controller=false

log4j.logger.state.change.logger=*TRACE*, stateChangeAppender
log4j.additivity.state.change.logger=false


Is it intentional to have this at TRACE level for the default shipped
config instead of having something like INFO or maybe DEBUG?


-Jaikiran





[DISCUSS] KIP-102 - Add close with timeout for consumers

2016-12-15 Thread Rajini Sivaram
Hi all,

I have just created KIP-102 to add a new close method for consumers with a
timeout parameter, making Consumer consistent with Producer:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-102+-+Add+close+with+timeout+for+consumers

Comments and suggestions are welcome.

Thank you...

Regards,

Rajini


Re: [DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-12-15 Thread Ismael Juma
Hi Rajini,

The use case you outline is indeed the one I was thinking of. The concern
is indeed what you pointed out, it could be a large change.

Ismael

On Thu, Dec 15, 2016 at 3:34 AM, Rajini Sivaram  wrote:

> @Mani
>
> Can you add a sample Jaas configuration using delegation tokens to the KIP?
> Since delegation tokens will be handled differently from other SCRAM
> credentials, it should work anyway, but it will be good to see an example
> of the configuration the user provides. It sounds like users provide both
> tokenID and delegation token, but I wasn't sure.
>
> @Gwen, @Ismael, @Harsha, @Mani
>
> To make sure I have understood correctly, KAFKA-3712 is aimed at enabling a
> superuser to impersonate another (single) user, say alice. A producer using
> impersonation will authenticate with superuser credentials. All requests
> from the producer will be run with the principal alice. But alice is not
> involved in the authentication and alice's credentials are not actually
> provided to the broker?
>
> The use case I was thinking of was the other one on Mani's list above. I
> want to make sure that my understanding matches Gwen's and Ismael's for
> this one. Using REST proxy as an example, users sending requests to the
> REST proxy provide a delegation token as an auth token. REST proxy itself
> authenticates as a different user, perhaps with access to read metadata of
> all topics. But Kafka requests corresponding to each REST request should be
> authenticated based on the authentication token provided in the request.
> At the moment, the REST proxy has to create producers for each user (or
> perform authentication and authorization for the user). Instead we want to
> create a single producer which has the ability to send requests on behalf
> of multiple users where the user is authenticated by Kafka and each request
> from the producer is authorized based on the user who initiated the
> request.
>
> 1a) I think Gwen's suggestion was something along the lines of:
> producer.send(record1, delegationToken1);
> producer.send(record2, delegationToken2);
> 1b) I was thinking more along the lines of:
> subject1 = producer.authenticate(user1Credentials);
> subject2 = producer.authenticate(user2Credentials);
> 
> Subject.doAs(subject1, (PrivilegedExceptionAction>)
> () -> producer.send(record1));
> Subject.doAs(subject2, (PrivilegedExceptionAction>)
> () -> producer.send(record2));
>
> To make either approach work, each request needs to be associated with a
> user. This would be delegation token in 1a) and user principal in 1b). And
> both need to ensure that producers dont send records from multiple users in
> one request. And if producers hold one metadata rather than one per-user,
> calls like partitionsFor() shouldn't return metadata for unauthorized
> topics. It is a very big change, so it will be good to know whether there
> is a real requirement to support this.
>
>
> On Thu, Dec 15, 2016 at 9:04 AM, Manikumar 
> wrote:
>
> > @Gwen, @Rajini,
> >
> > As mentioned in the KIP, main motivation for this KIP is to reduce load
> on
> > Kerberos
> > server on large kafka deployments with large number of clients.
> >
> > Also it looks like we are combining two overlapping concepts
> > 1. Single client sending requests with multiple users/authentications
> > 2. Impersonation
> >
> > Option 1, is definitely useful in some use cases and can be used to
> > implement workaround for
> > impersonation
> >
> > In Impersonation, a super user can send requests on behalf of another
> > user(Alice) in a secured way.
> > superuser has credentials but user Alice doesn't have any. The requests
> are
> > required
> > to run as user Alice and accesses/ACLs on Broker are required to be done
> as
> > user Alice.
> > It is required that user Alice can connect to the Broker on a connection
> > authenticated with
> > superuser's credentials. In other words superuser is impersonating the
> user
> > Alice.
> >
> > The approach mentioned by Harsha in previous mail is implemented in
> hadoop,
> > storm etc..
> >
> > Some more details here:
> > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-
> > dist/hadoop-common/Superusers.html
> >
> >
> > @Rajini
> >
> > Thanks for your comments on SASL/SCRAM usage. I am thinking to send
> > tokenHmac (salted-hashed version)
> > as password for authentication and tokenID for retrial of tokenHmac at
> > server side.
> > Does above sound OK?
> >
> >
> > Thanks,
> > Manikumar
> >
> > On Wed, Dec 14, 2016 at 10:33 PM, Harsha Chintalapani 
> > wrote:
> >
> > > @Gwen @Mani  Not sure why we want to authenticate at every request.
> Even
> > if
> > > the token exchange is cheap it still a few calls that need to go
> through
> > > round trip.  Impersonation doesn't require authentication for every
> > > request.
> > >
> > > "So a centralized app can create few producers, do the metadata request
> > and
> > > broker discovery with its own user auth, but then use delegation tokens
> > to
> > > allow perf

Re: [DISCUSS] KIP-86: Configurable SASL callback handlers

2016-12-15 Thread Ismael Juma
Thanks Rajini, your answers make sense to me. One more general point: we
are following the JAAS callback architecture and exposing that to the user
where the user has to write code like:

@Override
public void handle(Callback[] callbacks) throws IOException,
UnsupportedCallbackException {
String username = null;
for (Callback callback: callbacks) {
if (callback instanceof NameCallback)
username = ((NameCallback) callback).getDefaultName();
else if (callback instanceof PlainAuthenticateCallback) {
PlainAuthenticateCallback plainCallback =
(PlainAuthenticateCallback) callback;
boolean authenticated = authenticate(username,
plainCallback.password());
plainCallback.authenticated(authenticated);
} else
throw new UnsupportedCallbackException(callback);
}
}

protected boolean authenticate(String username, char[] password) throws
IOException {
if (username == null)
return false;
else {
String expectedPassword =
JaasUtils.jaasConfig(LoginType.SERVER.contextName(), "user_" + username,
PlainLoginModule.class.getName());
return Arrays.equals(password, expectedPassword.toCharArray());
}
}

Since we need to create a new callback type for Plain, Scram and so on, is
it really worth it to do it this way? For example, in the code above, the
`authenticate` method could be the only thing the user has to implement and
we could do the necessary work to unwrap the data from the various
callbacks when interacting with the SASL API. More work for us, but a much
more pleasant API for users. What are the drawbacks?

Ismael

On Thu, Dec 15, 2016 at 1:06 AM, Rajini Sivaram  wrote:

> Ismael,
>
> 1. At the moment AuthCallbackHandler is not a public interface, so I am
> assuming that it can be modified. Yes, agree that we should keep non-public
> methods separate. Will do that as part of the implementation of this KIP.
>
> 2. Callback handlers do tend to depend on ordering, including those
> included in the JVM and these in Kafka. I have specified the ordering in
> the KIP. Will make sure they get included in documentation too.
>
> 3. Added a note to the KIP. Kafka needs access to the SCRAM credentials to
> perform SCRAM authentication. For PLAIN, Kafka only needs to know if the
> password is valid for the user. We want to support external authentication
> servers whose interface is to validate password, not retrieve it.
>
> 4. Added code of ScramCredential to the KIP.
>
>
> On Wed, Dec 14, 2016 at 3:54 PM, Ismael Juma  wrote:
>
> > Thanks Rajini, that helps. A few comments:
> >
> > 1. The `AuthCallbackHandler` interface already exists and we are making
> > breaking changes (removing a parameter from `configure` and adding
> > additional methods). Is the reasoning that it was not a public interface
> > before? It would be good to clearly separate public versus non-public
> > interfaces in the security code (and we should tweak Gradle to publish
> > javadoc for the public ones).
> >
> > 2. It seems like there is an ordering when it comes to the invocation of
> > callbacks. At least the current code assumes that `NameCallback` is
> called
> > first. If I am interpreting this correctly, we should specify that
> > ordering.
> >
> > 3. The approach taken by `ScramCredentialCallback` is different than the
> > one taken by `PlainAuthenticateCallback`. The former lets the user pass
> the
> > credentials information while the latter passes the credentials and lets
> > the user do the authentication. It would be good to explain the
> > inconsistency.
> >
> > 4. We reference `ScramCredential` in a few places, so it would be good to
> > define that class too.
> >
> > Ismael
> >
> > On Wed, Dec 14, 2016 at 7:32 AM, Rajini Sivaram <
> > rajinisiva...@googlemail.com> wrote:
> >
> > > Have added sample callback handlers for PLAIN and SCRAM.
> > >
> > > On Tue, Dec 13, 2016 at 4:10 PM, Rajini Sivaram <
> > > rajinisiva...@googlemail.com> wrote:
> > >
> > > > Ismael,
> > > >
> > > > Thank you for the review. I will add an example.
> > > >
> > > > On Tue, Dec 13, 2016 at 1:07 PM, Ismael Juma 
> > wrote:
> > > >
> > > >> Hi Rajini,
> > > >>
> > > >> Thanks for the KIP. I think this is useful and users have asked for
> > > >> something like that. I like that you have a scenarios section, do
> you
> > > >> think
> > > >> you could provide a rough sketch of what a callback handler would
> look
> > > >> like
> > > >> for the first 2 scenarios? They seem to be the common ones, so it
> > would
> > > >> help to see a concrete example.
> > > >>
> > > >> Ismael
> > > >>
> > > >> On Tue, Oct 11, 2016 at 7:28 AM, Rajini Sivaram <
> > > >> rajinisiva...@googlemail.com> wrote:
> > > >>
> > > >> > Hi all,
> > > >> >
> > > >> > I have just created KIP-86 make callback handlers in SASL
> > configurable
> > > >> so
> > > >> > that credential providers for S

Re: Brokers cashing with OOME Map failed

2016-12-15 Thread Ismael Juma
Hi,

This is probably not a Kafka bug, but we should improve the information we
report in this case. Something along the lines of what Lucene did here:

https://issues.apache.org/jira/browse/LUCENE-5673

This error may be caused by lack of enough unfragmented virtual address
space or too restrictive virtual memory limits enforced by the operating
system, preventing Kafka from mapping a chunk. Kafka could be asking for a
chunk that is unnecessarily large, but the former seems more likely.

Ismael

On Wed, Dec 14, 2016 at 12:03 PM, Zakee  wrote:

> Recently, we have seen our brokers crash with below errors, any idea what
> might be wrong here?  The brokers have been running for long with the same
> hosts/configs without this issue before. Is this something to do with new
> version 0.10.0.1 (which we upgraded recently) or could it be a h/w issue?
> 10 hosts are dedicated for one broker per host. Each host has 128 gb RAM
> and 20TB of storage mounts. Any pointers will help...
>
>
> [2016-12-12 02:49:58,134] FATAL [app=broker] [ReplicaFetcherThread-15-15]
> [ReplicaFetcherThread-15-15], Disk error while replicating data for
> mytopic-19 (kafka.server.ReplicaFetcherThread)
> kafka.common.KafkaStorageException: I/O exception in append to log ’
> mytopic-19'
> at kafka.log.Log.append(Log.scala:349)
> at kafka.server.ReplicaFetcherThread.processPartitionData(Repli
> caFetcherThread.scala:130)
> at kafka.server.ReplicaFetcherThread.processPartitionData(Repli
> caFetcherThread.scala:42)
> at kafka.server.AbstractFetcherThread$$anonfun$
> processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfu
> n$apply$2.apply(AbstractFetcherThread.scala:159)
> at kafka.server.AbstractFetcherThread$$anonfun$
> processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfu
> n$apply$2.apply(AbstractFetcherThread.scala:141)
> at scala.Option.foreach(Option.scala:257)
> at kafka.server.AbstractFetcherThread$$anonfun$
> processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(A
> bstractFetcherThread.scala:141)
> at kafka.server.AbstractFetcherThread$$anonfun$
> processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(A
> bstractFetcherThread.scala:138)
> at scala.collection.mutable.ResizableArray$class.foreach(Resiza
> bleArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.
> scala:48)
> at kafka.server.AbstractFetcherThread$$anonfun$
> processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:138)
> at kafka.server.AbstractFetcherThread$$anonfun$
> processFetchRequest$2.apply(AbstractFetcherThread.scala:138)
> at kafka.server.AbstractFetcherThread$$anonfun$
> processFetchRequest$2.apply(AbstractFetcherThread.scala:138)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> at kafka.server.AbstractFetcherThread.processFetchRequest(Abstr
> actFetcherThread.scala:136)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThr
> ead.scala:103)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> Caused by: java.io.IOException: Map failed
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:907)
> at kafka.log.AbstractIndex$$anonfun$resize$1.apply(AbstractInde
> x.scala:116)
> at kafka.log.AbstractIndex$$anonfun$resize$1.apply(AbstractInde
> x.scala:106)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> at kafka.log.AbstractIndex.resize(AbstractIndex.scala:106)
> at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1.apply$
> mcV$sp(AbstractIndex.scala:160)
> at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1.apply(
> AbstractIndex.scala:160)
> at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1.apply(
> AbstractIndex.scala:160)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:
> 159)
> at kafka.log.Log.roll(Log.scala:772)
> at kafka.log.Log.maybeRoll(Log.scala:742)
> at kafka.log.Log.append(Log.scala:405)
> ... 16 more
> Caused by: java.lang.OutOfMemoryError: Map failed
> at sun.nio.ch.FileChannelImpl.map0(Native Method)
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:904)
> ... 28 more
>
>
> Thanks
> -Zakee


Re: [DISCUSS] KIP-86: Configurable SASL callback handlers

2016-12-15 Thread Rajini Sivaram
Ismael,

The reason for choosing CallbackHandler interface as the configurable
interface is flexibility. As you say, we could instead define a simpler
PlainCredentialProvider and ScramCredentialProvider. But that would tie
users to Kafka's SaslServer implementation for PLAIN and SCRAM.
SaslServer/SaslClient implementations are already pluggable using standard
Java security provider mechanism. Callback handlers are the configuration
mechanism for SaslServer/SaslClient. By making the handlers configurable,
SASL becomes fully configurable for mechanisms supported by Kafka as well
as custom mechanisms. From the 'Scenarios' section in the KIP, a simpler
PLAIN/SCRAM interface satisfies the first two, but configurable callback
handlers enables all five. I agree that most users don't require this level
of flexibility, but we have had discussions about custom mechanisms in the
past for integration with existing authentication servers. So I think it is
a feature worth supporting.

On Thu, Dec 15, 2016 at 2:21 PM, Ismael Juma  wrote:

> Thanks Rajini, your answers make sense to me. One more general point: we
> are following the JAAS callback architecture and exposing that to the user
> where the user has to write code like:
>
> @Override
> public void handle(Callback[] callbacks) throws IOException,
> UnsupportedCallbackException {
> String username = null;
> for (Callback callback: callbacks) {
> if (callback instanceof NameCallback)
> username = ((NameCallback) callback).getDefaultName();
> else if (callback instanceof PlainAuthenticateCallback) {
> PlainAuthenticateCallback plainCallback =
> (PlainAuthenticateCallback) callback;
> boolean authenticated = authenticate(username,
> plainCallback.password());
> plainCallback.authenticated(authenticated);
> } else
> throw new UnsupportedCallbackException(callback);
> }
> }
>
> protected boolean authenticate(String username, char[] password) throws
> IOException {
> if (username == null)
> return false;
> else {
> String expectedPassword =
> JaasUtils.jaasConfig(LoginType.SERVER.contextName(), "user_" + username,
> PlainLoginModule.class.getName());
> return Arrays.equals(password, expectedPassword.toCharArray()
> );
> }
> }
>
> Since we need to create a new callback type for Plain, Scram and so on, is
> it really worth it to do it this way? For example, in the code above, the
> `authenticate` method could be the only thing the user has to implement and
> we could do the necessary work to unwrap the data from the various
> callbacks when interacting with the SASL API. More work for us, but a much
> more pleasant API for users. What are the drawbacks?
>
> Ismael
>
> On Thu, Dec 15, 2016 at 1:06 AM, Rajini Sivaram 
> wrote:
>
> > Ismael,
> >
> > 1. At the moment AuthCallbackHandler is not a public interface, so I am
> > assuming that it can be modified. Yes, agree that we should keep
> non-public
> > methods separate. Will do that as part of the implementation of this KIP.
> >
> > 2. Callback handlers do tend to depend on ordering, including those
> > included in the JVM and these in Kafka. I have specified the ordering in
> > the KIP. Will make sure they get included in documentation too.
> >
> > 3. Added a note to the KIP. Kafka needs access to the SCRAM credentials
> to
> > perform SCRAM authentication. For PLAIN, Kafka only needs to know if the
> > password is valid for the user. We want to support external
> authentication
> > servers whose interface is to validate password, not retrieve it.
> >
> > 4. Added code of ScramCredential to the KIP.
> >
> >
> > On Wed, Dec 14, 2016 at 3:54 PM, Ismael Juma  wrote:
> >
> > > Thanks Rajini, that helps. A few comments:
> > >
> > > 1. The `AuthCallbackHandler` interface already exists and we are making
> > > breaking changes (removing a parameter from `configure` and adding
> > > additional methods). Is the reasoning that it was not a public
> interface
> > > before? It would be good to clearly separate public versus non-public
> > > interfaces in the security code (and we should tweak Gradle to publish
> > > javadoc for the public ones).
> > >
> > > 2. It seems like there is an ordering when it comes to the invocation
> of
> > > callbacks. At least the current code assumes that `NameCallback` is
> > called
> > > first. If I am interpreting this correctly, we should specify that
> > > ordering.
> > >
> > > 3. The approach taken by `ScramCredentialCallback` is different than
> the
> > > one taken by `PlainAuthenticateCallback`. The former lets the user pass
> > the
> > > credentials information while the latter passes the credentials and
> lets
> > > the user do the authentication. It would be good to explain the
> > > inconsistency.
> > >
> > > 4. We reference `ScramCredential` in a few places, so it would be go

[jira] [Commented] (KAFKA-4541) Add capability to create delegation token

2016-12-15 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751744#comment-15751744
 ] 

Manikumar Reddy commented on KAFKA-4541:


[~singhashish] Is this JIRA work includes  token storage in zk and  
notifications also?

> Add capability to create delegation token
> -
>
> Key: KAFKA-4541
> URL: https://issues.apache.org/jira/browse/KAFKA-4541
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Add request/ response and server side handling to create delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2261: MINOR: Replace deepIterator/shallowIterator with d...

2016-12-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2261

MINOR: Replace deepIterator/shallowIterator with deepEntries/shallowEntries

The latter return `Iterable` instead of `Iterator` so that enhanced foreach 
can be used
in Java.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka deepEntries-shallowEntries

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2261


commit e1c50825cb54f8971bc2507c6fe8fe1020a580f6
Author: Ismael Juma 
Date:   2016-12-15T16:17:27Z

Replace `deepIterator` and `shallowIterator` with `deepEntries` and 
`shallowEntries`

The latter return `Iterable` instead of `Iterator` so that enhanced foreach 
can be used
in Java.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-1696) Kafka should be able to generate Hadoop delegation tokens

2016-12-15 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751936#comment-15751936
 ] 

Manikumar Reddy commented on KAFKA-1696:


Thanks for creating sub-tasks. About authentication, we are still discussing 
about credentials of
SCRAM mechanism. As mentioned in the mailing list, I am planning to pass 
tokenID, hmac as username and
password. Now we need to store SCRAM credentials in ZK along with token 
details. Pl check if this
fits into your design. This works depends on KAFKA-3751.

> Kafka should be able to generate Hadoop delegation tokens
> -
>
> Key: KAFKA-1696
> URL: https://issues.apache.org/jira/browse/KAFKA-1696
> Project: Kafka
>  Issue Type: New Feature
>  Components: security
>Reporter: Jay Kreps
>Assignee: Parth Brahmbhatt
>
> For access from MapReduce/etc jobs run on behalf of a user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2252: HOTFIX: fix state transition stuck on rebalance

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2252


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] 0.10.1.1 RC0

2016-12-15 Thread Guozhang Wang
Michael,

I am rolling out a new RC right now, stay tuned.


Guozhang


On Thu, Dec 15, 2016 at 2:34 AM, Michael Pearce 
wrote:

> Is there any update on this, when do we expect and RC1 for vote?
>
> We see KAFKA-4497 is marked Resolved.
>
> Cheers
> Mike
>
> On 13/12/2016, 00:59, "Guozhang Wang"  wrote:
>
> I see. Currently the upload command has not included the 2.12 version
> yet,
> I will manually do that in the next RC.
>
> Guozhang
>
> On Mon, Dec 12, 2016 at 4:47 PM, Bernard Leach <
> leac...@bouncycastle.org>
> wrote:
>
> > I found those ones but was hoping to see them in
> > https://repository.apache.org/content/groups/staging/org/
> apache/kafka/ <
> > https://repository.apache.org/content/groups/staging/org/
> apache/kafka/>
> > so we can just point the maven build there for testing.
> >
> > > On 13 Dec 2016, at 11:38, Guozhang Wang 
> wrote:
> > >
> > > @Bernard
> > >
> > > You should be able to find the artifacts in the staging RC
> repository I
> > > listed above, named as "kafka_2.12-0.10.1.1.tgz
> > >  > kafka_2.12-0.10.1.1.tgz>
> > > ".
> > >
> > > @Everyone
> > >
> > > Since KAFKA-4497 is a critical issue, that anyone using compaction
> with
> > > compressed message on server logs can possibly hit it, I'm going to
> > create
> > > a new RC after it is merged (presumably soon).
> > >
> > > If you discover any new problems in the meantime, let me know on
> this
> > > thread.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, Dec 12, 2016 at 2:51 PM, Ismael Juma 
> wrote:
> > >
> > >> Yes, it would be good to include that (that's why I set the fix
> version
> > to
> > >> 0.10.1.1 a couple of days ago).
> > >>
> > >> Ismael
> > >>
> > >> On Mon, Dec 12, 2016 at 3:47 PM, Moczarski, Swen <
> > >> smoczar...@ebay-kleinanzeigen.de> wrote:
> > >>
> > >>> -0 (non-binding)
> > >>>
> > >>> Would it make sense to include https://issues.apache.org/
> > >>> jira/browse/KAFKA-4497 ? The issue sounds quite critical and a
> patch
> > for
> > >>> 0.10.1.1 seems to be available.
> > >>>
> > >>> On 2016-12-07 23:46 (+0100), Guozhang Wang   > w...@
> > >>> gmail.com>> wrote:
> >  Hello Kafka users, developers and client-developers,>
> > 
> >  This is the first candidate for the release of Apache Kafka
> 0.10.1.1.
> > >>> This is>
> >  a bug fix release and it includes fixes and improvements from 27
> > JIRAs.
> > >>> See>
> >  the release notes for more details:>
> > 
> >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/
> > RELEASE_NOTES.html
> > >>>
> > 
> >  *** Please download, test and vote by Monday, 13 December, 8am
> PT ***>
> > 
> >  Kafka's KEYS file containing PGP keys we use to sign the
> release:>
> >  http://kafka.apache.org/KEYS>
> > 
> >  * Release artifacts to be voted upon (source and binary):>
> >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/>
> > 
> >  * Maven artifacts to be voted upon:>
> >  https://repository.apache.org/content/groups/staging/org/
> > apache/kafka/
> > >>>
> > 
> >  * Javadoc:>
> >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/javadoc/>
> > 
> >  * Tag to be voted upon (off 0.10.0 branch) is the 0.10.0.1-rc0
> tag:>
> >  https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> > >>> 8b77507083fdd427ce81021228e7e346da0d814c>
> > 
> > 
> >  Thanks,>
> >  Guozhang>
> > 
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> >
> >
>
>
> --
> -- Guozhang
>
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>



-- 
-- Guozhang


[jira] [Resolved] (KAFKA-4529) tombstone may be removed earlier than it should

2016-12-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-4529.
--
Resolution: Fixed

Marked as fixed for the 0.10.1.1 release process.

> tombstone may be removed earlier than it should
> ---
>
> Key: KAFKA-4529
> URL: https://issues.apache.org/jira/browse/KAFKA-4529
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0
>Reporter: Jun Rao
>Assignee: Jiangjie Qin
> Fix For: 0.10.1.1
>
>
> As part of KIP-33, we introduced a regression on how tombstone is removed in 
> a compacted topic. We want to delay the removal of a tombstone to avoid the 
> case that a reader first reads a non-tombstone message on a key and then 
> doesn't see the tombstone for the key because it's deleted too quickly. So, a 
> tombstone is supposed to only be removed from a compacted topic after the 
> tombstone is part of the cleaned portion of the log after delete.retention.ms.
> Before KIP-33, deleteHorizonMs in LogCleaner is calculated based on the last 
> modified time, which is monotonically increasing from old to new segments. 
> With KIP-33, deleteHorizonMs is calculated based on the message timestamp, 
> which is not necessarily monotonically increasing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3540) KafkaConsumer.close() may block indefinitely

2016-12-15 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-3540:
--
Assignee: Manikumar Reddy

> KafkaConsumer.close() may block indefinitely
> 
>
> Key: KAFKA-3540
> URL: https://issues.apache.org/jira/browse/KAFKA-3540
> Project: Kafka
>  Issue Type: Bug
>Reporter: Oleg Zhurakousky
>Assignee: Manikumar Reddy
>
> KafkaConsumer API doc states 
> {code}
> Close the consumer, waiting indefinitely for any needed cleanup. . . . 
> {code}
> That is not acceptable as it creates an artificial deadlock which directly 
> affects systems that rely on Kafka API essentially rendering them unavailable.
> Consider adding _close(timeout)_ method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3539) KafkaProducer.send() may block even though it returns the Future

2016-12-15 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-3539:
--
Assignee: Manikumar Reddy

> KafkaProducer.send() may block even though it returns the Future
> 
>
> Key: KAFKA-3539
> URL: https://issues.apache.org/jira/browse/KAFKA-3539
> Project: Kafka
>  Issue Type: Bug
>Reporter: Oleg Zhurakousky
>Assignee: Manikumar Reddy
>Priority: Critical
>
> You can get more details from the us...@kafka.apache.org by searching on the 
> thread with the subject "KafkaProducer block on send".
> The bottom line is that method that returns Future must never block, since it 
> essentially violates the Future contract as it was specifically designed to 
> return immediately passing control back to the user to check for completion, 
> cancel etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2260: KAFKA-4529; LogCleaner should not delete the tombs...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2260


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4529) tombstone may be removed earlier than it should

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752230#comment-15752230
 ] 

ASF GitHub Bot commented on KAFKA-4529:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2260


> tombstone may be removed earlier than it should
> ---
>
> Key: KAFKA-4529
> URL: https://issues.apache.org/jira/browse/KAFKA-4529
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0
>Reporter: Jun Rao
>Assignee: Jiangjie Qin
> Fix For: 0.10.1.1
>
>
> As part of KIP-33, we introduced a regression on how tombstone is removed in 
> a compacted topic. We want to delay the removal of a tombstone to avoid the 
> case that a reader first reads a non-tombstone message on a key and then 
> doesn't see the tombstone for the key because it's deleted too quickly. So, a 
> tombstone is supposed to only be removed from a compacted topic after the 
> tombstone is part of the cleaned portion of the log after delete.retention.ms.
> Before KIP-33, deleteHorizonMs in LogCleaner is calculated based on the last 
> modified time, which is monotonically increasing from old to new segments. 
> With KIP-33, deleteHorizonMs is calculated based on the message timestamp, 
> which is not necessarily monotonically increasing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2016-12-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752252#comment-15752252
 ] 

Jun Rao commented on KAFKA-4477:


[~michael.andre.pearce], the deadlock seems to be the same as in KAFKA-3994, 
which will be fixed in 0.10.1.1. We will try rolling 0.10.1.1 RC1 today.

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Attachments: 2016_12_15.zip, issue_node_1001.log, 
> issue_node_1001_ext.log, issue_node_1002.log, issue_node_1002_ext.log, 
> issue_node_1003.log, issue_node_1003_ext.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4546) a consumer could miss tombstone when leader changes during the reads

2016-12-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-4546:
---
Affects Version/s: 0.10.1.0

This is affecting all versions of Kafka. Not sure what's the best way to 
address this though. If anyone has an idea, please chime in.

> a consumer could miss tombstone when leader changes during the reads
> 
>
> Key: KAFKA-4546
> URL: https://issues.apache.org/jira/browse/KAFKA-4546
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1, 0.10.1.0
>Reporter: Jun Rao
>
> Currently, in a compacted topic, we delay the removal of a tombstone by 
> delete.retention.ms to avoid the case that a consumer reads a non-tombstone 
> before the tombstone, then tombstone is removed and the consumer misses the 
> tombstone.
> However, tombstones are removed independently in different replicas. Supposed 
> that a tombstone is already removed in the follower, but not in the leader. A 
> consumer could read a non-tombstone message on the leader, then the leader 
> changes, and the consumer will miss the tombstone on the new leader since 
> it's already deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-12-15 Thread Shikhar Bhushan
I have updated KIP-66
https://cwiki.apache.org/confluence/display/KAFKA/KIP-66%3A+Single+Message+Transforms+for+Kafka+Connect
with
the changes I proposed in the design.

Gwen, I think the main downside to not including some transformations with
Kafka Connect is that it seems less user friendly if folks have to make
sure to have the right transformation(s) on the classpath as well, besides
their connector(s). Additionally by going in with a small set included, we
can encourage a consistent configuration and implementation style and
provide utilities for e.g. data transformations, which I expect we will
definitely need (discussed under 'Patterns for data transformations').

It does get hard to draw the line once you go from 'none' to 'some'. To get
discussion going, if we get agreement on 'none' vs 'some', I added a table
under 'Bundled transformations' for transformations which I think are worth
including.

For many of these, I have noticed their absence in the wild as a pain point
--
TimestampRouter:
https://github.com/confluentinc/kafka-connect-elasticsearch/issues/33
Mask:
https://groups.google.com/d/msg/confluent-platform/3yHb8_mCReQ/sTQc3dNgBwAJ
Insert:
http://stackoverflow.com/questions/40664745/elasticsearch-connector-for-kafka-connect-offset-and-timestamp
RegexRouter:
https://groups.google.com/d/msg/confluent-platform/yEBwu1rGcs0/gIAhRp6kBwAJ
NumericCast:
https://github.com/confluentinc/kafka-connect-jdbc/issues/101#issuecomment-249096119
TimestampConverter:
https://groups.google.com/d/msg/confluent-platform/gGAOsw3Qeu4/8JCqdDhGBwAJ
ValueToKey: https://github.com/confluentinc/kafka-connect-jdbc/pull/166

In other cases, their functionality is already being implemented by
connectors in divergent ways: RegexRouter, Insert, Replace, HoistToStruct,
ExtractFromStruct

On Wed, Dec 14, 2016 at 6:00 PM Gwen Shapira  wrote:

I'm a bit concerned about adding transformations in Kafka. NiFi has 150
processors, presumably they are all useful for someone. I don't know if I'd
want all of that in Apache Kafka. What's the downside of keeping it out? Or
at least keeping the built-in set super minimal (Flume has like 3 built-in
interceptors)?

Gwen

On Wed, Dec 14, 2016 at 1:36 PM, Shikhar Bhushan 
wrote:

> With regard to a), just using `ConnectRecord` with `newRecord` as a new
> abstract method would be a fine choice. In prototyping, both options end
up
> looking pretty similar (in terms of how transformations are implemented
and
> the runtime initializes and uses them) and I'm starting to lean towards
not
> adding a new interface into the mix.
>
> On b) I think we should include a small set of useful transformations with
> Connect, since they can be applicable across different connectors and we
> should encourage some standardization for common operations. I'll update
> KIP-66 soon including a spec of transformations that I believe are worth
> including.
>
> On Sat, Dec 10, 2016 at 11:52 PM Ewen Cheslack-Postava 
> wrote:
>
> If anyone has time to review here, it'd be great to get feedback. I'd
> imagine that the proposal itself won't be too controversial -- keeps
> transformations simple (by only allowing map/filter), doesn't affect the
> rest of the framework much, and fits in with general config structure
we've
> used elsewhere (although ConfigDef could use some updates to make this
> easier...).
>
> I think the main open questions for me are:
>
> a) Is TransformableRecord worth it to avoid reimplementing small bits of
> code (it allows for a single implementation of the interface to trivially
> apply to both Source and SinkRecords). I think I prefer this, but it does
> come with some commitment to another interface on top of ConnectRecord. We
> could alternatively modify ConnectRecord which would require fewer
changes.
> b) How do folks feel about built-in transformations and the set that are
> mentioned here? This brings us way back to the discussion of built-in
> connectors. Transformations, especially when intended to be lightweight
and
> touch nothing besides the data already in the record, seem different from
> connectors -- there might be quite a few, but hopefully limited. Since we
> (hopefully) already factor out most serialization-specific stuff via
> Converters, I think we can keep this pretty limited. That said, I have no
> doubt some folks will (in my opinion) abuse this feature to do data
> enrichment by querying external systems, so building a bunch of
> transformations in could potentially open the floodgates, or at least make
> decisions about what is included vs what should be 3rd party muddy.
>
> -Ewen
>
>
> On Wed, Dec 7, 2016 at 11:46 AM, Shikhar Bhushan 
> wrote:
>
> > Hi all,
> >
> > I have another iteration at a proposal for this feature here:
> > https://cwiki.apache.org/confluence/display/KAFKA/
> > Connect+Transforms+-+Proposed+Design
> >
> > I'd welcome your feedback and comments.
> >
> > Thanks,
> >
> > Shikhar
> >
> > On Tue, Aug 2, 2016 at 7:21 PM Ewen Cheslack-Postava 
> > wrote:

Jenkins build is back to normal : kafka-trunk-jdk8 #1104

2016-12-15 Thread Apache Jenkins Server
See 



[jira] [Assigned] (KAFKA-4547) Consumer.position returns incorrect results for Kafka 0.10.1.0 client

2016-12-15 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian reassigned KAFKA-4547:
--

Assignee: Vahid Hashemian

> Consumer.position returns incorrect results for Kafka 0.10.1.0 client
> -
>
> Key: KAFKA-4547
> URL: https://issues.apache.org/jira/browse/KAFKA-4547
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.0
> Environment: Windows Kafka 0.10.1.0
>Reporter: Pranav Nakhe
>Assignee: Vahid Hashemian
>  Labels: clients
>
> Consider the following code -
>   KafkaConsumer consumer = new 
> KafkaConsumer(props);
>   List listOfPartitions = new ArrayList();
>   for (int i = 0; i < 
> consumer.partitionsFor("IssueTopic").size(); i++) {
>   listOfPartitions.add(new TopicPartition("IssueTopic", 
> i));
>   }
>   consumer.assign(listOfPartitions);  
>   consumer.pause(listOfPartitions);
>   consumer.seekToEnd(listOfPartitions);
> //consumer.resume(listOfPartitions); -- commented out
>   for(int i = 0; i < listOfPartitions.size(); i++) {
>   
> System.out.println(consumer.position(listOfPartitions.get(i)));
>   }
>   
> I have created a topic IssueTopic with 3 partitions with a single replica on 
> my single node kafka installation (0.10.1.0)
> The behavior noticed for Kafka client 0.10.1.0 as against Kafka client 
> 0.10.0.1
> A) Initially when there are no messages on IssueTopic running the above 
> program returns
> 0.10.1.0   
> 0  
> 0  
> 0   
> 0.10.0.1
> 0
> 0
> 0
> B) Next I send 6 messages and see that the messages have been evenly 
> distributed across the three partitions. Running the above program now 
> returns 
> 0.10.1.0   
> 0  
> 0  
> 2  
> 0.10.0.1
> 2
> 2
> 2
> Clearly there is a difference in behavior for the 2 clients.
> Now after seekToEnd call if I make a call to resume (uncomment the resume 
> call in code above) then the behavior is
> 0.10.1.0   
> 2  
> 2  
> 2  
> 0.10.0.1
> 2
> 2
> 2
> This is an issue I came across when using the spark kafka integration for 
> 0.10. When I use kafka 0.10.1.0 I started seeing this issue. I had raised a 
> pull request to resolve that issue [SPARK-18779] but when looking at the 
> kafka client implementation/documentation now it seems the issue is with 
> kafka and not with spark. There does not seem to be any documentation which 
> specifies/implies that we need to call resume after seekToEnd for position to 
> return the correct value. Also there is a clear difference in the behavior in 
> the two kafka client implementations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4521) MirrorMaker should flush all messages before releasing partition ownership during rebalance

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752550#comment-15752550
 ] 

ASF GitHub Bot commented on KAFKA-4521:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2241


> MirrorMaker should flush all messages before releasing partition ownership 
> during rebalance
> ---
>
> Key: KAFKA-4521
> URL: https://issues.apache.org/jira/browse/KAFKA-4521
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> In order to ensure that messages from a given partition in the source cluster 
> are mirrored to the same partition in the destination cluster in the *same* 
> order, MirrorMaker needs to produce and flush all messages that its consumer 
> has received from source cluster before giving up partition in the cluster. 
> However, as of current implementation of Apache Kafka, this is not guaranteed 
> and will cause out-of-order message delivery in the following scenario.
> - mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
> FetchRequest
> - mirror maker process 2 starts up and triggers rebalance.
> - `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
> thread, which does producer.flush() and commit offset of this consumer. 
> However, at this moment messages 1, 2, 3 haven't even been produced.
> - consumer of mirror maker process 1 releases ownership of this partition
> - consumer of mirror maker process 2 gets ownership of this partition.
> - mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
> - messages 4, 5, 6 can be produced before messages 1, 2, 3. 
> To fix this problem, the rebalance listener callback function should signal 
> MirrorMakerThread to get all messages from consumer, produce these messages 
> to destination cluster, flush producer, and commit offset. Rebalance listener 
> callback function should wait for MirrorMakerThread to finish these steps 
> before it allows ownership of this partition to be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2241: KAFKA-4521; MirrorMaker should flush all messages ...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2241


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[VOTE] 0.10.1.1 RC1

2016-12-15 Thread Guozhang Wang
Hello Kafka users, developers and client-developers,

This is the second, and hopefully the last candidate for the release of
Apache Kafka 0.10.1.1 before the break. This is a bug fix release and it
includes fixes and improvements from 30 JIRAs. See the release notes for
more details:

http://home.apache.org/~guozhang/kafka-0.10.1.1-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Tuesday, 20 December, 8pm PT ***

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~guozhang/kafka-0.10.1.1-rc1/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

NOTE the artifacts include the ones built from Scala 2.12.1 and Java8,
which are treated a pre-alpha artifacts for the Scala community to try and
test it out:

https://repository.apache.org/content/groups/staging/org/apache/kafka/kafka_2.12/0.10.1.1/

We will formally add the scala 2.12 support in future minor releases.


* Javadoc:
http://home.apache.org/~guozhang/kafka-0.10.1.1-rc1/javadoc/

* Tag to be voted upon (off 0.10.0 branch) is the 0.10.0.1-rc0 tag:
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=c3638376708ee6c02dfe4e57747acae0126fa6e7


Thanks,
Guozhang

-- 
-- Guozhang


Re: [VOTE] 0.10.1.1 RC0

2016-12-15 Thread Becket Qin
Hey Guozhang,

Thanks for running the release.
KAFKA-4521 is just checked in. It fixes a bug in mirror maker that may
result in message loss. Can we include that in 0.10.1.1 as well?

Thanks,

Jiangjie (Becket) Qin

On Thu, Dec 15, 2016 at 9:46 AM, Guozhang Wang  wrote:

> Michael,
>
> I am rolling out a new RC right now, stay tuned.
>
>
> Guozhang
>
>
> On Thu, Dec 15, 2016 at 2:34 AM, Michael Pearce 
> wrote:
>
> > Is there any update on this, when do we expect and RC1 for vote?
> >
> > We see KAFKA-4497 is marked Resolved.
> >
> > Cheers
> > Mike
> >
> > On 13/12/2016, 00:59, "Guozhang Wang"  wrote:
> >
> > I see. Currently the upload command has not included the 2.12 version
> > yet,
> > I will manually do that in the next RC.
> >
> > Guozhang
> >
> > On Mon, Dec 12, 2016 at 4:47 PM, Bernard Leach <
> > leac...@bouncycastle.org>
> > wrote:
> >
> > > I found those ones but was hoping to see them in
> > > https://repository.apache.org/content/groups/staging/org/
> > apache/kafka/ <
> > > https://repository.apache.org/content/groups/staging/org/
> > apache/kafka/>
> > > so we can just point the maven build there for testing.
> > >
> > > > On 13 Dec 2016, at 11:38, Guozhang Wang 
> > wrote:
> > > >
> > > > @Bernard
> > > >
> > > > You should be able to find the artifacts in the staging RC
> > repository I
> > > > listed above, named as "kafka_2.12-0.10.1.1.tgz
> > > >  > > kafka_2.12-0.10.1.1.tgz>
> > > > ".
> > > >
> > > > @Everyone
> > > >
> > > > Since KAFKA-4497 is a critical issue, that anyone using
> compaction
> > with
> > > > compressed message on server logs can possibly hit it, I'm going
> to
> > > create
> > > > a new RC after it is merged (presumably soon).
> > > >
> > > > If you discover any new problems in the meantime, let me know on
> > this
> > > > thread.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Mon, Dec 12, 2016 at 2:51 PM, Ismael Juma 
> > wrote:
> > > >
> > > >> Yes, it would be good to include that (that's why I set the fix
> > version
> > > to
> > > >> 0.10.1.1 a couple of days ago).
> > > >>
> > > >> Ismael
> > > >>
> > > >> On Mon, Dec 12, 2016 at 3:47 PM, Moczarski, Swen <
> > > >> smoczar...@ebay-kleinanzeigen.de> wrote:
> > > >>
> > > >>> -0 (non-binding)
> > > >>>
> > > >>> Would it make sense to include https://issues.apache.org/
> > > >>> jira/browse/KAFKA-4497 ? The issue sounds quite critical and a
> > patch
> > > for
> > > >>> 0.10.1.1 seems to be available.
> > > >>>
> > > >>> On 2016-12-07 23:46 (+0100), Guozhang Wang  >  > > w...@
> > > >>> gmail.com>> wrote:
> > >  Hello Kafka users, developers and client-developers,>
> > > 
> > >  This is the first candidate for the release of Apache Kafka
> > 0.10.1.1.
> > > >>> This is>
> > >  a bug fix release and it includes fixes and improvements from
> 27
> > > JIRAs.
> > > >>> See>
> > >  the release notes for more details:>
> > > 
> > >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/
> > > RELEASE_NOTES.html
> > > >>>
> > > 
> > >  *** Please download, test and vote by Monday, 13 December, 8am
> > PT ***>
> > > 
> > >  Kafka's KEYS file containing PGP keys we use to sign the
> > release:>
> > >  http://kafka.apache.org/KEYS>
> > > 
> > >  * Release artifacts to be voted upon (source and binary):>
> > >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/>
> > > 
> > >  * Maven artifacts to be voted upon:>
> > >  https://repository.apache.org/content/groups/staging/org/
> > > apache/kafka/
> > > >>>
> > > 
> > >  * Javadoc:>
> > >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/javadoc/>
> > > 
> > >  * Tag to be voted upon (off 0.10.0 branch) is the 0.10.0.1-rc0
> > tag:>
> > >  https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> > > >>> 8b77507083fdd427ce81021228e7e346da0d814c>
> > > 
> > > 
> > >  Thanks,>
> > >  Guozhang>
> > > 
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
> >
> > The information contained in this email is strictly confidential and for
> > the use of the addressee only, unless otherwise indicated. If you are not
> > the intended recipient, please do not read, copy, use or disclose to
> others
> > this message or any attachment. Please also notify the sender by replying
> > to this email or by telephone (+44(020 7896 0011) and then delete the
> email
> > and any copies of it. Opinions, conclusion (etc) that do not relate

Re: [DISCUSS] KIP-102 - Add close with timeout for consumers

2016-12-15 Thread Ewen Cheslack-Postava
Rajini,

Thanks for this KIP, I'd definitely like to see this. Connect has had a
long-standing TODO around stopping sink tasks where we can't properly
manage the rebalance process (which involves stopping consumers) because we
lack a timeout here. Not a huge problem in practice, but would be nice to
fix. And I know I've seen at least a half dozen requests for this (and
other timeouts to be respected in the consumer) on the mailing list.

My only concern is that this could be a pretty substantial change to the
consumer code. We've had TODO items in the code since Jay wrote the first
version that say something about avoiding infinite looping and waiting
indefinitely on group membership. If we implement this, I'd hope to get it
committed in the early part of the release cycle so we have plenty of time
to stabilize + bug fix.

-Ewen

On Thu, Dec 15, 2016 at 5:32 AM, Rajini Sivaram  wrote:

> Hi all,
>
> I have just created KIP-102 to add a new close method for consumers with a
> timeout parameter, making Consumer consistent with Producer:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 102+-+Add+close+with+timeout+for+consumers
>
> Comments and suggestions are welcome.
>
> Thank you...
>
> Regards,
>
> Rajini
>


[jira] [Created] (KAFKA-4548) Add CompatibilityTest to verify that individual features are supported or not by the broker we're connecting to

2016-12-15 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-4548:
--

 Summary: Add CompatibilityTest to verify that individual features 
are supported or not by the broker we're connecting to
 Key: KAFKA-4548
 URL: https://issues.apache.org/jira/browse/KAFKA-4548
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, system tests, unit tests
Reporter: Colin P. McCabe
Assignee: Colin P. McCabe


Add CompatibilityTest to verify that individual features are supported or not 
by the broker we're connecting to.  This can be used in a ducktape test to 
verify that the feature is present or absent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.10.1.1 RC0

2016-12-15 Thread Guozhang Wang
Hey Becket,

I just cut the release this morning and the RC1 is out a few minutes ago so
that we can possibly have the release out before the break. I looked
through https://issues.apache.org/jira/browse/KAFKA-4521 and feel it is OK
to have it in the next minor release happening in a month or so. Does it
sound good to you?


Guozhang


On Thu, Dec 15, 2016 at 1:30 PM, Becket Qin  wrote:

> Hey Guozhang,
>
> Thanks for running the release.
> KAFKA-4521 is just checked in. It fixes a bug in mirror maker that may
> result in message loss. Can we include that in 0.10.1.1 as well?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Thu, Dec 15, 2016 at 9:46 AM, Guozhang Wang  wrote:
>
> > Michael,
> >
> > I am rolling out a new RC right now, stay tuned.
> >
> >
> > Guozhang
> >
> >
> > On Thu, Dec 15, 2016 at 2:34 AM, Michael Pearce 
> > wrote:
> >
> > > Is there any update on this, when do we expect and RC1 for vote?
> > >
> > > We see KAFKA-4497 is marked Resolved.
> > >
> > > Cheers
> > > Mike
> > >
> > > On 13/12/2016, 00:59, "Guozhang Wang"  wrote:
> > >
> > > I see. Currently the upload command has not included the 2.12
> version
> > > yet,
> > > I will manually do that in the next RC.
> > >
> > > Guozhang
> > >
> > > On Mon, Dec 12, 2016 at 4:47 PM, Bernard Leach <
> > > leac...@bouncycastle.org>
> > > wrote:
> > >
> > > > I found those ones but was hoping to see them in
> > > > https://repository.apache.org/content/groups/staging/org/
> > > apache/kafka/ <
> > > > https://repository.apache.org/content/groups/staging/org/
> > > apache/kafka/>
> > > > so we can just point the maven build there for testing.
> > > >
> > > > > On 13 Dec 2016, at 11:38, Guozhang Wang 
> > > wrote:
> > > > >
> > > > > @Bernard
> > > > >
> > > > > You should be able to find the artifacts in the staging RC
> > > repository I
> > > > > listed above, named as "kafka_2.12-0.10.1.1.tgz
> > > > >  > > > kafka_2.12-0.10.1.1.tgz>
> > > > > ".
> > > > >
> > > > > @Everyone
> > > > >
> > > > > Since KAFKA-4497 is a critical issue, that anyone using
> > compaction
> > > with
> > > > > compressed message on server logs can possibly hit it, I'm
> going
> > to
> > > > create
> > > > > a new RC after it is merged (presumably soon).
> > > > >
> > > > > If you discover any new problems in the meantime, let me know
> on
> > > this
> > > > > thread.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > > On Mon, Dec 12, 2016 at 2:51 PM, Ismael Juma <
> ism...@juma.me.uk>
> > > wrote:
> > > > >
> > > > >> Yes, it would be good to include that (that's why I set the
> fix
> > > version
> > > > to
> > > > >> 0.10.1.1 a couple of days ago).
> > > > >>
> > > > >> Ismael
> > > > >>
> > > > >> On Mon, Dec 12, 2016 at 3:47 PM, Moczarski, Swen <
> > > > >> smoczar...@ebay-kleinanzeigen.de> wrote:
> > > > >>
> > > > >>> -0 (non-binding)
> > > > >>>
> > > > >>> Would it make sense to include https://issues.apache.org/
> > > > >>> jira/browse/KAFKA-4497 ? The issue sounds quite critical and
> a
> > > patch
> > > > for
> > > > >>> 0.10.1.1 seems to be available.
> > > > >>>
> > > > >>> On 2016-12-07 23:46 (+0100), Guozhang Wang  > >  > > > w...@
> > > > >>> gmail.com>> wrote:
> > > >  Hello Kafka users, developers and client-developers,>
> > > > 
> > > >  This is the first candidate for the release of Apache Kafka
> > > 0.10.1.1.
> > > > >>> This is>
> > > >  a bug fix release and it includes fixes and improvements
> from
> > 27
> > > > JIRAs.
> > > > >>> See>
> > > >  the release notes for more details:>
> > > > 
> > > >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/
> > > > RELEASE_NOTES.html
> > > > >>>
> > > > 
> > > >  *** Please download, test and vote by Monday, 13 December,
> 8am
> > > PT ***>
> > > > 
> > > >  Kafka's KEYS file containing PGP keys we use to sign the
> > > release:>
> > > >  http://kafka.apache.org/KEYS>
> > > > 
> > > >  * Release artifacts to be voted upon (source and binary):>
> > > >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/>
> > > > 
> > > >  * Maven artifacts to be voted upon:>
> > > >  https://repository.apache.org/content/groups/staging/org/
> > > > apache/kafka/
> > > > >>>
> > > > 
> > > >  * Javadoc:>
> > > >  http://home.apache.org/~guozha
> ng/kafka-0.10.1.1-rc0/javadoc/>
> > > > 
> > > >  * Tag to be voted upon (off 0.10.0 branch) is the
> 0.10.0.1-rc0
> > > tag:>
> > > >  https://git-wip-us.apache.org/
> repos/asf?p=kafka.git;a=tag;h=
> > > > >>> 8b77507083fdd427ce81021228e7e346da0d814c>
> > > > 
> 

Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-12-15 Thread Gwen Shapira
I agree about the ease of use in adding a small-subset of built-in
transformations.

But the same thing is true for connectors - there are maybe 5 super popular
OSS connectors and the rest is a very long tail. We drew the line at not
adding any, because thats the easiest and because we did not want to turn
Kafka into a collection of transformations.

I really don't want to end up with 135 (or even 20) transformations in
Kafka. So either we have a super-clear definition of what belongs and what
doesn't - or we put in one minimal example and the rest goes into the
ecosystem.

We can also start by putting transformations on github and just see if
there is huge demand for them in Apache. It is easier to add stuff to the
project later than to remove functionality.



On Thu, Dec 15, 2016 at 11:59 AM, Shikhar Bhushan 
wrote:

> I have updated KIP-66
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 66%3A+Single+Message+Transforms+for+Kafka+Connect
> with
> the changes I proposed in the design.
>
> Gwen, I think the main downside to not including some transformations with
> Kafka Connect is that it seems less user friendly if folks have to make
> sure to have the right transformation(s) on the classpath as well, besides
> their connector(s). Additionally by going in with a small set included, we
> can encourage a consistent configuration and implementation style and
> provide utilities for e.g. data transformations, which I expect we will
> definitely need (discussed under 'Patterns for data transformations').
>
> It does get hard to draw the line once you go from 'none' to 'some'. To get
> discussion going, if we get agreement on 'none' vs 'some', I added a table
> under 'Bundled transformations' for transformations which I think are worth
> including.
>
> For many of these, I have noticed their absence in the wild as a pain point
> --
> TimestampRouter:
> https://github.com/confluentinc/kafka-connect-elasticsearch/issues/33
> Mask:
> https://groups.google.com/d/msg/confluent-platform/3yHb8_
> mCReQ/sTQc3dNgBwAJ
> Insert:
> http://stackoverflow.com/questions/40664745/elasticsearch-connector-for-
> kafka-connect-offset-and-timestamp
> RegexRouter:
> https://groups.google.com/d/msg/confluent-platform/
> yEBwu1rGcs0/gIAhRp6kBwAJ
> NumericCast:
> https://github.com/confluentinc/kafka-connect-
> jdbc/issues/101#issuecomment-249096119
> TimestampConverter:
> https://groups.google.com/d/msg/confluent-platform/
> gGAOsw3Qeu4/8JCqdDhGBwAJ
> ValueToKey: https://github.com/confluentinc/kafka-connect-jdbc/pull/166
>
> In other cases, their functionality is already being implemented by
> connectors in divergent ways: RegexRouter, Insert, Replace, HoistToStruct,
> ExtractFromStruct
>
> On Wed, Dec 14, 2016 at 6:00 PM Gwen Shapira  wrote:
>
> I'm a bit concerned about adding transformations in Kafka. NiFi has 150
> processors, presumably they are all useful for someone. I don't know if I'd
> want all of that in Apache Kafka. What's the downside of keeping it out? Or
> at least keeping the built-in set super minimal (Flume has like 3 built-in
> interceptors)?
>
> Gwen
>
> On Wed, Dec 14, 2016 at 1:36 PM, Shikhar Bhushan 
> wrote:
>
> > With regard to a), just using `ConnectRecord` with `newRecord` as a new
> > abstract method would be a fine choice. In prototyping, both options end
> up
> > looking pretty similar (in terms of how transformations are implemented
> and
> > the runtime initializes and uses them) and I'm starting to lean towards
> not
> > adding a new interface into the mix.
> >
> > On b) I think we should include a small set of useful transformations
> with
> > Connect, since they can be applicable across different connectors and we
> > should encourage some standardization for common operations. I'll update
> > KIP-66 soon including a spec of transformations that I believe are worth
> > including.
> >
> > On Sat, Dec 10, 2016 at 11:52 PM Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> > If anyone has time to review here, it'd be great to get feedback. I'd
> > imagine that the proposal itself won't be too controversial -- keeps
> > transformations simple (by only allowing map/filter), doesn't affect the
> > rest of the framework much, and fits in with general config structure
> we've
> > used elsewhere (although ConfigDef could use some updates to make this
> > easier...).
> >
> > I think the main open questions for me are:
> >
> > a) Is TransformableRecord worth it to avoid reimplementing small bits of
> > code (it allows for a single implementation of the interface to trivially
> > apply to both Source and SinkRecords). I think I prefer this, but it does
> > come with some commitment to another interface on top of ConnectRecord.
> We
> > could alternatively modify ConnectRecord which would require fewer
> changes.
> > b) How do folks feel about built-in transformations and the set that are
> > mentioned here? This brings us way back to the discussion of built-in
> > connectors. Transformat

Build failed in Jenkins: kafka-trunk-jdk8 #1105

2016-12-15 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-4529; LogCleaner should not delete the tombstone too early.

--
[...truncated 26063 lines...]
org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
buildUsingLogAppendTime[6] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
convertUsingLogAppendTime[6] STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
convertUsingLogAppendTime[6] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > writePastLimit[6] 
STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > writePastLimit[6] 
PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
buildUsingCreateTime[7] STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
buildUsingCreateTime[7] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
testCompressionRateV0[7] STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
testCompressionRateV0[7] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
testCompressionRateV1[7] STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
testCompressionRateV1[7] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
convertUsingCreateTime[7] STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
convertUsingCreateTime[7] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
buildUsingLogAppendTime[7] STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
buildUsingLogAppendTime[7] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
convertUsingLogAppendTime[7] STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > 
convertUsingLogAppendTime[7] PASSED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > writePastLimit[7] 
STARTED

org.apache.kafka.common.record.MemoryRecordsBuilderTest > writePastLimit[7] 
PASSED

org.apache.kafka.common.SerializeCompatibilityTopicPartitionTest > 
testSerializationRoundtrip STARTED

org.apache.kafka.common.SerializeCompatibilityTopicPartitionTest > 
testSerializationRoundtrip PASSED

org.apache.kafka.common.SerializeCompatibilityTopicPartitionTest > 
testTopiPartitionSerializationCompatibility STARTED

org.apache.kafka.common.SerializeCompatibilityTopicPartitionTest > 
testTopiPartitionSerializationCompatibility PASSED

org.apache.kafka.common.protocol.ApiKeysTest > testForIdWithInvalidIdLow STARTED

org.apache.kafka.common.protocol.ApiKeysTest > testForIdWithInvalidIdLow PASSED

org.apache.kafka.common.protocol.ApiKeysTest > testForIdWithInvalidIdHigh 
STARTED

org.apache.kafka.common.protocol.ApiKeysTest > testForIdWithInvalidIdHigh PASSED

org.apache.kafka.common.protocol.ErrorsTest > testExceptionName STARTED

org.apache.kafka.common.protocol.ErrorsTest > testExceptionName PASSED

org.apache.kafka.common.protocol.ErrorsTest > testForExceptionDefault STARTED

org.apache.kafka.common.protocol.ErrorsTest > testForExceptionDefault PASSED

org.apache.kafka.common.protocol.ErrorsTest > testUniqueExceptions STARTED

org.apache.kafka.common.protocol.ErrorsTest > testUniqueExceptions PASSED

org.apache.kafka.common.protocol.ErrorsTest > testForExceptionInheritance 
STARTED

org.apache.kafka.common.protocol.ErrorsTest > testForExceptionInheritance PASSED

org.apache.kafka.common.protocol.ErrorsTest > testNoneException STARTED

org.apache.kafka.common.protocol.ErrorsTest > testNoneException PASSED

org.apache.kafka.common.protocol.ErrorsTest > testUniqueErrorCodes STARTED

org.apache.kafka.common.protocol.ErrorsTest > testUniqueErrorCodes PASSED

org.apache.kafka.common.protocol.ErrorsTest > testExceptionsAreNotGeneric 
STARTED

org.apache.kafka.common.protocol.ErrorsTest > testExceptionsAreNotGeneric PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testNulls 
STARTED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testNulls 
PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testToString 
STARTED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testToString 
PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testReadStringSizeTooLarge STARTED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testReadStringSizeTooLarge PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testNullableDefault STARTED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testNullableDefault PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testReadNegativeStringSize STARTED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testReadNegativeStringSize PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testReadArraySizeTooLarge STARTED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > 
testReadArraySizeTooLarge PASSED


Re: [VOTE] 0.10.1.1 RC0

2016-12-15 Thread Becket Qin
Yes, that sounds good. Thanks.

Jiangjie (Becket) Qin

On Thu, Dec 15, 2016 at 1:46 PM, Guozhang Wang  wrote:

> Hey Becket,
>
> I just cut the release this morning and the RC1 is out a few minutes ago so
> that we can possibly have the release out before the break. I looked
> through https://issues.apache.org/jira/browse/KAFKA-4521 and feel it is OK
> to have it in the next minor release happening in a month or so. Does it
> sound good to you?
>
>
> Guozhang
>
>
> On Thu, Dec 15, 2016 at 1:30 PM, Becket Qin  wrote:
>
> > Hey Guozhang,
> >
> > Thanks for running the release.
> > KAFKA-4521 is just checked in. It fixes a bug in mirror maker that may
> > result in message loss. Can we include that in 0.10.1.1 as well?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Thu, Dec 15, 2016 at 9:46 AM, Guozhang Wang 
> wrote:
> >
> > > Michael,
> > >
> > > I am rolling out a new RC right now, stay tuned.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Dec 15, 2016 at 2:34 AM, Michael Pearce  >
> > > wrote:
> > >
> > > > Is there any update on this, when do we expect and RC1 for vote?
> > > >
> > > > We see KAFKA-4497 is marked Resolved.
> > > >
> > > > Cheers
> > > > Mike
> > > >
> > > > On 13/12/2016, 00:59, "Guozhang Wang"  wrote:
> > > >
> > > > I see. Currently the upload command has not included the 2.12
> > version
> > > > yet,
> > > > I will manually do that in the next RC.
> > > >
> > > > Guozhang
> > > >
> > > > On Mon, Dec 12, 2016 at 4:47 PM, Bernard Leach <
> > > > leac...@bouncycastle.org>
> > > > wrote:
> > > >
> > > > > I found those ones but was hoping to see them in
> > > > > https://repository.apache.org/content/groups/staging/org/
> > > > apache/kafka/ <
> > > > > https://repository.apache.org/content/groups/staging/org/
> > > > apache/kafka/>
> > > > > so we can just point the maven build there for testing.
> > > > >
> > > > > > On 13 Dec 2016, at 11:38, Guozhang Wang 
> > > > wrote:
> > > > > >
> > > > > > @Bernard
> > > > > >
> > > > > > You should be able to find the artifacts in the staging RC
> > > > repository I
> > > > > > listed above, named as "kafka_2.12-0.10.1.1.tgz
> > > > > >  > > > > kafka_2.12-0.10.1.1.tgz>
> > > > > > ".
> > > > > >
> > > > > > @Everyone
> > > > > >
> > > > > > Since KAFKA-4497 is a critical issue, that anyone using
> > > compaction
> > > > with
> > > > > > compressed message on server logs can possibly hit it, I'm
> > going
> > > to
> > > > > create
> > > > > > a new RC after it is merged (presumably soon).
> > > > > >
> > > > > > If you discover any new problems in the meantime, let me know
> > on
> > > > this
> > > > > > thread.
> > > > > >
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > >
> > > > > > On Mon, Dec 12, 2016 at 2:51 PM, Ismael Juma <
> > ism...@juma.me.uk>
> > > > wrote:
> > > > > >
> > > > > >> Yes, it would be good to include that (that's why I set the
> > fix
> > > > version
> > > > > to
> > > > > >> 0.10.1.1 a couple of days ago).
> > > > > >>
> > > > > >> Ismael
> > > > > >>
> > > > > >> On Mon, Dec 12, 2016 at 3:47 PM, Moczarski, Swen <
> > > > > >> smoczar...@ebay-kleinanzeigen.de> wrote:
> > > > > >>
> > > > > >>> -0 (non-binding)
> > > > > >>>
> > > > > >>> Would it make sense to include https://issues.apache.org/
> > > > > >>> jira/browse/KAFKA-4497 ? The issue sounds quite critical
> and
> > a
> > > > patch
> > > > > for
> > > > > >>> 0.10.1.1 seems to be available.
> > > > > >>>
> > > > > >>> On 2016-12-07 23:46 (+0100), Guozhang Wang  > > >  > > > > w...@
> > > > > >>> gmail.com>> wrote:
> > > > >  Hello Kafka users, developers and client-developers,>
> > > > > 
> > > > >  This is the first candidate for the release of Apache
> Kafka
> > > > 0.10.1.1.
> > > > > >>> This is>
> > > > >  a bug fix release and it includes fixes and improvements
> > from
> > > 27
> > > > > JIRAs.
> > > > > >>> See>
> > > > >  the release notes for more details:>
> > > > > 
> > > > >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/
> > > > > RELEASE_NOTES.html
> > > > > >>>
> > > > > 
> > > > >  *** Please download, test and vote by Monday, 13 December,
> > 8am
> > > > PT ***>
> > > > > 
> > > > >  Kafka's KEYS file containing PGP keys we use to sign the
> > > > release:>
> > > > >  http://kafka.apache.org/KEYS>
> > > > > 
> > > > >  * Release artifacts to be voted upon (source and binary):>
> > > > >  http://home.apache.org/~guozhang/kafka-0.10.1.1-rc0/>
> > > > > 
> > > > >  * Maven artifacts to be voted upon:>
> > > > >  https://repository.apache.org/content/groups/staging/org/
> > > > > apache

Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

2016-12-15 Thread radai
I can see several issues with the current proposal.

messages, even if sent under a TX, are delivered directly to their
destination partitions, downstream consumers need to be TX-aware. they can
either:
   1. be oblivious to TXs. that means they will deliver "garbage" - msgs
sent during eventually-aborted TXs.
   2. "opt-in" - they would have to not deliver _ANY_ msg until they know
the fate of all outstanding overlapping TXs - if i see msg A1 (in a TX),
followed by B, which is not under any TX, i cannot deliver B until i know
if A1 was committed or not (or I violate ordering). this would require some
sort of buffering on consumers. with a naive buffering impl i could DOS
everyone on a topic - just start a TX on a very busy topic and keep it open
as long as I can 
   3. explode if youre an old consumer that sees a control msg (whats your
migration plan?)
   4. cross-cluster replication mechanisms either replicate the garbage or
need to clean it up. there are >1 such different mechanism (almost one per
company really :-) ) so lots of adjustments.

I think the end result could be better if ongoing TXs are treated as
logically separate topic partitions, and only atomically appended onto the
target partitions on commit (meaning they are written to separate journal
file(s) on the broker).

such a design would present a "clean" view to any downstream consumers -
anything not committed wont even show up. old consumers wont need to know
about control msgs, no issues with unbounded msg buffering, generally
cleaner overall?

there would need to be adjustments made to watermark and follower fetch
logic but some of us here have discussed this over lunch and we think its
doable.


On Thu, Dec 15, 2016 at 1:08 AM, Rajini Sivaram  wrote:

> Hi Apurva,
>
> Thank you, makes sense.
>
> Rajini
>
> On Wed, Dec 14, 2016 at 7:36 PM, Apurva Mehta  wrote:
>
> > Hi Rajini,
> >
> > I think my original response to your point 15 was not accurate. The
> regular
> > definition of durability is that data once committed would never be lost.
> > So it is not enough for only the control messages to be flushed before
> > being acknowledged -- all the messages (and offset commits) which are
> part
> > of the transaction would need to be flushed before being acknowledged as
> > well.
> >
> > Otherwise, it is possible that if all replicas of a topic partition crash
> > before the transactional messages are flushed, those messages will be
> lost
> > even if the commit marker exists in the log. In this case, the
> transaction
> > would be 'committed' with incomplete data.
> >
> > Right now, there isn't any config which will ensure that the flush to
> disk
> > happens before the acknowledgement. We could add it in the future, and
> get
> > durability guarantees for kafka transactions.
> >
> > I hope this clarifies the situation. The present KIP does not intend to
> add
> > the aforementioned config, so even the control messages are susceptible
> to
> > being lost if there is a simultaneous crash across all replicas. So
> > transactions are only as durable as existing Kafka messages. We don't
> > strengthen any durability guarantees as part of this KIP.
> >
> > Thanks,
> > Apurva
> >
> >
> > On Wed, Dec 14, 2016 at 1:52 AM, Rajini Sivaram 
> > wrote:
> >
> > > Hi Apurva,
> > >
> > > Thank you for the answers. Just one follow-on.
> > >
> > > 15. Let me rephrase my original question. If all control messages
> > (messages
> > > to transaction logs and markers on user logs) were acknowledged only
> > after
> > > flushing the log segment, will transactions become durable in the
> > > traditional sense (i.e. not restricted to min.insync.replicas
> failures) ?
> > > This is not a suggestion to update the KIP. It seems to me that the
> > design
> > > enables full durability if required in the future with a rather
> > > non-intrusive change. I just wanted to make sure I haven't missed
> > anything
> > > fundamental that prevents Kafka from doing this.
> > >
> > >
> > >
> > > On Wed, Dec 14, 2016 at 5:30 AM, Ben Kirwin  wrote:
> > >
> > > > Hi Apurva,
> > > >
> > > > Thanks for the detailed answers... and sorry for the late reply!
> > > >
> > > > It does sound like, if the input-partitions-to-app-id mapping never
> > > > changes, the existing fencing mechanisms should prevent duplicates.
> > > Great!
> > > > I'm a bit concerned the proposed API will be delicate to program
> > against
> > > > successfully -- even in the simple case, we need to create a new
> > producer
> > > > instance per input partition, and anything fancier is going to need
> its
> > > own
> > > > implementation of the Streams/Samza-style 'task' idea -- but that may
> > be
> > > > fine for this sort of advanced feature.
> > > >
> > > > For the second question, I notice that Jason also elaborated on this
> > > > downthread:
> > > >
> > > > > We also looked at removing the producer ID.
> > > > > This was discussed somewhere above, but basically the idea is to
> > store
> > > > the
> > >

[GitHub] kafka pull request #2262: KAFKA-4548: Add CompatibilityTest to verify that i...

2016-12-15 Thread cmccabe
GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2262

KAFKA-4548: Add CompatibilityTest to verify that individual features …

Add CompatibilityTest to verify that individual features are supported or 
not by the broker we're connecting to

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-4548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2262


commit 408f785bee202f5a175186fe9abd37801e27c2b8
Author: Colin P. Mccabe 
Date:   2016-12-15T22:01:30Z

KAFKA-4548: Add CompatibilityTest to verify that individual features are 
supported or not by the broker we're connecting to




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Work started] (KAFKA-4548) Add CompatibilityTest to verify that individual features are supported or not by the broker we're connecting to

2016-12-15 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-4548 started by Colin P. McCabe.
--
> Add CompatibilityTest to verify that individual features are supported or not 
> by the broker we're connecting to
> ---
>
> Key: KAFKA-4548
> URL: https://issues.apache.org/jira/browse/KAFKA-4548
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, system tests, unit tests
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> Add CompatibilityTest to verify that individual features are supported or not 
> by the broker we're connecting to.  This can be used in a ducktape test to 
> verify that the feature is present or absent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4548) Add CompatibilityTest to verify that individual features are supported or not by the broker we're connecting to

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752717#comment-15752717
 ] 

ASF GitHub Bot commented on KAFKA-4548:
---

GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2262

KAFKA-4548: Add CompatibilityTest to verify that individual features …

Add CompatibilityTest to verify that individual features are supported or 
not by the broker we're connecting to

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-4548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2262


commit 408f785bee202f5a175186fe9abd37801e27c2b8
Author: Colin P. Mccabe 
Date:   2016-12-15T22:01:30Z

KAFKA-4548: Add CompatibilityTest to verify that individual features are 
supported or not by the broker we're connecting to




> Add CompatibilityTest to verify that individual features are supported or not 
> by the broker we're connecting to
> ---
>
> Key: KAFKA-4548
> URL: https://issues.apache.org/jira/browse/KAFKA-4548
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, system tests, unit tests
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> Add CompatibilityTest to verify that individual features are supported or not 
> by the broker we're connecting to.  This can be used in a ducktape test to 
> verify that the feature is present or absent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-12-15 Thread Shikhar Bhushan
I think the tradeoffs for including connectors are different. Connectors
are comparatively larger in scope, they tend to come with their own set of
dependencies for the systems they need to talk to. Transformations as I
imagine them - at least the ones on the table in the wiki currently -
should be a single not-very-large class (or 3 when there are simple *Key
and *Value variants deriving from a base implementing the common
functionality), in some cases relying on common utilities for munging with
the Connect data API. Correspondingly, the maintenance burden is also
smaller.

It's true that it would probably be easier to add specific transformations
down the line than evolve/remove, but I have faith we can strike a good
balance in making the call on what to include from the start.

On > super-clear definition of what belongs and what doesn't

How about: small and broadly applicable, configurable in an easily
understandable manner, no external dependencies, concrete use-case

On Thu, Dec 15, 2016 at 2:01 PM Gwen Shapira  wrote:

I agree about the ease of use in adding a small-subset of built-in
transformations.

But the same thing is true for connectors - there are maybe 5 super popular
OSS connectors and the rest is a very long tail. We drew the line at not
adding any, because thats the easiest and because we did not want to turn
Kafka into a collection of transformations.

I really don't want to end up with 135 (or even 20) transformations in
Kafka. So either we have a super-clear definition of what belongs and what
doesn't - or we put in one minimal example and the rest goes into the
ecosystem.

We can also start by putting transformations on github and just see if
there is huge demand for them in Apache. It is easier to add stuff to the
project later than to remove functionality.



On Thu, Dec 15, 2016 at 11:59 AM, Shikhar Bhushan 
wrote:

> I have updated KIP-66
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 66%3A+Single+Message+Transforms+for+Kafka+Connect
> with
> the changes I proposed in the design.
>
> Gwen, I think the main downside to not including some transformations with
> Kafka Connect is that it seems less user friendly if folks have to make
> sure to have the right transformation(s) on the classpath as well, besides
> their connector(s). Additionally by going in with a small set included, we
> can encourage a consistent configuration and implementation style and
> provide utilities for e.g. data transformations, which I expect we will
> definitely need (discussed under 'Patterns for data transformations').
>
> It does get hard to draw the line once you go from 'none' to 'some'. To
get
> discussion going, if we get agreement on 'none' vs 'some', I added a table
> under 'Bundled transformations' for transformations which I think are
worth
> including.
>
> For many of these, I have noticed their absence in the wild as a pain
point
> --
> TimestampRouter:
> https://github.com/confluentinc/kafka-connect-elasticsearch/issues/33
> Mask:
> https://groups.google.com/d/msg/confluent-platform/3yHb8_
> mCReQ/sTQc3dNgBwAJ
> Insert:
> http://stackoverflow.com/questions/40664745/elasticsearch-connector-for-
> kafka-connect-offset-and-timestamp
> RegexRouter:
> https://groups.google.com/d/msg/confluent-platform/
> yEBwu1rGcs0/gIAhRp6kBwAJ
> NumericCast:
> https://github.com/confluentinc/kafka-connect-
> jdbc/issues/101#issuecomment-249096119
> TimestampConverter:
> https://groups.google.com/d/msg/confluent-platform/
> gGAOsw3Qeu4/8JCqdDhGBwAJ
> ValueToKey: https://github.com/confluentinc/kafka-connect-jdbc/pull/166
>
> In other cases, their functionality is already being implemented by
> connectors in divergent ways: RegexRouter, Insert, Replace, HoistToStruct,
> ExtractFromStruct
>
> On Wed, Dec 14, 2016 at 6:00 PM Gwen Shapira  wrote:
>
> I'm a bit concerned about adding transformations in Kafka. NiFi has 150
> processors, presumably they are all useful for someone. I don't know if
I'd
> want all of that in Apache Kafka. What's the downside of keeping it out?
Or
> at least keeping the built-in set super minimal (Flume has like 3 built-in
> interceptors)?
>
> Gwen
>
> On Wed, Dec 14, 2016 at 1:36 PM, Shikhar Bhushan 
> wrote:
>
> > With regard to a), just using `ConnectRecord` with `newRecord` as a new
> > abstract method would be a fine choice. In prototyping, both options end
> up
> > looking pretty similar (in terms of how transformations are implemented
> and
> > the runtime initializes and uses them) and I'm starting to lean towards
> not
> > adding a new interface into the mix.
> >
> > On b) I think we should include a small set of useful transformations
> with
> > Connect, since they can be applicable across different connectors and we
> > should encourage some standardization for common operations. I'll update
> > KIP-66 soon including a spec of transformations that I believe are worth
> > including.
> >
> > On Sat, Dec 10, 2016 at 11:52 PM Ewen Cheslack-Postava <
> e...@confluent.io

Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-12-15 Thread Ewen Cheslack-Postava
I think there are a couple of factors that make transformations and
connectors different.

First, NiFi's 150 processors is a bit misleading. In NiFi, processors cover
data sources, data sinks, serialization/deserialization, *and*
transformations. I haven't filtered the list to see how many fall into the
first 3 categories, but it's a *lot* of the processors they have.

Second, since transformations only apply to a single message and I'd think
they generally shouldn't be interacting with external services (i.e. I
think trying to do enrichment in SMT is probably a bad idea), the scope of
possible transformations is reasonably limited and the transformations
themselves tend to be small and easily maintainable. I think this is a
dramatic difference from connectors, which are each substantial projects in
their own right.

While I get the slippery slope argument re: including specific
transformations, I think we can come up with a reasonable policy (and via
KIPs we can, as a community, come to an agreement based purely on taste if
it comes down to that). In particular, I'd say keep the core general (i.e.
no domain-specific transformations/parsing like HL7), pure data
manipulation (i.e. no enrichment), and nothing that could just as well be
done as a converter/serializer/deserializer/source connector/sink connector.

I was very staunchly against including connectors (aside from a simple
example) directly in Kafka, so this may seem like a reversal of position.
But I think the % of use cases covered will look very different between
connectors and transformations. Sure, some connectors are very popular, and
moreso right now because they are the most thoroughly developed, tested,
etc. But the top 3 most common transformations will probably be used across
all the top 20 most popular connectors. I have no doubt people will end up
writing custom ones (which is why it's nice to make them pluggable rather
than choosing a fixed set), but they'll either be very niche (like people
write custom connectors for their internal systems) or be more broadly
applicable but very domain specific such that they are easy to reject for
inclusion.

@Gwen if we filtered the list of NiFi processors to ones that fit that
criteria, would that still be too long a list for your taste? Similarly,
let's say we were going to include some baked in; in that case, does
anything look out of place to you in the list Shikhar has included in the
KIP?

-Ewen

On Thu, Dec 15, 2016 at 2:01 PM, Gwen Shapira  wrote:

> I agree about the ease of use in adding a small-subset of built-in
> transformations.
>
> But the same thing is true for connectors - there are maybe 5 super popular
> OSS connectors and the rest is a very long tail. We drew the line at not
> adding any, because thats the easiest and because we did not want to turn
> Kafka into a collection of transformations.
>
> I really don't want to end up with 135 (or even 20) transformations in
> Kafka. So either we have a super-clear definition of what belongs and what
> doesn't - or we put in one minimal example and the rest goes into the
> ecosystem.
>
> We can also start by putting transformations on github and just see if
> there is huge demand for them in Apache. It is easier to add stuff to the
> project later than to remove functionality.
>
>
>
> On Thu, Dec 15, 2016 at 11:59 AM, Shikhar Bhushan 
> wrote:
>
> > I have updated KIP-66
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 66%3A+Single+Message+Transforms+for+Kafka+Connect
> > with
> > the changes I proposed in the design.
> >
> > Gwen, I think the main downside to not including some transformations
> with
> > Kafka Connect is that it seems less user friendly if folks have to make
> > sure to have the right transformation(s) on the classpath as well,
> besides
> > their connector(s). Additionally by going in with a small set included,
> we
> > can encourage a consistent configuration and implementation style and
> > provide utilities for e.g. data transformations, which I expect we will
> > definitely need (discussed under 'Patterns for data transformations').
> >
> > It does get hard to draw the line once you go from 'none' to 'some'. To
> get
> > discussion going, if we get agreement on 'none' vs 'some', I added a
> table
> > under 'Bundled transformations' for transformations which I think are
> worth
> > including.
> >
> > For many of these, I have noticed their absence in the wild as a pain
> point
> > --
> > TimestampRouter:
> > https://github.com/confluentinc/kafka-connect-elasticsearch/issues/33
> > Mask:
> > https://groups.google.com/d/msg/confluent-platform/3yHb8_
> > mCReQ/sTQc3dNgBwAJ
> > Insert:
> > http://stackoverflow.com/questions/40664745/elasticsearch-connector-for-
> > kafka-connect-offset-and-timestamp
> > RegexRouter:
> > https://groups.google.com/d/msg/confluent-platform/
> > yEBwu1rGcs0/gIAhRp6kBwAJ
> > NumericCast:
> > https://github.com/confluentinc/kafka-connect-
> > jdbc/issues/101#issuecomment-24909

Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update

2016-12-15 Thread Vahid S Hashemian
Hi all,

Even though KIP-88 was recently approved, due to a limitation that comes 
with the proposed protocol change in KIP-88 I'll have to re-open it to 
address the problem.
I'd like to thank Jason Gustafson for catching this issue.

I'll explain this in the KIP as well, but to summarize, KIP-88 suggests 
adding the option of passing a "null" array in FetchOffset request to 
query all existing offsets for a consumer group. It does not suggest any 
modification to FetchOffset response.

In the existing protocol, group or coordinator related errors are reported 
along with each partition in the OffsetFetch response.

If there are partitions in the request, they are guaranteed to appear in 
the response (there could be an error code associated with each). So if 
there is an error, it is reported back by being attached to some partition 
in the request.
If an empty array is passed, no error is reported (no matter what the 
group or coordinator status is). The response comes back with an empty 
list.

With the proposed change in KIP-88 we could have a scenario in which a 
null array is sent in FetchOffset request, and due to some errors (for 
example if coordinator just started and hasn't caught up yet with the 
offset topic), an empty list is returned in the FetchOffset response (the 
group may or may not actually be empty). The issue is in situations like 
this no error can be returned in the response because there is no 
partition to attach the error to.

I'll update the KIP with more details and propose to add to OffsetFetch 
response schema an "error_code" at the top level that can be used to 
report group related errors (instead of reporting those errors with each 
individual partition).

I apologize if this causes any inconvenience.

Feedback and comments are always welcome.

Thanks.
--Vahid



[jira] [Commented] (KAFKA-3540) KafkaConsumer.close() may block indefinitely

2016-12-15 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752780#comment-15752780
 ] 

Rajini Sivaram commented on KAFKA-3540:
---

I have opened 
[KIP-102|https://cwiki.apache.org/confluence/display/KAFKA/KIP-102+-+Add+close+with+timeout+for+consumers]
 to add a close method to Consumer with timeout similar to Producer.

> KafkaConsumer.close() may block indefinitely
> 
>
> Key: KAFKA-3540
> URL: https://issues.apache.org/jira/browse/KAFKA-3540
> Project: Kafka
>  Issue Type: Bug
>Reporter: Oleg Zhurakousky
>Assignee: Manikumar Reddy
>
> KafkaConsumer API doc states 
> {code}
> Close the consumer, waiting indefinitely for any needed cleanup. . . . 
> {code}
> That is not acceptable as it creates an artificial deadlock which directly 
> affects systems that rely on Kafka API essentially rendering them unavailable.
> Consider adding _close(timeout)_ method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

2016-12-15 Thread radai
some clarifications on my alternative proposal:

TX msgs are written "sideways" to a transaction (ad-hoc) partition. this
partition can be replicated to followers, or can be an in-mem buffer -
depends on the resilience guarantees you want to provide for TXs in case of
broker crash.
on "commit" the partition leader broker (being the single point of
synchronization for the partition anyway) can atomically append the
contents of this TX "partition" onto the real target partition. this is the
point where the msgs get "real" offsets. there's some trickiness around how
not to expose these offsets to any consumers until everything's been
replicated to followers, but we believe its possible.



On Thu, Dec 15, 2016 at 2:31 PM, radai  wrote:

> I can see several issues with the current proposal.
>
> messages, even if sent under a TX, are delivered directly to their
> destination partitions, downstream consumers need to be TX-aware. they can
> either:
>1. be oblivious to TXs. that means they will deliver "garbage" - msgs
> sent during eventually-aborted TXs.
>2. "opt-in" - they would have to not deliver _ANY_ msg until they know
> the fate of all outstanding overlapping TXs - if i see msg A1 (in a TX),
> followed by B, which is not under any TX, i cannot deliver B until i know
> if A1 was committed or not (or I violate ordering). this would require some
> sort of buffering on consumers. with a naive buffering impl i could DOS
> everyone on a topic - just start a TX on a very busy topic and keep it open
> as long as I can 
>3. explode if youre an old consumer that sees a control msg (whats your
> migration plan?)
>4. cross-cluster replication mechanisms either replicate the garbage or
> need to clean it up. there are >1 such different mechanism (almost one per
> company really :-) ) so lots of adjustments.
>
> I think the end result could be better if ongoing TXs are treated as
> logically separate topic partitions, and only atomically appended onto the
> target partitions on commit (meaning they are written to separate journal
> file(s) on the broker).
>
> such a design would present a "clean" view to any downstream consumers -
> anything not committed wont even show up. old consumers wont need to know
> about control msgs, no issues with unbounded msg buffering, generally
> cleaner overall?
>
> there would need to be adjustments made to watermark and follower fetch
> logic but some of us here have discussed this over lunch and we think its
> doable.
>
>
> On Thu, Dec 15, 2016 at 1:08 AM, Rajini Sivaram 
> wrote:
>
>> Hi Apurva,
>>
>> Thank you, makes sense.
>>
>> Rajini
>>
>> On Wed, Dec 14, 2016 at 7:36 PM, Apurva Mehta 
>> wrote:
>>
>> > Hi Rajini,
>> >
>> > I think my original response to your point 15 was not accurate. The
>> regular
>> > definition of durability is that data once committed would never be
>> lost.
>> > So it is not enough for only the control messages to be flushed before
>> > being acknowledged -- all the messages (and offset commits) which are
>> part
>> > of the transaction would need to be flushed before being acknowledged as
>> > well.
>> >
>> > Otherwise, it is possible that if all replicas of a topic partition
>> crash
>> > before the transactional messages are flushed, those messages will be
>> lost
>> > even if the commit marker exists in the log. In this case, the
>> transaction
>> > would be 'committed' with incomplete data.
>> >
>> > Right now, there isn't any config which will ensure that the flush to
>> disk
>> > happens before the acknowledgement. We could add it in the future, and
>> get
>> > durability guarantees for kafka transactions.
>> >
>> > I hope this clarifies the situation. The present KIP does not intend to
>> add
>> > the aforementioned config, so even the control messages are susceptible
>> to
>> > being lost if there is a simultaneous crash across all replicas. So
>> > transactions are only as durable as existing Kafka messages. We don't
>> > strengthen any durability guarantees as part of this KIP.
>> >
>> > Thanks,
>> > Apurva
>> >
>> >
>> > On Wed, Dec 14, 2016 at 1:52 AM, Rajini Sivaram 
>> > wrote:
>> >
>> > > Hi Apurva,
>> > >
>> > > Thank you for the answers. Just one follow-on.
>> > >
>> > > 15. Let me rephrase my original question. If all control messages
>> > (messages
>> > > to transaction logs and markers on user logs) were acknowledged only
>> > after
>> > > flushing the log segment, will transactions become durable in the
>> > > traditional sense (i.e. not restricted to min.insync.replicas
>> failures) ?
>> > > This is not a suggestion to update the KIP. It seems to me that the
>> > design
>> > > enables full durability if required in the future with a rather
>> > > non-intrusive change. I just wanted to make sure I haven't missed
>> > anything
>> > > fundamental that prevents Kafka from doing this.
>> > >
>> > >
>> > >
>> > > On Wed, Dec 14, 2016 at 5:30 AM, Ben Kirwin  wrote:
>> > >
>> > > > Hi Apurva,
>> > > >
>> > > > Thanks fo

Re: [DISCUSS] KIP-102 - Add close with timeout for consumers

2016-12-15 Thread Rajini Sivaram
Ewen,

Thank you, I will try to prototype a solution early next week to get a
better understanding of how invasive the changes are.



On Thu, Dec 15, 2016 at 9:35 PM, Ewen Cheslack-Postava 
wrote:

> Rajini,
>
> Thanks for this KIP, I'd definitely like to see this. Connect has had a
> long-standing TODO around stopping sink tasks where we can't properly
> manage the rebalance process (which involves stopping consumers) because we
> lack a timeout here. Not a huge problem in practice, but would be nice to
> fix. And I know I've seen at least a half dozen requests for this (and
> other timeouts to be respected in the consumer) on the mailing list.
>
> My only concern is that this could be a pretty substantial change to the
> consumer code. We've had TODO items in the code since Jay wrote the first
> version that say something about avoiding infinite looping and waiting
> indefinitely on group membership. If we implement this, I'd hope to get it
> committed in the early part of the release cycle so we have plenty of time
> to stabilize + bug fix.
>
> -Ewen
>
> On Thu, Dec 15, 2016 at 5:32 AM, Rajini Sivaram 
> wrote:
>
> > Hi all,
> >
> > I have just created KIP-102 to add a new close method for consumers with
> a
> > timeout parameter, making Consumer consistent with Producer:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 102+-+Add+close+with+timeout+for+consumers
> >
> > Comments and suggestions are welcome.
> >
> > Thank you...
> >
> > Regards,
> >
> > Rajini
> >
>


[GitHub] kafka pull request #2263: KAFKA-4508. Create system tests that run newer ver...

2016-12-15 Thread cmccabe
GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2263

KAFKA-4508. Create system tests that run newer versions of the client…

KAFKA-4508. Create system tests that run newer versions of the client 
against older versions of the broker

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-4508

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2263.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2263


commit 9e64000cf3ee7af800568b3d910f51a954dbd56f
Author: Colin P. Mccabe 
Date:   2016-12-15T23:01:11Z

KAFKA-4508. Create system tests that run newer versions of the client 
against older versions of the broker




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4508) Create system tests that run newer versions of the client against older versions of the broker

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752827#comment-15752827
 ] 

ASF GitHub Bot commented on KAFKA-4508:
---

GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2263

KAFKA-4508. Create system tests that run newer versions of the client…

KAFKA-4508. Create system tests that run newer versions of the client 
against older versions of the broker

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-4508

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2263.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2263


commit 9e64000cf3ee7af800568b3d910f51a954dbd56f
Author: Colin P. Mccabe 
Date:   2016-12-15T23:01:11Z

KAFKA-4508. Create system tests that run newer versions of the client 
against older versions of the broker




> Create system tests that run newer versions of the client against older 
> versions of the broker
> --
>
> Key: KAFKA-4508
> URL: https://issues.apache.org/jira/browse/KAFKA-4508
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> Create system tests that run newer versions of the client against older 
> versions of the broker.  These system tests will assume that a tarball of the 
> Apache release of Kafka 0.10.0 (and possibly other releases) is located 
> somewhere, and go from there.  We can write the overall test harness stuff in 
> Python, and have some utilities in Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (KAFKA-4508) Create system tests that run newer versions of the client against older versions of the broker

2016-12-15 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-4508 started by Colin P. McCabe.
--
> Create system tests that run newer versions of the client against older 
> versions of the broker
> --
>
> Key: KAFKA-4508
> URL: https://issues.apache.org/jira/browse/KAFKA-4508
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> Create system tests that run newer versions of the client against older 
> versions of the broker.  These system tests will assume that a tarball of the 
> Apache release of Kafka 0.10.0 (and possibly other releases) is located 
> somewhere, and go from there.  We can write the overall test harness stuff in 
> Python, and have some utilities in Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-95: Incremental Batch Processing for Kafka Streams

2016-12-15 Thread Avi Flax

> On Dec 13, 2016, at 21:02, Matthias J. Sax  wrote:
> 
> thanks for your feedback.

My pleasure!

> We want to enlarge the scope for Streams
> application and started to collect use cases in the Wiki:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Data+%28Re%29Processing+Scenarios

This page looks great, I love this sort of thing. Nice work.

> Feel free to add there via editing the page or writing a comment.

Thanks, but I don’t appear to have permissions to edit or comment on the page.

Perhaps you could paste in the two use cases I described as a new comment?

Thanks,
Avi

Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-12-15 Thread Gwen Shapira
You are absolutely right that the vast majority of NiFi's processors are
not what we would consider SMT.

I went over the list and I think the still contain just short of 50 legit
SMTs:
https://cwiki.apache.org/confluence/display/KAFKA/Analyzing+NiFi+Transformations

You are right that ExtractHL7 is an extreme that clearly doesn't belong in
Apache Kafka, but just before that we have ExtractAvroMetadata that may
fit? and ExtractEmailHeaders doesn't sound totally outlandish either...

Nothing in the baked-in list by Shikhar looks out of place. I am concerned
about slipperly slope. Or the arbitrariness of the decision if we say that
this list is final and nothing else will ever make it into Kafka.

Gwen

On Thu, Dec 15, 2016 at 3:00 PM, Ewen Cheslack-Postava 
wrote:

> I think there are a couple of factors that make transformations and
> connectors different.
>
> First, NiFi's 150 processors is a bit misleading. In NiFi, processors cover
> data sources, data sinks, serialization/deserialization, *and*
> transformations. I haven't filtered the list to see how many fall into the
> first 3 categories, but it's a *lot* of the processors they have.
>
> Second, since transformations only apply to a single message and I'd think
> they generally shouldn't be interacting with external services (i.e. I
> think trying to do enrichment in SMT is probably a bad idea), the scope of
> possible transformations is reasonably limited and the transformations
> themselves tend to be small and easily maintainable. I think this is a
> dramatic difference from connectors, which are each substantial projects in
> their own right.
>
> While I get the slippery slope argument re: including specific
> transformations, I think we can come up with a reasonable policy (and via
> KIPs we can, as a community, come to an agreement based purely on taste if
> it comes down to that). In particular, I'd say keep the core general (i.e.
> no domain-specific transformations/parsing like HL7), pure data
> manipulation (i.e. no enrichment), and nothing that could just as well be
> done as a converter/serializer/deserializer/source connector/sink
> connector.
>
> I was very staunchly against including connectors (aside from a simple
> example) directly in Kafka, so this may seem like a reversal of position.
> But I think the % of use cases covered will look very different between
> connectors and transformations. Sure, some connectors are very popular, and
> moreso right now because they are the most thoroughly developed, tested,
> etc. But the top 3 most common transformations will probably be used across
> all the top 20 most popular connectors. I have no doubt people will end up
> writing custom ones (which is why it's nice to make them pluggable rather
> than choosing a fixed set), but they'll either be very niche (like people
> write custom connectors for their internal systems) or be more broadly
> applicable but very domain specific such that they are easy to reject for
> inclusion.
>
> @Gwen if we filtered the list of NiFi processors to ones that fit that
> criteria, would that still be too long a list for your taste? Similarly,
> let's say we were going to include some baked in; in that case, does
> anything look out of place to you in the list Shikhar has included in the
> KIP?
>
> -Ewen
>
> On Thu, Dec 15, 2016 at 2:01 PM, Gwen Shapira  wrote:
>
> > I agree about the ease of use in adding a small-subset of built-in
> > transformations.
> >
> > But the same thing is true for connectors - there are maybe 5 super
> popular
> > OSS connectors and the rest is a very long tail. We drew the line at not
> > adding any, because thats the easiest and because we did not want to turn
> > Kafka into a collection of transformations.
> >
> > I really don't want to end up with 135 (or even 20) transformations in
> > Kafka. So either we have a super-clear definition of what belongs and
> what
> > doesn't - or we put in one minimal example and the rest goes into the
> > ecosystem.
> >
> > We can also start by putting transformations on github and just see if
> > there is huge demand for them in Apache. It is easier to add stuff to the
> > project later than to remove functionality.
> >
> >
> >
> > On Thu, Dec 15, 2016 at 11:59 AM, Shikhar Bhushan 
> > wrote:
> >
> > > I have updated KIP-66
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 66%3A+Single+Message+Transforms+for+Kafka+Connect
> > > with
> > > the changes I proposed in the design.
> > >
> > > Gwen, I think the main downside to not including some transformations
> > with
> > > Kafka Connect is that it seems less user friendly if folks have to make
> > > sure to have the right transformation(s) on the classpath as well,
> > besides
> > > their connector(s). Additionally by going in with a small set included,
> > we
> > > can encourage a consistent configuration and implementation style and
> > > provide utilities for e.g. data transformations, which I expect we will
> > > definitely 

Re: [DISCUSS] KIP-95: Incremental Batch Processing for Kafka Streams

2016-12-15 Thread Matthias J. Sax
What is you wiki ID? We can grant you permission.

-Matthias

On 12/15/16 3:27 PM, Avi Flax wrote:
> 
>> On Dec 13, 2016, at 21:02, Matthias J. Sax  wrote:
>>
>> thanks for your feedback.
> 
> My pleasure!
> 
>> We want to enlarge the scope for Streams
>> application and started to collect use cases in the Wiki:
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Data+%28Re%29Processing+Scenarios
> 
> This page looks great, I love this sort of thing. Nice work.
> 
>> Feel free to add there via editing the page or writing a comment.
> 
> Thanks, but I don’t appear to have permissions to edit or comment on the page.
> 
> Perhaps you could paste in the two use cases I described as a new comment?
> 
> Thanks,
> Avi
> 



signature.asc
Description: OpenPGP digital signature


[jira] [Resolved] (KAFKA-4451) Recovering empty replica yields negative offsets in index of compact partitions

2016-12-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-4451.

   Resolution: Fixed
Fix Version/s: 0.10.2.0

Issue resolved by pull request 2210
[https://github.com/apache/kafka/pull/2210]

> Recovering empty replica yields negative offsets in index of compact 
> partitions
> ---
>
> Key: KAFKA-4451
> URL: https://issues.apache.org/jira/browse/KAFKA-4451
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Michael Schiff
>  Labels: reliability
> Fix For: 0.10.2.0
>
>
> Bringing up an empty broker.  
> the partition for a compact topic is not split into multiple log files.  All 
> data is written into a single log file, causing offsets to overflow.
> A dump of the affected broker shortly after it started replicating:
> {code}
> michael.schiff@stats-kafka09:~$ sudo /opt/kafka/bin/kafka-run-class.sh 
> kafka.tools.DumpLogSegments --files 
> /kafka/attainment_event-0/.index | head -n 10
> Dumping /kafka/attainment_event-0/.index
> offset: 1022071124 position: 1037612
> offset: -1713432120 position: 1348740
> offset: -886291423 position: 2397130
> offset: -644750126 position: 3445630
> offset: -57889876 position: 4493972
> offset: 433950099 position: 5388461
> offset: 1071769472 position: 6436837
> offset: 1746859069 position: 7485367
> offset: 2090359736 position: 8533822
> ...
> {code}
> and the dump of the same log file from the leader of this partition
> {code}
> michael.schiff@stats-kafka12:~$ sudo /opt/kafka/bin/kafka-run-class.sh 
> kafka.tools.DumpLogSegments --files 
> /kafka/attainment_event-0/.index
> [sudo] password for michael.schiff:
> Dumping /kafka/attainment_event-0/.index
> offset: 353690666 position: 262054
> offset: 633140428 position: 523785
> offset: 756537951 position: 785815
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2210: KAFKA-4451: Fix OffsetIndex overflow when replicat...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2210


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4451) Recovering empty replica yields negative offsets in index of compact partitions

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752877#comment-15752877
 ] 

ASF GitHub Bot commented on KAFKA-4451:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2210


> Recovering empty replica yields negative offsets in index of compact 
> partitions
> ---
>
> Key: KAFKA-4451
> URL: https://issues.apache.org/jira/browse/KAFKA-4451
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Michael Schiff
>  Labels: reliability
> Fix For: 0.10.2.0
>
>
> Bringing up an empty broker.  
> the partition for a compact topic is not split into multiple log files.  All 
> data is written into a single log file, causing offsets to overflow.
> A dump of the affected broker shortly after it started replicating:
> {code}
> michael.schiff@stats-kafka09:~$ sudo /opt/kafka/bin/kafka-run-class.sh 
> kafka.tools.DumpLogSegments --files 
> /kafka/attainment_event-0/.index | head -n 10
> Dumping /kafka/attainment_event-0/.index
> offset: 1022071124 position: 1037612
> offset: -1713432120 position: 1348740
> offset: -886291423 position: 2397130
> offset: -644750126 position: 3445630
> offset: -57889876 position: 4493972
> offset: 433950099 position: 5388461
> offset: 1071769472 position: 6436837
> offset: 1746859069 position: 7485367
> offset: 2090359736 position: 8533822
> ...
> {code}
> and the dump of the same log file from the leader of this partition
> {code}
> michael.schiff@stats-kafka12:~$ sudo /opt/kafka/bin/kafka-run-class.sh 
> kafka.tools.DumpLogSegments --files 
> /kafka/attainment_event-0/.index
> [sudo] password for michael.schiff:
> Dumping /kafka/attainment_event-0/.index
> offset: 353690666 position: 262054
> offset: 633140428 position: 523785
> offset: 756537951 position: 785815
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4540) Suspended tasks that are not assigned to the StreamThread need to be closed before new active and standby tasks are created

2016-12-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752887#comment-15752887
 ] 

Guozhang Wang commented on KAFKA-4540:
--

Thanks for the explanation. Makes sense.

> Suspended tasks that are not assigned to the StreamThread need to be closed 
> before new active and standby tasks are created
> ---
>
> Key: KAFKA-4540
> URL: https://issues.apache.org/jira/browse/KAFKA-4540
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.10.2.0
>
>
> When partition assignment happens we first try and add the active tasks and 
> then add the standby tasks. The problem with this is that a new active task 
> might already be an existing suspended standby task. if this is the case then 
> when the active task initialises it will throw an exception from RocksDB:
> {{Caused by: org.rocksdb.RocksDBException: IO error: lock 
> /tmp/kafka-streams-7071/kafka-music-charts/1_1/rocksdb/all-songs/LOCK: No 
> locks available}}
> We need to make sure we have removed an closed any no-longer assigned 
> Suspended tasks before creating new tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2258: MINOR: update KStream JavaDocs

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2258


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk8 #1106

2016-12-15 Thread Apache Jenkins Server
See 



[jira] [Commented] (KAFKA-4541) Add capability to create delegation token

2016-12-15 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752982#comment-15752982
 ] 

Ashish K Singh commented on KAFKA-4541:
---

Yes, it will.

> Add capability to create delegation token
> -
>
> Key: KAFKA-4541
> URL: https://issues.apache.org/jira/browse/KAFKA-4541
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Add request/ response and server side handling to create delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-12-15 Thread Shikhar Bhushan
There is no decision being proposed on the final list of transformations
that will ever be in Kafka :-) Just the initial set we should roll with.

On Thu, Dec 15, 2016 at 3:34 PM Gwen Shapira  wrote:

You are absolutely right that the vast majority of NiFi's processors are
not what we would consider SMT.

I went over the list and I think the still contain just short of 50 legit
SMTs:
https://cwiki.apache.org/confluence/display/KAFKA/Analyzing+NiFi+Transformations

You are right that ExtractHL7 is an extreme that clearly doesn't belong in
Apache Kafka, but just before that we have ExtractAvroMetadata that may
fit? and ExtractEmailHeaders doesn't sound totally outlandish either...

Nothing in the baked-in list by Shikhar looks out of place. I am concerned
about slipperly slope. Or the arbitrariness of the decision if we say that
this list is final and nothing else will ever make it into Kafka.

Gwen

On Thu, Dec 15, 2016 at 3:00 PM, Ewen Cheslack-Postava 
wrote:

> I think there are a couple of factors that make transformations and
> connectors different.
>
> First, NiFi's 150 processors is a bit misleading. In NiFi, processors
cover
> data sources, data sinks, serialization/deserialization, *and*
> transformations. I haven't filtered the list to see how many fall into the
> first 3 categories, but it's a *lot* of the processors they have.
>
> Second, since transformations only apply to a single message and I'd think
> they generally shouldn't be interacting with external services (i.e. I
> think trying to do enrichment in SMT is probably a bad idea), the scope of
> possible transformations is reasonably limited and the transformations
> themselves tend to be small and easily maintainable. I think this is a
> dramatic difference from connectors, which are each substantial projects
in
> their own right.
>
> While I get the slippery slope argument re: including specific
> transformations, I think we can come up with a reasonable policy (and via
> KIPs we can, as a community, come to an agreement based purely on taste if
> it comes down to that). In particular, I'd say keep the core general (i.e.
> no domain-specific transformations/parsing like HL7), pure data
> manipulation (i.e. no enrichment), and nothing that could just as well be
> done as a converter/serializer/deserializer/source connector/sink
> connector.
>
> I was very staunchly against including connectors (aside from a simple
> example) directly in Kafka, so this may seem like a reversal of position.
> But I think the % of use cases covered will look very different between
> connectors and transformations. Sure, some connectors are very popular,
and
> moreso right now because they are the most thoroughly developed, tested,
> etc. But the top 3 most common transformations will probably be used
across
> all the top 20 most popular connectors. I have no doubt people will end up
> writing custom ones (which is why it's nice to make them pluggable rather
> than choosing a fixed set), but they'll either be very niche (like people
> write custom connectors for their internal systems) or be more broadly
> applicable but very domain specific such that they are easy to reject for
> inclusion.
>
> @Gwen if we filtered the list of NiFi processors to ones that fit that
> criteria, would that still be too long a list for your taste? Similarly,
> let's say we were going to include some baked in; in that case, does
> anything look out of place to you in the list Shikhar has included in the
> KIP?
>
> -Ewen
>
> On Thu, Dec 15, 2016 at 2:01 PM, Gwen Shapira  wrote:
>
> > I agree about the ease of use in adding a small-subset of built-in
> > transformations.
> >
> > But the same thing is true for connectors - there are maybe 5 super
> popular
> > OSS connectors and the rest is a very long tail. We drew the line at not
> > adding any, because thats the easiest and because we did not want to
turn
> > Kafka into a collection of transformations.
> >
> > I really don't want to end up with 135 (or even 20) transformations in
> > Kafka. So either we have a super-clear definition of what belongs and
> what
> > doesn't - or we put in one minimal example and the rest goes into the
> > ecosystem.
> >
> > We can also start by putting transformations on github and just see if
> > there is huge demand for them in Apache. It is easier to add stuff to
the
> > project later than to remove functionality.
> >
> >
> >
> > On Thu, Dec 15, 2016 at 11:59 AM, Shikhar Bhushan 
> > wrote:
> >
> > > I have updated KIP-66
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 66%3A+Single+Message+Transforms+for+Kafka+Connect
> > > with
> > > the changes I proposed in the design.
> > >
> > > Gwen, I think the main downside to not including some transformations
> > with
> > > Kafka Connect is that it seems less user friendly if folks have to
make
> > > sure to have the right transformation(s) on the classpath as well,
> > besides
> > > their connector(s). Additionally by going in with

[GitHub] kafka pull request #2254: KAFKA-4537: StreamPartitionAssignor incorrectly ad...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2254


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4537) StreamPartitionAssignor incorrectly adds standby partitions to the partitionsByHostState map

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753018#comment-15753018
 ] 

ASF GitHub Bot commented on KAFKA-4537:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2254


> StreamPartitionAssignor incorrectly adds standby partitions to the 
> partitionsByHostState map
> 
>
> Key: KAFKA-4537
> URL: https://issues.apache.org/jira/browse/KAFKA-4537
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.10.2.0
>
>
> If a KafkaStreams app is using Standby Tasks then StreamPartitionAssignor 
> will add the standby partitions to the partitionsByHostState map for each 
> host. This is incorrect as the partitionHostState map is used to resolve 
> which host is hosting a particular store for a key. 
> The result is that doing metadata lookups for interactive queries can return 
> an incorrect host



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-4537) StreamPartitionAssignor incorrectly adds standby partitions to the partitionsByHostState map

2016-12-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-4537.
--
Resolution: Fixed

Issue resolved by pull request 2254
[https://github.com/apache/kafka/pull/2254]

> StreamPartitionAssignor incorrectly adds standby partitions to the 
> partitionsByHostState map
> 
>
> Key: KAFKA-4537
> URL: https://issues.apache.org/jira/browse/KAFKA-4537
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.10.2.0
>
>
> If a KafkaStreams app is using Standby Tasks then StreamPartitionAssignor 
> will add the standby partitions to the partitionsByHostState map for each 
> host. This is incorrect as the partitionHostState map is used to resolve 
> which host is hosting a particular store for a key. 
> The result is that doing metadata lookups for interactive queries can return 
> an incorrect host



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4539) StreamThread is not correctly creating StandbyTasks

2016-12-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4539:
-
Description: 
Fails because {{createStandbyTask(..)}} can return null if the topology for the 
{{TaskId}} doesn't have any state stores.

{code}
[2016-12-14 12:20:29,758] ERROR [StreamThread-1] User provided listener 
org.apache.kafka.streams.processor.internals.StreamThread$1 for group 
kafka-music-charts failed on partition assignment 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
java.lang.NullPointerException
at 
org.apache.kafka.streams.processor.internals.StreamThread$StandbyTaskCreator.createTask(StreamThread.java:1241)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1188)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStandbyTasks(StreamThread.java:915)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$400(StreamThread.java:72)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:239)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:230)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:314)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:278)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:261)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1022)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:987)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:568)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:357)
{code}

Also fails because the checkpointedOffsets from the newly created 
{{StandbyTask}} aren't added to the offsets map, so the partitions don't get 
assigned. We then get:


  was:
Fails because {{createStandbyTask(..)}} can return null fi the topology for the 
{{TaskId}} doesn't have any state stores.

{code}
[2016-12-14 12:20:29,758] ERROR [StreamThread-1] User provided listener 
org.apache.kafka.streams.processor.internals.StreamThread$1 for group 
kafka-music-charts failed on partition assignment 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
java.lang.NullPointerException
at 
org.apache.kafka.streams.processor.internals.StreamThread$StandbyTaskCreator.createTask(StreamThread.java:1241)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1188)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStandbyTasks(StreamThread.java:915)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$400(StreamThread.java:72)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:239)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:230)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:314)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:278)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:261)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1022)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:987)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:568)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:357)
{code}

Also fails because the checkpointedOffsets from the newly created 
{{StandbyTask}} aren't added to the offsets map, so the partitions don't get 
assigned. We then get:



> StreamThread is not correctly creating  StandbyTasks
> 
>
> Key: KAFKA-4539
> URL: https://issues.apache.org/jira/browse/KAFKA-4539
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.10.2.0
>
>
> Fails because {{createStandbyTask(..)}} can return null if the topology for 
> the {{TaskId}} doesn't have any state stores.
> {code}
> [2016-12-14 12:20:29,758] ERROR [StreamThread-1] User provided listener 
> org.apache.kafka.streams.processo

Re: [DISCUSS] KIP-102 - Add close with timeout for consumers

2016-12-15 Thread Becket Qin
+1 on the idea. We have a ticket about making all the blocking call have a
timeout in KafkaConsumer. The implementation could be a little tricky as
Ewen mentioned. But for close it is probably a simpler case because in the
worst case the consumer will just stop polling and heartbeating and
eventually got kicked out of the group. It is not ideal but maybe less
worrisome.

On Thu, Dec 15, 2016 at 3:08 PM, Rajini Sivaram  wrote:

> Ewen,
>
> Thank you, I will try to prototype a solution early next week to get a
> better understanding of how invasive the changes are.
>
>
>
> On Thu, Dec 15, 2016 at 9:35 PM, Ewen Cheslack-Postava 
> wrote:
>
> > Rajini,
> >
> > Thanks for this KIP, I'd definitely like to see this. Connect has had a
> > long-standing TODO around stopping sink tasks where we can't properly
> > manage the rebalance process (which involves stopping consumers) because
> we
> > lack a timeout here. Not a huge problem in practice, but would be nice to
> > fix. And I know I've seen at least a half dozen requests for this (and
> > other timeouts to be respected in the consumer) on the mailing list.
> >
> > My only concern is that this could be a pretty substantial change to the
> > consumer code. We've had TODO items in the code since Jay wrote the first
> > version that say something about avoiding infinite looping and waiting
> > indefinitely on group membership. If we implement this, I'd hope to get
> it
> > committed in the early part of the release cycle so we have plenty of
> time
> > to stabilize + bug fix.
> >
> > -Ewen
> >
> > On Thu, Dec 15, 2016 at 5:32 AM, Rajini Sivaram 
> > wrote:
> >
> > > Hi all,
> > >
> > > I have just created KIP-102 to add a new close method for consumers
> with
> > a
> > > timeout parameter, making Consumer consistent with Producer:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 102+-+Add+close+with+timeout+for+consumers
> > >
> > > Comments and suggestions are welcome.
> > >
> > > Thank you...
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> >
>


[jira] [Resolved] (KAFKA-4521) MirrorMaker should flush all messages before releasing partition ownership during rebalance

2016-12-15 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin resolved KAFKA-4521.
-
Resolution: Fixed

> MirrorMaker should flush all messages before releasing partition ownership 
> during rebalance
> ---
>
> Key: KAFKA-4521
> URL: https://issues.apache.org/jira/browse/KAFKA-4521
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 0.10.2.0
>
>
> In order to ensure that messages from a given partition in the source cluster 
> are mirrored to the same partition in the destination cluster in the *same* 
> order, MirrorMaker needs to produce and flush all messages that its consumer 
> has received from source cluster before giving up partition in the cluster. 
> However, as of current implementation of Apache Kafka, this is not guaranteed 
> and will cause out-of-order message delivery in the following scenario.
> - mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
> FetchRequest
> - mirror maker process 2 starts up and triggers rebalance.
> - `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
> thread, which does producer.flush() and commit offset of this consumer. 
> However, at this moment messages 1, 2, 3 haven't even been produced.
> - consumer of mirror maker process 1 releases ownership of this partition
> - consumer of mirror maker process 2 gets ownership of this partition.
> - mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
> - messages 4, 5, 6 can be produced before messages 1, 2, 3. 
> To fix this problem, the rebalance listener callback function should signal 
> MirrorMakerThread to get all messages from consumer, produce these messages 
> to destination cluster, flush producer, and commit offset. Rebalance listener 
> callback function should wait for MirrorMakerThread to finish these steps 
> before it allows ownership of this partition to be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4521) MirrorMaker should flush all messages before releasing partition ownership during rebalance

2016-12-15 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-4521:

Affects Version/s: 0.10.1.0
Fix Version/s: 0.10.2.0

> MirrorMaker should flush all messages before releasing partition ownership 
> during rebalance
> ---
>
> Key: KAFKA-4521
> URL: https://issues.apache.org/jira/browse/KAFKA-4521
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 0.10.2.0
>
>
> In order to ensure that messages from a given partition in the source cluster 
> are mirrored to the same partition in the destination cluster in the *same* 
> order, MirrorMaker needs to produce and flush all messages that its consumer 
> has received from source cluster before giving up partition in the cluster. 
> However, as of current implementation of Apache Kafka, this is not guaranteed 
> and will cause out-of-order message delivery in the following scenario.
> - mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
> FetchRequest
> - mirror maker process 2 starts up and triggers rebalance.
> - `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
> thread, which does producer.flush() and commit offset of this consumer. 
> However, at this moment messages 1, 2, 3 haven't even been produced.
> - consumer of mirror maker process 1 releases ownership of this partition
> - consumer of mirror maker process 2 gets ownership of this partition.
> - mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
> - messages 4, 5, 6 can be produced before messages 1, 2, 3. 
> To fix this problem, the rebalance listener callback function should signal 
> MirrorMakerThread to get all messages from consumer, produce these messages 
> to destination cluster, flush producer, and commit offset. Rebalance listener 
> callback function should wait for MirrorMakerThread to finish these steps 
> before it allows ownership of this partition to be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4549) KafkaLZ4OutputStream output invalid format when it was called close method without flash method

2016-12-15 Thread MURAKAMI Masahiko (JIRA)
MURAKAMI Masahiko created KAFKA-4549:


 Summary: KafkaLZ4OutputStream output invalid format when it was 
called close method without flash method
 Key: KAFKA-4549
 URL: https://issues.apache.org/jira/browse/KAFKA-4549
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.0.1
 Environment: java version "1.8.0_74"
Java(TM) SE Runtime Environment (build 1.8.0_74-b02)
Java HotSpot(TM) 64-Bit Server VM (build 25.74-b02, mixed mode)
Reporter: MURAKAMI Masahiko


When KafkaLZ4OutputStream was called close method without flash method, not 
called writeEndMark method. So, it output invalid format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-82 - Add Record Headers

2016-12-15 Thread Jun Rao
Hi, Michael,

Thanks for the response.

100. Is there any other metadata associated with the uuid that APM sends to
the central coordinator? What kind of things could you do once the tracing
is embedded in each message?

103. How do you preserve the per key ordering when switching to a different
DC at IG? Are you doing 2-way mirroring?

105. Got it. So, you don't need to use headers for encryption itself. But
if there is another use case for headers, it's hard to put that info into
the encrypted payload.

106. Embedding all metadata instead of just the producer id per message is
likely more verbose, right?  Similarly, in 100 above, only a uuid is
embedded in each message.

107. Yes, this kind of UUID is proposed KIP-98 for deduping.

Jun

On Thu, Dec 8, 2016 at 12:12 AM, Michael Pearce 
wrote:

> Hi Jun
>
> 100) each time a transaction exits a jvm for a remote system (HTTP/JMS/
> Hopefully one day kafka) the APM tools stich in a unique id (though I
> believe it contains the end2end uuid embedded in this id), on receiving the
> message at the receiving JVM the apm code takes this out, and continues its
> tracing on the that new thread. Both JVM’s (and other languages the APM
> tool supports) send this data async back to the central controllers where
> the stiching togeather occurs. For this they need some header space for
> them to put this id.
>
> 101) Yes indeed we have a business transaction Id in the payload. Though
> this is a system level tracing, that we need to have marry up. Also as per
> note on end2end encryption we’d be unable to prove the flow if the payload
> is encrypted as we’d not have access to this at certain points of the flow
> through the infrastructure/platform.
>
>
> 103) As said we use this mechanism in IG very successfully, as stated per
> key we guarantee the transaction producing app to handle the transaction of
> a key at one DC unless at point of critical failure where we have to flip
> processing to another. We care about key ordering.
> I disagree on the offset comment for the partition solution unless you do
> full ISR, or expensive full XA transactions even with partitions you cannot
> fully guarantee offsets would match.
>
> 105) Very much so, I need to have access at the platform level to the
> other meta data all mentioned, without having to need to have access to the
> encryption keys of the payload.
>
> 106)
> Techincally yes for AZ/Region/Cluster, but then we’d need to have a global
> producerId register which would be very hard to enforce/ensure is current
> and correct, just to understand the message origins of its
> region/az/cluster for routing.
> The client wrapper version, producerId can be the same, as obviously the
> producer could upgrade its wrapper, as such we need to know what wrapper
> version the message is created with.
> Likewise the IP address, as stated we can have our producer move, where
> its IP would change.
>
> 107)
> UUID is set on the message by interceptors before actual producer
> transport send. This is for platform level message dedupe guarantee, the
> business payload should be agnostic to this. Please see
> https://activemq.apache.org/artemis/docs/1.5.0/duplicate-detection.html
> note this is not touching business payloads.
>
>
>
> On 06/12/2016, 18:22, "Jun Rao"  wrote:
>
> Hi, Michael,
>
> Thanks for the reply. I find it very helpful.
>
> Data lineage:
> 100. I'd like to understand the APM use case a bit more. It sounds like
> that those APM plugins can generate a transaction id that we could
> potentially put in the header of every message. How would you typically
> make use of such transaction ids? Are there other metadata associated
> with
> the transaction id and if so, how are they propagated downstream?
>
> 101. For the finance use case, if the concept of transaction is
> important,
> wouldn't it be typically included in the message payload instead of as
> an
> optional header field?
>
> 102. The data lineage that Altas and Navigator support seems to be at
> the
> dataset level, not per record level? So, not sure if per message
> headers
> are relevant there.
>
> Mirroring:
> 103. The benefit of using separate partitions is that it potentially
> makes
> it easy to preserve offsets during mirroring. This will make it easier
> for
> consumer to switch clusters. Currently, the consumers can switch
> clusters
> by using the timestampToOffset() api, but it has to deal with
> duplicates.
> Good point on the issue with log compact and I am not sure how to
> address
> this. However, even if we mirror into the existing partitions, the
> ordering
> for messages generated from different clusters seems non-deterministic
> anyway. So, it seems that the consumers already have to deal with
> that? If
> a topic is compacted, does that mean which messages are preserved is
> also
> non-deterministic across clusters?
>
> 104. Good point on partit

[jira] [Resolved] (KAFKA-4536) Kafka clients throw NullPointerException on poll when delete the relative topic

2016-12-15 Thread mayi_hetu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mayi_hetu resolved KAFKA-4536.
--
   Resolution: Fixed
Fix Version/s: 0.10.1.0

> Kafka clients throw NullPointerException on poll when delete the relative 
> topic
> ---
>
> Key: KAFKA-4536
> URL: https://issues.apache.org/jira/browse/KAFKA-4536
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.0.1
>Reporter: mayi_hetu
> Fix For: 0.10.1.0
>
>
> 1. new KafkaConsumer
>   val groupIdString = "test1"  
>   val props = new Properties();
>   props.put("bootstrap.servers", "99.12.143.240:9093");
>   props.put("group.id", groupIdString);
>   props.put("enable.auto.commit", "false");
>   props.put("auto.offset.reset","earliest");
>   props.put("auto.commit.interval.ms", "5000");
>   props.put("metadata.max.age.ms","30");
>   props.put("session.timeout.ms", "3");
> props.setProperty("key.deserializer", 
> classOf[ByteArrayDeserializer].getName)
> props.setProperty("value.deserializer", 
> classOf[ByteArrayDeserializer].getName)
>   props.setProperty("client.id", groupIdString)
>   new KafkaConsumer[Array[Byte], Array[Byte]](props)
> 2. *subscribe topic through Partten*
> consumer.subscribe(Pattern.compile(".*\\.sh$"), consumerRebalanceListener)
> 3. use poll(1000) fetching messages
> 4. delete topic test1.sh in Kafka broker
> then the consumer throw NullPointerException
> {color:red}
> Exception in thread "main" java.lang.NullPointerException
>   at java.util.ArrayList.addAll(Unknown Source)
>   at 
> org.apache.kafka.clients.Metadata.getClusterForCurrentTopics(Metadata.java:271)
>   at org.apache.kafka.clients.Metadata.update(Metadata.java:185)
>   at 
> org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleResponse(NetworkClient.java:606)
>   at 
> org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeHandleCompletedReceive(NetworkClient.java:583)
>   at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:450)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:201)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:998)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:937)
>   at 
> TestNewConsumer$MyMirrorMakerNewConsumer.receive(TestNewConsumer.scala:146)
>   at TestNewConsumer$.main(TestNewConsumer.scala:38)
>   at TestNewConsumer.main(TestNewConsumer.scala)
> {color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (KAFKA-4536) Kafka clients throw NullPointerException on poll when delete the relative topic

2016-12-15 Thread mayi_hetu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mayi_hetu closed KAFKA-4536.


> Kafka clients throw NullPointerException on poll when delete the relative 
> topic
> ---
>
> Key: KAFKA-4536
> URL: https://issues.apache.org/jira/browse/KAFKA-4536
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.0.1
>Reporter: mayi_hetu
> Fix For: 0.10.1.0
>
>
> 1. new KafkaConsumer
>   val groupIdString = "test1"  
>   val props = new Properties();
>   props.put("bootstrap.servers", "99.12.143.240:9093");
>   props.put("group.id", groupIdString);
>   props.put("enable.auto.commit", "false");
>   props.put("auto.offset.reset","earliest");
>   props.put("auto.commit.interval.ms", "5000");
>   props.put("metadata.max.age.ms","30");
>   props.put("session.timeout.ms", "3");
> props.setProperty("key.deserializer", 
> classOf[ByteArrayDeserializer].getName)
> props.setProperty("value.deserializer", 
> classOf[ByteArrayDeserializer].getName)
>   props.setProperty("client.id", groupIdString)
>   new KafkaConsumer[Array[Byte], Array[Byte]](props)
> 2. *subscribe topic through Partten*
> consumer.subscribe(Pattern.compile(".*\\.sh$"), consumerRebalanceListener)
> 3. use poll(1000) fetching messages
> 4. delete topic test1.sh in Kafka broker
> then the consumer throw NullPointerException
> {color:red}
> Exception in thread "main" java.lang.NullPointerException
>   at java.util.ArrayList.addAll(Unknown Source)
>   at 
> org.apache.kafka.clients.Metadata.getClusterForCurrentTopics(Metadata.java:271)
>   at org.apache.kafka.clients.Metadata.update(Metadata.java:185)
>   at 
> org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleResponse(NetworkClient.java:606)
>   at 
> org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeHandleCompletedReceive(NetworkClient.java:583)
>   at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:450)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:201)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:998)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:937)
>   at 
> TestNewConsumer$MyMirrorMakerNewConsumer.receive(TestNewConsumer.scala:146)
>   at TestNewConsumer$.main(TestNewConsumer.scala:38)
>   at TestNewConsumer.main(TestNewConsumer.scala)
> {color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4550) current trunk unstable

2016-12-15 Thread radai rosenblatt (JIRA)
radai rosenblatt created KAFKA-4550:
---

 Summary: current trunk unstable
 Key: KAFKA-4550
 URL: https://issues.apache.org/jira/browse/KAFKA-4550
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.2.0
Reporter: radai rosenblatt
 Attachments: run1.log, run2.log, run3.log, run4.log, run5.log

on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
when running the exact same build 5 times, I get:

3 failures (on 3 separate runs):
   kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
No request is complete
   org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] FAILEDjava.lang.AssertionError: Condition not met within 
timeout 6. Did not receive 1 number of records
   kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
FAILED java.lang.AssertionError: Message set should have 1 message
1 success
1 stall



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4550) current trunk unstable

2016-12-15 Thread radai rosenblatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radai rosenblatt updated KAFKA-4550:

Attachment: run5.log
run4.log
run3.log
run2.log
run1.log

> current trunk unstable
> --
>
> Key: KAFKA-4550
> URL: https://issues.apache.org/jira/browse/KAFKA-4550
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: radai rosenblatt
> Attachments: run1.log, run2.log, run3.log, run4.log, run5.log
>
>
> on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
> when running the exact same build 5 times, I get:
> 3 failures (on 3 separate runs):
>kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
> No request is complete
>org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILEDjava.lang.AssertionError: Condition not met within 
> timeout 6. Did not receive 1 number of records
>kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
> FAILED java.lang.AssertionError: Message set should have 1 message
> 1 success
> 1 stall



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2255: KAFKA-4539: StreamThread is not correctly creating...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2255


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4539) StreamThread is not correctly creating StandbyTasks

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753082#comment-15753082
 ] 

ASF GitHub Bot commented on KAFKA-4539:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2255


> StreamThread is not correctly creating  StandbyTasks
> 
>
> Key: KAFKA-4539
> URL: https://issues.apache.org/jira/browse/KAFKA-4539
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.10.2.0
>
>
> Fails because {{createStandbyTask(..)}} can return null if the topology for 
> the {{TaskId}} doesn't have any state stores.
> {code}
> [2016-12-14 12:20:29,758] ERROR [StreamThread-1] User provided listener 
> org.apache.kafka.streams.processor.internals.StreamThread$1 for group 
> kafka-music-charts failed on partition assignment 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$StandbyTaskCreator.createTask(StreamThread.java:1241)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1188)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStandbyTasks(StreamThread.java:915)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$400(StreamThread.java:72)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:239)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:230)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:314)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:278)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:261)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1022)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:987)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:568)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:357)
> {code}
> Also fails because the checkpointedOffsets from the newly created 
> {{StandbyTask}} aren't added to the offsets map, so the partitions don't get 
> assigned. We then get:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4550) current trunk unstable

2016-12-15 Thread radai rosenblatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radai rosenblatt updated KAFKA-4550:

Description: 
on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
when running the exact same build 5 times, I get:

3 failures (on 3 separate runs):
   kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
No request is complete
   org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] FAILED java.lang.AssertionError: Condition not met within 
timeout 6. Did not receive 1 number of records
   kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
FAILED java.lang.AssertionError: Message set should have 1 message
1 success
1 stall (build hung)

  was:
on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
when running the exact same build 5 times, I get:

3 failures (on 3 separate runs):
   kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
No request is complete
   org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] FAILEDjava.lang.AssertionError: Condition not met within 
timeout 6. Did not receive 1 number of records
   kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
FAILED java.lang.AssertionError: Message set should have 1 message
1 success
1 stall


> current trunk unstable
> --
>
> Key: KAFKA-4550
> URL: https://issues.apache.org/jira/browse/KAFKA-4550
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: radai rosenblatt
> Attachments: run1.log, run2.log, run3.log, run4.log, run5.log
>
>
> on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
> when running the exact same build 5 times, I get:
> 3 failures (on 3 separate runs):
>kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
> No request is complete
>org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED java.lang.AssertionError: Condition not met within 
> timeout 6. Did not receive 1 number of records
>kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
> FAILED java.lang.AssertionError: Message set should have 1 message
> 1 success
> 1 stall (build hung)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4539) StreamThread is not correctly creating StandbyTasks

2016-12-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4539:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2255
[https://github.com/apache/kafka/pull/2255]

> StreamThread is not correctly creating  StandbyTasks
> 
>
> Key: KAFKA-4539
> URL: https://issues.apache.org/jira/browse/KAFKA-4539
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.10.2.0
>
>
> Fails because {{createStandbyTask(..)}} can return null if the topology for 
> the {{TaskId}} doesn't have any state stores.
> {code}
> [2016-12-14 12:20:29,758] ERROR [StreamThread-1] User provided listener 
> org.apache.kafka.streams.processor.internals.StreamThread$1 for group 
> kafka-music-charts failed on partition assignment 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$StandbyTaskCreator.createTask(StreamThread.java:1241)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1188)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStandbyTasks(StreamThread.java:915)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$400(StreamThread.java:72)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:239)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:230)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:314)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:278)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:261)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1022)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:987)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:568)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:357)
> {code}
> Also fails because the checkpointedOffsets from the newly created 
> {{StandbyTask}} aren't added to the offsets map, so the partitions don't get 
> assigned. We then get:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka-site issue #34: Fix typo on introduction page

2016-12-15 Thread guozhangwang
Github user guozhangwang commented on the issue:

https://github.com/apache/kafka-site/pull/34
  
Could you close this PR then?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2259: MINOR: Fix typo on introduction page

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2259


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4550) current trunk unstable

2016-12-15 Thread radai rosenblatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radai rosenblatt updated KAFKA-4550:

Issue Type: Sub-task  (was: Bug)
Parent: KAFKA-2054

> current trunk unstable
> --
>
> Key: KAFKA-4550
> URL: https://issues.apache.org/jira/browse/KAFKA-4550
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.10.2.0
>Reporter: radai rosenblatt
> Attachments: run1.log, run2.log, run3.log, run4.log, run5.log
>
>
> on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
> when running the exact same build 5 times, I get:
> 3 failures (on 3 separate runs):
>kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
> No request is complete
>org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED java.lang.AssertionError: Condition not met within 
> timeout 6. Did not receive 1 number of records
>kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
> FAILED java.lang.AssertionError: Message set should have 1 message
> 1 success
> 1 stall (build hung)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] KIP-92 - Add per partition lag metrics to KafkaConsumer

2016-12-15 Thread Becket Qin
Hi,

I want to start a voting thread on KIP-92 which proposes to add per
partition lag metrics to KafkaConsumer. The KIP wiki page is below:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-92+-+Add+per+partition+lag+metrics+to+KafkaConsumer

Thanks,

Jiangjie (Becket) Qin


[jira] [Work started] (KAFKA-4507) The client should send older versions of requests to the broker if necessary

2016-12-15 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-4507 started by Colin P. McCabe.
--
> The client should send older versions of requests to the broker if necessary
> 
>
> Key: KAFKA-4507
> URL: https://issues.apache.org/jira/browse/KAFKA-4507
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> The client should send older versions of requests to the broker if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2264: Kafka 4507: The client should send older versions ...

2016-12-15 Thread cmccabe
GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2264

Kafka 4507: The client should send older versions of requests to the broker 
if necessary

KAFKA-4507

The client should send older versions of requests to the broker if 
necessary.

Note: This builds on top of KAFKA-3600, which has not yet been committed 
yet.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-4507

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2264.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2264


commit 1c6d5ceda7a96213db8f65f9b99b7e1f02d7c980
Author: Ashish Singh 
Date:   2016-04-21T21:02:32Z

KAFKA-3600 (not committed yet)

commit 2c2510ddaea5d22493f95b4ef9f1abf05f60c422
Author: Colin P. Mccabe 
Date:   2016-12-16T01:50:57Z

KAFKA-4507. The client should send older versions of requests to the broker 
if necessary




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4507) The client should send older versions of requests to the broker if necessary

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753134#comment-15753134
 ] 

ASF GitHub Bot commented on KAFKA-4507:
---

GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2264

Kafka 4507: The client should send older versions of requests to the broker 
if necessary

KAFKA-4507

The client should send older versions of requests to the broker if 
necessary.

Note: This builds on top of KAFKA-3600, which has not yet been committed 
yet.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-4507

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2264.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2264


commit 1c6d5ceda7a96213db8f65f9b99b7e1f02d7c980
Author: Ashish Singh 
Date:   2016-04-21T21:02:32Z

KAFKA-3600 (not committed yet)

commit 2c2510ddaea5d22493f95b4ef9f1abf05f60c422
Author: Colin P. Mccabe 
Date:   2016-12-16T01:50:57Z

KAFKA-4507. The client should send older versions of requests to the broker 
if necessary




> The client should send older versions of requests to the broker if necessary
> 
>
> Key: KAFKA-4507
> URL: https://issues.apache.org/jira/browse/KAFKA-4507
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> The client should send older versions of requests to the broker if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-82 - Add Record Headers

2016-12-15 Thread radai
@Matthias - oh.

I think over the course of this thread enough use cases have been presented
for things that can be done/solved with headers that even if every single
potential use case has a better custom implementation (which I dont
believe) headers are clearly one of the best possible kafka modifications
in terms of "bang for your buck"/ROI

On Thu, Dec 15, 2016 at 5:08 PM, Jun Rao  wrote:

> Hi, Michael,
>
> Thanks for the response.
>
> 100. Is there any other metadata associated with the uuid that APM sends to
> the central coordinator? What kind of things could you do once the tracing
> is embedded in each message?
>
> 103. How do you preserve the per key ordering when switching to a different
> DC at IG? Are you doing 2-way mirroring?
>
> 105. Got it. So, you don't need to use headers for encryption itself. But
> if there is another use case for headers, it's hard to put that info into
> the encrypted payload.
>
> 106. Embedding all metadata instead of just the producer id per message is
> likely more verbose, right?  Similarly, in 100 above, only a uuid is
> embedded in each message.
>
> 107. Yes, this kind of UUID is proposed KIP-98 for deduping.
>
> Jun
>
> On Thu, Dec 8, 2016 at 12:12 AM, Michael Pearce 
> wrote:
>
> > Hi Jun
> >
> > 100) each time a transaction exits a jvm for a remote system (HTTP/JMS/
> > Hopefully one day kafka) the APM tools stich in a unique id (though I
> > believe it contains the end2end uuid embedded in this id), on receiving
> the
> > message at the receiving JVM the apm code takes this out, and continues
> its
> > tracing on the that new thread. Both JVM’s (and other languages the APM
> > tool supports) send this data async back to the central controllers where
> > the stiching togeather occurs. For this they need some header space for
> > them to put this id.
> >
> > 101) Yes indeed we have a business transaction Id in the payload. Though
> > this is a system level tracing, that we need to have marry up. Also as
> per
> > note on end2end encryption we’d be unable to prove the flow if the
> payload
> > is encrypted as we’d not have access to this at certain points of the
> flow
> > through the infrastructure/platform.
> >
> >
> > 103) As said we use this mechanism in IG very successfully, as stated per
> > key we guarantee the transaction producing app to handle the transaction
> of
> > a key at one DC unless at point of critical failure where we have to flip
> > processing to another. We care about key ordering.
> > I disagree on the offset comment for the partition solution unless you do
> > full ISR, or expensive full XA transactions even with partitions you
> cannot
> > fully guarantee offsets would match.
> >
> > 105) Very much so, I need to have access at the platform level to the
> > other meta data all mentioned, without having to need to have access to
> the
> > encryption keys of the payload.
> >
> > 106)
> > Techincally yes for AZ/Region/Cluster, but then we’d need to have a
> global
> > producerId register which would be very hard to enforce/ensure is current
> > and correct, just to understand the message origins of its
> > region/az/cluster for routing.
> > The client wrapper version, producerId can be the same, as obviously the
> > producer could upgrade its wrapper, as such we need to know what wrapper
> > version the message is created with.
> > Likewise the IP address, as stated we can have our producer move, where
> > its IP would change.
> >
> > 107)
> > UUID is set on the message by interceptors before actual producer
> > transport send. This is for platform level message dedupe guarantee, the
> > business payload should be agnostic to this. Please see
> > https://activemq.apache.org/artemis/docs/1.5.0/duplicate-detection.html
> > note this is not touching business payloads.
> >
> >
> >
> > On 06/12/2016, 18:22, "Jun Rao"  wrote:
> >
> > Hi, Michael,
> >
> > Thanks for the reply. I find it very helpful.
> >
> > Data lineage:
> > 100. I'd like to understand the APM use case a bit more. It sounds
> like
> > that those APM plugins can generate a transaction id that we could
> > potentially put in the header of every message. How would you
> typically
> > make use of such transaction ids? Are there other metadata associated
> > with
> > the transaction id and if so, how are they propagated downstream?
> >
> > 101. For the finance use case, if the concept of transaction is
> > important,
> > wouldn't it be typically included in the message payload instead of
> as
> > an
> > optional header field?
> >
> > 102. The data lineage that Altas and Navigator support seems to be at
> > the
> > dataset level, not per record level? So, not sure if per message
> > headers
> > are relevant there.
> >
> > Mirroring:
> > 103. The benefit of using separate partitions is that it potentially
> > makes
> > it easy to preserve offsets during mirroring. This will make it
> easier
> > for
> > consumer t

[GitHub] kafka-site pull request #34: Fix typo on introduction page

2016-12-15 Thread ashishg-qburst
Github user ashishg-qburst closed the pull request at:

https://github.com/apache/kafka-site/pull/34


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2265: KAFKA-4549: Change to call flush method before wri...

2016-12-15 Thread fossamagna
GitHub user fossamagna opened a pull request:

https://github.com/apache/kafka/pull/2265

KAFKA-4549: Change to call flush method before writeEndMark method in close 
method



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fossamagna/kafka fix-lz4outputstream-close

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2265.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2265


commit 20195e0901951a66e471d02e8c06b4d60856f332
Author: MURAKAMI Masahiko 
Date:   2016-12-16T04:49:59Z

Change to call flush method before writeEndMark method in close method




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4549) KafkaLZ4OutputStream output invalid format when it was called close method without flash method

2016-12-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753488#comment-15753488
 ] 

ASF GitHub Bot commented on KAFKA-4549:
---

GitHub user fossamagna opened a pull request:

https://github.com/apache/kafka/pull/2265

KAFKA-4549: Change to call flush method before writeEndMark method in close 
method



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fossamagna/kafka fix-lz4outputstream-close

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2265.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2265


commit 20195e0901951a66e471d02e8c06b4d60856f332
Author: MURAKAMI Masahiko 
Date:   2016-12-16T04:49:59Z

Change to call flush method before writeEndMark method in close method




> KafkaLZ4OutputStream output invalid format when it was called close method 
> without flash method
> ---
>
> Key: KAFKA-4549
> URL: https://issues.apache.org/jira/browse/KAFKA-4549
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.0.1
> Environment: java version "1.8.0_74"
> Java(TM) SE Runtime Environment (build 1.8.0_74-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 25.74-b02, mixed mode)
>Reporter: MURAKAMI Masahiko
>
> When KafkaLZ4OutputStream was called close method without flash method, not 
> called writeEndMark method. So, it output invalid format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #1108

2016-12-15 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-4537: StreamPartitionAssignor incorrectly adds standby 
partitions

[wangguoz] KAFKA-4539: StreamThread is not correctly creating StandbyTasks

[wangguoz] MINOR: Fix typo on introduction page

--
[...truncated 32552 lines...]

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] FAILED
java.lang.AssertionError: Condition not met within timeout 6. Did not 
receive 1 number of records
at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:283)
at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:251)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.waitUntilAtLeastNumRecordProcessed(QueryableStateIntegrationTest.java:669)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:350)

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.Rege

Re: [DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-12-15 Thread Manikumar
Hi,


> Can you add a sample Jaas configuration using delegation tokens to the KIP?
>

Will add sample Jaas configuration.


> To make sure I have understood correctly, KAFKA-3712 is aimed at enabling a
> superuser to impersonate another (single) user, say alice. A producer using
> impersonation will authenticate with superuser credentials. All requests
> from the producer will be run with the principal alice. But alice is not
> involved in the authentication and alice's credentials are not actually
> provided to the broker?
>
>
 Yes, this matches with my understanding of impersonation work . Even in
this approach
 we have to handle all producer related issues mentioned by you. Yes, this
is big change
 and can be implemented if there is a pressing need. I hope we are all in
agreement, that
 this can be done in a separate KIP.


 I request others give any suggestions/concerns on this KIP.


Thanks,



>
> On Thu, Dec 15, 2016 at 9:04 AM, Manikumar 
> wrote:
>
> > @Gwen, @Rajini,
> >
> > As mentioned in the KIP, main motivation for this KIP is to reduce load
> on
> > Kerberos
> > server on large kafka deployments with large number of clients.
> >
> > Also it looks like we are combining two overlapping concepts
> > 1. Single client sending requests with multiple users/authentications
> > 2. Impersonation
> >
> > Option 1, is definitely useful in some use cases and can be used to
> > implement workaround for
> > impersonation
> >
> > In Impersonation, a super user can send requests on behalf of another
> > user(Alice) in a secured way.
> > superuser has credentials but user Alice doesn't have any. The requests
> are
> > required
> > to run as user Alice and accesses/ACLs on Broker are required to be done
> as
> > user Alice.
> > It is required that user Alice can connect to the Broker on a connection
> > authenticated with
> > superuser's credentials. In other words superuser is impersonating the
> user
> > Alice.
> >
> > The approach mentioned by Harsha in previous mail is implemented in
> hadoop,
> > storm etc..
> >
> > Some more details here:
> > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-
> > dist/hadoop-common/Superusers.html
> >
> >
> > @Rajini
> >
> > Thanks for your comments on SASL/SCRAM usage. I am thinking to send
> > tokenHmac (salted-hashed version)
> > as password for authentication and tokenID for retrial of tokenHmac at
> > server side.
> > Does above sound OK?
> >
> >
> > Thanks,
> > Manikumar
> >
> > On Wed, Dec 14, 2016 at 10:33 PM, Harsha Chintalapani 
> > wrote:
> >
> > > @Gwen @Mani  Not sure why we want to authenticate at every request.
> Even
> > if
> > > the token exchange is cheap it still a few calls that need to go
> through
> > > round trip.  Impersonation doesn't require authentication for every
> > > request.
> > >
> > > "So a centralized app can create few producers, do the metadata request
> > and
> > > broker discovery with its own user auth, but then use delegation tokens
> > to
> > > allow performing produce/fetch requests as different users? Instead of
> > > having to re-connect for each impersonated user?"
> > >
> > > Yes. But what we will have is this centralized user as impersonation
> user
> > > on behalf of other users. When it authenticates initially we will
> create
> > a
> > > "Subject" and from there on wards centralized user can do
> > > Subject.doAsPrivileged
> > > on behalf, other users.
> > > On the server side, we can retrieve two principals out of this one is
> the
> > > authenticated user (centralized user) and another is impersonated user.
> > We
> > > will first check if the authenticated user allowed to impersonate and
> > then
> > > move on to check if the user Alice has access to the topic "X" to
> > > read/write.
> > >
> > > @Rajini Intention of this KIP is to support token auth via SASL/SCRAM,
> > not
> > > just with TLS.  What you raised is a good point let me take a look and
> > add
> > > details.
> > >
> > > It will be easier to add impersonation once we reach agreement on this
> > KIP.
> > >
> > >
> > > On Wed, Dec 14, 2016 at 5:51 AM Ismael Juma  wrote:
> > >
> > > > Hi Rajini,
> > > >
> > > > I think it would definitely be valuable to have a KIP for
> > impersonation.
> > > >
> > > > Ismael
> > > >
> > > > On Wed, Dec 14, 2016 at 4:03 AM, Rajini Sivaram  >
> > > > wrote:
> > > >
> > > > > It would clearly be very useful to enable clients to send requests
> on
> > > > > behalf of multiple users. A separate KIP makes sense, but it may be
> > > worth
> > > > > thinking through some of the implications now, especially if the
> main
> > > > > interest in delegation tokens comes from its potential to enable
> > > > > impersonation.
> > > > >
> > > > > I understand that delegation tokens are only expected to be used
> with
> > > > TLS.
> > > > > But the choice of SASL/SCRAM for authentication must be based on a
> > > > > requirement to protect the tokenHmac - otherwise you could just use
> > > > > SASL/PLAIN. With SASL/SCRAM the tokenHmac is