Build failed in Jenkins: kafka-trunk-jdk8 #2748

2018-06-18 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: provide an example for deserialization exception handler (#5231)

--
[...truncated 944.10 KB...]

kafka.zookeeper.ZooKeeperClientTest > 
testBlockOnRequestCompletionFromStateChangeHandler PASSED

kafka.zookeeper.ZooKeeperClientTest > testUnresolvableConnectString STARTED

kafka.zookeeper.ZooKeeperClientTest > testUnresolvableConnectString PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testPipelinedGetData STARTED

kafka.zookeeper.ZooKeeperClientTest > testPipelinedGetData PASSED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChildChangeHandlerForChildChange 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChildChangeHandlerForChildChange 
PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenExistingZNodeWithChildren 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenExistingZNodeWithChildren 
PASSED

kafka.zookeeper.ZooKeeperClientTest > testSetDataExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testSetDataExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testMixedPipeline STARTED

kafka.zookeeper.ZooKeeperClientTest > testMixedPipeline PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetDataExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetDataExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testDeleteExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testDeleteExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testSessionExpiry STARTED

kafka.zookeeper.ZooKeeperClientTest > testSessionExpiry PASSED

kafka.zookeeper.ZooKeeperClientTest > testSetDataNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testSetDataNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testDeleteNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testDeleteNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testExistsExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testExistsExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics PASSED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChangeHandlerForDeletion STARTED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChangeHandlerForDeletion PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetAclNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetAclNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testStateChangeHandlerForAuthFailure 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testStateChangeHandlerForAuthFailure 
PASSED

kafka.network.SocketServerTest > testGracefulClose STARTED

kafka.network.SocketServerTest > testGracefulClose PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone PASSED

kafka.network.SocketServerTest > controlThrowable STARTED

kafka.network.SocketServerTest > controlThrowable PASSED

kafka.network.SocketServerTest > testRequestMetricsAfterStop STARTED

kafka.network.SocketServerTest > testRequestMetricsAfterStop PASSED

kafka.network.SocketServerTest > testConnectionIdReuse STARTED

kafka.network.SocketServerTest > testConnectionIdReuse PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
STARTED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
PASSED

kafka.network.SocketServerTest > testProcessorMetricsTags STARTED

kafka.network.SocketServerTest > testProcessorMetricsTags PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp STARTED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > testConnectionId STARTED

kafka.network.SocketServerTest > testConnectionId PASSED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics STARTED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics PASSED

kafka.network.SocketServerTest > testNoOpAction STARTED

kafka.network.SocketServerTest > testNoOpAction PASSED

kafka.network.SocketServerTest > simpleRequest STARTED

kafka.network.SocketServerTest > simpleRequest PASSED

kafka.network.SocketServerTest > closingChannelException STARTED

kafka.network.SocketServerTest > closingChannelException PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingInProgress STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingInProgress PASSED

kafka.network.SocketServerTest 

Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Luís Cabral
 Hi Ted / Guozhang / Matthias,

@Ted: I've now added your argument to the "Rejected Alternatives" portion of 
the KIP. Please keep in mind that I would like to keep this as backwards 
compatible as possible, so a lot of decisions are inferred from that intent.

@Guozhang: IMHO, adding expression evaluation to configuration is an incorrect 
approach. If you absolutely insist on having this clear distinction between 
header/key, then I would suggest instead to have a dedicated property for the 
"key" part. Of course, this is your project so I'll just continue whatever 
approach moves this KIP forward...

@Matthias: Sorry, but update the KIP according to what?

Cheers,
Luís

On Monday, June 18, 2018, 2:55:17 AM GMT+2, Matthias J. Sax 
 wrote:  
 
 Well, for "offset" and "timestamp" policy, not communication between
both is required.

Only if headers are used, user A should communicate the corresponding
header key to user B.


@Luis: can you update the KIP accordingly?



-Matthias

On 6/17/18 5:36 PM, Ted Yu wrote:
> My previous reply was just an alternative for consideration.
> 
> bq.  than a second user B can add a header with key "offset" and thus break
> the intention of user A
> 
> I didn't see such scenario after reading the KIP. Maybe add this as
> reasoning for the current approach ?
> 
> I wonder how user B gets to know the intention of user A. Meaning, if user
> B doesn't follow the norm set by user A, there still would be issue, right ?
> 
> 
> On Sun, Jun 17, 2018 at 4:58 PM, Matthias J. Sax 
> wrote:
> 
>> Let me rephrase your answer to make sure I understand what you suggest:
>>
>> If compaction strategy is configured to use "offset", and if there is a
>> header in the record with `key == offset`, than we should use the value
>> of the record header instead of the actual record offset?
>>
>> Do I understand this correctly? If yes, what is the advantage of doing
>> this? From my point of view, it might be problematic, because if user A
>> creates a topic and configures "offset" compaction (with the intend that
>> the record offset should be uses), than a second user B can add a header
>> with key "offset" and thus break the intention of user A.
>>
>> Also, if existing topics might have data with record header key
>> "offset", the change would not be backward compatible either.
>>
>>
>> -Matthias
>>
>> On 6/16/18 6:59 PM, Ted Yu wrote:
>>> Pardon the brevity in my previous reply.
>>> I was talking about this bullet:
>>>
>>> bq. When this configuration is set to anything other than "*offset*" or "
>>> *timestamp*", then the record headers are scanned for a key matching this
>>> value.
>>>
>>> My point is that if matching key in the header is found, its value should
>>> take precedence over the value of the configuration.
>>> I understand that such interpretation may have slight performance cost.
>>>
>>> Cheers
>>>
>>> On Sat, Jun 16, 2018 at 6:29 PM, Matthias J. Sax 
>>> wrote:
>>>
 Ted,

 I am also not sure what you mean by "Shouldn't the selection in header
 have higher precedence over the configuration"? What selection do you
 mean? And want configuration?


 About the first point, I think this is actually a valid concern: To
 address this issue, it seems that we would need to change the accepted
 format of the config. Instead of "offset", "timestamp", "",
 we could replace the last one with "header=".

 WDYT?


 -Matthias

 On 6/15/18 3:06 AM, Ted Yu wrote:
> If selection exists in header, the selection should override the config
 value.
> Cheers
>  Original message From: Luis Cabral
  Date: 6/15/18  1:40 AM  (GMT-08:00) To:
 dev@kafka.apache.org Subject: Re: [VOTE] KIP-280: Enhanced log
>> compaction
> Hi,
>
> bq. Can the value be determined now ? My thinking is that what if there
 is a third compaction strategy proposed in the future ? We should guard
 against user unknowingly choosing the 'future' strategy.
>
> The idea is that the header name to use is flexible, which protects
 current clients that may want to use this from having to adapt their
 already existing header names (they can just specify a new name).
>
> bq. Shouldn't the selection in header have higher precedence over the
 configuration ?
>
> Not sure what you mean here, could you clarify?
>
> bq. Please create JIRA if you haven't already.
>
> Done: https://issues.apache.org/jira/browse/KAFKA-7061
>
> Cheers,
> Luís
>
>> On 11 Jun 2018, at 01:50, Ted Yu  wrote:
>>
>> bq. When this configuration is set to anything other than "*offset*"
>> or
 "
>> *timestamp*", then the record headers are scanned for a key matching
 this
>> value.
>>
>> Can the value be determined now ? My thinking is that what if there
>> is a
>> third compaction strategy proposed in the future ? We should guard
 again

[jira] [Resolved] (KAFKA-7065) Quickstart tutorial fails because of missing brokers

2018-06-18 Thread Holger Brandl (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holger Brandl resolved KAFKA-7065.
--
Resolution: Invalid

I misread the tutorial instructions. There was no bug.

> Quickstart tutorial fails because of missing brokers
> 
>
> Key: KAFKA-7065
> URL: https://issues.apache.org/jira/browse/KAFKA-7065
> Project: Kafka
>  Issue Type: Bug
>  Components: website
>Affects Versions: 1.1.0
> Environment: Java 1.8.0_162
> MacOS 10.13.4
>Reporter: Holger Brandl
>Priority: Major
>
> Following the tutorial on [https://kafka.apache.org/quickstart] I've tried 
> setup a kafka instance with
> {{wget --no-check-certificate 
> http://apache.lauf-forum.at/kafka/1.1.0/kafka_2.12-1.1.0.tgz}}
> {{tar xvf kafka_2.12-1.1.0.tgz}}
> {{## start the server}}
> {{cd kafka_2.12-1.1.0}}
> {{bin/zookeeper-server-start.sh config/zookeeper.properties}}
>  
> Until here everything is fine, and it is reporting:
>  
> {{[kafka_2.12-1.1.0]$ bin/zookeeper-server-start.sh 
> config/zookeeper.properties}}{{[2018-06-16 10:38:41,238] INFO Reading 
> configuration from: config/zookeeper.properties 
> (org.apache.zookeeper.server.quorum.QuorumPeerConfig)}}{{[2018-06-16 
> 10:38:41,240] INFO autopurge.snapRetainCount set to 3 
> (org.apache.zookeeper.server.DatadirCleanupManager)}}{{[2018-06-16 
> 10:38:41,241] INFO autopurge.purgeInterval set to 0 
> (org.apache.zookeeper.server.DatadirCleanupManager)}}{{[2018-06-16 
> 10:38:41,241] INFO Purge task is not scheduled. 
> (org.apache.zookeeper.server.DatadirCleanupManager)}}{{[2018-06-16 
> 10:38:41,241] WARN Either no config or no quorum defined in config, running  
> in standalone mode 
> (org.apache.zookeeper.server.quorum.QuorumPeerMain)}}{{[2018-06-16 
> 10:38:41,272] INFO Reading configuration from: config/zookeeper.properties 
> (org.apache.zookeeper.server.quorum.QuorumPeerConfig)}}{{[2018-06-16 
> 10:38:41,273] INFO Starting server 
> (org.apache.zookeeper.server.ZooKeeperServerMain)}}{{[2018-06-16 
> 10:38:41,299] INFO Server 
> environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f,
>  built on 03/23/2017 10:13 GMT 
> (org.apache.zookeeper.server.ZooKeeperServer)}}{{[2018-06-16 10:38:41,300] 
> INFO Server environment:host.name=192.168.0.8 
> (org.apache.zookeeper.server.ZooKeeperServer)}}{{[2018-06-16 10:38:41,300] 
> INFO Server environment:java.version=1.8.0_162 
> (org.apache.zookeeper.server.ZooKeeperServer)}}{{[2018-06-16 10:38:41,300] 
> INFO Server environment:java.vendor=Oracle Corporation 
> (org.apache.zookeeper.server.ZooKeeperServer)}}{{[2018-06-16 10:38:41,300] 
> INFO Server 
> environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home/jre
>  (org.apache.zookeeper.server.ZooKeeperServer)}}{{[2018-06-16 10:38:41,300] 
> INFO Server 
> environment:java.class.path=/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/aopalliance-repackaged-2.5.0-b32.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/argparse4j-0.7.0.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/commons-lang3-3.5.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/connect-api-1.1.0.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/connect-file-1.1.0.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/connect-json-1.1.0.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/connect-runtime-1.1.0.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/connect-transforms-1.1.0.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/guava-20.0.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/hk2-api-2.5.0-b32.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/hk2-locator-2.5.0-b32.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/hk2-utils-2.5.0-b32.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/jackson-annotations-2.9.4.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/jackson-core-2.9.4.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/jackson-databind-2.9.4.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/jackson-jaxrs-base-2.9.4.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/jackson-jaxrs-json-provider-2.9.4.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/jackson-module-jaxb-annotations-2.9.4.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/javassist-3.20.0-GA.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/javassist-3.21.0-GA.jar:/Users/brandl/projects/kotlin/kafka/kafka_2.12-1.1.0/bin/../libs/javax.annotation-api-1.2.jar:/Users/brandl/projects/kot

Re: Error in Kafka Stream

2018-06-18 Thread Amandeep Singh
Hi  Guozhang,

The file system is XFS and the folder is not a temp folder. The issue goes
away when I restart the streams. I forgot to mention i am running 3
multiple instances of consumer on 3 machines.
Also, this issue seems to be reported by other users too:
https://issues.apache.org/jira/browse/KAFKA-5998



Regards,
Amandeep Singh
+91-7838184964


On Mon, Jun 18, 2018 at 6:45 AM Guozhang Wang  wrote:

> Hello Amandeep,
>
> What file system are you using? Also is `/opt/info` a temp folder that can
> be auto-cleared from time to time?
>
>
> Guozhang
>
> On Fri, Jun 15, 2018 at 6:39 AM, Amandeep Singh 
> wrote:
>
> > Hi,
> >
> >
> >
> >  I am getting the below error while processign data with kafka stream.
> The
> > application was runnign for a couple of hours and the '
> > WatchlistUpdate-StreamThread-9 ' thread was assigned to the same
> partition
> > since beginning. I am assuming it was able to successfully commit offsets
> > for those couple of hours and the directory '
> > /opt/info/abc/cache/test-abc-kafka-processor/kafka-streams/
> > UI-Watchlist-ES-App/0_2
> > ' did exist for that period.
> >
> >  And then I start getting the below error after every 30 secs (probably
> > because if offset commit interval)  and messages are being missed from
> > processing.
> >
> > Can you please help?
> >
> >
> > 2018-06-15 08:47:58 [WatchlistUpdate-StreamThread-9] WARN
> > o.a.k.s.p.i.ProcessorStateManager:246
> > - task [0_2] Failed
> >
> > to write checkpoint file to
> > /opt/info/abc/cache/test-abc-kafka-processor/kafka-streams/
> > UI-Watchlist-ES-App/0_2/.che
> >
> > ckpoint:
> >
> > java.io.FileNotFoundException:
> > /opt/info/abc/cache/test-abc-kafka-processor/kafka-streams/
> > UI-Watchlist-ES-App/0_2/.
> >
> > checkpoint.tmp (No such file or directory)
> >
> > at java.io.FileOutputStream.open0(Native Method) ~[na:1.8.0_141]
> >
> > at java.io.FileOutputStream.open(FileOutputStream.java:270)
> > ~[na:1.8.0_141]
> >
> > at java.io.FileOutputStream.(FileOutputStream.java:213)
> > ~[na:1.8.0_141]
> >
> > at java.io.FileOutputStream.(FileOutputStream.java:162)
> > ~[na:1.8.0_141]
> >
> > at
> > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(
> > OffsetCheckpoint.java:73)
> > ~[kafka-streams-
> >
> > 1.0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.ProcessorStateManager.
> > checkpoint(ProcessorStateManager.java:3
> >
> > 20) ~[kafka-streams-1.0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.
> > java:306)
> > [kafka-streams-1.0.0.ja
> >
> > r:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.
> > measureLatencyNs(StreamsMetricsImpl.java:2
> >
> > 08) [kafka-streams-1.0.0.jar:na]
> >
> > at
> >
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.
> > java:299)
> > [kafka-streams-1.0.0.j
> >
> > ar:na]
> >
> > at
> >
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.
> > java:289)
> > [kafka-streams-1.0.0.j
> >
> > ar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(
> > AssignedTasks.java:87)
> > [kafka-streams-1
> >
> > .0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.AssignedTasks.
> > applyToRunningTasks(AssignedTasks.java:451)
> > [ka
> >
> > fka-streams-1.0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(
> > AssignedTasks.java:380)
> > [kafka-streams-1
> >
> > .0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.TaskManager.commitAll(
> > TaskManager.java:309)
> > [kafka-streams-1.
> >
> > 0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(
> > StreamThread.java:1018)
> > [kafka-strea
> >
> > ms-1.0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(
> > StreamThread.java:835)
> > [kafka-streams-1.
> >
> > 0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(
> > StreamThread.java:774)
> > [kafka-streams-1.
> >
> > 0.0.jar:na]
> >
> > at
> > org.apache.kafka.streams.processor.internals.
> > StreamThread.run(StreamThread.java:744)
> > [kafka-streams-1.0.0.
> >
> > jar:na]
> >
> >
> > Stream config:
> >
> > 2018-06-15 08:09:28 [main] INFO  o.a.k.c.consumer.ConsumerConfig:223 -
> > ConsumerConfig values:
> >
> > auto.commit.interval.ms = 5000
> >
> > auto.offset.reset = earliest
> >
> > bootstrap.servers = [XYZ]
> >
> > check.crcs = true
> >
> > client.id = WatchlistUpdate-StreamThread-9-consumer
> >
> > connections.max.idle.ms = 54
> >
> > enable.auto.commit = false
> >
> > exclude.internal.topics = true
> >
> > fetch.max.bytes = 52428800
> >
> > fetch.max.wait.

[jira] [Created] (KAFKA-7069) AclCommand does not allow 'create' operation on 'topic'

2018-06-18 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7069:
--

 Summary: AclCommand does not allow 'create'  operation on 'topic'
 Key: KAFKA-7069
 URL: https://issues.apache.org/jira/browse/KAFKA-7069
 Project: Kafka
  Issue Type: Bug
  Components: core, security
Affects Versions: 2.0.0
Reporter: Andy Coates
Assignee: Andy Coates


KAFKA-6726 saw 
[KIP-277|https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API]
 implemented, which extended the set of operations allowed on the 'topic' 
resource type to include 'create'.

The AclCommands CLI class currently rejects this new operation. e.g. running:

{{bin/kafka-acls --authorizer-properties zookeeper.connect=localhost:2181 --add 
--allow-principal User:KSQL --operation create --topic t1}}

Fails with error:

{{ResourceType Topic only supports operations 
Read,All,AlterConfigs,DescribeConfigs,Delete,Write,Describe}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7069) AclCommand does not allow 'create' operation on 'topic'

2018-06-18 Thread Andy Coates (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates resolved KAFKA-7069.

Resolution: Invalid

> AclCommand does not allow 'create'  operation on 'topic'
> 
>
> Key: KAFKA-7069
> URL: https://issues.apache.org/jira/browse/KAFKA-7069
> Project: Kafka
>  Issue Type: Bug
>  Components: core, security
>Affects Versions: 2.0.0
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Major
>
> KAFKA-6726 saw 
> [KIP-277|https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API]
>  implemented, which extended the set of operations allowed on the 'topic' 
> resource type to include 'create'.
> The AclCommands CLI class currently rejects this new operation. e.g. running:
> {{bin/kafka-acls --authorizer-properties zookeeper.connect=localhost:2181 
> --add --allow-principal User:KSQL --operation create --topic t1}}
> Fails with error:
> {{ResourceType Topic only supports operations 
> Read,All,AlterConfigs,DescribeConfigs,Delete,Write,Describe}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-312: Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2018-06-18 Thread Bill Bejeck
All, the discussion list for this proposed change has been quiet for a few
days.

If there are no changes or other proposals, I'll start a voting thread
later today.

Thanks,
Bill

On Wed, Jun 13, 2018 at 12:31 PM Guozhang Wang  wrote:

> Thanks for the explanation Bill. Makes sense to me.
>
> On Tue, Jun 12, 2018 at 9:21 AM, Bill Bejeck  wrote:
>
> > > Since there're only two values for the optional optimization config
> > > introduced by KAFKA-6935, I wonder the overloaded build method (with
> > Properties
> > > instance) would make the config unnecessary.
> >
> > Hi Ted, thanks for commenting.  You raise a good point.  Buy IMHO, yes we
> > still need the config as we want to give users the ability to turn
> > optimizations off/on explicitly and we haven't finalized anything
> > concerning how we'll pass in the parameters.  Additionally, as we release
> > new versions, the config will give users the ability to choose to apply
> all
> > of the latest optimizations or stick with the previous version.
> >
> > Guozhang,
> >
> >> if we can hide this from the public API, to, e.g. add an additional
> > function
> >> in InternalTopologyBuilder of InternalStreamsBuilder (since in your
> > current
> >> working PR we're reusing InternalStreamsBuilder for the logical plan
> >> generation) which can then be called inside KafkaStreams
> constructors?
> >
> > I like the idea, but as I looked into it, there is an issue concerning
> the
> > fact that users can call Topology.describe() at any point.  So with this
> > approach, we could end up where the first call to Topology.describe()
> > errors out or returns an invalid description, then the second call is
> > successful.  So I don't think we'll be able to pursue this approach.
> >
> >
> > John,
> >
> > I initially liked your suggestion, but I also agree with Matthias as to
> why
> > we should not use that approach either.
> >
> > Thanks to all for the comments.
> >
> > Bill
> >
> >
> > On Mon, Jun 11, 2018 at 4:13 PM John Roesler  wrote:
> >
> > > Thanks Matthias,
> > >
> > > I buy this reasoning.
> > >
> > > -John
> > >
> > > On Mon, Jun 11, 2018 at 12:48 PM, Matthias J. Sax <
> matth...@confluent.io
> > >
> > > wrote:
> > >
> > > > @John: I don't think this is a good idea. `KafkaStreams` executes a
> > > > `Topology` and should be agnostic if the topology was build manually
> or
> > > > via `StreamsBuilder` (at least from my point of view).
> > > >
> > > > -Matthias
> > > >
> > > > On 6/11/18 9:53 AM, Guozhang Wang wrote:
> > > > > Another implementation detail that we can consider: currently the
> > > > > InternalTopologyBuilder#setApplicationId() is used because we do
> not
> > > > have
> > > > > such a mechanism to pass in configs to the topology building
> process.
> > > > Once
> > > > > we add such mechanism we should consider removing this function as
> > > well.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Mon, Jun 11, 2018 at 9:51 AM, Guozhang Wang  >
> > > > wrote:
> > > > >
> > > > >> Hello Bill,
> > > > >>
> > > > >> While working on https://github.com/apache/kafka/pull/5163 I am
> > > > wondering
> > > > >> if we can hide this from the public API, to e.g. add an additional
> > > > function
> > > > >> in InternalTopologyBuilder of InternalStreamsBuilder (since in
> your
> > > > current
> > > > >> working PR we're reusing InternalStreamsBuilder for the logical
> plan
> > > > >> generation) which can then be called inside KafkaStreams
> > constructors?
> > > > >>
> > > > >>
> > > > >> Guozhang
> > > > >>
> > > > >>
> > > > >> On Mon, Jun 11, 2018 at 9:41 AM, John Roesler 
> > > > wrote:
> > > > >>
> > > > >>> Hi Bill,
> > > > >>>
> > > > >>> Thanks for the KIP.
> > > > >>>
> > > > >>> Just a small thought. This new API will result in calls that look
> > > like
> > > > >>> this:
> > > > >>> new KafkaStreams(builder.build(props), props);
> > > > >>>
> > > > >>> Do you think that's a significant enough eyesore to warrant
> adding
> > a
> > > > new
> > > > >>> KafkaStreams constructor taking a KStreamsBuilder like this:
> > > > >>> new KafkaStreams(builder, props);
> > > > >>>
> > > > >>> such that it would internally call builder.build(props) ?
> > > > >>>
> > > > >>> Thanks,
> > > > >>> -John
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Fri, Jun 8, 2018 at 7:16 PM, Ted Yu 
> > wrote:
> > > > >>>
> > > >  Since there're only two values for the optional optimization
> > config
> > > >  introduced by KAFKA-6935, I wonder the overloaded build method
> > (with
> > > >  Properties
> > > >  instance) would make the config unnecessary.
> > > > 
> > > >  nit:
> > > >  * @return @return the {@link Topology} that represents the
> > specified
> > > >  processing logic
> > > > 
> > > >  Double @return above.
> > > > 
> > > >  Cheers
> > > > 
> > > >  On Fri, Jun 8, 2018 at 3:20 PM, Bill Bejeck 
> > > > wrote:
> > > > 
> > > > > All,
> > > > >
> > > > > I'd like to

Re: [EXTERNAL] [DISCUSS] KIP-310: Add a Kafka Source Connector to Kafka Connect

2018-06-18 Thread McCaig, Rhys
Hi Stephane,

Thanks for your feedback and apologies for the delay in my response. 

> Are there any performance benchmarks against Mirror Maker available? I'm
> interested to know if this is more performant / scalable.
> Regarding the implementation, here's some feedback:


Currently I don’t have any performance benchmarks, but I think this is a great 
idea, ill see if I can set up something one the next week or so. 

> - I think it's worth mentioning that this solution does not rely on
> consumer groups, and therefore tracking progress may be tricky. Can you
> think of a way to expose that?

This is a reasonable concern. I’m not sure how to track this other than looking 
at the Kafka connect offsets. Once a messages is passed to the framework, I'm 
unaware of a way to get at the commit offsets on the producer side. Any 
thoughts?

> - Some code can be in config Validator I believe:
> https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47
> 
> - I think your kip mentions `source.admin.` and `source.consumer.` but I
> don't see it reflected yet in the code
> 
> - Is there a way to be flexible and merge list and regex, or offer the two
> simultaneously ? source_topics=my_static_topic,prefix.* ?

Agree on all of the above - I will incorporate into the code later this week as 
ill get some time back to work on this.

Cheers,
Rhys



> On Jun 6, 2018, at 7:16 PM, Stephane Maarek  
> wrote:
> 
> Hi Rhys,
> 
> I think this will be a great addition.
> 
> Are there any performance benchmarks against Mirror Maker available? I'm
> interested to know if this is more performant / scalable.
> Regarding the implementation, here's some feedback:
> 
> - I think it's worth mentioning that this solution does not rely on
> consumer groups, and therefore tracking progress may be tricky. Can you
> think of a way to expose that?
> 

> - Some code can be in config Validator I believe:
> https://github.com/Comcast/MirrorTool-for-Kafka-Connect/blob/master/src/main/java/com/comcast/kafka/connect/kafka/KafkaSourceConnector.java#L47
> 
> - I think your kip mentions `source.admin.` and `source.consumer.` but I
> don't see it reflected yet in the code
> 
> - Is there a way to be flexible and merge list and regex, or offer the two
> simultaneously ? source_topics=my_static_topic,prefix.* ?
> 
> Hope that helps
> Stephane
> 
> Kind regards,
> Stephane
> 
> [image: Simple Machines]
> 
> Stephane Maarek | Developer
> 
> +61 416 575 980
> steph...@simplemachines.com.au
> simplemachines.com.au
> Level 2, 145 William Street, Sydney NSW 2010
> 
> On 5 June 2018 at 09:04, McCaig, Rhys  wrote:
> 
>> Hi All,
>> 
>> As I didn’t get any comment on this KIP and there has since been an
>> additional 2 KIP’s created numbered 308 since, I'm bumping this and
>> renaming the KIP to 310 to remove the duplication:
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 310%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect
>> 
>> Let me know if you have any comments or feedback, would love to hear them.
>> 
>> Cheers,
>> Rhys
>> 
>>> On May 28, 2018, at 10:23 PM, McCaig, Rhys 
>> wrote:
>>> 
>>> Sorry for the bad link to the KIP, here it is: https://cwiki.apache.org/
>> confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+
>> Connector+to+Kafka+Connect
>>> 
 On May 28, 2018, at 10:19 PM, McCaig, Rhys 
>> wrote:
 
 Hi All,
 
 I added a KIP to include a Kafka Source Connector with Kafka Connect.
 Here is the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect> ps://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 308:+Add+a+Kafka+Source+Connector+to+Kafka+Connect>
 
 Looking forward to your feedback and suggestions.
 
 Cheers,
 Rhys
 
 
>>> 
>> 
>> 



[jira] [Created] (KAFKA-7070) KafkaConsumer#committed might unexpectedly shift consumer offset

2018-06-18 Thread JIRA
Jan Lukavský created KAFKA-7070:
---

 Summary: KafkaConsumer#committed might unexpectedly shift consumer 
offset
 Key: KAFKA-7070
 URL: https://issues.apache.org/jira/browse/KAFKA-7070
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 1.1.0
Reporter: Jan Lukavský


When client uses manual partition assignment (e.g. {{KafkaConsumer#assign}}), 
but then accidentally calls {{KafkaConsumer#committed}} (for whatever reason, 
most probably bug in user code), then the offset gets shifted to latest, 
possibly skipping any unconsumed messages, or producing duplicates. The reason 
is that the call to {{KafkaConsumer#committed}} invokes AbstractCoordinator, 
which tries to fetch committed offset, but doesn't find {{group.id}} (will be 
probably even empty). This might cause Fetcher to receive invalid offset for 
partition and reset it to the latest offset.

Although this is primarily bug in user code, it is very hard to track it down. 
The call to {{KafkaConsumer#committed}} might probably throw exception when 
called on client without auto partition assignment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4061) Apache Kafka failover is not working

2018-06-18 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4061.
--
Resolution: Cannot Reproduce

This is mostly due to the health of the consumer offset topic.  replication 
factor of the "__consumer_offsets"  topic should be greater than 1 for greater 
availability.  Please reopen if you think the issue still exists

> Apache Kafka failover is not working
> 
>
> Key: KAFKA-4061
> URL: https://issues.apache.org/jira/browse/KAFKA-4061
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
> Environment: Linux
>Reporter: Sebastian Bruckner
>Priority: Major
>
> We have a 3 node cluster (kafka1 to kafka3) on 0.10.0.0
> When I shut down the node kafka1 i can see in the debug logs of my consumers 
> the following:
> {code}
> Sending coordinator request for group f49dc74f-3ccb-4fef-bafc-a7547fe26bc8 to 
> broker kafka3:9092 (id: 3 rack: null)
> Received group coordinator response 
> ClientResponse(receivedTimeMs=1471511333843, disconnected=false, 
> request=ClientRequest(expectResponse=true, 
> callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@3892b449,
>  
> request=RequestSend(header={api_key=10,api_version=0,correlation_id=118,client_id=f49dc74f-3ccb-4fef-bafc-a7547fe26bc8},
>  body={group_id=f49dc74f-3ccb-4fef-bafc-a7547fe26bc8}), 
> createdTimeMs=1471511333794, sendTimeMs=1471511333794), 
> responseBody={error_code=0,coordinator={node_id=1,host=kafka1,port=9092}})
> {code}
> So the problem is that kafka3 answers with an response telling the consumer 
> that the coordinator is kafka1 (which is shut down).
> This then happens over and over again.
> When i restart the consumer i can see the following:
> {code}
> Updated cluster metadata version 1 to Cluster(nodes = [kafka2:9092 (id: -2 
> rack: null), kafka1:9092 (id: -1 rack: null), kafka3:9092 (id: -3 rack: 
> null)], partitions = [])
> ... responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}})
> {code}
> The difference is now that it answers with error code 15 
> (GROUP_COORDINATOR_NOT_AVAILABLE). 
> Somehow kafka doesn't elect a new group coordinator. 
> In a local setup with 2 brokers and 1 zookeper it works fine..
> Can you help me debugging this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-3791) Broken tools -- need better way to get offsets and other info

2018-06-18 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3791.
--
Resolution: Fixed

kafka-consumer-offset-checker.sh tool has been removed. Use 
kafka-consumer-groups.sh to get consumer group details.  Old clients related 
tools will be removed in 2.0.0. Please reopen if the issue still exists in 
other tools.

> Broken tools -- need better way to get offsets and other info
> -
>
> Key: KAFKA-3791
> URL: https://issues.apache.org/jira/browse/KAFKA-3791
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.10.0.0
>Reporter: Greg Zoller
>Priority: Major
>
> Whenever I run included tools like kafka-consumer-offset-checker.sh I get 
> deprecation warnings and it doesn't work for me (offsets not returned).  
> These need to be fixed.  The suggested class in the deprecation warning is 
> not documented clearly in the docs.
> In general it would be nice to streamline and simplify the tool scripts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7071) specify number of partitions when using repartition logic

2018-06-18 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7071:
--

 Summary: specify number of partitions when using repartition logic
 Key: KAFKA-7071
 URL: https://issues.apache.org/jira/browse/KAFKA-7071
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Boyang Chen
Assignee: Boyang Chen


Hey there, I was wondering whether it makes sense to specify number of 
partitions of the output topic from a repartition operation like groupBy, 
flatMap, etc. The current DSL doesn't support adding a customized repartition 
topic which has customized number of partitions. For example, I want to reduce 
the input topic from 8 partitions to 1, there is no easy solution but to create 
a new topic instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Guozhang Wang
Hello Luís,

I agree that having an expression evaluation as a config value is not the
best approach; if there are better ideas to allow users to specify the
header key which happen to be the same as the preserved config values
"offset" and "timestamp" (although the likelihood may be small, as Ted
mentioned there may be more preserved config values added in the future),
then I'd be happily follow the suggestions. For example, we could have the
config value for header keys as "header-"? Is that what you've
suggested? Or do you suggest using two configs instead, and the second
config specifying the key name, and will only be considered if the first
(i.e. current proposed) config's value is `header`, otherwise be ignored?


Guozhang


On Mon, Jun 18, 2018 at 12:20 AM, Luís Cabral  wrote:

>  Hi Ted / Guozhang / Matthias,
>
> @Ted: I've now added your argument to the "Rejected Alternatives" portion
> of the KIP. Please keep in mind that I would like to keep this as backwards
> compatible as possible, so a lot of decisions are inferred from that intent.
>
> @Guozhang: IMHO, adding expression evaluation to configuration is an
> incorrect approach. If you absolutely insist on having this clear
> distinction between header/key, then I would suggest instead to have a
> dedicated property for the "key" part. Of course, this is your project so
> I'll just continue whatever approach moves this KIP forward...
>
> @Matthias: Sorry, but update the KIP according to what?
>
> Cheers,
> Luís
>
> On Monday, June 18, 2018, 2:55:17 AM GMT+2, Matthias J. Sax <
> matth...@confluent.io> wrote:
>
>  Well, for "offset" and "timestamp" policy, not communication between
> both is required.
>
> Only if headers are used, user A should communicate the corresponding
> header key to user B.
>
>
> @Luis: can you update the KIP accordingly?
>
>
>
> -Matthias
>
> On 6/17/18 5:36 PM, Ted Yu wrote:
> > My previous reply was just an alternative for consideration.
> >
> > bq.  than a second user B can add a header with key "offset" and thus
> break
> > the intention of user A
> >
> > I didn't see such scenario after reading the KIP. Maybe add this as
> > reasoning for the current approach ?
> >
> > I wonder how user B gets to know the intention of user A. Meaning, if
> user
> > B doesn't follow the norm set by user A, there still would be issue,
> right ?
> >
> >
> > On Sun, Jun 17, 2018 at 4:58 PM, Matthias J. Sax 
> > wrote:
> >
> >> Let me rephrase your answer to make sure I understand what you suggest:
> >>
> >> If compaction strategy is configured to use "offset", and if there is a
> >> header in the record with `key == offset`, than we should use the value
> >> of the record header instead of the actual record offset?
> >>
> >> Do I understand this correctly? If yes, what is the advantage of doing
> >> this? From my point of view, it might be problematic, because if user A
> >> creates a topic and configures "offset" compaction (with the intend that
> >> the record offset should be uses), than a second user B can add a header
> >> with key "offset" and thus break the intention of user A.
> >>
> >> Also, if existing topics might have data with record header key
> >> "offset", the change would not be backward compatible either.
> >>
> >>
> >> -Matthias
> >>
> >> On 6/16/18 6:59 PM, Ted Yu wrote:
> >>> Pardon the brevity in my previous reply.
> >>> I was talking about this bullet:
> >>>
> >>> bq. When this configuration is set to anything other than "*offset*"
> or "
> >>> *timestamp*", then the record headers are scanned for a key matching
> this
> >>> value.
> >>>
> >>> My point is that if matching key in the header is found, its value
> should
> >>> take precedence over the value of the configuration.
> >>> I understand that such interpretation may have slight performance cost.
> >>>
> >>> Cheers
> >>>
> >>> On Sat, Jun 16, 2018 at 6:29 PM, Matthias J. Sax <
> matth...@confluent.io>
> >>> wrote:
> >>>
>  Ted,
> 
>  I am also not sure what you mean by "Shouldn't the selection in header
>  have higher precedence over the configuration"? What selection do you
>  mean? And want configuration?
> 
> 
>  About the first point, I think this is actually a valid concern: To
>  address this issue, it seems that we would need to change the accepted
>  format of the config. Instead of "offset", "timestamp",
> "",
>  we could replace the last one with "header=".
> 
>  WDYT?
> 
> 
>  -Matthias
> 
>  On 6/15/18 3:06 AM, Ted Yu wrote:
> > If selection exists in header, the selection should override the
> config
>  value.
> > Cheers
> >  Original message From: Luis Cabral
>   Date: 6/15/18  1:40 AM  (GMT-08:00)
> To:
>  dev@kafka.apache.org Subject: Re: [VOTE] KIP-280: Enhanced log
> >> compaction
> > Hi,
> >
> > bq. Can the value be determined now ? My thinking is that what if
> there
>  is a third compaction strategy proposed

[jira] [Resolved] (KAFKA-4870) A question about broker down , the server is doing partition master election,the client producer may send msg fail . How the producer deal with the situation ??

2018-06-18 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4870.
--
Resolution: Information Provided

If the produce request fails, the producer automatically retry based on retries 
config for any retry exceptions. Also Producer updates the metadata for any 
exceptions or if any partitions does not have leader etc..

Post these kind of queries to 
[us...@kafka.apache.org|mailto:us...@kafka.apache.org] mailing list 
([http://kafka.apache.org/contact]) for  quicker responses.

> A question about broker down , the server is doing partition master 
> election,the client producer may send msg fail . How the producer deal with 
> the situation ??
> 
>
> Key: KAFKA-4870
> URL: https://issues.apache.org/jira/browse/KAFKA-4870
> Project: Kafka
>  Issue Type: Test
>  Components: clients
> Environment: java client 
>Reporter: zhaoziyan
>Priority: Minor
>
> the broker down . The kafka cluster is doing partion  master election , the 
> producer send order msg or nomal msg ,the producer may send msg fail .How 
> client update metadata and deal with the msg send fail ?? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[VOTE] KIP-316: Command-line overrides for ConnectDistributed worker properties

2018-06-18 Thread Kevin Lafferty
Hi all,

I got a couple notes of interest on the discussion thread and no
objections, so I'd like to kick off a vote. This is a very small change.

KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-316%3A+Command-line+overrides+for+ConnectDistributed+worker+properties

Jira: https://issues.apache.org/jira/browse/KAFKA-7060

GitHub PR: https://github.com/apache/kafka/pull/5234

-Kevin


[jira] [Resolved] (KAFKA-5237) SimpleConsumerShell logs terminating message to stdout instead of stderr

2018-06-18 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-5237.
--
Resolution: Auto Closed

Old consumer related tools are deprecated and  will be removed in KAFKA-2983.

> SimpleConsumerShell logs terminating message to stdout instead of stderr
> 
>
> Key: KAFKA-5237
> URL: https://issues.apache.org/jira/browse/KAFKA-5237
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.2.1
>Reporter: Daniel Einspanjer
>Priority: Trivial
>
> The SimpleConsumerShell has one big advantage over the standard 
> kafka-console-consumer client, it supports the --no-wait-at-logend parameter 
> which lets you script its use without having to rely on a timeout or dealing 
> with the exception and stacktrace thrown by said timeout.
> Unfortunately, when you use this option, it will write a termination message 
> to stdout when it is finished.  This means if you are using it to dump the 
> contents of a topic to a file, you get an extra line.
> This pull request just changes that one line to call System.err.println 
> instead of println.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[VOTE] KIP-291: Have separate queues for control requests and data requests

2018-06-18 Thread Lucas Wang
Hi All,

I've addressed a couple of comments in the discussion thread for KIP-291,
and
got no objections after making the changes. Therefore I would like to start
the voting thread.

KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-291%3A+Have+separate+queues+for+control+requests+and+data+requests

Thanks for your time!
Lucas


[DISCUSS] - KIP-314: KTable to GlobalKTable Bi-directional Join

2018-06-18 Thread Adam Bellemare
Hi All

I created KIP-314 and I would like to initiate a discussion on it.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-314%3A+KTable+to+GlobalKTable+Bi-directional+Join

The primary goal of this KIP is to improve the way that Kafka can deal with
relational data at scale. This KIP would alter the way that GlobalKTables
can be used in relation to KTables. I believe that this would be a very
useful change but I need some eyes on the technical aspects to validate or
refute the strategy.

Thanks

Adam Bellemare


RE: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Luís Cabral
Hi Guozhang,

Yes, that is what I meant (separate configs).
Though I would still prefer to keep it as it is, as its a much simpler and 
cleaner approach – I’m not so sure that a potential client would really be so 
inconvenienced for having to use “_offset” or “_timestamp_” as a header

Cheers,
Luís


From: Guozhang Wang
Sent: 18 June 2018 19:35
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-280: Enhanced log compaction

Hello Luís,

I agree that having an expression evaluation as a config value is not the
best approach; if there are better ideas to allow users to specify the
header key which happen to be the same as the preserved config values
"offset" and "timestamp" (although the likelihood may be small, as Ted
mentioned there may be more preserved config values added in the future),
then I'd be happily follow the suggestions. For example, we could have the
config value for header keys as "header-"? Is that what you've
suggested? Or do you suggest using two configs instead, and the second
config specifying the key name, and will only be considered if the first
(i.e. current proposed) config's value is `header`, otherwise be ignored?


Guozhang


On Mon, Jun 18, 2018 at 12:20 AM, Luís Cabral  wrote:

>  Hi Ted / Guozhang / Matthias,
>
> @Ted: I've now added your argument to the "Rejected Alternatives" portion
> of the KIP. Please keep in mind that I would like to keep this as backwards
> compatible as possible, so a lot of decisions are inferred from that intent.
>
> @Guozhang: IMHO, adding expression evaluation to configuration is an
> incorrect approach. If you absolutely insist on having this clear
> distinction between header/key, then I would suggest instead to have a
> dedicated property for the "key" part. Of course, this is your project so
> I'll just continue whatever approach moves this KIP forward...
>
> @Matthias: Sorry, but update the KIP according to what?
>
> Cheers,
> Luís
>
> On Monday, June 18, 2018, 2:55:17 AM GMT+2, Matthias J. Sax <
> matth...@confluent.io> wrote:
>
>  Well, for "offset" and "timestamp" policy, not communication between
> both is required.
>
> Only if headers are used, user A should communicate the corresponding
> header key to user B.
>
>
> @Luis: can you update the KIP accordingly?
>
>
>
> -Matthias
>
> On 6/17/18 5:36 PM, Ted Yu wrote:
> > My previous reply was just an alternative for consideration.
> >
> > bq.  than a second user B can add a header with key "offset" and thus
> break
> > the intention of user A
> >
> > I didn't see such scenario after reading the KIP. Maybe add this as
> > reasoning for the current approach ?
> >
> > I wonder how user B gets to know the intention of user A. Meaning, if
> user
> > B doesn't follow the norm set by user A, there still would be issue,
> right ?
> >
> >
> > On Sun, Jun 17, 2018 at 4:58 PM, Matthias J. Sax 
> > wrote:
> >
> >> Let me rephrase your answer to make sure I understand what you suggest:
> >>
> >> If compaction strategy is configured to use "offset", and if there is a
> >> header in the record with `key == offset`, than we should use the value
> >> of the record header instead of the actual record offset?
> >>
> >> Do I understand this correctly? If yes, what is the advantage of doing
> >> this? From my point of view, it might be problematic, because if user A
> >> creates a topic and configures "offset" compaction (with the intend that
> >> the record offset should be uses), than a second user B can add a header
> >> with key "offset" and thus break the intention of user A.
> >>
> >> Also, if existing topics might have data with record header key
> >> "offset", the change would not be backward compatible either.
> >>
> >>
> >> -Matthias
> >>
> >> On 6/16/18 6:59 PM, Ted Yu wrote:
> >>> Pardon the brevity in my previous reply.
> >>> I was talking about this bullet:
> >>>
> >>> bq. When this configuration is set to anything other than "*offset*"
> or "
> >>> *timestamp*", then the record headers are scanned for a key matching
> this
> >>> value.
> >>>
> >>> My point is that if matching key in the header is found, its value
> should
> >>> take precedence over the value of the configuration.
> >>> I understand that such interpretation may have slight performance cost.
> >>>
> >>> Cheers
> >>>
> >>> On Sat, Jun 16, 2018 at 6:29 PM, Matthias J. Sax <
> matth...@confluent.io>
> >>> wrote:
> >>>
>  Ted,
> 
>  I am also not sure what you mean by "Shouldn't the selection in header
>  have higher precedence over the configuration"? What selection do you
>  mean? And want configuration?
> 
> 
>  About the first point, I think this is actually a valid concern: To
>  address this issue, it seems that we would need to change the accepted
>  format of the config. Instead of "offset", "timestamp",
> "",
>  we could replace the last one with "header=".
> 
>  WDYT?
> 
> 
>  -Matthias
> 
>  On 6/15/18 3:06 AM, Ted Yu wrote:
> > If selection exists 

Re: [VOTE] KIP-291: Have separate queues for control requests and data requests

2018-06-18 Thread Ted Yu
+1

On Mon, Jun 18, 2018 at 1:04 PM, Lucas Wang  wrote:

> Hi All,
>
> I've addressed a couple of comments in the discussion thread for KIP-291,
> and
> got no objections after making the changes. Therefore I would like to start
> the voting thread.
>
> KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 291%3A+Have+separate+queues+for+control+requests+and+data+requests
>
> Thanks for your time!
> Lucas
>


Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Guozhang Wang
How about make the preserved values to be "_offset_" and "_timestamp_"
then? Currently in the KIP they are reserved as "offset" and "timestamp".


Guozhang

On Mon, Jun 18, 2018 at 1:40 PM, Luís Cabral 
wrote:

> Hi Guozhang,
>
> Yes, that is what I meant (separate configs).
> Though I would still prefer to keep it as it is, as its a much simpler and
> cleaner approach – I’m not so sure that a potential client would really be
> so inconvenienced for having to use “_offset” or “_timestamp_” as a header
>
> Cheers,
> Luís
>
>
> From: Guozhang Wang
> Sent: 18 June 2018 19:35
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>
> Hello Luís,
>
> I agree that having an expression evaluation as a config value is not the
> best approach; if there are better ideas to allow users to specify the
> header key which happen to be the same as the preserved config values
> "offset" and "timestamp" (although the likelihood may be small, as Ted
> mentioned there may be more preserved config values added in the future),
> then I'd be happily follow the suggestions. For example, we could have the
> config value for header keys as "header-"? Is that what you've
> suggested? Or do you suggest using two configs instead, and the second
> config specifying the key name, and will only be considered if the first
> (i.e. current proposed) config's value is `header`, otherwise be ignored?
>
>
> Guozhang
>
>
> On Mon, Jun 18, 2018 at 12:20 AM, Luís Cabral
>  > wrote:
>
> >  Hi Ted / Guozhang / Matthias,
> >
> > @Ted: I've now added your argument to the "Rejected Alternatives" portion
> > of the KIP. Please keep in mind that I would like to keep this as
> backwards
> > compatible as possible, so a lot of decisions are inferred from that
> intent.
> >
> > @Guozhang: IMHO, adding expression evaluation to configuration is an
> > incorrect approach. If you absolutely insist on having this clear
> > distinction between header/key, then I would suggest instead to have a
> > dedicated property for the "key" part. Of course, this is your project so
> > I'll just continue whatever approach moves this KIP forward...
> >
> > @Matthias: Sorry, but update the KIP according to what?
> >
> > Cheers,
> > Luís
> >
> > On Monday, June 18, 2018, 2:55:17 AM GMT+2, Matthias J. Sax <
> > matth...@confluent.io> wrote:
> >
> >  Well, for "offset" and "timestamp" policy, not communication between
> > both is required.
> >
> > Only if headers are used, user A should communicate the corresponding
> > header key to user B.
> >
> >
> > @Luis: can you update the KIP accordingly?
> >
> >
> >
> > -Matthias
> >
> > On 6/17/18 5:36 PM, Ted Yu wrote:
> > > My previous reply was just an alternative for consideration.
> > >
> > > bq.  than a second user B can add a header with key "offset" and thus
> > break
> > > the intention of user A
> > >
> > > I didn't see such scenario after reading the KIP. Maybe add this as
> > > reasoning for the current approach ?
> > >
> > > I wonder how user B gets to know the intention of user A. Meaning, if
> > user
> > > B doesn't follow the norm set by user A, there still would be issue,
> > right ?
> > >
> > >
> > > On Sun, Jun 17, 2018 at 4:58 PM, Matthias J. Sax <
> matth...@confluent.io>
> > > wrote:
> > >
> > >> Let me rephrase your answer to make sure I understand what you
> suggest:
> > >>
> > >> If compaction strategy is configured to use "offset", and if there is
> a
> > >> header in the record with `key == offset`, than we should use the
> value
> > >> of the record header instead of the actual record offset?
> > >>
> > >> Do I understand this correctly? If yes, what is the advantage of doing
> > >> this? From my point of view, it might be problematic, because if user
> A
> > >> creates a topic and configures "offset" compaction (with the intend
> that
> > >> the record offset should be uses), than a second user B can add a
> header
> > >> with key "offset" and thus break the intention of user A.
> > >>
> > >> Also, if existing topics might have data with record header key
> > >> "offset", the change would not be backward compatible either.
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 6/16/18 6:59 PM, Ted Yu wrote:
> > >>> Pardon the brevity in my previous reply.
> > >>> I was talking about this bullet:
> > >>>
> > >>> bq. When this configuration is set to anything other than "*offset*"
> > or "
> > >>> *timestamp*", then the record headers are scanned for a key matching
> > this
> > >>> value.
> > >>>
> > >>> My point is that if matching key in the header is found, its value
> > should
> > >>> take precedence over the value of the configuration.
> > >>> I understand that such interpretation may have slight performance
> cost.
> > >>>
> > >>> Cheers
> > >>>
> > >>> On Sat, Jun 16, 2018 at 6:29 PM, Matthias J. Sax <
> > matth...@confluent.io>
> > >>> wrote:
> > >>>
> >  Ted,
> > 
> >  I am also not sure what you mean by "Shouldn't the selection in
> header
> >  have higher pre

[jira] [Resolved] (KAFKA-7071) specify number of partitions when using repartition logic

2018-06-18 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7071.

Resolution: Duplicate

Dup with 6037

> specify number of partitions when using repartition logic
> -
>
> Key: KAFKA-7071
> URL: https://issues.apache.org/jira/browse/KAFKA-7071
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Hey there, I was wondering whether it makes sense to specify number of 
> partitions of the output topic from a repartition operation like groupBy, 
> flatMap, etc. The current DSL doesn't support adding a customized repartition 
> topic which has customized number of partitions. For example, I want to 
> reduce the input topic from 8 partitions to 1, there is no easy solution but 
> to create a new topic instead.
> cc [~guozhang] [~liquanpei]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


RE: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Luís Cabral
I’m ok with that...

Ted / Matthias?


From: Guozhang Wang
Sent: 18 June 2018 22:49
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-280: Enhanced log compaction

How about make the preserved values to be "_offset_" and "_timestamp_"
then? Currently in the KIP they are reserved as "offset" and "timestamp".


Guozhang

On Mon, Jun 18, 2018 at 1:40 PM, Luís Cabral 
wrote:

> Hi Guozhang,
>
> Yes, that is what I meant (separate configs).
> Though I would still prefer to keep it as it is, as its a much simpler and
> cleaner approach – I’m not so sure that a potential client would really be
> so inconvenienced for having to use “_offset” or “_timestamp_” as a header
>
> Cheers,
> Luís
>
>
> From: Guozhang Wang
> Sent: 18 June 2018 19:35
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>
> Hello Luís,
>
> I agree that having an expression evaluation as a config value is not the
> best approach; if there are better ideas to allow users to specify the
> header key which happen to be the same as the preserved config values
> "offset" and "timestamp" (although the likelihood may be small, as Ted
> mentioned there may be more preserved config values added in the future),
> then I'd be happily follow the suggestions. For example, we could have the
> config value for header keys as "header-"? Is that what you've
> suggested? Or do you suggest using two configs instead, and the second
> config specifying the key name, and will only be considered if the first
> (i.e. current proposed) config's value is `header`, otherwise be ignored?
>
>
> Guozhang
>
>
> On Mon, Jun 18, 2018 at 12:20 AM, Luís Cabral
>  > wrote:
>
> >  Hi Ted / Guozhang / Matthias,
> >
> > @Ted: I've now added your argument to the "Rejected Alternatives" portion
> > of the KIP. Please keep in mind that I would like to keep this as
> backwards
> > compatible as possible, so a lot of decisions are inferred from that
> intent.
> >
> > @Guozhang: IMHO, adding expression evaluation to configuration is an
> > incorrect approach. If you absolutely insist on having this clear
> > distinction between header/key, then I would suggest instead to have a
> > dedicated property for the "key" part. Of course, this is your project so
> > I'll just continue whatever approach moves this KIP forward...
> >
> > @Matthias: Sorry, but update the KIP according to what?
> >
> > Cheers,
> > Luís
> >
> > On Monday, June 18, 2018, 2:55:17 AM GMT+2, Matthias J. Sax <
> > matth...@confluent.io> wrote:
> >
> >  Well, for "offset" and "timestamp" policy, not communication between
> > both is required.
> >
> > Only if headers are used, user A should communicate the corresponding
> > header key to user B.
> >
> >
> > @Luis: can you update the KIP accordingly?
> >
> >
> >
> > -Matthias
> >
> > On 6/17/18 5:36 PM, Ted Yu wrote:
> > > My previous reply was just an alternative for consideration.
> > >
> > > bq.  than a second user B can add a header with key "offset" and thus
> > break
> > > the intention of user A
> > >
> > > I didn't see such scenario after reading the KIP. Maybe add this as
> > > reasoning for the current approach ?
> > >
> > > I wonder how user B gets to know the intention of user A. Meaning, if
> > user
> > > B doesn't follow the norm set by user A, there still would be issue,
> > right ?
> > >
> > >
> > > On Sun, Jun 17, 2018 at 4:58 PM, Matthias J. Sax <
> matth...@confluent.io>
> > > wrote:
> > >
> > >> Let me rephrase your answer to make sure I understand what you
> suggest:
> > >>
> > >> If compaction strategy is configured to use "offset", and if there is
> a
> > >> header in the record with `key == offset`, than we should use the
> value
> > >> of the record header instead of the actual record offset?
> > >>
> > >> Do I understand this correctly? If yes, what is the advantage of doing
> > >> this? From my point of view, it might be problematic, because if user
> A
> > >> creates a topic and configures "offset" compaction (with the intend
> that
> > >> the record offset should be uses), than a second user B can add a
> header
> > >> with key "offset" and thus break the intention of user A.
> > >>
> > >> Also, if existing topics might have data with record header key
> > >> "offset", the change would not be backward compatible either.
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 6/16/18 6:59 PM, Ted Yu wrote:
> > >>> Pardon the brevity in my previous reply.
> > >>> I was talking about this bullet:
> > >>>
> > >>> bq. When this configuration is set to anything other than "*offset*"
> > or "
> > >>> *timestamp*", then the record headers are scanned for a key matching
> > this
> > >>> value.
> > >>>
> > >>> My point is that if matching key in the header is found, its value
> > should
> > >>> take precedence over the value of the configuration.
> > >>> I understand that such interpretation may have slight performance
> cost.
> > >>>
> > >>> Cheers
> > >>>
> > >>> On Sat, Jun 16, 2018 at 6:29 PM, Matthias J. Sax <
> > matth...@con

Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Matthias J. Sax
Luis,

I meant to update the "Rejected Alternative" sections, what you have
done already. Thx.

Originally, I also had the idea about a second config, but thought it
might be easier to just change the allowed values to be `offset`,
`timestamp`, `header=`. (We try to keep the number of configs small
if possible, as more configs are more confusing to users.)

I don't think that using `_offset_`, `_timestamp_` and `` solves
the problem because users still might use `_something_` as header key --
and if we want to introduce a new compaction strategy "something" later
we face the same issues as without the underscores. We only reduce the
likelihood that it happens.

Using `header=` as prefix or introducing a second config, that is only
effective if the strategy is set to `header` seems to be a cleaner solution.

@Luis: why do you think that using `header=` is an "incorrect
approach"?

> Though I would still prefer to keep it as it is, as its a much simple> and 
> cleaner approach – I’m not so sure that a potential client would
> really be so inconvenienced for having to use “_offset” or
> “_timestamp_” as a header

I don't think that it's about the issue that people cannot use
`_offset_` or `_timestamp_` in their header (by "use" I mean for
compaction). With the current KIP, they cannot use `offset` or
`timestamp` either. The issue is, that we cannot introduce a new system
supported compaction strategy in the future without potentially breaking
something, as we basically assign the whole space of Strings (minus
`offset`, `timestamp`) as valid configs to enable header based compaction.

Personally, I prefer either adding a config or going with
`header=`. Using `_timestamp_`, `_offset_`, and `` might be
good enough (even if this is the solution I like least)---for this case,
we should state explicitly, that the whole space of `_*_` is reserved
and users are not allowed to set those for header compaction. In fact, I
would also add a check for the config that only allows for `_offset_`
and `_timestamp_` and throws an exception for all other `_*_` configs.


-Matthias


On 6/18/18 2:03 PM, Luís Cabral wrote:
> I’m ok with that...
> 
> Ted / Matthias?
> 
> 
> From: Guozhang Wang
> Sent: 18 June 2018 22:49
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
> 
> How about make the preserved values to be "_offset_" and "_timestamp_"
> then? Currently in the KIP they are reserved as "offset" and "timestamp".
> 
> 
> Guozhang
> 
> On Mon, Jun 18, 2018 at 1:40 PM, Luís Cabral 
> wrote:
> 
>> Hi Guozhang,
>>
>> Yes, that is what I meant (separate configs).
>> Though I would still prefer to keep it as it is, as its a much simpler and
>> cleaner approach – I’m not so sure that a potential client would really be
>> so inconvenienced for having to use “_offset” or “_timestamp_” as a header
>>
>> Cheers,
>> Luís
>>
>>
>> From: Guozhang Wang
>> Sent: 18 June 2018 19:35
>> To: dev@kafka.apache.org
>> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>>
>> Hello Luís,
>>
>> I agree that having an expression evaluation as a config value is not the
>> best approach; if there are better ideas to allow users to specify the
>> header key which happen to be the same as the preserved config values
>> "offset" and "timestamp" (although the likelihood may be small, as Ted
>> mentioned there may be more preserved config values added in the future),
>> then I'd be happily follow the suggestions. For example, we could have the
>> config value for header keys as "header-"? Is that what you've
>> suggested? Or do you suggest using two configs instead, and the second
>> config specifying the key name, and will only be considered if the first
>> (i.e. current proposed) config's value is `header`, otherwise be ignored?
>>
>>
>> Guozhang
>>
>>
>> On Mon, Jun 18, 2018 at 12:20 AM, Luís Cabral
>> >> wrote:
>>
>>>  Hi Ted / Guozhang / Matthias,
>>>
>>> @Ted: I've now added your argument to the "Rejected Alternatives" portion
>>> of the KIP. Please keep in mind that I would like to keep this as
>> backwards
>>> compatible as possible, so a lot of decisions are inferred from that
>> intent.
>>>
>>> @Guozhang: IMHO, adding expression evaluation to configuration is an
>>> incorrect approach. If you absolutely insist on having this clear
>>> distinction between header/key, then I would suggest instead to have a
>>> dedicated property for the "key" part. Of course, this is your project so
>>> I'll just continue whatever approach moves this KIP forward...
>>>
>>> @Matthias: Sorry, but update the KIP according to what?
>>>
>>> Cheers,
>>> Luís
>>>
>>> On Monday, June 18, 2018, 2:55:17 AM GMT+2, Matthias J. Sax <
>>> matth...@confluent.io> wrote:
>>>
>>>  Well, for "offset" and "timestamp" policy, not communication between
>>> both is required.
>>>
>>> Only if headers are used, user A should communicate the corresponding
>>> header key to user B.
>>>
>>>
>>> @Luis: can you update the KIP accordingly?
>>>
>>>
>>>
>>> -Matthi

[jira] [Created] (KAFKA-7072) Kafka Streams may drop rocksb window segments before they expire

2018-06-18 Thread John Roesler (JIRA)
John Roesler created KAFKA-7072:
---

 Summary: Kafka Streams may drop rocksb window segments before they 
expire
 Key: KAFKA-7072
 URL: https://issues.apache.org/jira/browse/KAFKA-7072
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 1.0.1, 1.1.0, 0.11.0.2, 1.0.0, 0.11.0.1, 0.11.0.0, 2.0.0
Reporter: John Roesler
Assignee: John Roesler
 Fix For: 2.1.0


The current implementation of Segments used by Rocks Session and Time window 
stores is in conflict with our current timestamp management model.

The current segmentation approach allows configuration of a fixed number of 
segments (let's say *4*) and a fixed retention time. We essentially divide up 
the retention time into the available number of segments:
{quote}{{<-|-|}}
{{   expiration date                 right now}}
{{          \---retention time/}}
{{          |  seg 0  |  seg 1  |  seg 2  |  seg 3  |}}
{quote}
Note that we keep one extra segment so that we can record new events, while 
some events in seg 0 are actually expired (but we only drop whole segments, so 
they just get to hang around.
{quote}{{<-|-|}}
{{       expiration date                 right now}}
{{              \---retention time/}}
{{          |  seg 0  |  seg 1  |  seg 2  |  seg 3  |}}
{quote}
When it's time to provision segment 4, we know that segment 0 is completely 
expired, so we drop it:
{quote}{{<---|-|}}
{{             expiration date                 right now}}
{{                    \---retention time/}}
{{                    |  seg 1  |  seg 2  |  seg 3  |  seg 4  |}}
{quote}
 

However, the current timestamp management model allows for records from the 
future. Namely, because we define stream time as the minimum buffered timestamp 
(nondecreasing), we can have a buffer like this: [ 5, 2, 6 ], and our stream 
time will be 2, but we'll handle a record with timestamp 5 next. referring to 
the example, this means we could wind up having to provision segment 4 before 
segment 0 expires!

Let's say "f" is our future event:
{quote}{{<---|-|f}}
{{             expiration date                 right now}}
{{                    \---retention time/}}
{{                    |  seg 1  |  seg 2  |  seg 3  |  seg 4  |}}
{quote}
{{}}Should we drop segment 0 prematurely? Or should we crash and refuse to 
process "f"?

Today, we do the former, and this is probably the better choice. If we refuse 
to process "f", then we cannot make progress ever again.

Dropping segment 0 prematurely is a bummer, but users could also set the 
retention time high enough that they don't think they'll actually get any 
events late enough to need segment 0. Worst case, since we can have many future 
events without advancing stream time, sparse enough to each require their own 
segment, which would eat deeply into the retention time, dropping many segments 
that should be live.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Guozhang Wang
Hi Matthias,

Yes, we are effectively assigning the the whole space of Strings minus
current preserved ones as header keys; honestly I think in practice users
wanting to use `_something_` would be very rare, but I admit it may still
be possible in theory.

I think Luis' point about "header=" is that having a expression
evaluation as the config value is a bit weird, and thinking about it twice
it is still not flawless: we can still argue that we are effectively
assigning the whole sub-space of "header=*" of Strings for headers, and
what if users want to use preserved value falling into that sub-space
(again, should not really happen in practice, just being paranoid here).

It seems that two configs are the common choice that everyone is happy with.

Guozhang


On Mon, Jun 18, 2018 at 2:35 PM, Matthias J. Sax 
wrote:

> Luis,
>
> I meant to update the "Rejected Alternative" sections, what you have
> done already. Thx.
>
> Originally, I also had the idea about a second config, but thought it
> might be easier to just change the allowed values to be `offset`,
> `timestamp`, `header=`. (We try to keep the number of configs small
> if possible, as more configs are more confusing to users.)
>
> I don't think that using `_offset_`, `_timestamp_` and `` solves
> the problem because users still might use `_something_` as header key --
> and if we want to introduce a new compaction strategy "something" later
> we face the same issues as without the underscores. We only reduce the
> likelihood that it happens.
>
> Using `header=` as prefix or introducing a second config, that is only
> effective if the strategy is set to `header` seems to be a cleaner
> solution.
>
> @Luis: why do you think that using `header=` is an "incorrect
> approach"?
>
> > Though I would still prefer to keep it as it is, as its a much simple>
> and cleaner approach – I’m not so sure that a potential client would
> > really be so inconvenienced for having to use “_offset” or
> > “_timestamp_” as a header
>
> I don't think that it's about the issue that people cannot use
> `_offset_` or `_timestamp_` in their header (by "use" I mean for
> compaction). With the current KIP, they cannot use `offset` or
> `timestamp` either. The issue is, that we cannot introduce a new system
> supported compaction strategy in the future without potentially breaking
> something, as we basically assign the whole space of Strings (minus
> `offset`, `timestamp`) as valid configs to enable header based compaction.
>
> Personally, I prefer either adding a config or going with
> `header=`. Using `_timestamp_`, `_offset_`, and `` might be
> good enough (even if this is the solution I like least)---for this case,
> we should state explicitly, that the whole space of `_*_` is reserved
> and users are not allowed to set those for header compaction. In fact, I
> would also add a check for the config that only allows for `_offset_`
> and `_timestamp_` and throws an exception for all other `_*_` configs.
>
>
> -Matthias
>
>
> On 6/18/18 2:03 PM, Luís Cabral wrote:
> > I’m ok with that...
> >
> > Ted / Matthias?
> >
> >
> > From: Guozhang Wang
> > Sent: 18 June 2018 22:49
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-280: Enhanced log compaction
> >
> > How about make the preserved values to be "_offset_" and "_timestamp_"
> > then? Currently in the KIP they are reserved as "offset" and "timestamp".
> >
> >
> > Guozhang
> >
> > On Mon, Jun 18, 2018 at 1:40 PM, Luís Cabral
> 
> > wrote:
> >
> >> Hi Guozhang,
> >>
> >> Yes, that is what I meant (separate configs).
> >> Though I would still prefer to keep it as it is, as its a much simpler
> and
> >> cleaner approach – I’m not so sure that a potential client would really
> be
> >> so inconvenienced for having to use “_offset” or “_timestamp_” as a
> header
> >>
> >> Cheers,
> >> Luís
> >>
> >>
> >> From: Guozhang Wang
> >> Sent: 18 June 2018 19:35
> >> To: dev@kafka.apache.org
> >> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
> >>
> >> Hello Luís,
> >>
> >> I agree that having an expression evaluation as a config value is not
> the
> >> best approach; if there are better ideas to allow users to specify the
> >> header key which happen to be the same as the preserved config values
> >> "offset" and "timestamp" (although the likelihood may be small, as Ted
> >> mentioned there may be more preserved config values added in the
> future),
> >> then I'd be happily follow the suggestions. For example, we could have
> the
> >> config value for header keys as "header-"? Is that what you've
> >> suggested? Or do you suggest using two configs instead, and the second
> >> config specifying the key name, and will only be considered if the first
> >> (i.e. current proposed) config's value is `header`, otherwise be
> ignored?
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Mon, Jun 18, 2018 at 12:20 AM, Luís Cabral
> >>  >>> wrote:
> >>
> >>>  Hi Ted / Guozhang / Matthias,
> >>>
> >>> @Ted: I've now added your argument to the "Rejecte

[jira] [Resolved] (KAFKA-7067) ConnectRestApiTest fails assertion

2018-06-18 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-7067.

Resolution: Fixed

> ConnectRestApiTest fails assertion
> --
>
> Key: KAFKA-7067
> URL: https://issues.apache.org/jira/browse/KAFKA-7067
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0
>Reporter: Magesh kumar Nandakumar
>Assignee: Magesh kumar Nandakumar
>Priority: Blocker
> Fix For: 2.0.0
>
>
> ConnectRestApiTest fails assertion for the test_rest_api. The test needs to 
> be updated to include the new configs added in 2.0 in the expected result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSS] KIP-317: Transparent Data Encryption

2018-06-18 Thread Sönke Liebau
Hi everybody,

I've created a draft version of KIP-317 which describes the addition
of transparent data encryption functionality to Kafka.

Please consider this as a basis for discussion - I am aware that this
is not at a level of detail sufficient for implementation, but I
wanted to get some feedback from the community on the general idea
before spending more time on this.

Link to the KIP is:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-317%3A+Add+transparent+data+encryption+functionality

Best regards,
Sönke


Jenkins build is back to normal : kafka-trunk-jdk8 #2749

2018-06-18 Thread Apache Jenkins Server
See 




Re: Error in Kafka Stream

2018-06-18 Thread Guozhang Wang
Interesting, it indeed seem like a lurking issue in Kafka Streams.

Which Kafka version are you using?


Guozhang

On Mon, Jun 18, 2018 at 12:32 AM, Amandeep Singh 
wrote:

> Hi  Guozhang,
>
> The file system is XFS and the folder is not a temp folder. The issue goes
> away when I restart the streams. I forgot to mention i am running 3
> multiple instances of consumer on 3 machines.
> Also, this issue seems to be reported by other users too:
> https://issues.apache.org/jira/browse/KAFKA-5998
>
>
>
> Regards,
> Amandeep Singh
> +91-7838184964
>
>
> On Mon, Jun 18, 2018 at 6:45 AM Guozhang Wang  wrote:
>
> > Hello Amandeep,
> >
> > What file system are you using? Also is `/opt/info` a temp folder that
> can
> > be auto-cleared from time to time?
> >
> >
> > Guozhang
> >
> > On Fri, Jun 15, 2018 at 6:39 AM, Amandeep Singh 
> > wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > >  I am getting the below error while processign data with kafka stream.
> > The
> > > application was runnign for a couple of hours and the '
> > > WatchlistUpdate-StreamThread-9 ' thread was assigned to the same
> > partition
> > > since beginning. I am assuming it was able to successfully commit
> offsets
> > > for those couple of hours and the directory '
> > > /opt/info/abc/cache/test-abc-kafka-processor/kafka-streams/
> > > UI-Watchlist-ES-App/0_2
> > > ' did exist for that period.
> > >
> > >  And then I start getting the below error after every 30 secs (probably
> > > because if offset commit interval)  and messages are being missed from
> > > processing.
> > >
> > > Can you please help?
> > >
> > >
> > > 2018-06-15 08:47:58 [WatchlistUpdate-StreamThread-9] WARN
> > > o.a.k.s.p.i.ProcessorStateManager:246
> > > - task [0_2] Failed
> > >
> > > to write checkpoint file to
> > > /opt/info/abc/cache/test-abc-kafka-processor/kafka-streams/
> > > UI-Watchlist-ES-App/0_2/.che
> > >
> > > ckpoint:
> > >
> > > java.io.FileNotFoundException:
> > > /opt/info/abc/cache/test-abc-kafka-processor/kafka-streams/
> > > UI-Watchlist-ES-App/0_2/.
> > >
> > > checkpoint.tmp (No such file or directory)
> > >
> > > at java.io.FileOutputStream.open0(Native Method)
> ~[na:1.8.0_141]
> > >
> > > at java.io.FileOutputStream.open(FileOutputStream.java:270)
> > > ~[na:1.8.0_141]
> > >
> > > at java.io.FileOutputStream.(FileOutputStream.java:213)
> > > ~[na:1.8.0_141]
> > >
> > > at java.io.FileOutputStream.(FileOutputStream.java:162)
> > > ~[na:1.8.0_141]
> > >
> > > at
> > > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(
> > > OffsetCheckpoint.java:73)
> > > ~[kafka-streams-
> > >
> > > 1.0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.ProcessorStateManager.
> > > checkpoint(ProcessorStateManager.java:3
> > >
> > > 20) ~[kafka-streams-1.0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.
> StreamTask$1.run(StreamTask.
> > > java:306)
> > > [kafka-streams-1.0.0.ja
> > >
> > > r:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.
> > > measureLatencyNs(StreamsMetricsImpl.java:2
> > >
> > > 08) [kafka-streams-1.0.0.jar:na]
> > >
> > > at
> > >
> > org.apache.kafka.streams.processor.internals.
> StreamTask.commit(StreamTask.
> > > java:299)
> > > [kafka-streams-1.0.0.j
> > >
> > > ar:na]
> > >
> > > at
> > >
> > org.apache.kafka.streams.processor.internals.
> StreamTask.commit(StreamTask.
> > > java:289)
> > > [kafka-streams-1.0.0.j
> > >
> > > ar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(
> > > AssignedTasks.java:87)
> > > [kafka-streams-1
> > >
> > > .0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.AssignedTasks.
> > > applyToRunningTasks(AssignedTasks.java:451)
> > > [ka
> > >
> > > fka-streams-1.0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(
> > > AssignedTasks.java:380)
> > > [kafka-streams-1
> > >
> > > .0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.TaskManager.commitAll(
> > > TaskManager.java:309)
> > > [kafka-streams-1.
> > >
> > > 0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(
> > > StreamThread.java:1018)
> > > [kafka-strea
> > >
> > > ms-1.0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(
> > > StreamThread.java:835)
> > > [kafka-streams-1.
> > >
> > > 0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(
> > > StreamThread.java:774)
> > > [kafka-streams-1.
> > >
> > > 0.0.jar:na]
> > >
> > > at
> > > org.apache.kafka.streams.processor.internals.
> > > StreamThread.run(StreamThread.java:744)
> > > [kafka-streams-1.0.0.
> > >
> > > jar:na]
> > >
> > >
> > > Stream config:
> > >
> > > 2018-06-

[jira] [Resolved] (KAFKA-7023) Kafka Streams RocksDB bulk loading config may not be honored with customized RocksDBConfigSetter

2018-06-18 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-7023.
--
   Resolution: Fixed
Fix Version/s: 2.0.0

> Kafka Streams RocksDB bulk loading config may not be honored with customized 
> RocksDBConfigSetter 
> -
>
> Key: KAFKA-7023
> URL: https://issues.apache.org/jira/browse/KAFKA-7023
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.1.0
>Reporter: Liquan Pei
>Assignee: Liquan Pei
>Priority: Major
> Fix For: 2.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We observed frequent L0 -> L1 compaction during Kafka Streams state recovery. 
> Some sample log:
> {code:java}
> 2018/06/08-00:04:50.892331 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892298) [db/compaction_picker_universal.cc:270] [default] 
> Universal: sorted runs files(6): files[3 0 0 0 1 1 38] max score 1.00
> 2018/06/08-00:04:50.892336 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892300) [db/compaction_picker_universal.cc:655] [default] 
> Universal: First candidate file 134[0] to reduce size amp.
> 2018/06/08-00:04:50.892338 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892302) [db/compaction_picker_universal.cc:686] [default] 
> Universal: size amp not needed. newer-files-total-size 13023497 
> earliest-file-size 2541530372
> 2018/06/08-00:04:50.892339 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892303) [db/compaction_picker_universal.cc:473] [default] 
> Universal: Possible candidate file 134[0].
> 2018/06/08-00:04:50.892341 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892304) [db/compaction_picker_universal.cc:525] [default] 
> Universal: Skipping file 134[0] with size 1007 (compensated size 1287)
> 2018/06/08-00:04:50.892343 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892306) [db/compaction_picker_universal.cc:473] [default] 
> Universal: Possible candidate file 133[1].
> 2018/06/08-00:04:50.892344 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892307) [db/compaction_picker_universal.cc:525] [default] 
> Universal: Skipping file 133[1] with size 4644 (compensated size 16124)
> 2018/06/08-00:04:50.892346 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892307) [db/compaction_picker_universal.cc:473] [default] 
> Universal: Possible candidate file 126[2].
> 2018/06/08-00:04:50.892348 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892308) [db/compaction_picker_universal.cc:525] [default] 
> Universal: Skipping file 126[2] with size 319764 (compensated size 319764)
> 2018/06/08-00:04:50.892349 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892309) [db/compaction_picker_universal.cc:473] [default] 
> Universal: Possible candidate level 4[3].
> 2018/06/08-00:04:50.892351 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892310) [db/compaction_picker_universal.cc:525] [default] 
> Universal: Skipping level 4[3] with size 2815574 (compensated size 2815574)
> 2018/06/08-00:04:50.892352 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892311) [db/compaction_picker_universal.cc:473] [default] 
> Universal: Possible candidate level 5[4].
> 2018/06/08-00:04:50.892357 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892311) [db/compaction_picker_universal.cc:525] [default] 
> Universal: Skipping level 5[4] with size 9870748 (compensated size 9870748)
> 2018/06/08-00:04:50.892358 7f8a6d7fa700 (Original Log Time 
> 2018/06/08-00:04:50.892313) [db/compaction_picker_universal.cc:473] [default] 
> Universal: Possible candidate level 6[5].
> {code}
> In customized RocksDBConfigSetter, we set 
> {code:java}
> level0_file_num_compaction_trigger=6 {code}
> During bulk loading, the following options are set: 
> [https://github.com/facebook/rocksdb/blob/master/options/options.cc] 
> {code:java}
> Options*
> Options::PrepareForBulkLoad()
> {
> // never slowdown ingest.
> level0_file_num_compaction_trigger = (1<<30);
> level0_slowdown_writes_trigger = (1<<30);
> level0_stop_writes_trigger = (1<<30);
> soft_pending_compaction_bytes_limit = 0;
> hard_pending_compaction_bytes_limit = 0;
> // no auto compactions please. The application should issue a
> // manual compaction after all data is loaded into L0.
> disable_auto_compactions = true;
> // A manual compaction run should pick all files in L0 in
> // a single compaction run.
> max_compaction_bytes = (static_cast(1) << 60);
> // It is better to have only 2 levels, otherwise a manual
> // compaction would compact at every possible level, thereby
> // increasing the total time needed for compactions.
> num_levels = 2;
> // Need to allow more write buffers to allow more parallism
> // of flushes.
> max_write_buffer_number

Build failed in Jenkins: kafka-trunk-jdk10 #232

2018-06-18 Thread Apache Jenkins Server
See 


Changes:

[jason] KAFKA-7067; Include new connector configs in system test assertion

--
[...truncated 1.57 MB...]
kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[0] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[0] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[0] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[0] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[0] STARTED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[1] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[1] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[1] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[1] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[1] STARTED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[2] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[2] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[2] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[2] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[2] STARTED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[3] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerConfigUpdateTest[3] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[3] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[3] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[3] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[3] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] PASSED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[3] STARTED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[3] PASSED

kafka.log.ProducerStateManagerTest > testCoordinatorFencing STARTED

kafka.log.ProducerStateManagerTest > testCoordinatorFencing PASSED

kafka.log.ProducerStateManagerTest > testTruncate STARTED

kafka.log.ProducerStateManagerTest > testTruncate PASSED

kafka.log.ProducerStateManagerTest > testLoadFromTruncatedSnapshotFile STARTED

kafka.log.ProducerStateManagerTest > testLoadFromTruncatedSnapshotFile PASSED

kafka.log.ProducerStateManagerTest > testRemoveExpiredPidsOnReload STARTED

kafka.log.ProducerStateManagerTest > testRemoveExpiredPidsOnReload PASSED

kafka.log.ProducerStateManagerTest > 
testOutOfSequenceAfterControlRecordEpochBump STARTED

kafka.log.ProducerStateManagerTest > 
testOutOfSequenceAfterControlRecordEpochBump PASSED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterTruncation 
STARTED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterTruncation 
PASSED

kafka.log.ProducerStateManagerTest > testTakeSnapshot STARTED

kafka.log.ProducerStateManagerTest > testTakeSnapshot PASSED

kafka.log.ProducerStateManagerTest > testDeleteSnapshotsBefore STARTED

kafka.log.ProducerStateManagerTest > testDeleteSnapshotsBefore PASSED

kafka.log.ProducerStateManagerTest > 
testNonMatch

Re: [DISCUSS] - KIP-314: KTable to GlobalKTable Bi-directional Join

2018-06-18 Thread Matthias J. Sax
Adam,

thanks a lot for the KIP. I agree that this would be a valuable feature
to add. It's a very complex one though. You correctly pointed out, that
the GlobalKTable (or global stores in general) cannot be the "driver"
atm and are passively updated only. This is by design. Are you familiar
with the KIP discussion of KIP-99?
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=67633649)
Would be worth to refresh to understand the tradeoffs and design decisions.

It's unclear to me, what the impact will be if we want to change the
current design. Even if no GlobalKTable is used, it might have impact on
performance and for sure on code complexity. Overall, it seems that a
POC might be required before we can consider adding this (with the
danger, that it does not get accepted in the end).

Are you aware of KIP-213:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable

It suggest to add non-key joins and a lot of issues how to implement
this were discussed already. As a KTable-GloblKTable join is a non-key
join, too, it seems that those discussion apply to your KIP too.

Hope this helps to make the next steps.


-Matthias


On 6/18/18 1:15 PM, Adam Bellemare wrote:
> Hi All
> 
> I created KIP-314 and I would like to initiate a discussion on it.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-314%3A+KTable+to+GlobalKTable+Bi-directional+Join
> 
> The primary goal of this KIP is to improve the way that Kafka can deal with
> relational data at scale. This KIP would alter the way that GlobalKTables
> can be used in relation to KTables. I believe that this would be a very
> useful change but I need some eyes on the technical aspects to validate or
> refute the strategy.
> 
> Thanks
> 
> Adam Bellemare
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-317: Transparent Data Encryption

2018-06-18 Thread Stephane Maarek
Hi Sonke

Very much needed feature and discussion. FYI the image links seem broken.

My 2 cents (if I understood correctly): you say "This process will be
implemented after Serializer and Interceptors are done with the message
right before it is added to the batch to be sent, in order to ensure that
existing serializers and interceptors keep working with encryption just
like without it."

I think encryption should happen AFTER a batch is created, right before it
is sent. Reason is that if we want to still keep advantage of compression,
encryption needs to happen after it (and I believe compression happens on a
batch level).
So to me for a producer: serializer / interceptors => batching =>
compression => encryption => send.
and the inverse for a consumer.

Regards
Stephane

On 19 June 2018 at 06:46, Sönke Liebau 
wrote:

> Hi everybody,
>
> I've created a draft version of KIP-317 which describes the addition
> of transparent data encryption functionality to Kafka.
>
> Please consider this as a basis for discussion - I am aware that this
> is not at a level of detail sufficient for implementation, but I
> wanted to get some feedback from the community on the general idea
> before spending more time on this.
>
> Link to the KIP is:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 317%3A+Add+transparent+data+encryption+functionality
>
> Best regards,
> Sönke
>