[jira] [Commented] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832325#comment-16832325 ] Andrew commented on KAFKA-8315: --- The fudge factor is the \{windowstore.changelog.additional.retention.ms} which is set to 24h by default, so that seems to add up to 5 days (120 hours). You understand my use case well. The choice of limiting the grace was to reduce the performance overhead during the large historical processing. I definitely have lots of data beyond this that should join and the joins and aggregation within the latest 120 hours seem to be correct and complete. This seems to be just related to retention. I will experiment with increasing \{windowstore.changelog.additional.retention.ms} to see if it brings in more data. One oddity of our left stream is that it contains records in batches from different devices. Each batch is about 1000 records and contiguous within the stream. Within a batch the records are in increasing timestamp order. Subsequent batches from different devices will be within 1-2 days of the previous batch (we don't have, say, a batch for 1/1/2019 followed by a batch for 4/1/2019 or a batch for 24/12/2019 for example. The right stream is single records not batches with similar date ordering (i.e. subsequent records should be within 1-2 days of each other. My understanding is that the windows are closed when streamtime - windowstart > windowsize + grace. So as stream time increases as newer batches arrive, joins/aggregations should continue, until a batch arrives after windowsize + grace. However, we are not seeing this. > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7697) Possible deadlock in kafka.cluster.Partition
[ https://issues.apache.org/jira/browse/KAFKA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832406#comment-16832406 ] Rajini Sivaram commented on KAFKA-7697: --- [~jwesteen] [~nsnmurthy] Thank you! The deadlock was fixed in 2.2.0 and 2.1.1. We are keen to see if there are other similar issues still remaining in these two releases, so that we can fix them before the next release. If experiencing this issue in 2.1.0, please upgrade to 2.1.1. > Possible deadlock in kafka.cluster.Partition > > > Key: KAFKA-7697 > URL: https://issues.apache.org/jira/browse/KAFKA-7697 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Gian Merlino >Assignee: Rajini Sivaram >Priority: Blocker > Fix For: 2.2.0, 2.1.1 > > Attachments: kafka.log, threaddump.txt > > > After upgrading a fairly busy broker from 0.10.2.0 to 2.1.0, it locked up > within a few minutes (by "locked up" I mean that all request handler threads > were busy, and other brokers reported that they couldn't communicate with > it). I restarted it a few times and it did the same thing each time. After > downgrading to 0.10.2.0, the broker was stable. I attached a thread dump from > the last attempt on 2.1.0 that shows lots of kafka-request-handler- threads > trying to acquire the leaderIsrUpdateLock lock in kafka.cluster.Partition. > It jumps out that there are two threads that already have some read lock > (can't tell which one) and are trying to acquire a second one (on two > different read locks: 0x000708184b88 and 0x00070821f188): > kafka-request-handler-1 and kafka-request-handler-4. Both are handling a > produce request, and in the process of doing so, are calling > Partition.fetchOffsetSnapshot while trying to complete a DelayedFetch. At the > same time, both of those locks have writers from other threads waiting on > them (kafka-request-handler-2 and kafka-scheduler-6). Neither of those locks > appear to have writers that hold them (if only because no threads in the dump > are deep enough in inWriteLock to indicate that). > ReentrantReadWriteLock in nonfair mode prioritizes waiting writers over > readers. Is it possible that kafka-request-handler-1 and > kafka-request-handler-4 are each trying to read-lock the partition that is > currently locked by the other one, and they're both parked waiting for > kafka-request-handler-2 and kafka-scheduler-6 to get write locks, which they > never will, because the former two threads own read locks and aren't giving > them up? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8316) Remove deprecated usage of Slf4jRequestLog, SslContextFactory
[ https://issues.apache.org/jira/browse/KAFKA-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832415#comment-16832415 ] ASF GitHub Bot commented on KAFKA-8316: --- dongjinleekr commented on pull request #6668: KAFKA-8316: Remove deprecated usage of Slf4jRequestLog, SslContextFactory URL: https://github.com/apache/kafka/pull/6668 This PR resolves [KAFKA-8316](https://issues.apache.org/jira/browse/KAFKA-8316) following the method described [here](https://github.com/eclipse/jetty.project/issues/3502), along with [KAFKA-8308](https://issues.apache.org/jira/browse/KAFKA-8308). ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove deprecated usage of Slf4jRequestLog, SslContextFactory > - > > Key: KAFKA-8316 > URL: https://issues.apache.org/jira/browse/KAFKA-8316 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Assignee: Lee Dongjin >Priority: Major > Labels: newbie > > Jetty recently deprecated a few classes we use. The following commit > suppresses the deprecation warnings: > https://github.com/apache/kafka/commit/e66bc6255b2ee42481b54b7fd1d256b9e4ff5741 > We should remove the suppressions and use the suggested alternatives. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8316) Remove deprecated usage of Slf4jRequestLog, SslContextFactory
[ https://issues.apache.org/jira/browse/KAFKA-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Dongjin reassigned KAFKA-8316: -- Assignee: Lee Dongjin > Remove deprecated usage of Slf4jRequestLog, SslContextFactory > - > > Key: KAFKA-8316 > URL: https://issues.apache.org/jira/browse/KAFKA-8316 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Assignee: Lee Dongjin >Priority: Major > Labels: newbie > > Jetty recently deprecated a few classes we use. The following commit > suppresses the deprecation warnings: > https://github.com/apache/kafka/commit/e66bc6255b2ee42481b54b7fd1d256b9e4ff5741 > We should remove the suppressions and use the suggested alternatives. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8308) Update jetty for security vulnerability CVE-2019-10241
[ https://issues.apache.org/jira/browse/KAFKA-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Dongjin reassigned KAFKA-8308: -- Assignee: Lee Dongjin > Update jetty for security vulnerability CVE-2019-10241 > -- > > Key: KAFKA-8308 > URL: https://issues.apache.org/jira/browse/KAFKA-8308 > Project: Kafka > Issue Type: Task > Components: core >Affects Versions: 2.2.0 >Reporter: Di Shang >Assignee: Lee Dongjin >Priority: Major > Labels: security > > Kafka 2.2 uses jetty-*-9.4.14.v20181114 which is marked vulnerable > [https://github.com/apache/kafka/blob/2.2/gradle/dependencies.gradle#L58] > > [https://nvd.nist.gov/vuln/detail/CVE-2019-10241] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832325#comment-16832325 ] Andrew edited comment on KAFKA-8315 at 5/3/19 1:03 PM: --- The fudge factor is the \{windowstore.changelog.additional.retention.ms} which is set to 24h by default, so that seems to add up to 5 days (120 hours). You understand my use case well. The choice of limiting the grace was to reduce the performance overhead during the large historical processing. I definitely have lots of data beyond this that should join and the joins and aggregation within the latest 120 hours seem to be correct and complete. This seems to be just related to retention. I will experiment with increasing \{windowstore.changelog.additional.retention.ms} to see if it brings in more data. One oddity of our left stream is that it contains records in batches from different devices. Each batch is about 1000 records and contiguous within the stream. Within a batch the records are in increasing timestamp order. Subsequent batches from different devices will be within 1-2 days of the previous batch (we don't have, say, a batch for 1/1/2019 followed by a batch for 4/1/2019 or a batch for 24/12/2019 for example. The right stream is single records not batches with similar date ordering (i.e. subsequent records should be within 1-2 days of each other. My understanding is that the windows are closed when streamtime - windowstart > windowsize + grace. So as stream time increases as newer batches arrive, joins/aggregations should continue, until a batch arrives after windowsize + grace. However, we are not seeing this. [Correction] I think I meant : streamtime - windowend > grace was (Author: the4thamigo_uk): The fudge factor is the \{windowstore.changelog.additional.retention.ms} which is set to 24h by default, so that seems to add up to 5 days (120 hours). You understand my use case well. The choice of limiting the grace was to reduce the performance overhead during the large historical processing. I definitely have lots of data beyond this that should join and the joins and aggregation within the latest 120 hours seem to be correct and complete. This seems to be just related to retention. I will experiment with increasing \{windowstore.changelog.additional.retention.ms} to see if it brings in more data. One oddity of our left stream is that it contains records in batches from different devices. Each batch is about 1000 records and contiguous within the stream. Within a batch the records are in increasing timestamp order. Subsequent batches from different devices will be within 1-2 days of the previous batch (we don't have, say, a batch for 1/1/2019 followed by a batch for 4/1/2019 or a batch for 24/12/2019 for example. The right stream is single records not batches with similar date ordering (i.e. subsequent records should be within 1-2 days of each other. My understanding is that the windows are closed when streamtime - windowstart > windowsize + grace. So as stream time increases as newer batches arrive, joins/aggregations should continue, until a batch arrives after windowsize + grace. However, we are not seeing this. > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8238) Log how many bytes and messages were read from __consumer_offsets
[ https://issues.apache.org/jira/browse/KAFKA-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832523#comment-16832523 ] ASF GitHub Bot commented on KAFKA-8238: --- vamossagar12 commented on pull request #6669: KAFKA-8238: Adding Number of messages/bytes read URL: https://github.com/apache/kafka/pull/6669 This PR includes changes to count the number of messages/bytes read while loading offsets from TopicPartition. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Log how many bytes and messages were read from __consumer_offsets > - > > Key: KAFKA-8238 > URL: https://issues.apache.org/jira/browse/KAFKA-8238 > Project: Kafka > Issue Type: Improvement >Reporter: Colin P. McCabe >Assignee: Sagar Rao >Priority: Minor > Labels: newbie > > We should log how many bytes and messages were read from __consumer_offsets. > Currently we only log how long it took. Example: > {code} > [GroupMetadataManager brokerId=2] Finished loading offsets and group metadata > from __consumer_offsets-22 in 23131 milliseconds. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8240) Source.equals() can fail with NPE
[ https://issues.apache.org/jira/browse/KAFKA-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832539#comment-16832539 ] ASF GitHub Bot commented on KAFKA-8240: --- bbejeck commented on pull request #6589: KAFKA-8240: Fix NPE in Source.equals() URL: https://github.com/apache/kafka/pull/6589 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Source.equals() can fail with NPE > - > > Key: KAFKA-8240 > URL: https://issues.apache.org/jira/browse/KAFKA-8240 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0, 2.1.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Major > Labels: beginner, easy-fix, newbie > > Reported on an PR: > [https://github.com/apache/kafka/pull/5284/files/1df6208f48b6b72091fea71323d94a16102ffd13#r270607795] > InternalTopologyBuilder#Source.equals() might fail with NPE if > `topicPattern==null`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832543#comment-16832543 ] Andrew commented on KAFKA-8315: --- [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the window size to 2 days, and increased the grace to 3 days. None of the above seems to make any difference. What I have observed is that I get a few joins before the 6 day window for a single partition and this partition is the first to complete as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions ofleft and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams applications. My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near the mark? I will next try the join using a very large grace period to see if it makes a difference. One other thing I might try is to cap my end time using a stream filter to see if I manage to join to earlier records. > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832543#comment-16832543 ] Andrew edited comment on KAFKA-8315 at 5/3/19 2:38 PM: --- [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the join window size to 2 days, and increased the join grace to 3 days. None of the above seems to make any difference. What I have observed is that I do get a few joins before the 6 day window for a single partition (13) and this partition is the first to complete by far, as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions of the left and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams applications. My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near the mark? I will next try the join using a very large grace period to see if it makes a difference. One other thing I might try is to cap my end time using a stream filter to see if I manage to join to earlier records. was (Author: the4thamigo_uk): [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the join window size to 2 days, and increased the join grace to 3 days. None of the above seems to make any difference. What I have observed is that I do get a few joins before the 6 day window for a single partition (13) and this partition is the first to complete by far, as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions ofleft and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams applications. My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near the mark? I will next try the join using a very large grace period to see if it makes a difference. One other thing I migh
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832543#comment-16832543 ] Andrew edited comment on KAFKA-8315 at 5/3/19 2:38 PM: --- [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the join window size to 2 days, and increased the join grace to 3 days. None of the above seems to make any difference. What I have observed is that I do get a few joins before the 6 day window for a single partition (13) and this partition is the first to complete by far, as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions ofleft and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams applications. My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near the mark? I will next try the join using a very large grace period to see if it makes a difference. One other thing I might try is to cap my end time using a stream filter to see if I manage to join to earlier records. was (Author: the4thamigo_uk): [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the window size to 2 days, and increased the grace to 3 days. None of the above seems to make any difference. What I have observed is that I get a few joins before the 6 day window for a single partition and this partition is the first to complete as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions ofleft and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams applications. My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near the mark? I will next try the join using a very large grace period to see if it makes a difference. One other thing I might try is to cap my end time
[jira] [Created] (KAFKA-8318) Session Window Aggregations generate an extra tombstone
John Roesler created KAFKA-8318: --- Summary: Session Window Aggregations generate an extra tombstone Key: KAFKA-8318 URL: https://issues.apache.org/jira/browse/KAFKA-8318 Project: Kafka Issue Type: Improvement Components: streams Reporter: John Roesler See the discussion https://github.com/apache/kafka/pull/6654#discussion_r280231439 The session merging logic generates a tombstone in addition to an update when the session window already exists. It's not a correctness issue, just a small performance hit, because that tombstone is immediately invalidated by the update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832543#comment-16832543 ] Andrew edited comment on KAFKA-8315 at 5/3/19 3:03 PM: --- [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the join window size to 2 days, and increased the join grace to 3 days. None of the above seems to make any difference. What I have observed is that I do get a few joins before the 6 day window for a single partition (13) and this partition is the first to complete by far, as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions of the left and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams application instances (processes). All instances share exactly the same configuration and, in particular, the same application.id. (My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near the mark? I will next try the join using a very large grace period to see if it makes a difference. One other thing I might try is to cap my end time using a stream filter to see if I manage to join to earlier records. was (Author: the4thamigo_uk): [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the join window size to 2 days, and increased the join grace to 3 days. None of the above seems to make any difference. What I have observed is that I do get a few joins before the 6 day window for a single partition (13) and this partition is the first to complete by far, as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions of the left and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams applications. My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832543#comment-16832543 ] Andrew edited comment on KAFKA-8315 at 5/3/19 3:05 PM: --- [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the join window size to 2 days, and increased the join grace to 3 days. None of the above seems to make any difference. What I have observed is that I do get a few joins before the 6 day window for a single partition (13) and this partition is the first to complete by far, as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions of the left and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams application instances (processes). All instances share exactly the same configuration and, in particular, the same application.id. (My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right stream is read up until the latest left stream-time, so that data is pulled off both streams to minimize the difference in times between the latest records read off the topics. Is this near the mark? I will next try the join using a very large grace period to see if it makes a difference. One other thing I might try is to cap my end time using a stream filter to see if I manage to join to earlier records. P.S. we are using the following versions 2.2.0 5.2.1 was (Author: the4thamigo_uk): [~vvcephei] Ive been struggling with this today. * I have logging on every join, and I can see that joins do not occur before the 6 day period, except in a few cases (see below) * Ive checked the data and I can definitely see that the vast majority of records should have joins throughout the whole period. * I have added `windowstore.changelog.additional.retention.ms` so that the auto-generated JOINTHIS/JOINOTHER intermediate topics now have a retention.ms of 840 hours. * I have removed my previous call to `until()` and only set the join window size to 2 days, and increased the join grace to 3 days. None of the above seems to make any difference. What I have observed is that I do get a few joins before the 6 day window for a single partition (13) and this partition is the first to complete by far, as it has the fewest records. Both topics are partitioned using murmur2 into 20 partitions (Ive checked we have the same keys in the corresponding partitions of the left and right topics). We are running 4 instances of the streams application, and we do not explicitly set the `num.stream.threads`. My understanding is that a task is created for each partition (i.e. 20 tasks) and the work for these tasks is distributed out to the stream threads in our 4 streams application instances (processes). All instances share exactly the same configuration and, in particular, the same application.id. (My assumption is that stream time is per-task (i.e. per partition). Is this correct? Is there _any_ possibility that the stream time of partition 13 is somehow shared with any of the other tasks, such that windows might be closed before the join-able data is read on the other partitions. Id like to understand some more about how stream-time increases. I imagine (probably naively) that within a task, stream time increases as the latest timestamp read from a partition, and that both left and right streams have their own stream time. I also assume that during a join, the left stream is read, up until just after the current right stream-time, then the right st
[jira] [Updated] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-8289: Fix Version/s: (was: 2.2.2) 2.2.1 > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:01.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:05:19.599 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:20.045 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107476398},k=B,v=11 > 2019-04-24 20:05:20.047 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107501398},k=B,v=12 > 2019-04-24 20:05:26.075 INFO --- [-StreamThread-1] c.g.k.AppStreams
[jira] [Commented] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832575#comment-16832575 ] John Roesler commented on KAFKA-8289: - [~vahid], can you explain why this critical bug doesn't block the release? > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:01.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:05:19.599 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:20.045 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107476398},k=B,v=11 > 2019-04-24 20:05:20.047 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107501398},k=B,v=12 >
[jira] [Commented] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832577#comment-16832577 ] Andrew commented on KAFKA-8315: --- [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832577#comment-16832577 ] Andrew edited comment on KAFKA-8315 at 5/3/19 3:26 PM: --- [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? For reference, I am attaching the row key and timestamp for all the records in our left and right topics. was (Author: the4thamigo_uk): [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8319) Flaky Test KafkaStreamsTest.statefulTopologyShouldCreateStateDirectory
Bill Bejeck created KAFKA-8319: -- Summary: Flaky Test KafkaStreamsTest.statefulTopologyShouldCreateStateDirectory Key: KAFKA-8319 URL: https://issues.apache.org/jira/browse/KAFKA-8319 Project: Kafka Issue Type: Bug Components: streams Reporter: Bill Bejeck Assignee: Bill Bejeck -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832582#comment-16832582 ] Vahid Hashemian commented on KAFKA-8289: [~vvcephei] Since, according to your earlier comment, there seems to be a workaround for the reported issue. That's why I didn't think this was blocking the bug fix release, and can be included in a follow-up release. In any case, it seems that the fix is merged to trunk. Is there a reason this ticket is not resolved yet? > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:01.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:05:19.599 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:20.045 INFO --- [-StreamThread-1] c.g.k.AppStreams
[jira] [Commented] (KAFKA-8240) Source.equals() can fail with NPE
[ https://issues.apache.org/jira/browse/KAFKA-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832585#comment-16832585 ] Vahid Hashemian commented on KAFKA-8240: [~mjsax] I see the PR is merged. Can this ticket be resolved? > Source.equals() can fail with NPE > - > > Key: KAFKA-8240 > URL: https://issues.apache.org/jira/browse/KAFKA-8240 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0, 2.1.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Major > Labels: beginner, easy-fix, newbie > > Reported on an PR: > [https://github.com/apache/kafka/pull/5284/files/1df6208f48b6b72091fea71323d94a16102ffd13#r270607795] > InternalTopologyBuilder#Source.equals() might fail with NPE if > `topicPattern==null`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832582#comment-16832582 ] Vahid Hashemian edited comment on KAFKA-8289 at 5/3/19 3:39 PM: [~vvcephei] Since, according to your earlier comment, there seems to be a workaround for the reported issue, I didn't think this was blocking the bug fix release, and thought it could be included in a follow-up release. In any case, it seems that the fix is merged to trunk. Is there a reason this ticket is not resolved yet? was (Author: vahid): [~vvcephei] Since, according to your earlier comment, there seems to be a workaround for the reported issue. That's why I didn't think this was blocking the bug fix release, and can be included in a follow-up release. In any case, it seems that the fix is merged to trunk. Is there a reason this ticket is not resolved yet? > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO
[jira] [Commented] (KAFKA-8254) Suppress incorrectly passes a null topic to the serdes
[ https://issues.apache.org/jira/browse/KAFKA-8254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832593#comment-16832593 ] John Roesler commented on KAFKA-8254: - 2.2.1 has been proposed: https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+2.2.1 > Suppress incorrectly passes a null topic to the serdes > -- > > Key: KAFKA-8254 > URL: https://issues.apache.org/jira/browse/KAFKA-8254 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 >Reporter: John Roesler >Assignee: John Roesler >Priority: Major > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > For example, in KTableSuppressProcessor, we do: > {noformat} > final Bytes serializedKey = Bytes.wrap(keySerde.serializer().serialize(null, > key)); > {noformat} > This violates the contract of Serializer (and Deserializer), and breaks > integration with known Serde implementations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8204) Streams may flush state stores in the incorrect order
[ https://issues.apache.org/jira/browse/KAFKA-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832592#comment-16832592 ] John Roesler commented on KAFKA-8204: - 2.2.1 has been proposed: https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+2.2.1 > Streams may flush state stores in the incorrect order > - > > Key: KAFKA-8204 > URL: https://issues.apache.org/jira/browse/KAFKA-8204 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 1.1.0, 1.1.1, 2.0.0, 2.0.1, 2.1.0, 2.2.0, 2.1.1 >Reporter: John Roesler >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.2.1 > > > Cached state stores may forward records during a flush call, so Streams > should flush the stores in topological order. Otherwise, Streams may flush a > downstream store before an upstream one, resulting in sink results being > committed without the corresponding state changelog updates being committed. > This behavior is partly responsible for the bug reported in KAFKA-7895 . > The fix is simply to flush the stores in topological order, then when the > upstream store forwards records to a downstream stateful processor, the > corresponding state changes will be correctly flushed as well. > An alternative would be to repeatedly call flush on all state stores until > they report there is nothing left to flush, but this requires a public API > change to enable state stores to report whether they need a flush or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8240) Source.equals() can fail with NPE
[ https://issues.apache.org/jira/browse/KAFKA-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck resolved KAFKA-8240. Resolution: Fixed > Source.equals() can fail with NPE > - > > Key: KAFKA-8240 > URL: https://issues.apache.org/jira/browse/KAFKA-8240 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0, 2.1.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Major > Labels: beginner, easy-fix, newbie > > Reported on an PR: > [https://github.com/apache/kafka/pull/5284/files/1df6208f48b6b72091fea71323d94a16102ffd13#r270607795] > InternalTopologyBuilder#Source.equals() might fail with NPE if > `topicPattern==null`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832653#comment-16832653 ] John Roesler commented on KAFKA-8289: - Last I heard [~bbejeck] was working on merging to 2.2 and 2.1 . I think there was just some flaky test failure. Then, the ticket will be resolved. The workaround was just to get the reporter unstuck. There is a larger correctness issue here, which affects any session aggregation, and for which there is no workaround. I see the confusion. Thanks. > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:01.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:05:19.599 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:20.045 INFO --- [-StreamThread-1] c.g.k.
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832577#comment-16832577 ] Andrew edited comment on KAFKA-8315 at 5/3/19 5:32 PM: --- [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? For reference, I have uploaded the rowkey and timestamps of all the records in both of our streams : https://github.com/the4thamigo-uk/join_data was (Author: the4thamigo_uk): [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? For reference, I am attaching the row key and timestamp for all the records in our left and right topics. > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832685#comment-16832685 ] ASF GitHub Bot commented on KAFKA-8289: --- vvcephei commented on pull request #6670: KAFKA-8289: Fix Session Expiration and Suppression (#6654) URL: https://github.com/apache/kafka/pull/6670 Fix two problems in Streams: * Session windows expired prematurely (off-by-one error), since the window end is inclusive, unlike other windows * Suppress duration for sessions incorrectly waited only the grace period, but session windows aren't closed until gracePeriod + sessionGap Update the tests accordingly Reviewers: A. Sophie Blee-Goldman , Boyang Chen , Matthias J. Sax , Bill Bejeck , Guozhang Wang *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473,
[jira] [Resolved] (KAFKA-8308) Update jetty for security vulnerability CVE-2019-10241
[ https://issues.apache.org/jira/browse/KAFKA-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma resolved KAFKA-8308. Resolution: Fixed Assignee: Ismael Juma (was: Lee Dongjin) Fix Version/s: 2.3.0 This was fixed by https://github.com/apache/kafka/pull/6665 which was merged today (coincidentally). > Update jetty for security vulnerability CVE-2019-10241 > -- > > Key: KAFKA-8308 > URL: https://issues.apache.org/jira/browse/KAFKA-8308 > Project: Kafka > Issue Type: Task > Components: core >Affects Versions: 2.2.0 >Reporter: Di Shang >Assignee: Ismael Juma >Priority: Major > Labels: security > Fix For: 2.3.0 > > > Kafka 2.2 uses jetty-*-9.4.14.v20181114 which is marked vulnerable > [https://github.com/apache/kafka/blob/2.2/gradle/dependencies.gradle#L58] > > [https://nvd.nist.gov/vuln/detail/CVE-2019-10241] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832703#comment-16832703 ] ASF GitHub Bot commented on KAFKA-8289: --- vvcephei commented on pull request #6671: KAFKA-8289: Fix Session Expiration and Suppression (#6654) URL: https://github.com/apache/kafka/pull/6671 Fix two problems in Streams: * Session windows expired prematurely (off-by-one error), since the window end is inclusive, unlike other windows * Suppress duration for sessions incorrectly waited only the grace period, but session windows aren't closed until gracePeriod + sessionGap Update the tests accordingly Reviewers: A. Sophie Blee-Goldman , Boyang Chen , Matthias J. Sax , Bill Bejeck , Guozhang Wang *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473,
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832577#comment-16832577 ] Andrew edited comment on KAFKA-8315 at 5/3/19 6:21 PM: --- [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? For reference, I have uploaded the rowkey and timestamps of the kind of records we have in our streams : [https://github.com/the4thamigo-uk/join_data] was (Author: the4thamigo_uk): [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? For reference, I have uploaded the rowkey and timestamps of all the records in both of our streams : https://github.com/the4thamigo-uk/join_data > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832577#comment-16832577 ] Andrew edited comment on KAFKA-8315 at 5/3/19 6:34 PM: --- [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? -For reference, I have uploaded the rowkey and timestamps of the kind of records we have in our streams : [https://github.com/the4thamigo-uk/join_data ]-[ |https://github.com/the4thamigo-uk/join_data]I think the data I put here is incorrect let me fix.[ |https://github.com/the4thamigo-uk/join_data] was (Author: the4thamigo_uk): [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? For reference, I have uploaded the rowkey and timestamps of the kind of records we have in our streams : [https://github.com/the4thamigo-uk/join_data] > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832577#comment-16832577 ] Andrew edited comment on KAFKA-8315 at 5/3/19 6:54 PM: --- [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? | | | | was (Author: the4thamigo_uk): [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? -For reference, I have uploaded the rowkey and timestamps of the kind of records we have in our streams : [https://github.com/the4thamigo-uk/join_data ]-[ |https://github.com/the4thamigo-uk/join_data]I think the data I put here is incorrect let me fix.[ |https://github.com/the4thamigo-uk/join_data] > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace
[ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832577#comment-16832577 ] Andrew edited comment on KAFKA-8315 at 5/3/19 6:54 PM: --- [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? was (Author: the4thamigo_uk): [~vvcephei] Update I ran using grace = 20D and I see records from around 23D prior to the latest data in the topics. So it seems that we definitely can join the records, but that the grace period is the critical factor in doing this for a historical ingest. So, for a 2 year ingestion do I have to set my grace to 2 years? Seems a bit strange, and maybe sounds a bit inefficient? | | | | > Cannot pass Materialized into a join operation - hence cant set retention > period independent of grace > - > > Key: KAFKA-8315 > URL: https://issues.apache.org/jira/browse/KAFKA-8315 > Project: Kafka > Issue Type: Bug >Reporter: Andrew >Assignee: John Roesler >Priority: Major > > The documentation says to use `Materialized` not `JoinWindows.until()` > ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), > but there is no where to pass a `Materialized` instance to the join > operation, only to the group operation is supported it seems. > > Slack conversation here : > [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300] > [Additional] > From what I understand, the retention period should be independent of the > grace period, so I think this is more than a documentation fix (see comments > below) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (KAFKA-8240) Source.equals() can fail with NPE
[ https://issues.apache.org/jira/browse/KAFKA-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck reopened KAFKA-8240: This needs to get cherry-picked back to 2.2 and 2.1 so reopening. I'll resolve once that happens. > Source.equals() can fail with NPE > - > > Key: KAFKA-8240 > URL: https://issues.apache.org/jira/browse/KAFKA-8240 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0, 2.1.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Major > Labels: beginner, easy-fix, newbie > > Reported on an PR: > [https://github.com/apache/kafka/pull/5284/files/1df6208f48b6b72091fea71323d94a16102ffd13#r270607795] > InternalTopologyBuilder#Source.equals() might fail with NPE if > `topicPattern==null`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832801#comment-16832801 ] ASF GitHub Bot commented on KAFKA-8289: --- bbejeck commented on pull request #6670: KAFKA-8289: Fix Session Expiration and Suppression (#6654) URL: https://github.com/apache/kafka/pull/6670 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:01.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:05:19.599 INFO --- [-StreamThread-
[jira] [Commented] (KAFKA-8284) Enable static membership on KStream
[ https://issues.apache.org/jira/browse/KAFKA-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832849#comment-16832849 ] ASF GitHub Bot commented on KAFKA-8284: --- abbccdda commented on pull request #6673: KAFKA-8284: enable static membership on KStream URL: https://github.com/apache/kafka/pull/6673 Part of KIP-345 effort. The strategy is to extract user passed in `group.instance.id` config and pass it in with given thread-id (because consumer is currently per-thread level). ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Enable static membership on KStream > --- > > Key: KAFKA-8284 > URL: https://issues.apache.org/jira/browse/KAFKA-8284 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Boyang Chen >Assignee: Boyang Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8320) Connect Error handling is using the RetriableException from common package
Magesh kumar Nandakumar created KAFKA-8320: -- Summary: Connect Error handling is using the RetriableException from common package Key: KAFKA-8320 URL: https://issues.apache.org/jira/browse/KAFKA-8320 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 2.0.0 Reporter: Magesh kumar Nandakumar Assignee: Magesh kumar Nandakumar When a SourceConnector throws org.apache.kafka.connect.errors.RetriableException during the poll, connect runtime is supposed to ignore the error and retry per [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] . When the conenctors throw the execption its not handled gracefully. WorkerSourceTask is catching the execption from wrog package `org.apache.kafka.common.errors`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832869#comment-16832869 ] John Roesler commented on KAFKA-8289: - Hi [~vahid], the fix has been merged to 2.2 . Sorry for the delay. Thanks, -John > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:01.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:05:19.599 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:20.045 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107476398},k=B,v=11 > 2019-04-24 20:05:20.047 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107501398},k=B,
[jira] [Created] (KAFKA-8321) Flaky Test kafka.server.DynamicConfigTest.shouldFailWhenChangingClientIdUnknownConfig
Bill Bejeck created KAFKA-8321: -- Summary: Flaky Test kafka.server.DynamicConfigTest.shouldFailWhenChangingClientIdUnknownConfig Key: KAFKA-8321 URL: https://issues.apache.org/jira/browse/KAFKA-8321 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.3.0 Reporter: Bill Bejeck [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4253/testReport/junit/kafka.server/DynamicConfigTest/shouldFailWhenChangingClientIdUnknownConfig/] {noformat} Error Message kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING Stacktrace kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:268) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251) at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:264) at kafka.zookeeper.ZooKeeperClient.(ZooKeeperClient.scala:97) at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1694) at kafka.zk.ZooKeeperTestHarness.setUp(ZooKeeperTestHarness.scala:59) at jdk.internal.reflect.GeneratedMethodAccessor138.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at jdk.internal.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117) at jdk.internal.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.
[jira] [Updated] (KAFKA-8320) Connect Error handling is using the RetriableException from common package
[ https://issues.apache.org/jira/browse/KAFKA-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magesh kumar Nandakumar updated KAFKA-8320: --- Description: When a SourceConnector throws org.apache.kafka.connect.errors.RetriableException during the poll, connect runtime is supposed to ignore the error and retry per [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] . When the conenctors throw the execption its not handled gracefully. WorkerSourceTask is catching the execption from wrog package `org.apache.kafka.common.errors`. It is not clear from the API standpoint as to which package the connect farmework supports. The safest thing would be to support both the packages eventhough its less desirable. was: When a SourceConnector throws org.apache.kafka.connect.errors.RetriableException during the poll, connect runtime is supposed to ignore the error and retry per [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] . When the conenctors throw the execption its not handled gracefully. WorkerSourceTask is catching the execption from wrog package `org.apache.kafka.common.errors`. > Connect Error handling is using the RetriableException from common package > -- > > Key: KAFKA-8320 > URL: https://issues.apache.org/jira/browse/KAFKA-8320 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.0.0 >Reporter: Magesh kumar Nandakumar >Assignee: Magesh kumar Nandakumar >Priority: Major > > When a SourceConnector throws > org.apache.kafka.connect.errors.RetriableException during the poll, connect > runtime is supposed to ignore the error and retry per > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] > . When the conenctors throw the execption its not handled gracefully. > WorkerSourceTask is catching the execption from wrog package > `org.apache.kafka.common.errors`. It is not clear from the API standpoint as > to which package the connect farmework supports. The safest thing would be to > support both the packages eventhough its less desirable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8320) Connect Error handling is using the RetriableException from common package
[ https://issues.apache.org/jira/browse/KAFKA-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magesh kumar Nandakumar updated KAFKA-8320: --- Description: When a SourceConnector throws org.apache.kafka.connect.errors.RetriableException during the poll, connect runtime is supposed to ignore the error and retry per [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] . When the conenctors throw the execption its not handled gracefully. WorkerSourceTask is catching the exception from wrong package `org.apache.kafka.common.errors`. It is not clear from the API standpoint as to which package the connect framework supports. The safest thing would be to support both the packages even though it's less desirable. was: When a SourceConnector throws org.apache.kafka.connect.errors.RetriableException during the poll, connect runtime is supposed to ignore the error and retry per [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] . When the conenctors throw the execption its not handled gracefully. WorkerSourceTask is catching the execption from wrog package `org.apache.kafka.common.errors`. It is not clear from the API standpoint as to which package the connect farmework supports. The safest thing would be to support both the packages eventhough its less desirable. > Connect Error handling is using the RetriableException from common package > -- > > Key: KAFKA-8320 > URL: https://issues.apache.org/jira/browse/KAFKA-8320 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.0.0 >Reporter: Magesh kumar Nandakumar >Assignee: Magesh kumar Nandakumar >Priority: Major > > When a SourceConnector throws > org.apache.kafka.connect.errors.RetriableException during the poll, connect > runtime is supposed to ignore the error and retry per > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] > . When the conenctors throw the execption its not handled gracefully. > WorkerSourceTask is catching the exception from wrong package > `org.apache.kafka.common.errors`. It is not clear from the API standpoint as > to which package the connect framework supports. The safest thing would be to > support both the packages even though it's less desirable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8320) Connect Error handling is using the RetriableException from common package
[ https://issues.apache.org/jira/browse/KAFKA-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832886#comment-16832886 ] ASF GitHub Bot commented on KAFKA-8320: --- mageshn commented on pull request #6675: KAFKA-8320 : fix retriable exception package for source connectors URL: https://github.com/apache/kafka/pull/6675 WorkerSourceTask is catching the exception from wrong package `org.apache.kafka.common.errors`. It is not clear from the API standpoint as to which package the connect framework supports - the one from common or connect. The safest thing would be to support both the packages even though it's less desirable. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Connect Error handling is using the RetriableException from common package > -- > > Key: KAFKA-8320 > URL: https://issues.apache.org/jira/browse/KAFKA-8320 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.0.0 >Reporter: Magesh kumar Nandakumar >Assignee: Magesh kumar Nandakumar >Priority: Major > > When a SourceConnector throws > org.apache.kafka.connect.errors.RetriableException during the poll, connect > runtime is supposed to ignore the error and retry per > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect] > . When the conenctors throw the execption its not handled gracefully. > WorkerSourceTask is catching the exception from wrong package > `org.apache.kafka.common.errors`. It is not clear from the API standpoint as > to which package the connect framework supports. The safest thing would be to > support both the packages even though it's less desirable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8322) Flaky test: SslTransportLayerTest.testListenerConfigOverride
Dhruvil Shah created KAFKA-8322: --- Summary: Flaky test: SslTransportLayerTest.testListenerConfigOverride Key: KAFKA-8322 URL: https://issues.apache.org/jira/browse/KAFKA-8322 Project: Kafka Issue Type: Test Components: core, unit tests Reporter: Dhruvil Shah java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:120) at org.junit.Assert.assertEquals(Assert.java:146) at org.apache.kafka.common.network.NetworkTestUtils.waitForChannelClose(NetworkTestUtils.java:111) at org.apache.kafka.common.network.SslTransportLayerTest.testListenerConfigOverride(SslTransportLayerTest.java:319) [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4250/testReport/junit/org.apache.kafka.common.network/SslTransportLayerTest/testListenerConfigOverride/] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8322) Flaky test: SslTransportLayerTest.testListenerConfigOverride
[ https://issues.apache.org/jira/browse/KAFKA-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhruvil Shah updated KAFKA-8322: Description: java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:120) at org.junit.Assert.assertEquals(Assert.java:146) at org.apache.kafka.common.network.NetworkTestUtils.waitForChannelClose(NetworkTestUtils.java:111) [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4250/testReport/junit/org.apache.kafka.common.network/SslTransportLayerTest/testListenerConfigOverride/] was: java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:120) at org.junit.Assert.assertEquals(Assert.java:146) at org.apache.kafka.common.network.NetworkTestUtils.waitForChannelClose(NetworkTestUtils.java:111) at org.apache.kafka.common.network.SslTransportLayerTest.testListenerConfigOverride(SslTransportLayerTest.java:319) [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4250/testReport/junit/org.apache.kafka.common.network/SslTransportLayerTest/testListenerConfigOverride/] > Flaky test: SslTransportLayerTest.testListenerConfigOverride > > > Key: KAFKA-8322 > URL: https://issues.apache.org/jira/browse/KAFKA-8322 > Project: Kafka > Issue Type: Test > Components: core, unit tests >Reporter: Dhruvil Shah >Priority: Major > > java.lang.AssertionError: expected: but > was: at org.junit.Assert.fail(Assert.java:89) at > org.junit.Assert.failNotEquals(Assert.java:835) at > org.junit.Assert.assertEquals(Assert.java:120) at > org.junit.Assert.assertEquals(Assert.java:146) at > org.apache.kafka.common.network.NetworkTestUtils.waitForChannelClose(NetworkTestUtils.java:111) > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4250/testReport/junit/org.apache.kafka.common.network/SslTransportLayerTest/testListenerConfigOverride/] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8323) Memory leak of BloomFilter Rocks object
Sophie Blee-Goldman created KAFKA-8323: -- Summary: Memory leak of BloomFilter Rocks object Key: KAFKA-8323 URL: https://issues.apache.org/jira/browse/KAFKA-8323 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 2.3.0, 2.2.1 Reporter: Sophie Blee-Goldman Assignee: Sophie Blee-Goldman Any RocksJava object that inherits from org.rocksdb.AbstractNativeReference must be closed explicitly in order to free up the memory of the backing C++ object. The BloomFilter extends RocksObject (which implements AbstractNativeReference) and should be also be closed in RocksDBStore#close to avoid leaking memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8323) Memory leak of BloomFilter Rocks object
[ https://issues.apache.org/jira/browse/KAFKA-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman updated KAFKA-8323: --- Affects Version/s: (was: 2.2.1) (was: 2.3.0) 2.2.0 Fix Version/s: 2.2.1 2.3.0 > Memory leak of BloomFilter Rocks object > --- > > Key: KAFKA-8323 > URL: https://issues.apache.org/jira/browse/KAFKA-8323 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0 >Reporter: Sophie Blee-Goldman >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.3.0, 2.2.1 > > > Any RocksJava object that inherits from org.rocksdb.AbstractNativeReference > must be closed explicitly in order to free up the memory of the backing C++ > object. The BloomFilter extends RocksObject (which implements > AbstractNativeReference) and should be also be closed in RocksDBStore#close > to avoid leaking memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8324) User constructed RocksObjects leak memory
Sophie Blee-Goldman created KAFKA-8324: -- Summary: User constructed RocksObjects leak memory Key: KAFKA-8324 URL: https://issues.apache.org/jira/browse/KAFKA-8324 Project: Kafka Issue Type: Bug Reporter: Sophie Blee-Goldman Some of the RocksDB options a user can set when extending RocksDBConfigSetter take Rocks objects as parameters. Many of these--including potentially large objects like Cache and Filter-- inherit from AbstractNativeReference and must be closed explicitly in order to free the memory of the backing C++ object. However the user has no way of closing any objects they have created in RocksDBConfigSetter, and we do not ever close them for them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-7946) Flaky Test DeleteConsumerGroupsTest#testDeleteNonEmptyGroup
[ https://issues.apache.org/jira/browse/KAFKA-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832206#comment-16832206 ] Vahid Hashemian edited comment on KAFKA-7946 at 5/3/19 11:42 PM: - This has been fixed by [https://github.com/apache/kafka/pull/6312.] Resolving the ticket. was (Author: vahid): Removed Fix Version 2.2.1 as this issue is not blocking that release. > Flaky Test DeleteConsumerGroupsTest#testDeleteNonEmptyGroup > --- > > Key: KAFKA-7946 > URL: https://issues.apache.org/jira/browse/KAFKA-7946 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Assignee: Gwen Shapira >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0, 2.2.2 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/17/] > {quote}java.lang.NullPointerException at > kafka.admin.DeleteConsumerGroupsTest.testDeleteNonEmptyGroup(DeleteConsumerGroupsTest.scala:96){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-7946) Flaky Test DeleteConsumerGroupsTest#testDeleteNonEmptyGroup
[ https://issues.apache.org/jira/browse/KAFKA-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vahid Hashemian resolved KAFKA-7946. Resolution: Fixed Fix Version/s: 2.2.1 > Flaky Test DeleteConsumerGroupsTest#testDeleteNonEmptyGroup > --- > > Key: KAFKA-7946 > URL: https://issues.apache.org/jira/browse/KAFKA-7946 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Assignee: Gwen Shapira >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0, 2.2.1, 2.2.2 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/17/] > {quote}java.lang.NullPointerException at > kafka.admin.DeleteConsumerGroupsTest.testDeleteNonEmptyGroup(DeleteConsumerGroupsTest.scala:96){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8323) Memory leak of BloomFilter Rocks object
[ https://issues.apache.org/jira/browse/KAFKA-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832934#comment-16832934 ] ASF GitHub Bot commented on KAFKA-8323: --- bbejeck commented on pull request #6672: KAFKA-8323: Close RocksDBStore's BloomFilter URL: https://github.com/apache/kafka/pull/6672 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Memory leak of BloomFilter Rocks object > --- > > Key: KAFKA-8323 > URL: https://issues.apache.org/jira/browse/KAFKA-8323 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0 >Reporter: Sophie Blee-Goldman >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.3.0, 2.2.1 > > > Any RocksJava object that inherits from org.rocksdb.AbstractNativeReference > must be closed explicitly in order to free up the memory of the backing C++ > object. The BloomFilter extends RocksObject (which implements > AbstractNativeReference) and should be also be closed in RocksDBStore#close > to avoid leaking memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8323) Memory leak of BloomFilter Rocks object
[ https://issues.apache.org/jira/browse/KAFKA-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck resolved KAFKA-8323. Resolution: Fixed cherry-picked to 2.2 as well > Memory leak of BloomFilter Rocks object > --- > > Key: KAFKA-8323 > URL: https://issues.apache.org/jira/browse/KAFKA-8323 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0 >Reporter: Sophie Blee-Goldman >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.3.0, 2.2.1 > > > Any RocksJava object that inherits from org.rocksdb.AbstractNativeReference > must be closed explicitly in order to free up the memory of the backing C++ > object. The BloomFilter extends RocksObject (which implements > AbstractNativeReference) and should be also be closed in RocksDBStore#close > to avoid leaking memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8323) Memory leak of BloomFilter Rocks object
[ https://issues.apache.org/jira/browse/KAFKA-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832941#comment-16832941 ] ASF GitHub Bot commented on KAFKA-8323: --- ableegoldman commented on pull request #6676: KAFKA-8323: Should close filter in RocksDBStoreTest as well URL: https://github.com/apache/kafka/pull/6676 Forgot to also close the filter in RocksDBStoreTest in time. Thanks @bbejeck for merging (too!) quickly 😄 ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Memory leak of BloomFilter Rocks object > --- > > Key: KAFKA-8323 > URL: https://issues.apache.org/jira/browse/KAFKA-8323 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.2.0 >Reporter: Sophie Blee-Goldman >Assignee: Sophie Blee-Goldman >Priority: Blocker > Fix For: 2.3.0, 2.2.1 > > > Any RocksJava object that inherits from org.rocksdb.AbstractNativeReference > must be closed explicitly in order to free up the memory of the backing C++ > object. The BloomFilter extends RocksObject (which implements > AbstractNativeReference) and should be also be closed in RocksDBStore#close > to avoid leaking memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8324) User constructed RocksObjects leak memory
[ https://issues.apache.org/jira/browse/KAFKA-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman updated KAFKA-8324: --- Description: Some of the RocksDB options a user can set when extending RocksDBConfigSetter take Rocks objects as parameters. Many of these – including potentially large objects like Cache and Filter – inherit from AbstractNativeReference and must be closed explicitly in order to free the memory of the backing C++ object. However the user has no way of closing any objects they have created in RocksDBConfigSetter, and we do not ever close them for them. was: Some of the RocksDB options a user can set when extending RocksDBConfigSetter take Rocks objects as parameters. Many of these--including potentially large objects like Cache and Filter-- inherit from AbstractNativeReference and must be closed explicitly in order to free the memory of the backing C++ object. However the user has no way of closing any objects they have created in RocksDBConfigSetter, and we do not ever close them for them. > User constructed RocksObjects leak memory > - > > Key: KAFKA-8324 > URL: https://issues.apache.org/jira/browse/KAFKA-8324 > Project: Kafka > Issue Type: Bug >Reporter: Sophie Blee-Goldman >Priority: Major > > Some of the RocksDB options a user can set when extending RocksDBConfigSetter > take Rocks objects as parameters. Many of these – including potentially large > objects like Cache and Filter – inherit from AbstractNativeReference and must > be closed explicitly in order to free the memory of the backing C++ object. > However the user has no way of closing any objects they have created in > RocksDBConfigSetter, and we do not ever close them for them. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8289) KTable, Long> can't be suppressed
[ https://issues.apache.org/jira/browse/KAFKA-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vahid Hashemian resolved KAFKA-8289. Resolution: Fixed > KTable, Long> can't be suppressed > --- > > Key: KAFKA-8289 > URL: https://issues.apache.org/jira/browse/KAFKA-8289 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0, 2.2.0, 2.1.1 > Environment: Broker on a Linux, stream app on my win10 laptop. > I add one row log.message.timestamp.type=LogAppendTime to my broker's > server.properties. stream app all default config. >Reporter: Xiaolin Jia >Assignee: John Roesler >Priority: Blocker > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > I write a simple stream app followed official developer guide [Stream > DSL|[https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#window-final-results]]. > but I got more than one [Window Final > Results|https://kafka.apache.org/22/documentation/streams/developer-guide/dsl-api.html#id31] > from a session time window. > time ticker A -> (4,A) / 25s, > time ticker B -> (4, B) / 25s all send to the same topic > below is my stream app code > {code:java} > kstreams[0] > .peek((k, v) -> log.info("--> ping, k={},v={}", k, v)) > .groupBy((k, v) -> v, Grouped.with(Serdes.String(), Serdes.String())) > .windowedBy(SessionWindows.with(Duration.ofSeconds(100)).grace(Duration.ofMillis(20))) > .count() > .suppress(Suppressed.untilWindowCloses(BufferConfig.unbounded())) > .toStream().peek((k, v) -> log.info("window={},k={},v={}", k.window(), > k.key(), v)); > {code} > {{here is my log print}} > {noformat} > 2019-04-24 20:00:26.142 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:00:47.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556106587744, > endMs=1556107129191},k=A,v=20 > 2019-04-24 20:00:51.071 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:16.065 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:01:41.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:06.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:31.066 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:02:56.208 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:21.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:03:46.078 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:04.684 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:11.069 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:19.371 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107426409},k=B,v=9 > 2019-04-24 20:04:19.372 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107445012},k=A,v=1 > 2019-04-24 20:04:29.604 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:04:36.067 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:04:49.715 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107451397},k=B,v=10 > 2019-04-24 20:04:49.716 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107445012, > endMs=1556107469935},k=A,v=2 > 2019-04-24 20:04:54.593 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:01.070 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=B > 2019-04-24 20:05:19.599 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> ping, k=4,v=A > 2019-04-24 20:05:20.045 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107476398},k=B,v=11 > 2019-04-24 20:05:20.047 INFO --- [-StreamThread-1] c.g.k.AppStreams > : window=Window{startMs=1556107226473, > endMs=1556107501398},k=B,v=12 > 2019-04-24 20:05:26.075 INFO --- [-StreamThread-1] c.g.k.AppStreams > : --> pi