[jira] [Resolved] (KAFKA-7756) Leader: -1 after topic delete
[ https://issues.apache.org/jira/browse/KAFKA-7756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhws resolved KAFKA-7756. - Resolution: Fixed > Leader: -1 after topic delete > - > > Key: KAFKA-7756 > URL: https://issues.apache.org/jira/browse/KAFKA-7756 > Project: Kafka > Issue Type: Bug >Reporter: zhws >Priority: Blocker > Attachments: image-2018-12-19-17-03-42-912.png, > image-2018-12-19-17-07-27-850.png, image-2018-12-19-17-10-25-784.png > > > 1、when i first delete topic "deleteTestTwo",it's successed. I can see the > delete log and zookeeper delete node too. > !image-2018-12-19-17-03-42-912.png! > > 2、But when i create this topic and delete again. > !image-2018-12-19-17-07-27-850.png! > I just see the file delete log. > Zookeeper still have this node, and i execute describe shell as follows > !image-2018-12-19-17-10-25-784.png! > > if some people know the reason, please tell me.thanks > kafka version : 2.0 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7770) Add method that gives 100% guarantee that topic has been created
Tomasz Szlek created KAFKA-7770: --- Summary: Add method that gives 100% guarantee that topic has been created Key: KAFKA-7770 URL: https://issues.apache.org/jira/browse/KAFKA-7770 Project: Kafka Issue Type: Improvement Reporter: Tomasz Szlek Current Kafka client API provides method to create topics but it indicates that "It may take several seconds after \{{CreateTopicsResult}} returns success for all the brokers to become aware that the topics have been created". If possible, it would be good to have a yet another method that will give 100% guarantee that topic has been created and all brokers are aware of it. Just for reference I raised the same issue [here|https://stackoverflow.com/questions/53910783/kafka-producer-throws-received-unknown-topic-or-partition-error-when-sending-t]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-7625) Kafka Broker node JVM crash - kafka.coordinator.transaction.TransactionCoordinator.$anonfun$handleEndTransaction
[ https://issues.apache.org/jira/browse/KAFKA-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma resolved KAFKA-7625. Resolution: Not A Bug Closing since it's not a Kafka issue. Someone else reported that changing the GC from G1 also made the issue go away. In any case, upgrading to the latest version of JDK 8 is the recommended path. > Kafka Broker node JVM crash - > kafka.coordinator.transaction.TransactionCoordinator.$anonfun$handleEndTransaction > > > Key: KAFKA-7625 > URL: https://issues.apache.org/jira/browse/KAFKA-7625 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.0.0 > Environment: environment:os.version=2.6.32-754.2.1.el6.x86_64 > java.version=1.8.0_92 > environment:zookeeper.version=3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, > built on 06/29/2018 00:39 GMT (org.apache.zookeeper.ZooKeeper) > Kafka commitId : 3402a8361b734732 >Reporter: Sebastian Puzoń >Priority: Critical > Attachments: hs_err_pid10238.log, hs_err_pid15119.log, > hs_err_pid19131.log, hs_err_pid19405.log, hs_err_pid20124.log, > hs_err_pid22373.log, hs_err_pid22386.log, hs_err_pid22633.log, > hs_err_pid24681.log, hs_err_pid25513.log, hs_err_pid25701.log, > hs_err_pid26844.log, hs_err_pid27156.log, hs_err_pid27290.log, > hs_err_pid4194.log, hs_err_pid4299.log > > > I observe broker node JVM crashes with same problematic frame: > {code:java} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7ff4a2588261, pid=24681, tid=0x7ff3b9bb1700 > # > # JRE version: Java(TM) SE Runtime Environment (8.0_92-b14) (build > 1.8.0_92-b14) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.92-b14 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # J 9736 C1 > kafka.coordinator.transaction.TransactionCoordinator.$anonfun$handleEndTransaction$7(Lkafka/coordinator/transaction/TransactionCoordinator;Ljava/lang/String;JSLorg/apache/kafka/common/requests/TransactionResult;Lkafka/coordinator/transaction/TransactionMetadata;)Lscala/util/Either; > (518 bytes) @ 0x7ff4a2588261 [0x7ff4a25871a0+0x10c1] > # > # Failed to write core dump. Core dumps have been disabled. To enable core > dumping, try "ulimit -c unlimited" before starting Java again > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x7ff4b356f800): JavaThread "kafka-request-handler-3" > daemon [_thread_in_Java, id=24781, > stack(0x7ff3b9ab1000,0x7ff3b9bb2000)] > {code} > {code:java} > Stack: [0x7ff3b9ab1000,0x7ff3b9bb2000], sp=0x7ff3b9bafca0, free > space=1019k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > J 9736 C1 > kafka.coordinator.transaction.TransactionCoordinator.$anonfun$handleEndTransaction$7(Lkafka/coordinator/transaction/TransactionCoordinator;Ljava/lang/String;JSLorg/apache/kafka/common/requests/TransactionResult;Lkafka/coordinator/transaction/TransactionMetadata;)Lscala/util/Either; > (518 bytes) @ 0x7ff4a2588261 [0x7ff4a25871a0+0x10c1] > J 10456 C2 > kafka.coordinator.transaction.TransactionCoordinator.$anonfun$handleEndTransaction$5(Lkafka/coordinator/transaction/TransactionCoordinator;Ljava/lang/String;JSLorg/apache/kafka/common/requests/TransactionResult;ILscala/Option;)Lscala/util/Either; > (192 bytes) @ 0x7ff4a1d413f0 [0x7ff4a1d41240+0x1b0] > J 9303 C1 > kafka.coordinator.transaction.TransactionCoordinator$$Lambda$1107.apply(Ljava/lang/Object;)Ljava/lang/Object; > (32 bytes) @ 0x7ff4a245f55c [0x7ff4a245f3c0+0x19c] > J 10018 C2 > scala.util.Either$RightProjection.flatMap(Lscala/Function1;)Lscala/util/Either; > (43 bytes) @ 0x7ff4a1f242c4 [0x7ff4a1f24260+0x64] > J 9644 C1 > kafka.coordinator.transaction.TransactionCoordinator.sendTxnMarkersCallback$1(Lorg/apache/kafka/common/protocol/Errors;Ljava/lang/String;JSLorg/apache/kafka/common/requests/TransactionResult;Lscala/Function1;ILkafka/coordinator/transaction/TxnTransitMetadata;)V > (251 bytes) @ 0x7ff4a1ef6254 [0x7ff4a1ef5120+0x1134] > J 9302 C1 > kafka.coordinator.transaction.TransactionCoordinator$$Lambda$1106.apply(Ljava/lang/Object;)Ljava/lang/Object; > (40 bytes) @ 0x7ff4a24747ec [0x7ff4a24745a0+0x24c] > J 10125 C2 > kafka.coordinator.transaction.TransactionStateManager.updateCacheCallback$1(Lscala/collection/Map;Ljava/lang/String;ILkafka/coordinator/transaction/TxnTransitMetadata;Lscala/Function1;Lscala/Function1;Lorg/apache/kafka/common/TopicPartition;)V > (892 bytes) @ 0x7ff4a27045ec [0x7ff4a2703c60+0x98c] > J 10051 C2 > k
Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.
Hi All Sorry for the delay - holidays and all. I have since updated the KIP with John's original suggestion and have pruned a number of the no longer relevant diagrams. Any more comments would be welcomed, otherwise I will look to kick off the vote again shortly. Thanks Adam On Mon, Dec 17, 2018 at 7:06 PM Adam Bellemare wrote: > Hi John and Guozhang > > Ah yes, I lost that in the mix! Thanks for the convergent solutions - I do > think that the attachment that John included makes for a better design. It > should also help with overall performance as very high-cardinality foreign > keyed data (say millions of events with the same entity) will be able to > leverage the multiple nodes for join functionality instead of having it all > performed in one node. There is still a bottleneck in the right table > having to propagate all those events, but with slimmer structures, less IO > and no need to perform the join I think the throughput will be much higher > in those scenarios. > > Okay, I am convinced. I will update the KIP accordingly to a Gliffy > version of John's diagram and ensure that the example flow matches > correctly. Then I can go back to working on the PR to match the diagram. > > Thanks both of you for all the help - very much appreciated. > > Adam > > > > > > > > On Mon, Dec 17, 2018 at 6:39 PM Guozhang Wang wrote: > >> Hi John, >> >> Just made a pass on your diagram (nice hand-drawing btw!), and obviously >> we >> are thinking about the same thing :) A neat difference that I like, is >> that >> in the pre-join repartition topic we can still send message in the format >> of `K=k, V=(i=2)` while using "i" as the partition key in >> StreamsPartition, >> this way we do not need to even augment the key for the repartition topic, >> but just do a projection on the foreign key part but trim all other >> fields: >> as long as we still materialize the store as `A-2` co-located with the >> right KTable, that is fine. >> >> As I mentioned in my previous email, I also think this has a few >> advantages >> on saving over-the-wire bytes as well as disk bytes. >> >> Guozhang >> >> >> On Mon, Dec 17, 2018 at 3:17 PM John Roesler wrote: >> >> > Hi Guozhang, >> > >> > Thanks for taking a look! I think Adam's already addressed your >> questions >> > as well as I could have. >> > >> > Hi Adam, >> > >> > Thanks for updating the KIP. It looks great, especially how all the >> > need-to-know information is right at the top, followed by the details. >> > >> > Also, thanks for that high-level diagram. Actually, now that I'm looking >> > at it, I think part of my proposal got lost in translation, although I >> do >> > think that what you have there is also correct. >> > >> > I sketched up a crude diagram based on yours and attached it to the KIP >> > (I'm not sure if attached or inline images work on the mailing list): >> > >> https://cwiki.apache.org/confluence/download/attachments/74684836/suggestion.png >> > . It's also attached to this email for convenience. >> > >> > Hopefully, you can see how it's intended to line up, and which parts are >> > modified. >> > At a high level, instead of performing the join on the right-hand side, >> > we're essentially just registering interest, like "LHS key A wishes to >> > receive updates for RHS key 2". Then, when there is a new "interest" or >> any >> > updates to the RHS records, it "broadcasts" its state back to the LHS >> > records who are interested in it. >> > >> > Thus, instead of sending the LHS values to the RHS joiner workers and >> then >> > sending the join results back to the LHS worke be co-partitioned and >> > validated, we instead only send the LHS *keys* to the RHS workers and >> then >> > only the RHS k/v back to be joined by the LHS worker. >> > >> > I've been considering both your diagram and mine, and I *think* what I'm >> > proposing has a few advantages. >> > >> > Here are some points of interest as you look at the diagram: >> > * When we extract the foreign key and send it to the Pre-Join >> Repartition >> > Topic, we can send only the FK/PK pair. There's no need to worry about >> > custom partitioner logic, since we can just use the foreign key plainly >> as >> > the repartition record key. Also, we save on transmitting the LHS value, >> > since we only send its key in this step. >> > * We also only need to store the RHSKey:LHSKey mapping in the >> > MaterializedSubscriptionStore, saving on disk. We can use the same rocks >> > key format you proposed and the same algorithm involving range scans >> when >> > the RHS records get updated. >> > * Instead of joining on the right side, all we do is compose a >> > re-repartition record so we can broadcast the RHS k/v pair back to the >> > original LHS partition. (this is what the "rekey" node is doing) >> > * Then, there is a special kind of Joiner that's co-resident in the same >> > StreamTask as the LHS table, subscribed to the Post-Join Repartition >> Topic. >> > ** This Joiner is *not* triggered directl
[jira] [Resolved] (KAFKA-7707) Some code is not necessary
[ https://issues.apache.org/jira/browse/KAFKA-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-7707. Resolution: Not A Problem > Some code is not necessary > -- > > Key: KAFKA-7707 > URL: https://issues.apache.org/jira/browse/KAFKA-7707 > Project: Kafka > Issue Type: Improvement >Reporter: huangyiming >Priority: Minor > Attachments: image-2018-12-05-18-01-46-886.png > > > In the trunk branch in > [BufferPool.java|https://github.com/apache/kafka/blob/578205cadd0bf64d671c6c162229c4975081a9d6/clients/src/main/java/org/apache/kafka/clients/producer/internals/BufferPool.java#L174], > i think the code can clean,is not necessary,it will never execute > {code:java} > if (!(this.nonPooledAvailableMemory == 0 && this.free.isEmpty()) && > !this.waiters.isEmpty()) > this.waiters.peekFirst().signal(); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Kafka tests on a remote cluster
+dev@kafka.apache.org On Wed, Dec 26, 2018 at 8:53 PM Parviz deyhim wrote: > Thanks fair points. Probably best if I simplify the question: How does > Kafka community run tests besides using mocked local Kafka components? > Surely there are tests to confirm different failure scenarios such as > losing a broker in a real clustered environment (multi node cluster with > Ip, port, hostnsmes and etc). The answer would be a good starting point for > me. > > On Wed, Dec 26, 2018 at 6:11 PM Stephen Powis > wrote: > >> Without looking into how the integration tests work my best guess is >> within >> the context they were written to run in, it doesn't make sense to run them >> against a remote cluster. The "internal" cluster is running the same >> code, >> so why require having to coordinate with an external dependency? >> >> For the use case you gave, and I'm not sure if tests exist that cover this >> behavior or not -- Running the brokers locally in the context of the tests >> mean that those tests have control over the brokers (IE shut them off, >> restart them, etc.. programmatically) and validate behavior. To >> coordinate >> these operations on a remote broker would be significantly more difficult. >> >> Not sure this helps...but perhaps you're either asking the wrong questions >> or trying to go about solving your problem using the wrong set of tools? >> My gut feeling says if you want to do a full scale multi-server load / HA >> test, Kafka's test suite is not the best place to start. >> >> Stephen >> >> >> >> On Thu, Dec 27, 2018 at 10:53 AM Parviz deyhim wrote: >> >> > Hi, >> > >> > I'm looking to see who has done this before and get some guidance. On >> > frequent basis I like to run basic tests on a remote Kafka cluster while >> > some random chaos/faults are being performed. In other words I like to >> run >> > chaos engineering tasks (network outage, disk outage, etc) and see how >> > Kafka behaves. For example: >> > >> > 1) bring some random Broker node down >> > 2) send 2000 messages >> > 3) consumes messages >> > 4) confirm there's no data loss >> > >> > My questions: I'm pretty sure most of the scenarios I'm looking to test >> > have been covered under Kafka's integration,unit and other existing >> tests. >> > What I cannot figure out is how to run those tests on a remote cluster >> vs. >> > a local one which the tests seems to run on. For example I like to run >> the >> > following command but the tests to be executed on a remote cluster of my >> > choice: >> > >> > ./gradlew cleanTest integrationTest >> > >> > Any guidance/help would be appreciated. >> > >> > Thanks >> > >> >
[jira] [Created] (KAFKA-7771) Group/Transaction coordinators should update assignment based on current partition count
Jason Gustafson created KAFKA-7771: -- Summary: Group/Transaction coordinators should update assignment based on current partition count Key: KAFKA-7771 URL: https://issues.apache.org/jira/browse/KAFKA-7771 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson In GroupMetadataManager and TransactionStateManager, we cache the number of partitions assigned to __consumer_offsets and __transaction_state respectively. This is used to compute the expected partition for a given group.id or transactional.id. The value is computed only once when the broker starts up and it is based on the number of partitions if the topic exists or the value configured by `offsets.topic.num.partitions` in the case of the group coordinator (or `transaction.state.log.num.partitions` for the transaction coordinator). The problem is that Kafka supports manual creation of these topics as well and the number of partitions doesn't have to match the value configured. If the topic is created with a mismatching number of partitions, then the cached values will not be correct and coordinator lookup may fail. To fix this, when brokers receive metadata updates from the controller, they should pass the current number of partitions for these internal topics to the respective coordinators so that the count can be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7772) Dynamically adjust log level in Connect workers
Arjun Satish created KAFKA-7772: --- Summary: Dynamically adjust log level in Connect workers Key: KAFKA-7772 URL: https://issues.apache.org/jira/browse/KAFKA-7772 Project: Kafka Issue Type: Improvement Components: KafkaConnect Reporter: Arjun Satish Assignee: Arjun Satish Currently, Kafka provides a JMX interface to dynamically modify log levels of different active loggers. It would be good to have a similar interface for Connect as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7773) Use verifiable consumer in system tests to avoid reliance on console consumer idle timeout
Jason Gustafson created KAFKA-7773: -- Summary: Use verifiable consumer in system tests to avoid reliance on console consumer idle timeout Key: KAFKA-7773 URL: https://issues.apache.org/jira/browse/KAFKA-7773 Project: Kafka Issue Type: Improvement Components: system tests Reporter: Jason Gustafson Assignee: Jason Gustafson We have a few system tests which use `ProduceConsumeValidateTest`, which relies on the console consumer's idle timeout for shutdown. We should convert these systems tests to use the verifiable consumer instead so that we can control shutdown more finely and we have more validation options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)