kaikai.hou created KAFKA-9385: --------------------------------- Summary: Connect cluster: connector task repeat like a splitbrain cluster problem Key: KAFKA-9385 URL: https://issues.apache.org/jira/browse/KAFKA-9385 Project: Kafka Issue Type: Bug Components: KafkaConnect Reporter: kaikai.hou Attachments: 12_31_d8c7j_1.jpg
I am using Debezium. And find a task repeat problem.[Jump|[https://issues.redhat.com/browse/DBZ-1573?jql=key%20in%20watchedIssues()]] 1. I push the Debezium image to our private image repository. 2. Deploy the connect cluster with the following *Deployment Config*: {code:java} //代码占位符 apiVersion: apps.openshift.io/v1 kind: DeploymentConfig metadata: annotations: openshift.io/generated-by: OpenShiftWebConsole creationTimestamp: '2019-10-14T07:45:41Z' generation: 29 labels: app: debezium-test-cloud name: debezium-test-cloud namespace: test resourceVersion: '168496156' selfLink: >- /apis/apps.openshift.io/v1/namespaces/test/deploymentconfigs/debezium-test-cloud uid: 9f4f8f4d-ee56-11e9-a5a1-00163e0e008f spec: replicas: 2 selector: app: debezium-test-cloud deploymentconfig: debezium-test-cloud strategy: activeDeadlineSeconds: 21600 resources: {} rollingParams: intervalSeconds: 1 maxSurge: 25% maxUnavailable: 25% timeoutSeconds: 600 updatePeriodSeconds: 1 type: Rolling template: metadata: annotations: openshift.io/generated-by: OpenShiftWebConsole creationTimestamp: null labels: app: debezium-test-cloud deploymentconfig: debezium-test-cloud spec: containers: - env: - name: BOOTSTRAP_SERVERS value: '192.168.100.228:9092' - name: GROUP_ID value: test-cloud - name: CONFIG_STORAGE_TOPIC value: base.test-cloud.config - name: OFFSET_STORAGE_TOPIC value: base.test-cloud.offset - name: STATUS_STORAGE_TOPIC value: base.test-cloud.status - name: CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE value: 'true' - name: CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE value: 'true' - name: CONNECT_PRODUCER_MAX_REQUEST_SIZE value: '20971520' - name: CONNECT_DATABASE_HISTORY_KAFKA_RECOVERY_POLL_INTERVAL_MS value: '1000' - name: HEAP_OPTS value: '-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0' image: 'registry.cn-hangzhou.aliyuncs.com/eshine/debeziumconnect:1.0.0.Beta2' imagePullPolicy: IfNotPresent name: debezium-test-cloud ports: - containerPort: 8083 protocol: TCP - containerPort: 8778 protocol: TCP - containerPort: 9092 protocol: TCP - containerPort: 9779 protocol: TCP resources: limits: cpu: 400m memory: 1Gi requests: cpu: 200m memory: 1Gi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /kafka/config name: debezium-test-cloud-1 - mountPath: /kafka/data name: debezium-test-cloud-2 - mountPath: /kafka/logs name: debezium-test-cloud-3 dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 volumes: - emptyDir: {} name: debezium-test-cloud-1 - emptyDir: {} name: debezium-test-cloud-2 - emptyDir: {} name: debezium-test-cloud-3 test: false triggers: - type: ConfigChange status: availableReplicas: 2 conditions: - lastTransitionTime: '2019-11-25T06:44:30Z' lastUpdateTime: '2019-11-25T06:44:44Z' message: replication controller "debezium-test-cloud-15" successfully rolled out reason: NewReplicationControllerAvailable status: 'True' type: Progressing - lastTransitionTime: '2019-12-31T10:06:23Z' lastUpdateTime: '2019-12-31T10:06:23Z' message: Deployment config has minimum availability. status: 'True' type: Available details: causes: - type: Manual message: manual change latestVersion: 15 observedGeneration: 29 readyReplicas: 2 replicas: 2 unavailableReplicas: 0 updatedReplicas: 2 {code} 3. Connect cluster in openshift: one service with two pods 4. a). task_connector_1_0 and task_connector_3_0 were running in podA; task_connector_2_0 was running in PodB b) Then, PodA console follows error log: In attachment "12_31_d8c7j_1.jpg" c) Then, Rebalance started; d) However, In PodB, all task (task_connector_1_0, task_connector_2_0, task_connector_3_0) are running. In PodA, still task_connector_1_0 and task_connector_3_0. e) So the repeat task appeared. -- This message was sent by Atlassian Jira (v8.3.4#803005)