[jira] [Commented] (FLINK-4035) Bump Kafka producer in Kafka sink to Kafka 0.10.0.0

ASF GitHub Bot (JIRA) Fri, 05 Aug 2016 01:12:40 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409138#comment-15409138
 ]


ASF GitHub Bot commented on FLINK-4035:
---------------------------------------

Github user tzulitai commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2231#discussion_r73656446
  
    --- Diff: 
flink-streaming-connectors/flink-connector-kafka-0.10/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer010.java
 ---
    @@ -0,0 +1,259 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.streaming.connectors.kafka;
    +
    +import 
org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks;
    +import 
org.apache.flink.streaming.api.functions.AssignerWithPunctuatedWatermarks;
    +import org.apache.flink.streaming.api.operators.StreamingRuntimeContext;
    +import 
org.apache.flink.streaming.connectors.kafka.internal.Kafka010Fetcher;
    +import 
org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher;
    +import 
org.apache.flink.streaming.connectors.kafka.internals.KafkaTopicPartition;
    +import org.apache.flink.streaming.util.serialization.DeserializationSchema;
    +import 
org.apache.flink.streaming.util.serialization.KeyedDeserializationSchema;
    +import 
org.apache.flink.streaming.util.serialization.KeyedDeserializationSchemaWrapper;
    +import org.apache.flink.util.SerializedValue;
    +
    +import org.apache.kafka.clients.consumer.ConsumerConfig;
    +import org.apache.kafka.clients.consumer.KafkaConsumer;
    +import org.apache.kafka.common.PartitionInfo;
    +import org.apache.kafka.common.serialization.ByteArrayDeserializer;
    +
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import java.util.ArrayList;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Properties;
    +
    +import static org.apache.flink.util.Preconditions.checkNotNull;
    +
    +/**
    + * The Flink Kafka Consumer is a streaming data source that pulls a 
parallel data stream from
    + * Apache Kafka 0.10.x. The consumer can run in multiple parallel 
instances, each of which will pull
    + * data from one or more Kafka partitions. 
    + * 
    + * <p>The Flink Kafka Consumer participates in checkpointing and 
guarantees that no data is lost
    + * during a failure, and that the computation processes elements "exactly 
once". 
    + * (Note: These guarantees naturally assume that Kafka itself does not 
loose any data.)</p>
    + *
    + * <p>Please note that Flink snapshots the offsets internally as part of 
its distributed checkpoints. The offsets
    + * committed to Kafka / ZooKeeper are only to bring the outside view of 
progress in sync with Flink's view
    + * of the progress. That way, monitoring and other jobs can get a view of 
how far the Flink Kafka consumer
    + * has consumed a topic.</p>
    + *
    + * <p>Please refer to Kafka's documentation for the available 
configuration properties:
    + * http://kafka.apache.org/documentation.html#newconsumerconfigs</p>
    + *
    + * <p><b>NOTE:</b> The implementation currently accesses partition 
metadata when the consumer
    + * is constructed. That means that the client that submits the program 
needs to be able to
    + * reach the Kafka brokers or ZooKeeper.</p>
    + */
    +public class FlinkKafkaConsumer010<T> extends FlinkKafkaConsumerBase<T> {
    +
    +   private static final long serialVersionUID = 2324564345203409112L;
    +
    +   private static final Logger LOG = 
LoggerFactory.getLogger(FlinkKafkaConsumer010.class);
    +
    +   /**  Configuration key to change the polling timeout **/
    +   public static final String KEY_POLL_TIMEOUT = "flink.poll-timeout";
    +
    +   /** Boolean configuration key to disable metrics tracking **/
    +   public static final String KEY_DISABLE_METRICS = 
"flink.disable-metrics";
    --- End diff --
    
    This is redundant. It's already declared in the `FlinkKafkaConsumerBase`.


> Bump Kafka producer in Kafka sink to Kafka 0.10.0.0
> ---------------------------------------------------
>
>                 Key: FLINK-4035
>                 URL: https://issues.apache.org/jira/browse/FLINK-4035
>             Project: Flink
>          Issue Type: Bug
>          Components: Kafka Connector
>    Affects Versions: 1.0.3
>            Reporter: Elias Levy
>            Priority: Minor
>
> Kafka 0.10.0.0 introduced protocol changes related to the producer.  
> Published messages now include timestamps and compressed messages now include 
> relative offsets.  As it is now, brokers must decompress publisher compressed 
> messages, assign offset to them, and recompress them, which is wasteful and 
> makes it less likely that compression will be used at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4035) Bump Kafka producer in Kafka sink to Kafka 0.10.0.0

Reply via email to