flink SQL client with kafka confluent avro binaries setup

Georg Heiler Wed, 23 Mar 2022 08:59:01 -0700

Hi,

When trying to set up a demo for the kafka-sql-client reading an Avro topic
from Kafka I run into problems with regards to the additional dependencies.
In the spark-shell there is a --packages option which automatically
resolves any additional required jars (transitively) using the provided
maven coordinates. So far, I could not find this function for flink. Am
I missing something?


When now instead of trying to set this up manually I first get the
additional jars (for flink 1.14.1 scala 2.12) which are mentioned here:
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/table/kafka/
and
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/table/formats/avro-confluent/

wget
https://repo1.maven.org/maven2/org/apache/flink/flink-connector-kafka_2.12/1.14.4/flink-connector-kafka_2.12-1.14.4.jar
-P lib/
wget
https://repo1.maven.org/maven2/org/apache/kafka/kafka-clients/3.0.0/kafka-clients-3.0.0.jar
-P lib/
wget
https://repo1.maven.org/maven2/org/apache/flink/flink-avro-confluent-registry/1.14.4/flink-avro-confluent-registry-1.14.4.jar
-P lib/
wget
https://repo1.maven.org/maven2/org/apache/flink/flink-avro/1.14.4/flink-avro-1.14.4.jar
-P lib/

I still fail to get them loaded (even though they are put into the default
lib path.
When starting a local cluster:

./bin/start-cluster.sh local

and the SQL client:

./bin/sql-client.sh

Any option:
./bin/sql-client.sh -j or ./bin/sql-client.sh -l (with the path to the lib
folder or the additional jars wich were added before) all fails with the
same reason:

Caused by: java.lang.ClassNotFoundException:
org.apache.avro.SchemaParseException

when trying to execute:
CREATE TABLE foo (foo string) WITH (
    'connector' = 'kafka',
    'topic' = 'foo',
    'scan.startup.mode' = 'earliest-offset',
    'format' = 'avro-confluent',
    'avro-confluent.schema-registry.url' = 'http://localhost:8081/',
    'properties.group.id' = 'flink-test-001',
    'properties.bootstrap.servers' = 'localhost:9092'
);
SELECT * FROM foo;

(irrespective of any dummy data loaded) Though I have some dummy data in
the following structure available using the Kafka Connect dummy data
generator for the following Avro schema (and serialized using Avro into
Kafkas topic):

{
  "type": "record",
  "name": "commercialrating",
  "fields": [
    {
      "name": "brand",
      "type": {
        "type": "string",
        "arg.properties": {
          "options": ["Acme", "Globex"]
        }
      }
    },
    {
      "name": "duration",
      "type": {
        "type": "int",
        "arg.properties": {
          "options": [30, 45, 60]
        }
      }
    },
    {
      "name": "rating",
      "type": {
        "type": "int",
        "arg.properties": {
          "range": { "min": 1, "max": 5 }
        }
      }
    }
  ]
}


*Questions:*

*1) can I somehow specify maven coordinates directly? (for the naive method
of using the SQL client like in the spark-shell) to simplify the setup of
the required jars?*


*2) given the fact that I manually have downloaded the jars into the lib
folder of the flink installation - why are they not loaded by default? What
needs to change so the additional (required) jars for Avro +
confluent-schema-registry + Kafka are loaded by the flink SQL client?*

Best,
Georg

flink SQL client with kafka confluent avro binaries setup

Reply via email to