Handling Schema Variability and Applying Regex Patterns in Flink Job Configuration

arjun s Mon, 06 Nov 2023 04:50:51 -0800

Hi team,
I'm currently utilizing the Table API function within my Flink job, with
the objective of reading records from CSV files located in a source
directory. To obtain the file names, I'm creating a table and specifying
the schema using the Table API in Flink. Consequently, when the schema
matches, my Flink job successfully submits and executes as intended.
However, in cases where the schema does not match, the job fails to submit.
Given that the schema of the files in the source directory is
unpredictable, I'm seeking a method to handle this situation.
Create table query
=============
CREATE TABLE sample (col1 STRING,col2 STRING,col3 STRING,col4
STRING,file.path` STRING NOT NULL METADATA) WITH ('connector' =
'filesystem','path' = 'file:///home/techuser/inputdata','format' =
'csv','source.monitor-interval' = '10000')
=============


Furthermore, I have a question about whether there's a way to read files
from the source directory based on a specific regex pattern. This is
relevant in our situation because only file names that match a particular
pattern need to be processed by the Flink job.

Thanks and Regards,
Arjun

Handling Schema Variability and Applying Regex Patterns in Flink Job Configuration

Reply via email to