Thanks for your response. How should we address the issue of dealing with the unpredictable file schema(Table API) in the source directory, as I previously mentioned in my email?
Thanks and regards, Arjun On Mon, 6 Nov 2023 at 20:56, Chen Yu <yuchen.e...@gmail.com> wrote: > Hi Arjun, > > If you can filter files by a regex pattern, I think the config > `source.path.regex-pattern`[1] maybe what you want. > > 'source.path.regex-pattern' = '...', -- optional: regex pattern to filter > files to read under the -- directory > of `path` option. This regex pattern should be > -- matched with the absolute file path. If this option is set, > -- the connector will recursive all files > under the directory -- of `path` option > > > Best, > Yu Chen > > > [1] > https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/filesystem/ > > ------------------------------ > *发件人:* arjun s <arjunjoice...@gmail.com> > *发送时间:* 2023年11月6日 20:50 > *收件人:* user@flink.apache.org <user@flink.apache.org> > *主题:* Handling Schema Variability and Applying Regex Patterns in Flink > Job Configuration > > Hi team, > I'm currently utilizing the Table API function within my Flink job, with > the objective of reading records from CSV files located in a source > directory. To obtain the file names, I'm creating a table and specifying > the schema using the Table API in Flink. Consequently, when the schema > matches, my Flink job successfully submits and executes as intended. > However, in cases where the schema does not match, the job fails to submit. > Given that the schema of the files in the source directory is > unpredictable, I'm seeking a method to handle this situation. > Create table query > ============= > CREATE TABLE sample (col1 STRING,col2 STRING,col3 STRING,col4 > STRING,file.path` STRING NOT NULL METADATA) WITH ('connector' = > 'filesystem','path' = 'file:///home/techuser/inputdata','format' = > 'csv','source.monitor-interval' = '10000') > ============= > > Furthermore, I have a question about whether there's a way to read files > from the source directory based on a specific regex pattern. This is > relevant in our situation because only file names that match a particular > pattern need to be processed by the Flink job. > > Thanks and Regards, > Arjun >