[ https://issues.apache.org/jira/browse/FLINK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-19903: ----------------------------------- Labels: auto-deprioritized-major pull-request-available (was: pull-request-available stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Allow to read metadata in filesystem connector > ---------------------------------------------- > > Key: FLINK-19903 > URL: https://issues.apache.org/jira/browse/FLINK-19903 > Project: Flink > Issue Type: New Feature > Components: Connectors / FileSystem, Table SQL / Ecosystem > Reporter: Ruben Laguna > Priority: Minor > Labels: auto-deprioritized-major, pull-request-available > Attachments: image-2020-11-03-08-53-03-714.png > > > Use case: > I have a dataset where they embedded some information in the filenames > (200k files) and I need to extract that as a new column. > In Spark I could ` > .withColumn("id",f.split(f.reverse(f.split(f.input_file_name(),'/'))[0],'\.')[0])` > but I don't see how can I do the same with Flink. > > Apparently there is > [FLIP-107|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Handling+of+metadata+in+SQL+connectors]] > which would allow SQL connectors and formats to expose metadata. > > So it would be great for the Filesystem SQL connector to expose the path. > Ideally for me the path could be exposed via a function that read the > metadata. So I could write something akin to `SELECT input_file_name(),* > FROM table1` > > > [1]: > [https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Handling+of+metadata+in+SQL+connectors] > [2]: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Can-I-get-the-filename-as-a-column-td39096.html -- This message was sent by Atlassian Jira (v8.3.4#803005)