kyle n created FLINK-37296:
------------------------------

             Summary: Add ability to add File Level Header in the File Sink 
forRowFormat
                 Key: FLINK-37296
                 URL: https://issues.apache.org/jira/browse/FLINK-37296
             Project: Flink
          Issue Type: Improvement
          Components: API / DataStream, Connectors / FileSystem
            Reporter: kyle n


Our teams use case would like to be able to use forRowFormat to processing 
streaming data. we use the size limit and time based sinks to create output 
files sinking to a s3 bucket.

we would like the ability to add a header to these files. the header will be 
static across all file parts.

I think this can be accomplished by whenever we open a new part file to inject 
a string before returning that bucketWriter for further processing.

 

we have created a working prototype by modifying the method 
`[org.apache.flink.connector.file.sink.writer.FileWriterBucket#rollPartFile|https://github.com/apache/flink/blob/release-1.18/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/writer/FileWriterBucket.java#L249-L261]`
where we will have the bucketwriter write a string when it opens a new part. 

and have added some pass through args and a method in FileSink to allow header 
variable to be set



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to