Hi Dan

The SQL add the uuid by default is for the case that users want execute
multiple bounded sql and append to the same directory (hive table), thus
a uuid is attached to avoid overriding the previous output.

The datastream could be viewed as providing the low-level api and
thus it does not add the uuid automatically. And as you have pointed out,
by using OutputFileConfig users could also implement the functionality.

Best,
 Yun


 ------------------Original Mail ------------------
Sender:Dan Hill <quietgol...@gmail.com>
Send Date:Mon Feb 8 07:40:36 2021
Recipients:user <user@flink.apache.org>
Subject:UUID in part files

Hi.

Context
I'm migrating my Flink SQL job to DataStream.  When switching to 
StreamingFileSink, I noticed that the part files now do not have a uuid in 
them.  "part-0-0" vs "part-{uuid string}-0-0".  This is easy to add with 
OutputFileConfig.

Question
Is there a reason why the base OutputFileConfig doesn't add the uuid 
automatically?  Is this just a legacy issue?  Or do most people not have the 
uuid in the file outputs?

Reply via email to