Hi Robert,
Due to the performance issue of using state processor, I probably would like to
give up state processor and am trying StreamingFileSink in a streaming manner.
However, I need to store the files on GCS. However, I encountered the error
below. It looks like Flink hasn't support GCS for
Hi Robert,
Due to some concerns, we planned to use state processor to achieve our goal.
Now we will consider to reevaluate using datastream to do the job while
exploring the possibility of implementing a custom FileOutputFormat. Thanks for
your comments!
Best wishes,
Chen-Che Huang
On 2021/0
Hi,
I assumed you are using the DataStream API, because you mentioned the
streaming sink. But you also mentioned the state processor API (which I
ignored a bit).
I wonder why you are using the state processor API. Can't you use the
streaming job that created the state also for writing it to files
Hi Robert,
Thanks for your code. It's really helpful!
However, with the readKeyedState api of state processor, we get dataset for our
data instead of datastream and it seems the dataset doesn't support
streamfilesink (not addSink method like datastream). If not, I need to
transform the dataset
Hey Chen-Che Huang,
I guess the StreamingFileSink is what you are looking for. It is documented
here:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
I drafted a short example (that is not production ready), which does
roughly what you are asking for:
htt
Hi all,
We're going to use state processor to make our keyedstate data to be written to
different files based on the keys. More specifically, we want our data to be
written to files key1.txt, key2.txt, ..., and keyn.txt where the value with the
same key is stored in the same file. In each file,