[jira] [Updated] (FLINK-11378) Allow HadoopRecoverableWriter to write to Hadoop compatible Filesystems

Martijn van de Grift (JIRA) Thu, 17 Jan 2019 00:57:52 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Martijn van de Grift updated FLINK-11378:
-----------------------------------------
    Description: 
At a client we're using Flink jobs to read data from Kafka and writing to GCS. 
In earlier versions, we've used `BucketingFileSink` for this, but we want to 
switch to the newer `StreamingFileSink`.

Since we're running Flink on Google's DataProc, we're using the Hadoop 
compatible GCS 
[connector|https://github.com/GoogleCloudPlatform/bigdata-interop] made by 
Google. This currently doesn't work on Flink, because Flink checks for a HDFS 
scheme at 'HadoopRecoverableWriter'.

We've successfully ran our jobs by creating a custom Flink Distro which has the 
hdfs scheme check removed.

 

 

  was:
At a client we're using Flink jobs to read data from Kafka and writing it to 
GCS. In earlier versions, we've used `BucketingFileSink` for this, but we want 
to switch to the newer `StreamingFileSink`.

Since we're running Flink on Google's DataProc, we're using the Hadoop 
compatible GCS 
[connector|https://github.com/GoogleCloudPlatform/bigdata-interop] made by 
Google. This currently doesn't work on Flink, because Flink checks for a HDFS 
scheme at 'HadoopRecoverableWriter'.

We've successfully ran our jobs by creating a custom Flink Distro which has the 
hdfs scheme check removed.

 

 


> Allow HadoopRecoverableWriter to write to Hadoop compatible Filesystems
> -----------------------------------------------------------------------
>
>                 Key: FLINK-11378
>                 URL: https://issues.apache.org/jira/browse/FLINK-11378
>             Project: Flink
>          Issue Type: Improvement
>          Components: FileSystem
>            Reporter: Martijn van de Grift
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> At a client we're using Flink jobs to read data from Kafka and writing to 
> GCS. In earlier versions, we've used `BucketingFileSink` for this, but we 
> want to switch to the newer `StreamingFileSink`.
> Since we're running Flink on Google's DataProc, we're using the Hadoop 
> compatible GCS 
> [connector|https://github.com/GoogleCloudPlatform/bigdata-interop] made by 
> Google. This currently doesn't work on Flink, because Flink checks for a HDFS 
> scheme at 'HadoopRecoverableWriter'.
> We've successfully ran our jobs by creating a custom Flink Distro which has 
> the hdfs scheme check removed.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-11378) Allow HadoopRecoverableWriter to write to Hadoop compatible Filesystems

Reply via email to