> On 18 Oct 2015, at 03:23, vonnagy <i...@vadio.com> wrote:
> 
> Has anyone tried to go from streaming directly to GCS or S3 and overcome the
> unacceptable performance. It can never keep up.

the problem here is that they aren't really filesystems (certainly s3 via the 
s3n & s3a clients), flush() is a no-op, and its's only on the close() that 
there's a bulk upload. For bonus fun, anything that does a rename() usually 
forces a download/re-upload of the source files.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to