> On 18 Oct 2015, at 03:23, vonnagy <i...@vadio.com> wrote: > > Has anyone tried to go from streaming directly to GCS or S3 and overcome the > unacceptable performance. It can never keep up.
the problem here is that they aren't really filesystems (certainly s3 via the s3n & s3a clients), flush() is a no-op, and its's only on the close() that there's a bulk upload. For bonus fun, anything that does a rename() usually forces a download/re-upload of the source files. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org