Hello,
We had the same problem. I've written a blog post with the detailed
explanation and workaround:
http://labs.totango.com/spark-read-file-with-colon/
Greetings,
Romi K.
On Tue, Aug 25, 2015 at 2:47 PM Gourav Sengupta
wrote:
> I am not quite sure about this but should the notation not be
I am not quite sure about this but should the notation not be
s3n://redactedbucketname/*
instead of
s3a://redactedbucketname/*
The best way is to use s3://<>/<>/*
Regards,
Gourav
On Tue, Aug 25, 2015 at 10:35 AM, Akhil Das
wrote:
> You can change the names, whatever program that is pushing th
You can change the names, whatever program that is pushing the record must
follow the naming conventions. Try to replace : with _ or something.
Thanks
Best Regards
On Tue, Aug 18, 2015 at 10:20 AM, Brian Stempin
wrote:
> Hi,
> I'm running Spark on Amazon EMR (Spark 1.4.1, Hadoop 2.6.0). I'm se
Hi,
I'm running Spark on Amazon EMR (Spark 1.4.1, Hadoop 2.6.0). I'm seeing
the exception below when encountering file names that contain colons. Any
idea on how to get around this?
scala> val files = sc.textFile("s3a://redactedbucketname/*")
2015-08-18 04:38:34,567 INFO [main] storage.MemoryS