Re: [s3a] Spark is not reading s3 object content

Amin Mosayyebzadeh Fri, 31 May 2024 05:07:57 -0700

I am reading from a single file:
df = spark.read.text("s3a://test-bucket/testfile.csv")




On Fri, May 31, 2024 at 5:26 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Tell Spark to read from a single file
>
> data = spark.read.text("s3a://test-bucket/testfile.csv")
>
> This clarifies to Spark that you are dealing with a single file and avoids
> any bucket-like interpretation.
>
> HTH
>
> Mich Talebzadeh,
> Technologist | Architect | Data Engineer  | Generative AI | FinCrime
> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College
> London <https://en.wikipedia.org/wiki/Imperial_College_London>
> London, United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Fri, 31 May 2024 at 09:53, Amin Mosayyebzadeh <mosayyebza...@gmail.com>
> wrote:
>
>> I will work on the first two possible causes.
>> For the third one, which I guess is the real problem, Spark treats the
>> testfile.csv object with the url s3a://test-bucket/testfile.csv as a bucket
>> to access _spark_metadata with url
>> s3a://test-bucket/testfile.csv/_spark_metadata
>> testfile.csv is an object and should not be treated as a bucket. But I am
>> not sure how to prevent Spark from doing that.
>>
>

Re: [s3a] Spark is not reading s3 object content

Reply via email to