Hi users,
Does anyone here has experience with written spark code that just read the
last line of each text file in a directory, s3 bucket, etc?
I am looking for a solution that doesn’t require reading the whole file. I
basically wonder whether you can create a data frame/Rdd using file seek.
Not s
Dear Spark folks,
Is there somewhere a guideline on the density tipping point when it makes
more sense to use a spark ml dense vector vs. a sparse vector with regards
to the memory usage on fairly large (image processing) vectors?
My google-foo didn't deliver me anything useful.
Thanks in advance