Please find the answers inline please .
1) Can I apply predicate pushdown filters if I have data stored in S3 or it
should be used only while reading from DBs?
it can be applied in s3 if you store parquet , csv, json or in avro format
.It does not depend on the DB , its supported in object store li
Hi Tufan,
Thanks for the answers. However, by the second point, I mean to say where
would my code reside? Will it be copied to all the executors since the code
size would be small or will it be maintained on the driver's side? I know
that driver converts the code to DAG and when an action is calle
Code is always distributed for any operations on a DataFrame or RDD. The size
of your code is irrelevant except to Jvm memory limits. For most jobs the
entire application jar and all dependencies are put on the classpath of every
executor.
There are some exceptions but generally you should thi