Hi all, I am running a program which connects to Amazon RDS and generate some data from S3 into RDD. When I run rdd.collect and insert the results into RDS using JDBC, I get "communication link failure". I tried to insert results into RDS using both python and mysql client in the master machine and everything went well. However, when I used Spark, the insertion was not successful. My questions are:
1) When I establish connection with RDS before RDD is generated, is this done in master? 2) When I calll rdd.collect, is the returned array in master or slave nodes? 3) When I insert the results of rdd.collect, where does the insertion happen? Thanks! Bill