Hey afshin,
Your point 1 is innumerably faster than the latter.
It further shoots up the speed if you know how to properly use distKey and
sortKey on the tables being loaded.
Thanks,
Aakash.
https://www.linkedin.com/in/aakash-basu-5278b363
On 24-Apr-2017 10:37 PM, "Afshin, Bardia"
wrote:
I w
Redshift COPY is immensely faster than trying to do insert statements. I
did some rough testing of inserting data using INSERT and COPY and COPY is
vastly superior to the point that if speed is at all an issue to your
process you shouldn't even consider using INSERT.
On Mon, Apr 24, 2017 at 11:07
I wanted to reach out to the community to get a understanding of what everyones
experience is in regardst to maximizing performance as in decreasing load time
on loading multiple large datasets to RedShift.
Two approaches:
1. Spark writes file to S3, RedShift COPY INTO from S3 bucket.
2.