1. if you have join by some specific field(e.g. user id or account-id or
whatever) you may try to partition parquet file by this field and then join
will be more efficient.
2. you need to see in spark metrics what is performance of particular join,
how much partitions is there, what is shuffle size
Can you share your approximate data size? all should be valid use cases for
spark, wondering if you are providing enough resources.
Also - do you have some expectations in terms of performance? what does "slow
down" mean?
For this usecase I would personally favor parquet over DB, and sql/datafr