for my understanding, all transformations are thread-safe cause dataframe is just a description of the calculation and it's immutable, so the case above is all right. just be careful with the actions.
On Sun, Feb 12, 2017 at 4:06 PM, Mendelson, Assaf <assaf.mendel...@rsa.com> wrote: > Hi, > > I was wondering if dataframe is considered thread safe. I know the spark > session and spark context are thread safe (and actually have tools to > manage jobs from different threads) but the question is, can I use the same > dataframe in both threads. > > The idea would be to create a dataframe in the main thread and then in two > sub threads do different transformations and actions on it. > > I understand that some things might not be thread safe (e.g. if I > unpersist in one thread it would affect the other. Checkpointing would > cause similar issues), however, I can’t find any documentation as to what > operations (if any) are thread safe. > > > > Thanks, > > Assaf. >