hi, pesudo,
I've posted a blog before spark-dataframe-introduction
<http://litaotao.github.io/spark-dataframe-introduction?s=gmail> , and for
me, I use spark dataframe [ or RDD ] to do the logic calculation on all the
datasets, and then transform the result into pandas dataframe, and make
data visualization using pandas dataframe, sometimes you may need
matplotlib or seaborn.
--
*___________________*
Quant | Engineer | Boy
*___________________*
*blog*: http://litaotao.github.io
<http://litaotao.github.io?utm_source=spark_mail>
*github*: www.github.com/litaotao