I ran into the "No space left on device" error in zeppelin spark when I tried to run the following. cache table temp_tbl as select * from ( select *, rank() over (partition by id order by year desc) as rank from table1 ) v where v.rank =1
The table1 is very big. I set up spark.local.dir in spark-env.sh. SPARK_LOCAL_DIRS in spark-defaults.conf, and spark.local.dir in zeppelin spark interpreter parameter. They are all set to /apps/spark_tmp folder. I can see some folders are created in /apps/spark_tmp. But I can still see in df -h, /tmp is running out of space. What is missing? Thanks