Dear all, I released a first version of an open source library that uses Apache POI to read/write Excel files on Hadoop/Spark/etc.: https://snippetessay.wordpress.com/2017/01/08/readingwriting-excel-documents-with-the-hadoopoffice-library-on-hadoop-and-spark-first-release/
Feel free to comment or propose suggestions via Github issues. Thank you! best regsards