Hi All, At my workplace we are starting to use Datasets in 1.6.1 and even more with Spark 2.0 in place of Dataframes. I looked at the 1.6.1 documentation then the 2.0 documentation and it looks like not much time has been spent writing a Dataset guide/tutorial.
Preview Docs: https://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/sql-programming-guide.html#creating-datasets Spark master docs: https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md I would like to spend the time to contribute an improvement to those docs with a more in depth examples of creating and using Datasets (eg using $ to select columns). Is this of value, and if so what should my next step be to get this going (create JIRA etc)? -- Pedro Rodriguez PhD Student in Distributed Machine Learning | CU Boulder R&D Data Science Intern at Oracle Data Cloud UC Berkeley AMPLab Alumni ski.rodrig...@gmail.com | pedrorodriguez.io | 909-353-4423 Github: github.com/EntilZha | LinkedIn: https://www.linkedin.com/in/pedrorodriguezscience