Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37751624 Hey Michael, I really like the docs and API for this! I tried this out in spark-shell though and saw a few errors: * The built-in SQL seems to be case-sensitive -- it complained about "select * from foo" but not "SELECT * FROM foo" * When trying to do `[ExecutedQuery].rdd.collect()`, I got `NotSerializableException: SqlContext` Maybe these are just due to the shell environment, I'm not sure if they'd happen in standalone jobs. Also, I have some comments on the API to make it more consistent with our other stuff (more later): * Capitalize SQL in SQLContext, to match some of the other names where we capitalize acronyms * Instead of making registerAsTable an implicit conversion on RDDs, make it a method of SQLContext, so that the API looks the same in Java and Python. (There may be other places where we can do this.) * Do we really want loadFile to infer the file type? It might be better to have loadParquetFile and allow other types in the future Finally, add ScalaDoc comments to all the public methods on things like SQLContext.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---