Both of these make sense to add. Can you submit a pull request?

On Sun, Feb 7, 2016 at 3:29 PM, sim <s...@swoop.com> wrote:

> The more Spark code I write, the more I hit the same use cases where the
> Scala APIs feel a bit awkward. I'd love to understand if there are
> historical reasons for these and whether there is opportunity + interest to
> improve the APIs. Here are my top two:
> 1. registerTempTable() returns Unit
>
> def cachedDF(path: String, tableName: String) = {
>   val df = sqlContext.read.load(path).cache()
>   df.registerTempTable(tableName)
>   df
> }
>
> // vs.
>
> def cachedDF(path: String, tableName: String) =
>   sqlContext.read.load(path).cache().registerTempTable(tableName)
>
> 2. No toDF() implicit for creating a DataFrame from an RDD + schema
>
> val schema: StructType = ...
> val rdd = sc.textFile(...)
>   .map(...)
>   .aggregate(...)
> val df = sqlContext.createDataFrame(rdd, schema)
>
> // vs.
>
> val schema: StructType = ...
> val df = sc.textFile(...)
>   .map(...)
>   .aggregate(...)
>   .toDF(schema)
>
> Have you encountered other examples where small, low-risk API tweaks could
> make common use cases more consistent + simpler to code?
>
> /Sim
> ------------------------------
> View this message in context: Scala API: simplifying common patterns
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-API-simplifying-common-patterns-tp16238.html>
> Sent from the Apache Spark Developers List mailing list archive
> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at
> Nabble.com.
>
>

Reply via email to