Quickly reviewing the latest SQL Programming Guide
<https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
(in github) I had a couple of quick questions:
1) Do we need to instantiate the SparkContext as per
// sc is an existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
Within Spark 1.3 the sqlContext is already available so probably do not
need to make this call.
2) Importing org.apache.spark.sql._ should bring in both SQL data types,
struct types, and row
// Import Spark SQL data types and Row.
import org.apache.spark.sql._
Currently with Spark 1.3 RC1, it appears org.apache.spark.sql._ only brings
in row.
scala> import org.apache.spark.sql._
import org.apache.spark.sql._
scala> val schema =
| StructType(
| schemaString.split(" ").map(fieldName => StructField(fieldName,
StringType, true)))
<console>:25: error: not found: value StructType
StructType(
But if I also import in org.apache.spark.sql.types_
scala> import org.apache.spark.sql.types._
import org.apache.spark.sql.types._
scala> val schema =
| StructType(
| schemaString.split(" ").map(fieldName => StructField(fieldName,
StringType, true)))
schema: org.apache.spark.sql.types.StructType =
StructType(StructField(DeviceMake,StringType,true),
StructField(Country,StringType,true))
Wondering if this is by design or perhaps a quick documentation / package
update is warranted.