We just merged a feature into master that lets you print the schema or view
it as a string (printSchema() and schemaTreeString on SchemaRDD).

There is also this JIRA targeting 1.1 for presenting a nice programatic API
for this information: https://issues.apache.org/jira/browse/SPARK-2179


On Wed, Jun 18, 2014 at 10:36 AM, Kevin Jung <itsjb.j...@samsung.com> wrote:

> Can I get schema information from SchemaRDD?
> For example,
>
> *case class Person(name:String, Age:Int, Gender:String, Birth:String)
> val peopleRDD = sc.textFile("/sample/sample.csv").map(_.split(",")).map(p
> =>
> Person(p(0).toString, p(1).toInt, p(2).toString, p(3).toString))
> peopleRDD.saveAsParquetFile("people.parquet")*
>
> (few days later...)
>
> *val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext._
> val loadedPeopleRDD = sqlContext.parquetFile("people.parquet")
> loadedPeopleRDD.registerAsTable("peopleTable")*
>
> Someone who doesn't know Person class can't know what columns and types
> this
> table have.
> Maybe they want to get schema information from loadedPeopleRDD.
> How can I do this?
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/get-schema-from-SchemaRDD-tp7830.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to