Hi list,

val dfs = spark
  .read
  .format("org.apache.spark.sql.cassandra")
  .options(Map("cluster" -> "helloCassandra",
  "spark.cassandra.connection.host" -> "127.0.0.1",
  "spark.input.fetch.size_in_rows" -> "100000",
  "spark.cassandra.input.consistency.level" -> "ONE",
  "table" -> "cliks",
  "pushdown" -> "true",
  "keyspace" -> "test",
  "header" -> "false",
  "inferSchema" -> "true"))
  .load()
  .select("kafka")

dfs.printSchema()

Any way to put this schema in json?

Thanks in advance

On 22-01-2018 17:51, Sathish Kumaran Vairavelu wrote:
> You have to register a Cassandra table in spark as dataframes
>
>
> https://github.com/datastax/spark-cassandra-connector/blob/master/doc/14_data_frames.md
>
>
> Thanks
>
> Sathish
> On Mon, Jan 22, 2018 at 7:43 AM Conconscious <conconsci...@gmail.com
> <mailto:conconsci...@gmail.com>> wrote:
>
>     Hi list,
>
>     I have a Cassandra table with two fields; id bigint, kafka text
>
>     My goal is to read only the kafka field (that is a JSON) and infer the
>     schema
>
>     Hi have this skeleton code (not working):
>
>     sc.stop
>     import org.apache.spark._
>     import com.datastax.spark._
>     import org.apache.spark.sql.functions.get_json_object
>
>     import org.apache.spark.sql.functions.to_json
>     import org.apache.spark.sql.functions.from_json
>     import org.apache.spark.sql.types._
>
>     val conf = new SparkConf(true)
>     .set("spark.cassandra.connection.host", "127.0.0.1")
>     .set("spark.cassandra.auth.username", "cassandra")
>     .set("spark.cassandra.auth.password", "cassandra")
>     val sc = new SparkContext(conf)
>
>     val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>     val df = sqlContext.sql("SELECT kafka from table1")
>     df.printSchema()
>
>     I think at least I have two problems; is missing the keyspace, is not
>     recognizing the table and for sure is not going to infer the
>     schema from
>     the text field.
>
>     I have a working solution for json files, but I can't "translate" this
>     to Cassandra:
>
>     import org.apache.spark.sql.SparkSession
>     import spark.implicits._
>     val spark = SparkSession.builder().appName("Spark SQL basic
>     example").getOrCreate()
>     val redf = spark.read.json("/usr/local/spark/examples/cqlsh_r.json")
>     redf.printSchema
>     redf.count
>     redf.show
>     redf.createOrReplaceTempView("clicks")
>     val clicksDF = spark.sql("SELECT * FROM clicks")
>     clicksDF.show()
>
>     My Spark version is 2.2.1 and Cassandra version is 3.11.1
>
>     Thanks in advance
>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>     <mailto:user-unsubscr...@spark.apache.org>
>

Reply via email to