Hi Raymond,
Your problem is to pass those 100 fields to .toDF() method??
Sent from my Samsung device
Original message
From: Raymond Xie
Date: 31/12/2016 10:46 (GMT+08:00)
To: user@spark.apache.org
Subject: How to load a big csv to dataframe in Spark 1.6
Hello,
I s
Hi,
We can use CombineByKey to achieve this.
val finalRDD = tempRDD.combineByKey((x: (Any, Any)) => (x),(acc: (Any, Any), x)
=> (acc, x),(acc1: (Any, Any), acc2: (Any, Any)) => (acc1, acc2))
finalRDD.collect.foreach(println)
(amazon,((book1, tech),(book2,tech)))(barns&noble, (book,tech))(eBa