Hi
Please use explode, which is written to solve exactly your problem.
Consider below:
>>> s = ["ERN~58XX7~^EPN~5X551~|1000"]
>>> df = sc.parallelize(s).map(lambda t: t.split('|')).toDF(['phone','id'])
>>> df.registerTempTable("t")
>>> resDF = sqlContext.sql("select id,explode(phone)
Hi Pralabh,
Thanks for your help.
val xx = columnList.map(x => x->0).toMap
val opMap = dataFrame.rdd.flatMap { row =>
columnList.foldLeft(xx) { case (y, col) =>
val s = row.getAs[String](col).split("\\^").length
if (y(col) < s)
y.updated(col, s)
else
y
}.toList
}
val colMaxSizeMap = opMap.group
Hi Nayan
Please find the solution of your problem which work on spark 2.
val spark =
SparkSession.builder().appName("practice").enableHiveSupport().getOrCreate()
val sc = spark.sparkContext
val sqlContext = spark.sqlContext
import spark.implicits._
val dataFrame =
sc.parallelize(List("ERN
If I have 2-3 values in a column then I can easily separate it and create new
columns with withColumn option.
but I am trying to achieve it in loop and dynamically generate the new columns
as many times the ^ has occurred in column values
Can it be achieve in this way.
> On 17-Jul-2017, at 3:29
You are looking for explode function.
On Mon, 17 Jul 2017 at 4:25 am, nayan sharma
wrote:
> I’ve a Dataframe where in some columns there are multiple values, always
> separated by ^
>
> phone|contact|
> ERN~58XX7~^EPN~5X551~|C~MXXX~MSO~^CAxxE~~3XXX5|
>
> phone1|phone2|contact1|contac