I am not sure why you need to create an RDD first. You can create a
data frame directly from csv file, for instance:
spark.read.format("csv").option("header","true").schema(yourSchema).load(ftpUrl)
-- ND
On 8/5/21 3:14 AM, igyu wrote:
val ftpUrl ="ftp://test:test@ip:21/upload/test/_temporary/
May be this link will help you.
https://stackoverflow.com/questions/41898144/convert-rddstring-to-rddrow-to-dataframe-spark-scala
On Thu, Aug 5, 2021 at 12:46 PM igyu wrote:
> val ftpUrl =
> "ftp://test:test@ip:21/upload/test/_temporary/0/_temporary/task_2019124756_0002_m_00_0/*";
> val
val ftpUrl =
"ftp://test:test@ip:21/upload/test/_temporary/0/_temporary/task_2019124756_0002_m_00_0/*";
val rdd = spark.sparkContext.wholeTextFiles(ftpUrl)
val value = rdd.map(_._2).map(csv=>csv.split(",").toSeq)
val schemas = StructType(List(
new StructField("id", DataTypes.Strin