Re: Entire XML data as one of the column in DataFrame

2016-08-21 Thread Hyukjin Kwon
I can't say this is the best way to do so but my instant thought is as below: Create two df sc.hadoopConfiguration.set(XmlInputFormat.START_TAG_KEY, s"") sc.hadoopConfiguration.set(XmlInputFormat.END_TAG_KEY, s"") sc.hadoopConfiguration.set(XmlInputFormat.ENCODING_KEY, "UTF-8") val strXmlDf = sc

Entire XML data as one of the column in DataFrame

2016-08-21 Thread srikanth.jella
Hello Experts, I’m using spark-xml package which is automatically inferring my schema and creating a DataFrame. I’m extracting few fields like id, name (which are unique) from below xml, but my requirement is to store entire XML in one of the column as well. I’m writing this data to AVRO hive