Re: NumberFormatException while reading and split the file

2018-04-04 Thread utkarsh_deep
Response to the 1st approach: When you do spark.read.text("/xyz/a/b/filename") it returns a DataFrame and when applying the rdd methods gives you a RDD[Row], so when you use map, your function get Row as the parameter i.e; ip in your code. Therefore you must use the Row methods to access its membe

Re: NumberFormatException: For input string: "0.00000"

2016-09-19 Thread Hyukjin Kwon
It seems not an issue in Spark. Does "CSVParser" works fine without Spark with the data? BTW, it seems there is something wrong with your email address. I am sending this again. On 20 Sep 2016 8:32 a.m., "Hyukjin Kwon" wrote: > It seems not an issue in Spark. Does "CSVParser" works fine without

Re: NumberFormatException: For input string: "0.00000"

2016-09-19 Thread Hyukjin Kwon
It seems not an issue in Spark. Does "CSVParser" works fine without Spark with the data? On 20 Sep 2016 2:15 a.m., "Mohamed ismail" wrote: > Hi all > > I am trying to read: > > sc.textFile(DataFile).mapPartitions(lines => { > val parser = new CSVParser(",") >

Re: NumberFormatException

2014-12-16 Thread Imran Rashid
wow, really weird. My intuition is the same as everyone else's, some unprintable character. Here's a couple more debugging tricks I've used in the past: //set up an accumulator to catch the bad rows as a side-effect val nBadRows = sc.accumulator(0) val nGoodRows = sc.accumulator(0) val badRows =

Re: NumberFormatException

2014-12-15 Thread Akhil Das
There could be some other character like a space or ^M etc. You could try the following and see the actual row. val newstream = datastream.map(row => { try{ val strArray = str.trim().split(",") (strArray(0).toInt, strArray(1).toInt) //Instead try this //*(strArray(0).trim(

Re: NumberFormatException

2014-12-15 Thread Harihar Nahak
Hi Yu, Try this : val data = csv.map( line => line.split(",").map(elem => elem.trim)) //lines in rows data.map( rec => (rec(0).toInt, rec(1).toInt)) to convert into integer. On 16 December 2014 at 10:49, yu [via Apache Spark User List] < ml-node+s1001560n20694...@n3.nabble.com> wrote: > > He

Re: NumberFormatException

2014-12-15 Thread Sean Owen
That certainly looks surprising. Are you sure there are no unprintable characters in the file? On Mon, Dec 15, 2014 at 9:49 PM, yu wrote: > The exception info is: > 14/12/15 15:35:03 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 > (TID 0, h3): java.lang.NumberFormatException: For inpu