Good to hear it helps. 2016년 11월 18일 (금) 오전 1:52, Muhammad Rezaul Karim <reza_cse...@yahoo.com>님이 작성:
> Hi Shim, > > Now it works perfectly. Thank you so much. Actually, I am from Java > background and learning the Scala. > > > Thanks and Regards, > --------------------------------- > *Md. Rezaul Karim* > PhD Researcher, Insight Centre for Data Analytics > National University of Ireland Galway > *E-mail:* rezaul.ka...@insight-centre.org > <donal.carr...@insight-centre.org> > *Web*: www.insight-centre.org > *Phone:* +353892311519 > > > On Thursday, November 17, 2016 2:00 PM, Hyung Sung Shim <hss...@nflabs.com> > wrote: > > > Hello Muhammad. > > Please check your bank-full.csv file first and you can filter item length > in your scala code > for example *val bank = bankText.map(s => s.split(";")).filter(s => > (s.size)>5).filter(s => s(0) != "\"age\"")* > > Hope this helps. > > > > 2016-11-17 21:26 GMT+09:00 Dayong <will...@gmail.com>: > > Try to debug your code in IDE. You should look at your array s since it > complains about array index. > > Thanks, > Wd > > > On Nov 16, 2016, at 10:44 PM, Muhammad Rezaul Karim <reza_cse...@yahoo.com> > wrote: > > Hi All, > > I have the following Scala code (taken from https://zeppelin.apache.org/ > docs/0.6.2/quickstart/ tutorial.html#data-retrieval > <https://zeppelin.apache.org/docs/0.6.2/quickstart/tutorial.html#data-retrieval>) > that deals with the sample Bank-details data: > > > ------------------------------ -----------------------------* > ------------------------------ ------------------------------ -------- > > val bankText = sc.textFile("/home/asif/ zeppelin-0.6.2-bin-all/bin/ > bank-full.csv") > > > case class Bank(age:Integer, job:String, marital:String, education:String, > balance:Integer) > > // split each line, filter out header (starts with "age"), and map it into > Bank case class > val bank = bankText.map(s=>s.split(";")). filter(s=>s(0)!="\"age\""). map( > s=>Bank(s(0), > s(1).replaceAll("\"", ""), > s(2).replaceAll("\"", ""), > s(3).replaceAll("\"", ""), > s(5).replaceAll("\"", "") > ) > ) > // convert to DataFrame and create temporal table > bank.toDF().registerTempTable( "bank") > > ------------------------------ -----------------------------* > ------------------------------ ------------------------------ -------- > The above code segment runs successfully. However, when I am trying to > execute the following line of code: *bank.collect(), *I am getting the > following error: > > org.apache.spark. SparkException: Job aborted due to stage failure: Task 1 > in stage 6.0 failed 1 times, most recent failure: Lost task 1.0 in stage > 6.0 (TID 7, localhost): java.lang. ArrayIndexOutOfBoundsException : 2 > > > at $anonfun$3.apply(<console>:91) > at $anonfun$3.apply(<console>:89) > > > Moreover, I cannot execute the below SQL queries, but getting the same > error message (i.e., ArrayIndexOutOfBoundsException : 2 ): > > 1. %sql select age, count(1) from bank where age < 30 group by age > order by age > 2. %sql select age, count(1) from bank where age < ${maxAge=30} group by > age order by age > > 3. %sql select age, count(1) from bank where marital="${marital=single, > single|divorced|married}" group by age order by age > > > *Note: However, when I am executing the following SQL statements I am not > getting any error:* > > 1. %sql select age from bank > 2. %sql select * from bank > > I don't understand what's wrong I am doing here! Please help me, someone, > to get rid of it. > > > Thanks and Regards, > ------------------------------ --- > *Md. Rezaul Karim* > PhD Researcher, Insight Centre for Data Analytics > National University of Ireland Galway > *E-mail:* rezaul.karim@insight-centre. org > <donal.carr...@insight-centre.org> > *Web*: www.insight-centre.org > *Phone:* +353892311519 > >