Re: UnicodeDecodeError in zeppelin 0.7.1

2017-06-19 Thread Meethu Mathew
_csv(training_data,header=None, >delimiter=delimiter, error_bad_lines=False, usecols=[label_column,text_ >column],names=['label','msg']).dropna() >- new_training['processed_msg'] = textPreProcessor(new_training['msg']) > > This python co

Re: UnicodeDecodeError in zeppelin 0.7.1

2017-04-20 Thread Meethu Mathew
the unicode function. Hope the problem is clear now. Regards, Meethu Mathew On Fri, Apr 21, 2017 at 3:07 AM, Felix Cheung wrote: > And are they running with the same Python version? What is the Python > version? > > _ > From: moon soo Lee > Sent:

Re: UnicodeDecodeError in zeppelin 0.7.1

2017-04-20 Thread Felix Cheung
And are they running with the same Python version? What is the Python version? _ From: moon soo Lee mailto:m...@apache.org>> Sent: Thursday, April 20, 2017 11:53 AM Subject: Re: UnicodeDecodeError in zeppelin 0.7.1 To: mailto:users@zeppelin.apache.org>>

Re: UnicodeDecodeError in zeppelin 0.7.1

2017-04-20 Thread moon soo Lee
Hi, 0.7.1 didn't changed any encoding type as far as i know. One difference is 0.7.1 official artifact has been built with JDK8 while 0.7.0 built with JDK7 (we'll use JDK7 to build upcoming 0.7.2 binary). But i'm not sure that can make pyspark and spark encoding type changes. Do you have exactly

UnicodeDecodeError in zeppelin 0.7.1

2017-04-19 Thread Meethu Mathew
Hi, I just migrated from zeppelin 0.7.0 to zeppelin 0.7.1 and I am facing this error while creating an RDD(in pyspark). UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: > invalid start byte I was able to create the RDD without any error after adding use_unicode=False as fo