_csv(training_data,header=None,
>delimiter=delimiter, error_bad_lines=False, usecols=[label_column,text_
>column],names=['label','msg']).dropna()
>- new_training['processed_msg'] = textPreProcessor(new_training['msg'])
>
> This python co
the unicode function.
Hope the problem is clear now.
Regards,
Meethu Mathew
On Fri, Apr 21, 2017 at 3:07 AM, Felix Cheung
wrote:
> And are they running with the same Python version? What is the Python
> version?
>
> _
> From: moon soo Lee
> Sent:
And are they running with the same Python version? What is the Python version?
_
From: moon soo Lee mailto:m...@apache.org>>
Sent: Thursday, April 20, 2017 11:53 AM
Subject: Re: UnicodeDecodeError in zeppelin 0.7.1
To: mailto:users@zeppelin.apache.org>>
Hi,
0.7.1 didn't changed any encoding type as far as i know.
One difference is 0.7.1 official artifact has been built with JDK8 while
0.7.0 built with JDK7 (we'll use JDK7 to build upcoming 0.7.2 binary). But
i'm not sure that can make pyspark and spark encoding type changes.
Do you have exactly
Hi,
I just migrated from zeppelin 0.7.0 to zeppelin 0.7.1 and I am facing this
error while creating an RDD(in pyspark).
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0:
> invalid start byte
I was able to create the RDD without any error after adding
use_unicode=False as fo