RE: removing header from csv file

Mishra, Abhishek Tue, 26 Apr 2016 23:05:28 -0700

You should be doing something like this:


data = sc.textFile('file:///path1/path/test1.csv')
header = data.first() #extract header
#print header
data = data.filter(lambda x:x !=header)
#print data
Hope it helps.

Sincerely,
Abhishek
+91-7259028700

From: nihed mbarek [mailto:nihe...@gmail.com]
Sent: Wednesday, April 27, 2016 11:29 AM
To: Divya Gehlot
Cc: Ashutosh Kumar; user @spark
Subject: Re: removing header from csv file

You can add a filter with string that you are sure available only in the header

Le mercredi 27 avril 2016, Divya Gehlot 
<divya.htco...@gmail.com<mailto:divya.htco...@gmail.com>> a écrit :
yes you can remove the headers by removing the first row

can first() or head() to do that


Thanks,
Divya

On 27 April 2016 at 13:24, Ashutosh Kumar 
<kmr.ashutos...@gmail.com<javascript:_e(%7B%7D,'cvml','kmr.ashutos...@gmail.com');>>
 wrote:
I see there is a library spark-csv which can be used for removing header and 
processing of csv files. But it seems it works with sqlcontext only. Is there a 
way to remove header from csv files without sqlcontext ?
Thanks
Ashutosh



--

M'BAREK Med Nihed,
Fedora Ambassador, TUNISIA, Northern Africa
http://www.nihed.com

[http://www.linkedin.com/img/webpromo/btn_myprofile_160x33_fr_FR.png]<http://tn.linkedin.com/in/nihed>

RE: removing header from csv file

Reply via email to