Re: How to properly read the first number lines of file into a RDD

2015-11-17 Thread Zhiliang Zhu
Thanks a lot for your reply.I have also worked it out by some other ways. In fact, firstly I was thinking about using filter to do it but failed.  On Monday, November 9, 2015 9:52 PM, Akhil Das wrote: ​There's multiple way to achieve this: 1. Read the N lines from the driver and th

Re: How to properly read the first number lines of file into a RDD

2015-11-09 Thread Akhil Das
​There's multiple way to achieve this: 1. Read the N lines from the driver and then do a sc.parallelize(nlines) to create an RDD out of it. 2. Create an RDD with N+M, do a take on N and then broadcast or parallelize the returning list. 3. Something like this if the file is in hdfs: val n_f =