yep Michael Quinlan,it's working as suggested by Hoe Ren
thansk to you and Hoe Ren
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/removing-first-record-from-RDD-String-tp20834p20840.html
Sent from the Apache Spark User List mailing list archive at Nabble.
Hafiz,
You can probably use the RDD.mapPartitionsWithIndex method.
Mike
On Tue, Dec 23, 2014 at 8:35 AM, Hafiz Mujadid [via Apache Spark User List]
wrote:
>
> hi dears!
>
> Is there some efficient way to drop first line of an RDD[String]?
>
> any suggestion?
>
> Thanks
>
> -
There is also a lazy implementation:
http://erikerlandson.github.io/blog/2014/07/29/deferring-spark-actions-to-lazy-transforms-with-the-promise-rdd/
I generated a PR for it -- there was also an alternate proposal for having it
be a library in the new Spark Packages site:
http://databricks.com/bl
that's nice if it works
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/removing-first-record-from-RDD-String-tp20834p20837.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
--
Hi,
maybe the drop function is helpful for you (even though this is probably
more than you need, still interesting read)
http://erikerlandson.github.io/blog/2014/07/27/some-implications-of-supporting-the-scala-drop-method-for-spark-rdds/
Joerg
On Tue, Dec 23, 2014 at 5:45 PM, Hao Ren wrote:
> H
Hi,
I guess you would like to remove the header of a CSV file.
You can play with partitions. =)
// src is your RDD
val noHeader = src.mapPartitionsWithIndex(
(i, iterator) =>
if (i == 0 && iterator.hasNext) {
iterator.next
iterator
} else iterator)
Thus, you don't need to