Is there a way to get iterator from RDD? Something like rdd.collect(), but returning lazy sequence and not single array.
Context: I need to GZip processed data to upload it to Amazon S3. Since archive should be a single file, I want to iterate over RDD, writing each line to a local .gz file. File is small enough to fit local disk, but still large enough not to fit into memory.