Hi

Is it possible to elaborate a little more?

In order to consume a fixed width file, the standard process should be

1. Write a map function which takes input as a string and implement file
specs to return tuple of fields.
2. Load the files using sc.textFile (which reads the lines as string)
3. Pass on the lines to map and get back a RDD of fields.

Ayan

On Mon, Nov 9, 2015 at 3:20 PM, Hitoshi Ozawa <[email protected]> wrote:

> There's a document describing the format of files in the parent directory.
> It
> seems like a fixed width file.
> ftp://ftp.ncdc.noaa.gov/pub/data/noaa/ish-format-document.pdf
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-analyze-weather-data-in-Spark-tp25256p25320.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Best Regards,
Ayan Guha

Reply via email to