just use SPARK CSV, all other ways of splitting and working is just trying
to reinvent the wheel and a magnanimous waste of time.
Regards,
Gourav
On Mon, Sep 5, 2016 at 1:48 PM, Ashok Kumar
wrote:
> Hi,
>
> I have a text file as below that I read in
>
> 74,20160905-133143,98.112180691288275941
sc.textFile("filename").map(_.split(",")).filter(arr => arr.length == 3 &&
arr(2).toDouble > 50).collect this will give you a Array[Array[String]] do
as you may wish with it. And please read through abt RDD
On 5 Sep 2016 8:51 pm, "Ashok Kumar" wrote:
> Thanks everyone.
>
> I am not skilled like
Thanks everyone.
I am not skilled like you gentlemen
This is what I did
1) Read the text file
val textFile = sc.textFile("/tmp/myfile.txt")
2) That produces an RDD of String.
3) Create a DF after splitting the file into an Array
val df = textFile.map(line =>
line.split(",")).map(x=>(x(0).toInt,x
Then, You need to refer third term in the array, convert it to your desired
data type and then use filter.
On Tue, Sep 6, 2016 at 12:14 AM, Ashok Kumar wrote:
> Hi,
> I want to filter them for values.
>
> This is what is in array
>
> 74,20160905-133143,98.11218069128827594148
>
> I want to filt
Ask yourself how to access the third element in an array in Scala.
Am 05.09.2016 um 16:14 schrieb Ashok Kumar:
Hi,
I want to filter them for values.
This is what is in array
74,20160905-133143,98.11218069128827594148
I want to filter anything > 50.0 in the third column
Thanks
On Monday,
Hi,I want to filter them for values.
This is what is in array
74,20160905-133143,98.11218069128827594148
I want to filter anything > 50.0 in the third column
Thanks
On Monday, 5 September 2016, 15:07, ayan guha wrote:
Hi
x.split returns an array. So, after first map, you will get RDD
Hi
x.split returns an array. So, after first map, you will get RDD of arrays.
What is your expected outcome of 2nd map?
On Mon, Sep 5, 2016 at 11:30 PM, Ashok Kumar
wrote:
> Thank you sir.
>
> This is what I get
>
> scala> textFile.map(x=> x.split(","))
> res52: org.apache.spark.rdd.RDD[Array[S
Please have a look at the documentation for information on how to work with
RDD. Start with this http://spark.apache.org/docs/latest/quick-start.html
On 5 Sep 2016 7:00 pm, "Ashok Kumar" wrote:
> Thank you sir.
>
> This is what I get
>
> scala> textFile.map(x=> x.split(","))
> res52: org.apache.
Thank you sir.
This is what I get
scala> textFile.map(x=> x.split(","))res52:
org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[27] at map at
:27
How can I work on individual columns. I understand they are strings
scala> textFile.map(x=> x.split(",")).map(x => (x.getString(0)) |
):2
Basic error, you get back an RDD on transformations like map.
sc.textFile("filename").map(x => x.split(",")
On 5 Sep 2016 6:19 pm, "Ashok Kumar" wrote:
> Hi,
>
> I have a text file as below that I read in
>
> 74,20160905-133143,98.11218069128827594148
> 75,20160905-133143,49.5277699881591680774
Hi,
I have a text file as below that I read in
74,20160905-133143,98.1121806912882759414875,20160905-133143,49.5277699881591680774276,20160905-133143,56.0802995712398098455677,20160905-133143,46.636895265444075228,20160905-133143,84.8822714116440218155179,20160905-133143,68.72408602520662115000
11 matches
Mail list logo