How to extract complex JSON structures using Apache Spark 1.4.0 Data Frames

2015-06-24 Thread Gustavo Arjones
Hi All, I am using the new Apache Spark version 1.4.0 Data-frames API to extract information from Twitter's Status JSON, mostly focused on the Entities Object - the relevant part to this question is showed below: { ... ... "entities": {

Re: Poor performance writing to S3

2014-10-01 Thread Gustavo Arjones
load method() When I ran count() on my dataset before trying to save it to S3 I could figure out the input bottleneck. - gustavo On Sep 30, 2014, at 10:03 PM, Gustavo Arjones wrote: > Hi, > I’m trying to save about a million of lines containing statistics data, > somet

Poor performance writing to S3

2014-09-30 Thread Gustavo Arjones
Hi, I’m trying to save about a million of lines containing statistics data, something like: 233815212529_10152316612422530 233815212529_10152316612422530 1328569332 1404691200 1404691200 1402316275 46 0 0 7 0 0 0 233815212529_101523166124