Hello Spark community,
We have a project where we want to use Spark as computation engine to perform
calculations and return result via REST services.
Working with Spark we have learned how to do things to make it work faster and
finally optimize our code to produce results in acceptable time (1
F and RDD...
>
> On Nov 4, 2015 7:54 PM, "Aliaksei Tsyvunchyk" <mailto:atsyvunc...@exadel.com>> wrote:
> Hello Romi,
>
> Do you mean that in my particular case I’m causing computation on dataFrame
> or it is regular behavior of DataFrame.toJavaRDD ?
> If
program ?
> On Nov 4, 2015, at 12:34 PM, Romi Kuntsman wrote:
>
> I noticed that toJavaRDD causes a computation on the DataFrame, so is it
> considered an action, even though logically it's a transformation?
>
> On Nov 4, 2015 6:51 PM, "Aliaksei Tsyvunchyk" <m
Hello folks,
Recently I have noticed unexpectedly big network traffic between Driver Program
and Worker node.
During debugging I have figured out that it is caused by following block of
code
—— Java ——— —
DataFrame etpvRecords = context.sql(" SOME SQL query here");
Mapper m = new Mapp
Hello all community members,
I need opinion of people who was using Spark before and can share there
experience to help me select technical approach.
I have a project in Proof Of Concept phase, where we are evaluating possibility
of Spark usage for our use case.
Here is brief task description.