Re: [GraphFrame, Pyspark] Weighted Edge in PageRank

2016-12-01 Thread Weiwei Zhang
Thanks Felix. Anyone know when this feature will be rolled out in GraphFrame? Best Regards, Weiwei On Thu, Dec 1, 2016 at 5:22 PM, Felix Cheung wrote: > That's correct - currently GraphFrame does not compute PageRank with > weighted edges. > > > ____

[GraphFrame, Pyspark] Weighted Edge in PageRank

2016-12-01 Thread Weiwei Zhang
Hi guys, I am trying to compute the pagerank for the locations in the following dummy dataframe, *srcdes shared_gas_stations* A B 2 A C 10 C E 3 D E 12 E G 5 ... I have tried the funct

Configure Spark Resource on AWS CLI Not Working

2016-02-26 Thread Weiwei Zhang
Hi there, I am trying to configure memory for spark using AWS CLI. However, I got the following message: *A client error (ValidationException) occurred when calling the RunJobFlow operation: Cannot specify args for application 'Spark' when release label is used.* In the aws 'create-cluster' comm

Re: Behind the scene of RDD to DataFrame

2016-02-21 Thread Weiwei Zhang
to catalyst types) are involved but no > shuffling. > > Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811> > www.snappydata.io > > On Sun, Feb 21, 2016 at 11:48 AM, Weiwei Zhang > wrote: > >> Hi there, >> >> Could someon

Behind the scene of RDD to DataFrame

2016-02-20 Thread Weiwei Zhang
Hi there, Could someone explain to me what is behind the scene of rdd.toDF()? More importantly, will this step involve a lot of shuffles and cause the surge of the size of intermediate files? Thank you. Best Regards, Vivian

Pyspark SQL Join Failure

2015-12-19 Thread Weiwei Zhang
Hi all, I got this error when I tried to use the 'join' function to left outer join two data frames in pyspark 1.4.1. Please kindly point out the places where I made mistakes. Thank you. Traceback (most recent call last): File "/Users/wz/PycharmProjects/PysparkTraining/Airbnb/src/driver.py", li

Is Feature Transformations supported by Spark export to PMML

2015-10-15 Thread Weiwei Zhang
Hi Folks, I am trying to find out if the Spark export to PMML has support for feature transformations. I know in R, I need to specify local transformations and attributes using the "pmml" and "pmmlTransformation" libraries. The example I read on Spark, simply apply "toPMML" function and it generat