date:20150113

VertexId type in GraphX

2015-01-13 Thread Madhu

Are there any plans to generalize the type of VertexId in GraphX? Our keys are particularly long. We could use the hashCode() trick, but the chance of collisions is not acceptable. Given our data volume, we have encountered hashCode() collisions more than once. I see this Jira, but it is specific

Re: DBSCAN for MLlib

2015-01-13 Thread Muhammad Ali A'råby

I have to say, I have created a Jira task for it: [SPARK-5226] Add DBSCAN Clustering Algorithm to MLlib - ASF JIRA | | | | | | | | | [SPARK-5226] Add DBSCAN Clustering Algorithm to MLlib - ASF JIRAMLlib is all k-means now, and I think we should add some new clustering algorithms to it

DBSCAN for MLlib

2015-01-13 Thread Muhammad Ali A'råby

Dear all, I think MLlib needs more clustering algorithms and DBSCAN is my first candidate. I am starting to implement it. Any advice? Muhammad-Ali

Re: create a SchemaRDD from a custom datasource

2015-01-13 Thread Reynold Xin

If it is a small collection of them on the driver, you can just use sc.parallelize to create an RDD. On Tue, Jan 13, 2015 at 7:56 AM, Malith Dhanushka wrote: > Hi Reynold, > > Thanks for the response. I am just wondering, lets say we have set of Row > objects. Isn't there a straightforward way

Re: Use of MapConverter, ListConverter in python to java object conversion

2015-01-13 Thread Davies Liu

It's not necessary, I will create a PR to remove them. For larger dict/list/tuple, the pickle approach may have less RPC calls, better performance. Davies On Tue, Jan 13, 2015 at 4:53 AM, Meethu Mathew wrote: > Hi all, > > In the python object to java conversion done in the method _py2java in >

Re: Python to Java object conversion of numpy array

2015-01-13 Thread Davies Liu

On Mon, Jan 12, 2015 at 8:14 PM, Meethu Mathew wrote: > Hi, > > This is the function defined in PythonMLLibAPI.scala > def findPredict( > data: JavaRDD[Vector], > wt: Object, > mu: Array[Object], > si: Array[Object]): RDD[Array[Double]] = { > } > > So the parameter mu sho

Fwd: [ NOTICE ] Service Downtime Notification - R/W git repos

2015-01-13 Thread Patrick Wendell

FYI our git repo may be down for a few hours today. -- Forwarded message -- From: "Tony Stevenson" Date: Jan 13, 2015 6:49 AM Subject: [ NOTICE ] Service Downtime Notification - R/W git repos To: Cc: Folks, Please note than on Thursday 15th at 20:00 UTC the Infrastructure team wi

Unable to find configuration file at location scalastyle-config.xml

2015-01-13 Thread Zhiwei Chan

Hi everyone, I am newly to spark, and try to package the spark-core for some modification. I use IDEA to package the spark-core_2.10 of spark 1.1.1. When encounter the following error, I check the website http://www.scalastyle.org/maven.html, and its suggest configuration is to modify the spark

Use of MapConverter, ListConverter in python to java object conversion

2015-01-13 Thread Meethu Mathew

Hi all, In the python object to java conversion done in the method _py2java in spark/python/pyspark/mllib/common.py, why we are doing individual conversion using MpaConverter,ListConverter? The same can be acheived using bytearray(PickleSerializer().dumps(obj)) obj = sc._jvm.SerDe.loads(by

Re: create a SchemaRDD from a custom datasource

2015-01-13 Thread Reynold Xin

Depends on what the other side is doing. You can create your own RDD implementation by subclassing RDD, or it might work if you use sc.parallelize(1 to n, n).mapPartitionsWithIndex( /* code to read the data and return an iterator */ ) where n is the number of partitions. On Tue, Jan 13, 2015 at 12

create a SchemaRDD from a custom datasource

2015-01-13 Thread Niranda Perera

Hi, We have a custom datasources API, which connects to various data sources and exposes them out as a common API. We are now trying to implement the Spark datasources API released in 1.2.0 to connect Spark for analytics. Looking at the sources API, we figured out that we should extend a scan cla

VertexId type in GraphX

Re: DBSCAN for MLlib

DBSCAN for MLlib

Re: create a SchemaRDD from a custom datasource

Re: Use of MapConverter, ListConverter in python to java object conversion

Re: Python to Java object conversion of numpy array

Fwd: [ NOTICE ] Service Downtime Notification - R/W git repos

Unable to find configuration file at location scalastyle-config.xml

Use of MapConverter, ListConverter in python to java object conversion

Re: create a SchemaRDD from a custom datasource

create a SchemaRDD from a custom datasource

11 matches

Site Navigation

Mail list logo

Footer information