Re: Storage of RDDs created via sc.parallelize

2015-03-22 Thread Akhil Das
You can use sc.newAPIHadoopFile with CSVInputFormat so that it will read the csv file properly. Thanks Best Regards On Sat, Mar 21, 2015 at 12:39 AM, Karlson wrote

Re: Directly broadcasting (sort of) RDDs

2015-03-22 Thread Sean Owen
In a sentence, is this the idea of collecting an RDD to memory on each executor directly? On Sun, Mar 22, 2015 at 10:56 PM, Sandy Ryza wrote: > Hi Guillaume, > > I've long thought something like this would be useful - i.e. the ability to > broadcast RDDs directly without first pulling data throug

Re: Directly broadcasting (sort of) RDDs

2015-03-22 Thread Sandy Ryza
Hi Guillaume, I've long thought something like this would be useful - i.e. the ability to broadcast RDDs directly without first pulling data through the driver. If I understand correctly, your requirement to "block" a matrix up and only fetch the needed parts could be implemented on top of this b

Re: lower&upperBound not working/spark 1/3

2015-03-22 Thread Michael Armbrust
I have not heard this reported yet, but your invocation looks correct to me. Can you open a JIRA? On Sun, Mar 22, 2015 at 8:39 AM, Marek Wiewiorka wrote: > Hi All - I try to use the new SQLContext API for populating DataFrame from > jdbc data source. > like this: > > val jdbcDF = sqlContext.jdb

lower&upperBound not working/spark 1/3

2015-03-22 Thread Marek Wiewiorka
Hi All - I try to use the new SQLContext API for populating DataFrame from jdbc data source. like this: val jdbcDF = sqlContext.jdbc(url = "jdbc:postgresql://localhost:5430/dbname?user=user&password=111", table = "se_staging.exp_table3" ,columnName="cs_id",lowerBound=1 ,upperBound = 1, numPart