date:20170419

Re: Spark-shell's performance

2017-04-19 Thread Yan Facai

Hi, Hanson. Perhaps I’m digressing here. If I'm wrong or mistake, please correct me. SPARK_WORKER_* is the configuration for whole cluster, and it's fine to write those global variable in spark-env.sh. However, SPARK_DRIVER_* and SPARK_EXECUTOR_* is the configuration for application (your code), p

Re: how to add new column using regular expression within pyspark dataframe

2017-04-19 Thread Yan Facai

How about using `withColumn` and UDF? example: + https://gist.github.com/zoltanctoth/2deccd69e3d1cde1dd78 + https://ragrawal.wordpress.com/2015/10/02/spark-custom-udf-example/ On Mon, Apr 17, 2017 at 8:25 PM, Zeming Yu wrote: > I've g

JDBC write error of Pyspark dataframe

2017-04-19 Thread Cinyoung Hur

Hi, I'm trying to write dataframe to MariaDB. I got this error message, but I have no clue. Please give some advice. Py4JJavaErrorTraceback (most recent call last) in ()> 1 result1.filter(result1["gnl_nm_set"] == "").count() /usr/local/linewalks/spark/spark/python/pyspark/sql/dataframe.pyc

Re: Problem with Java and Scala interoperability // streaming

2017-04-19 Thread kant kodali

works now! thanks much! On Wed, Apr 19, 2017 at 2:05 PM, kant kodali wrote: > oops my bad. I see it now! sorry. > > On Wed, Apr 19, 2017 at 1:56 PM, Marcelo Vanzin > wrote: > >> I see a bunch of getOrCreate methods in that class. They were all >> added in SPARK-6752, a long time ago. >> >> On W

Re: Problem with Java and Scala interoperability // streaming

2017-04-19 Thread kant kodali

oops my bad. I see it now! sorry. On Wed, Apr 19, 2017 at 1:56 PM, Marcelo Vanzin wrote: > I see a bunch of getOrCreate methods in that class. They were all > added in SPARK-6752, a long time ago. > > On Wed, Apr 19, 2017 at 1:51 PM, kant kodali wrote: > > There is no getOrCreate for JavaStream

Re: Problem with Java and Scala interoperability // streaming

2017-04-19 Thread Marcelo Vanzin

I see a bunch of getOrCreate methods in that class. They were all added in SPARK-6752, a long time ago. On Wed, Apr 19, 2017 at 1:51 PM, kant kodali wrote: > There is no getOrCreate for JavaStreamingContext however I do use > JavaStreamingContext inside createStreamingContext() from my code in th

Re: Problem with Java and Scala interoperability // streaming

2017-04-19 Thread kant kodali

There is no *getOrCreate *for JavaStreamingContext however I do use JavaStreamingContext inside createStreamingContext() from my code in the previous email. On Wed, Apr 19, 2017 at 1:46 PM, Marcelo Vanzin wrote: > Why are you not using JavaStreamingContext if you're writing Java? > > On Wed, Apr

Re: Problem with Java and Scala interoperability // streaming

2017-04-19 Thread Marcelo Vanzin

Why are you not using JavaStreamingContext if you're writing Java? On Wed, Apr 19, 2017 at 1:42 PM, kant kodali wrote: > Hi All, > > I get the following errors whichever way I try either lambda or generics. I > am using > spark 2.1 and scalla 2.11.8 > > > StreamingContext ssc = StreamingContext.g

Problem with Java and Scala interoperability // streaming

2017-04-19 Thread kant kodali

Hi All, I get the following errors whichever way I try either lambda or generics. I am using spark 2.1 and scalla 2.11.8 StreamingContext ssc = StreamingContext.getOrCreate(hdfsCheckpointDir, () -> {return createStreamingContext();}, null, false); ERROR StreamingContext ssc = StreamingContext.

Re: java.lang.java.lang.UnsupportedOperationException

2017-04-19 Thread Nicholas Hakobian

CDH 5.5 only provides Spark 1.5. Are you managing your pySpark install separately? For something like your example, you will get significantly better performance using coalesce with a lit, like so: from pyspark.sql.functions import lit, coalesce def replace_empty(icol): return coalesce(col(i

Re: Handling skewed data

2017-04-19 Thread Richard Siebeling

I'm also interested in this, does anyone this? On 17 April 2017 at 17:17, Vishnu Viswanath wrote: > Hello All, > > Does anyone know if the skew handling code mentioned in this talk > https://www.youtube.com/watch?v=bhYV0JOPd9Y was added to spark? > > If so can I know where to look for more info,

Re: java.lang.java.lang.UnsupportedOperationException

2017-04-19 Thread issues solution

Pyspark 1.6 On cloudera 5.5 (yearn) 2017-04-19 13:42 GMT+02:00 issues solution : > Hi , > somone can tell me why i get the folowing error with udf apply like udf > > def replaceCempty(x): > if x is None : > return "" > else : > return x.encode('utf-8') > udf_replaceCe

Real time incremental Update to Spark Graphs.

2017-04-19 Thread Siddharth Ubale

Hi, We have a scenario where we want to build graph and perform analytics on it. However , the graph once built and cached in spark memory, allows analytics on the already present graph. We would like the graph to be incremented in real time as and when we receive new edges and vertices for th

java.lang.java.lang.UnsupportedOperationException

2017-04-19 Thread issues solution

Hi , somone can tell me why i get the folowing error with udf apply like udf def replaceCempty(x): if x is None : return "" else : return x.encode('utf-8') udf_replaceCempty = F.udf(replaceCempty,StringType()) dfTotaleNormalize53 = dfTotaleNormalize52.select([i if i not

Re: Spark-shell's performance

Re: how to add new column using regular expression within pyspark dataframe

JDBC write error of Pyspark dataframe

Re: Problem with Java and Scala interoperability // streaming

Re: Problem with Java and Scala interoperability // streaming

Re: Problem with Java and Scala interoperability // streaming

Re: Problem with Java and Scala interoperability // streaming

Re: Problem with Java and Scala interoperability // streaming

Problem with Java and Scala interoperability // streaming

Re: java.lang.java.lang.UnsupportedOperationException

Re: Handling skewed data

Re: java.lang.java.lang.UnsupportedOperationException

Real time incremental Update to Spark Graphs.

java.lang.java.lang.UnsupportedOperationException

14 matches

Site Navigation

Mail list logo

Footer information