from:"Denis Bolshakov"

Re: Append more files to existing partitioned data

2018-03-18 Thread Denis Bolshakov

kflowId") >>> .parquet("/here/is/my/dir") >>> >>> >>> I want to run more jobs that will produce new partitions or add more >>> files to existing partitions. >>> What is the right way to do it? >>> >> > -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: Append more files to existing partitioned data

2018-03-17 Thread Denis Bolshakov

Hello Serega, https://spark.apache.org/docs/latest/sql-programming-guide.html Please try SaveMode.Append option. Does it work for you? сб, 17 мар. 2018 г., 15:19 Serega Sheypak : > Hi, I', using spark-sql to process my data and store result as parquet > partitioned by several columns > > ds.wr

Re: Pasting into spark-shell doesn't work for Databricks example

2016-11-22 Thread Denis Bolshakov

t;>> View this message in context: http://apache-spark-user-list. >>> 1001560.n3.nabble.com/Pasting-into-spark-shell-doesn-t-work- >>> for-Databricks-example-tp28113p28116.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> - >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> >>> >> > -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: Pasting into spark-shell doesn't work for Databricks example

2016-11-22 Thread Denis Bolshakov

^ > :46: error: not found: type Row > override def evaluate(buffer: Row): Any = { >^ > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Pasting-into-spark-shell-doesn-t-work- > for-Databricks-example-tp28113.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: spark with kerberos

2016-10-13 Thread Denis Bolshakov

on `cluster B`, currently it's turned off. >>> 5. Spark app is built on top of RDD and does not depend on spark-sql. >>> >>> Does anybody know how to write data using RDD api to remote cluster >>> which is >>> running with Kerbe

Spark with kerberos

2016-10-13 Thread Denis Bolshakov

depend on spark-sql. Does anybody know how to write data using RDD api to remote cluster which is running with Kerberos? -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: one executor runs multiple parallel tasks VS multiple excutors each runs one task

2016-10-11 Thread Denis Bolshakov

Look here http://www.slideshare.net/cloudera/top-5-mistakes-to-avoid-when-writing-apache-spark-applications Probably it will help a bit. Best regards, Denis 11 Окт 2016 г. 23:49 пользователь "Xiaoye Sun" написал: > Hi, > > Currently, I am running Spark using the standalone scheduler with 3 > m

Re: Spark Docker Container - Jars problem when deploying my app

2016-10-11 Thread Denis Bolshakov

Try to build a flat (uber) jar which includes all dependencies. 11 Окт 2016 г. 22:11 пользователь "doruchiulan" написал: > Hi, > > I have a problem that's bothering me for a few days, and I'm pretty out of > ideas. > > I built a Spark docker container where Spark runs in standalone mode. Both >

Re: java.lang.NoClassDefFoundError: org/apache/spark/sql/Dataset

2016-10-07 Thread Denis Bolshakov

You need to have spark-sql, now you are missing it. 7 Окт 2016 г. 11:12 пользователь "kant kodali" написал: > Here are the jar files on my classpath after doing a grep for spark jars. > > org.apache.spark/spark-core_2.11/2.0.0/c4d04336c142f10eb7e172155f022f > 86b6d11dd3/spark-core_2.11-2.0.0.jar

Re: Are Task Closures guaranteed to be accessed by only one Thread?

2016-10-05 Thread Denis Bolshakov

In a few words, you cannot ignore thread safety if you use more than 1 core per executer. Year ago I faced a race conditiob issue with SimpleDateFormat. And I solved it using ThreadLocal. 5 Окт 2016 г. 20:12 пользователь "Sean Owen" написал: > I don't think this is guaranteed and don't think I'd

Re: java.net.URISyntaxException

2016-10-04 Thread Denis Bolshakov

I think you are wrong with port for hdfs file, as I remember default value is 8020, and not 9000. 4 Окт 2016 г. 17:29 пользователь "Hafiz Mujadid" написал: > Hi, > > I am trying example of structured streaming in spark using following piece > of code, > > val spark = SparkSession > .builder > .a

Re: Partition n keys into exacly n partitions

2016-09-12 Thread Denis Bolshakov

> > hence many elements with different keys fall into single partition at > times. > > > > Thanks, > Sujeet > -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: Spark tasks blockes randomly on standalone cluster

2016-09-12 Thread Denis Bolshakov

e-spark-user-list. > 1001560.n3.nabble.com/Spark-tasks-blockes-randomly-on- > standalone-cluster-tp27693.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: Creating RDD using swebhdfs with truststore

2016-09-03 Thread Denis Bolshakov

Hello, I would also set java opts for driver. Best regards, Denis 4 Сен 2016 г. 0:31 пользователь "Sourav Mazumder" < sourav.mazumde...@gmail.com> написал: > Hi, > > I am trying to create a RDD by using swebhdfs to a remote hadoop cluster > which is protected by Knox and uses SSL. > > The code

Re: After calling persist, why the size in sparkui is not matching with the actual file size

2016-08-29 Thread Denis Bolshakov

== > > logData RDD takes *2.1 KB* > > errors RDD takes *1.3 KB* > > > > Regards > > Rohit Kumar Prusty > > +91-9884070075 > > > -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: How to acess the WrappedArray

2016-08-29 Thread Denis Bolshakov

A03031, OED-Employment Dev (031), > 1979-10-24T00:00:00, 56705.00, 54135.44)) > > Expecting Output: > > Need elements from the WrappedArray > > Below you can find the attachment of .json file > > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > -- //with Best Regards --Denis Bolshakov e-mail: bolshakov.de...@gmail.com

Re: Append more files to existing partitioned data

Re: Append more files to existing partitioned data

Re: Pasting into spark-shell doesn't work for Databricks example

Re: Pasting into spark-shell doesn't work for Databricks example

Re: spark with kerberos

Spark with kerberos

Re: one executor runs multiple parallel tasks VS multiple excutors each runs one task

Re: Spark Docker Container - Jars problem when deploying my app

Re: java.lang.NoClassDefFoundError: org/apache/spark/sql/Dataset

Re: Are Task Closures guaranteed to be accessed by only one Thread?

Re: java.net.URISyntaxException

Re: Partition n keys into exacly n partitions

Re: Spark tasks blockes randomly on standalone cluster

Re: Creating RDD using swebhdfs with truststore

Re: After calling persist, why the size in sparkui is not matching with the actual file size

Re: How to acess the WrappedArray

16 matches

Site Navigation

Mail list logo

Footer information