kflowId")
>>> .parquet("/here/is/my/dir")
>>>
>>>
>>> I want to run more jobs that will produce new partitions or add more
>>> files to existing partitions.
>>> What is the right way to do it?
>>>
>>
>
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
Hello Serega,
https://spark.apache.org/docs/latest/sql-programming-guide.html
Please try SaveMode.Append option. Does it work for you?
сб, 17 мар. 2018 г., 15:19 Serega Sheypak :
> Hi, I', using spark-sql to process my data and store result as parquet
> partitioned by several columns
>
> ds.wr
t;>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/Pasting-into-spark-shell-doesn-t-work-
>>> for-Databricks-example-tp28113p28116.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>
>>
>
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
^
> :46: error: not found: type Row
> override def evaluate(buffer: Row): Any = {
>^
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Pasting-into-spark-shell-doesn-t-work-
> for-Databricks-example-tp28113.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
on `cluster B`, currently it's turned off.
>>> 5. Spark app is built on top of RDD and does not depend on spark-sql.
>>>
>>> Does anybody know how to write data using RDD api to remote cluster
>>> which is
>>> running with Kerbe
depend on spark-sql.
Does anybody know how to write data using RDD api to remote cluster which
is running with Kerberos?
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
Look here
http://www.slideshare.net/cloudera/top-5-mistakes-to-avoid-when-writing-apache-spark-applications
Probably it will help a bit.
Best regards,
Denis
11 Окт 2016 г. 23:49 пользователь "Xiaoye Sun"
написал:
> Hi,
>
> Currently, I am running Spark using the standalone scheduler with 3
> m
Try to build a flat (uber) jar which includes all dependencies.
11 Окт 2016 г. 22:11 пользователь "doruchiulan"
написал:
> Hi,
>
> I have a problem that's bothering me for a few days, and I'm pretty out of
> ideas.
>
> I built a Spark docker container where Spark runs in standalone mode. Both
>
You need to have spark-sql, now you are missing it.
7 Окт 2016 г. 11:12 пользователь "kant kodali" написал:
> Here are the jar files on my classpath after doing a grep for spark jars.
>
> org.apache.spark/spark-core_2.11/2.0.0/c4d04336c142f10eb7e172155f022f
> 86b6d11dd3/spark-core_2.11-2.0.0.jar
In a few words, you cannot ignore thread safety if you use more than 1 core
per executer. Year ago I faced a race conditiob issue with
SimpleDateFormat. And I solved it using ThreadLocal.
5 Окт 2016 г. 20:12 пользователь "Sean Owen" написал:
> I don't think this is guaranteed and don't think I'd
I think you are wrong with port for hdfs file, as I remember default value
is 8020, and not 9000.
4 Окт 2016 г. 17:29 пользователь "Hafiz Mujadid"
написал:
> Hi,
>
> I am trying example of structured streaming in spark using following piece
> of code,
>
> val spark = SparkSession
> .builder
> .a
>
> hence many elements with different keys fall into single partition at
> times.
>
>
>
> Thanks,
> Sujeet
>
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
e-spark-user-list.
> 1001560.n3.nabble.com/Spark-tasks-blockes-randomly-on-
> standalone-cluster-tp27693.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
Hello,
I would also set java opts for driver.
Best regards,
Denis
4 Сен 2016 г. 0:31 пользователь "Sourav Mazumder" <
sourav.mazumde...@gmail.com> написал:
> Hi,
>
> I am trying to create a RDD by using swebhdfs to a remote hadoop cluster
> which is protected by Knox and uses SSL.
>
> The code
==
>
> logData RDD takes *2.1 KB*
>
> errors RDD takes *1.3 KB*
>
>
>
> Regards
>
> Rohit Kumar Prusty
>
> +91-9884070075
>
>
>
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
A03031, OED-Employment Dev (031),
> 1979-10-24T00:00:00, 56705.00, 54135.44))
>
> Expecting Output:
>
> Need elements from the WrappedArray
>
> Below you can find the attachment of .json file
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
--
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com
16 matches
Mail list logo