Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-04-01 Thread Mich Talebzadeh
Good stuff Khalid. I have created a section in Apache Spark Community Stack called spark foundation. spark-foundation - Apache Spark Community - Slack I invite you to add your weblink to that section. HT

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-04-01 Thread Khalid Mammadov
Hey AN-TRUONG I have got some articles about this subject that should help. E.g. https://khalidmammadov.github.io/spark/spark_internals_rdd.html Also check other Spark Internals on web. Regards Khalid On Fri, 31 Mar 2023, 16:29 AN-TRUONG Tran Phan, wrote: > Thank you for your information, > >

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread Mich Talebzadeh
yes history refers to completed jobs. 4040 is the running jobs you should have screen shots for executors and stages as well. HTH Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread AN-TRUONG Tran Phan
Thank you for your information, I have tracked the spark history server on port 18080 and the spark UI on port 4040. I see the result of these two tools as similar right? I want to know what each Task ID (Example Task ID 0, 1, 3, 4, 5, ) in the images does, is it possible? https://i.stack.img

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread Mich Talebzadeh
Are you familiar with spark GUI default on port 4040? have a look. HTH Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh

Unusual bug,please help me,i can do nothing!!!

2022-03-30 Thread spark User
uot;Failed to initialize Spark session.org.apache.spark.SparkException: Invalid Spark URL: spark://HeartbeatReceiver@x.168.137.41:49963". When I try to add "x.168.137.41" in 'etc/hosts' it works fine, then use "ctrl+c" again. The result is that it cannot start normally. Please help me

error bug,please help me!!!

2022-03-20 Thread spark User
uot;Failed to initialize Spark session.org.apache.spark.SparkException: Invalid Spark URL: spark://HeartbeatReceiver@x.168.137.41:49963". When I try to add "x.168.137.41" in 'etc/hosts' it works fine, then use "ctrl+c" again. The result is that it cannot start normally. Please help me

please help me: when I write code to connect kafka with spark using python and I run code on jupyer there is error display

2018-09-16 Thread hager
I write code to connect kafka with spark using python and I run code on jupyer my code import os #os.environ['PYSPARK_SUBMIT_ARGS'] = '--jars /home/hadoop/Desktop/spark-program/kafka/spark-streaming-kafka-0-8-assembly_2.10-2.0.0-preview.jar pyspark-shell' os.environ['PYSPARK_SUBMIT_ARGS'] = "--pack

Spark streaming giving me a bunch of WARNINGS, please help me understand them

2017-07-09 Thread shyla deshpande
WARN Use an existing SparkContext, some configuration may not take effect. I wanted to restart the spark streaming app, so stopped the running and issued a new spark submit. Why and how it will use a existing SparkContext? WARN Spark is not running in local mode, therefore the

the function of countByValueAndWindow and foreachRDD in DStream, would you like help me understand it please?

2017-06-27 Thread ??????????
HI all, I have code like below: Logger.getLogger("org.apache.spark").setLevel( Level.ERROR) //Logger.getLogger("org.apache.spark.streaming.dstream").setLevel( Level.DEBUG) val conf = new SparkConf().setAppName("testDstream").setMaster("local[4]") //val sc = SparkContext.getOrCrea

Re: the compile of spark stoped without any hints, would you like help me please?

2017-06-25 Thread Ted Yu
> repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\ > leveldbjni-all-1.8.jar;C:\Users\shaof\.m2\repository\ > org\apache\commons\commons-lang3\3.5\commons-lang3-3.5. > jar;C:\Users\shaof\.m2\repository\com\fasterxml\ > jackson\core\jackson-databind\2.6.5\jackson-databind-2.6.5. > jar;C:\Users\shaof\.m2\rep

the compile of spark stoped without any hints, would you like help me please?

2017-06-25 Thread ??????????
\resources [INFO] Copying 3 resources [INFO] [INFO] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @ spark-network-common_2.11 --- <stop here for more than 30 minutes. I stopped it and retried again. It stopped at the same point.

Please help me out !!!!Getting error while trying to hive java generic udf in spark

2017-01-17 Thread Sirisha Cheruvu
Hi Everyone.. getting below error while running hive java udf from sql context.. org.apache.spark.sql.AnalysisException: No handler for Hive udf class com.nexr.platform.hive.udf.GenericUDFNVL2 because: com.nexr.platform.hive.udf.GenericUDFNVL2.; line 1 pos 26 at org.apache.spark.sql.hive.HiveFu

Re: Help me! Spark WebUI is corrupted!

2015-12-31 Thread Aniket Bhatnagar
of active jobs. It seems there is something > missing in my opearing system. I googled it but find nothing. Could anybody > help me? > > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org

Help me! Spark WebUI is corrupted!

2015-12-31 Thread LinChen
Screenshot1(Normal WebUI) Screenshot2(Corrupted WebUI) As screenshot2 shows, the format of my Spark WebUI looks strange and I cannot click the description of active jobs. It seems there is something missing in my opearing system. I googled it but find nothing. Could anybody help me

anyone who can help me out with thi error please

2015-12-04 Thread Mich Talebzadeh
Hi, I am trying to make Hive work with Spark. I have been told that I need to use Spark 1.3 and build it from source code WITHOUT HIVE libraries. I have built it as follows: ./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4,parquet-p

Re: Please help me understand TF-IDF Vector structure

2015-03-14 Thread Xi Shen
48576,[0,4,7,8,10,13,17,21],[...some float numbers...]) > ... > > I think it s a tuple with 3 element. > >- I have no idea what the 1st element is... >- I think the 2nd element is a list of the word >- I think the 3rd element is a list of tf-idf value of the words in >the previous list > > Please help me understand this structure. > > > Thanks, > David > > > >

Please help me understand TF-IDF Vector structure

2015-03-14 Thread Xi Shen
tuple with 3 element. - I have no idea what the 1st element is... - I think the 2nd element is a list of the word - I think the 3rd element is a list of tf-idf value of the words in the previous list Please help me understand this structure. Thanks, David

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Yana Kadiyska
e core is to exchange lower memory usage vs speed. Hope > my understanding is correct. > > Thanks > > Yong > > -- > Date: Thu, 26 Feb 2015 17:03:20 -0500 > Subject: Re: Help me understand the partition, parallelism in Spark > From: yana.kadiy...@gm

RE: Help me understand the partition, parallelism in Spark

2015-02-26 Thread java8964
m". I think lowering the core is to exchange lower memory usage vs speed. Hope my understanding is correct. Thanks Yong Date: Thu, 26 Feb 2015 17:03:20 -0500 Subject: Re: Help me understand the partition, parallelism in Spark From: yana.kadiy...@gmail.com To: iras...@cloudera.com CC: java8...@hotm

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Zhan Zhang
Here is my understanding. When running on top of yarn, the cores means the number of tasks can run in one executor. But all these cores are located in the same JVM. Parallelism typically control the balance of tasks. For example, if you have 200 cores, but only 50 partitions. There will be 150

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Yana Kadiyska
Imran, I have also observed the phenomenon of reducing the cores helping with OOM. I wanted to ask this (hopefully without straying off topic): we can specify the number of cores and the executor memory. But we don't get to specify _how_ the cores are spread among executors. Is it possible that wi

Re: Help me understand the partition, parallelism in Spark

2015-02-26 Thread Imran Rashid
Hi Yong, mostly correct except for: > >- Since we are doing reduceByKey, shuffling will happen. Data will be >shuffled into 1000 partitions, as we have 1000 unique keys. > > no, you will not get 1000 partitions. Spark has to decide how many partitions to use before it even knows how many

RE: Help me understand the partition, parallelism in Spark

2015-02-26 Thread java8964
Anyone can share any thoughts related to my questions? Thanks From: java8...@hotmail.com To: user@spark.apache.org Subject: Help me understand the partition, parallelism in Spark Date: Wed, 25 Feb 2015 21:58:55 -0500 Hi, Sparkers: I come from the Hadoop MapReducer world, and try to understand

Help me understand the partition, parallelism in Spark

2015-02-25 Thread java8964
Hi, Sparkers: I come from the Hadoop MapReducer world, and try to understand some internal information of spark. From the web and this list, I keep seeing people talking about increase the parallelism if you get the OOM error. I tried to read document as much as possible to understand the RDD pa

Re: Please help me get started on Apache Spark

2014-11-20 Thread Guibert. J Tchinde
For Spark, You can start with a new book like : https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch01.html I think the paper book is out now, You can also have a look on tutorials documentation guide available on : https://spark.apache.org/docs/1.1.0/mllib-guide.html Th

Re: Please help me get started on Apache Spark

2014-11-20 Thread Darin McBeath
Take a look at the O'Reilly Learning Spark (Early Release) book.  I've found this very useful. Darin. From: Saurabh Agrawal To: "user@spark.apache.org" Sent: Thursday, November 20, 2014 9:04 AM Subject: Please help me get started on Apache Spark   Friends,

Please help me get started on Apache Spark

2014-11-20 Thread Saurabh Agrawal
Friends, I am pretty new to Spark as much as to Scala, MLib and the entire Hadoop stack!! It would be so much help if I could be pointed to some good books on Spark and MLib? Further, does MLib support any algorithms for B2B cross sell/ upsell or customer retention (out of the box preferably)

Re: Can anyone help me set memory for standalone cluster?

2014-06-01 Thread Aaron Davidson
0.jar" > "-Xms512M" "-Xmx512M" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" > > The memory seems to be the default number, not 1600M. > I don't how to make SPARK_WORKER_MEMORY work. > Can anyone help me? > Many thanks in advance. > > Yunmeng >

Can anyone help me set memory for standalone cluster?

2014-06-01 Thread Yunmeng Ban
"-cp" ":/~path/spark/conf:/~path/spark/assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar" "-Xms512M" "-Xmx512M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" The memory seems to be the default number, not 1600M. I don

help me: Out of memory when spark streaming

2014-05-16 Thread Francis . Hu
Of Memory after moments. I tried to adjust JVM GC arguments to speed up GC process. Actually, it made a little bit change of performance, but workers finally occur OOM. Is there any way to resolve it? it would be appreciated if anyone can help me to get it fixed ! Thanks

Re: help me

2014-05-03 Thread Chris Fregly
mes spark switchs into node_local mode from process_local and it becomes >> 10x faster. I am very confused. >> >> scala> val a = sc.textFile("/user/exobrain/batselem/LUBM1000") >> scala> f.count() >> >> Long = 137805557 >> took 130.809661618 s &

Re: help me

2014-05-02 Thread Mayur Rustagi
er/exobrain/batselem/LUBM1000") > scala> f.count() > > Long = 137805557 > took 130.809661618 s > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/help-me-tp4598.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >

help me

2014-04-22 Thread Joe L
= 137805557 took 130.809661618 s -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/help-me-tp4598.html Sent from the Apache Spark User List mailing list archive at Nabble.com.