Re: Re: EXT: Dual Write to HDFS and MinIO in faster way

2024-05-21 Thread eab...@163.com
Hi, I think you should write to HDFS then copy file (parquet or orc) from HDFS to MinIO. eabour From: Prem Sahoo Date: 2024-05-22 00:38 To: Vibhor Gupta; user Subject: Re: EXT: Dual Write to HDFS and MinIO in faster way On Tue, May 21, 2024 at 6:58 AM Prem Sahoo wrote: Hello Vibhor, Th

Re: Re: [EXTERNAL] Re: Spark-submit without access to HDFS

2023-11-15 Thread eab...@163.com
hdfs-site.xml, for instance, fs.oss.impl, etc. eabour From: Eugene Miretsky Date: 2023-11-16 09:58 To: eab...@163.com CC: Eugene Miretsky; user @spark Subject: Re: [EXTERNAL] Re: Spark-submit without access to HDFS Hey! Thanks for the response. We are getting the error because there is no ne

Re: Spark-submit without access to HDFS

2023-11-15 Thread eab...@163.com
Hi Eugene, I think you should Check if the HDFS service is running properly. From the logs, it appears that there are two datanodes in HDFS, but none of them are healthy. Please investigate the reasons why the datanodes are not functioning properly. It seems that the issue might be due t

Re: Re: jackson-databind version mismatch

2023-11-02 Thread eab...@163.com
.jar 2023/09/09 10:08 513,968 jackson-module-scala_2.12-2.15.2.jar eabour From: Bjørn Jørgensen Date: 2023-11-02 16:40 To: eab...@163.com CC: user @spark; Saar Barhoom; moshik.vitas Subject: Re: jackson-databind version mismatch [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and

Re: jackson-databind version mismatch

2023-11-01 Thread eab...@163.com
Hi, Please check the versions of jar files starting with "jackson-". Make sure all versions are consistent. jackson jar list in spark-3.3.0: 2022/06/10 04:3775,714 jackson-annotations-2.13.3.jar 2022/06/10 04:37 374,895 jackson-core-2.13.3.jar 2022/06/

[Resolved] Re: spark.stop() cannot stop spark connect session

2023-10-25 Thread eab...@163.com
d. eabour From: eab...@163.com Date: 2023-10-20 15:56 To: user @spark Subject: spark.stop() cannot stop spark connect session Hi, my code: from pyspark.sql import SparkSession spark = SparkSession.builder.remote("sc://172.29.190.147").getOrCreate() import pandas as pd # 创建pa

submitting tasks failed in Spark standalone mode due to missing failureaccess jar file

2023-10-23 Thread eab...@163.com
Hi Team. I use spark 3.5.0 to start Spark cluster with start-master.sh and start-worker.sh, when I use ./bin/spark-shell --master spark://LAPTOP-TC4A0SCV.:7077 and get error logs: ``` 23/10/24 12:00:46 ERROR TaskSchedulerImpl: Lost an executor 1 (already removed): Command exited with code

spark.stop() cannot stop spark connect session

2023-10-20 Thread eab...@163.com
Hi, my code: from pyspark.sql import SparkSession spark = SparkSession.builder.remote("sc://172.29.190.147").getOrCreate() import pandas as pd # 创建pandas dataframe pdf = pd.DataFrame({ "name": ["Alice", "Bob", "Charlie"], "age": [25, 30, 35], "gender": ["F", "M", "M"] }) # 将pandas

Re: Re: Running Spark Connect Server in Cluster Mode on Kubernetes

2023-10-19 Thread eab...@163.com
start the spark connect server as a service for client tests. So, I believe that by configuring the spark.plugins and starting the Spark cluster on Kubernetes, clients can utilize sc://ip:port to connect to the remote server. Let me give it a try. eabour From: eab...@163.com Date: 2023

Re: Re: Running Spark Connect Server in Cluster Mode on Kubernetes

2023-10-18 Thread eab...@163.com
Hi all, Has the spark connect server running on k8s functionality been implemented? From: Nagatomi Yasukazu Date: 2023-09-05 17:51 To: user Subject: Re: Running Spark Connect Server in Cluster Mode on Kubernetes Dear Spark Community, I've been exploring the capabilities of the Spark Conn

Unsubscribe

2023-07-08 Thread yixu2...@163.com
Unsubscribe yixu2...@163.com

UNSUBSCRIBE

2022-12-13 Thread yixu2...@163.com
UNSUBSCRIBE

[no subject]

2022-12-13 Thread yixu2...@163.com
UNSUBSCRIBE yixu2...@163.com

Running 30 Spark applications at the same time is slower than one on average

2022-10-26 Thread eab...@163.com
Hi All, I have a CDH5.16.2 hadoop cluster with 1+3 nodes(64C/128G, 1NN/RM + 3DN/NM), and yarn with 192C/240G. I used the following test scenario: 1.spark app resource with 2G driver memory/2C driver vcore/1 executor nums/2G executor memory/2C executor vcore. 2.one spark app will use 5G4C on yar

Re: Re: [how to]RDD using JDBC data source in PySpark

2022-09-19 Thread javaca...@163.com
D javaca...@163.com 发件人: Bjørn Jørgensen 发送时间: 2022-09-19 18:34 收件人: javaca...@163.com 抄送: Xiao, Alton; user@spark.apache.org 主题: Re: 答复: [how to]RDD using JDBC data source in PySpark https://www.projectpro.io/recipes/save-dataframe-mysql-pyspark and https://towardsdatascience.com/pyspark

回复: 答复: [how to]RDD using JDBC data source in PySpark

2022-09-19 Thread javaca...@163.com
Thank you answer alton. But i see that is use scala to implement it. I know java/scala can get data from mysql using JDBCRDD farily well. But i want to get same way in Python Spark. Would you to give me more advice, very thanks to you. javaca...@163.com 发件人: Xiao, Alton 发送时间: 2022-09-19 18

[how to]RDD using JDBC data source in PySpark

2022-09-19 Thread javaca...@163.com
...@163.com

Which manufacturers' GPUs support Spark?

2022-02-16 Thread 15927907...@163.com
e GPU supported by spark(https://spark.apache.org/docs/3.2.1/configuration.html#custom-resource-scheduling-and-configuration-overview). So, Can Spark also support GPUs from other manufacturers? such as AMD. Looking forward to your reply. 15927907...@163.com

Execution efficiency slows down as the number of CPU cores increases

2022-02-10 Thread 15927907...@163.com
ses, they are all consistent. Or why is the execution efficiency almost unchanged after the number of cores increases to a certain number. Looking forward to your reply. 15927907...@163.com

Spark usage help

2021-09-01 Thread yinghua...@163.com
Hi: I found that the following methods are used when setting parameters to create a sparksession access hive table 1) hive.execution.engine:spark spark = SparkSession.builder() .appName("get data from hive") .config("hive.execution.engine", "spark") .enableHiveSupport() .getOrCreate()

Re: How can I config hive.metastore.warehouse.dir

2021-08-11 Thread eab...@163.com
-ff7ed3cd4076 2021-08-12 09:21:22 INFO SessionState:641 - Created HDFS directory: /tmp/hive/hdfs/8bc342dd-aa0b-407b-b9ad-ff7ed3cd4076/_tmp_space.db === eab...@163.com From: igyu Date: 2021-08-12 11:33 To: user Subject: How can I config hive.metastore.warehouse.dir

Flink 1.11.3从Kafka提取数据到Hive问题求助

2021-02-03 Thread yinghua...@163.com
目前在官网上没看到Flink SQL中从kafka到Hive的样例?翻了下测试代码,但是需要自己在代码中特殊处理下,求助能给一个在Flink SQL中直接使用Hive Connector的样例吗? yinghua...@163.com

Is there a better way to read kerberized impala tables by spark jdbc?

2020-12-07 Thread eab...@163.com
is way required absolute jaas.conf file and keyTab file, in other words, these files must be placed in the same path and on each node, Is there a better way? Please help. Regards eab...@163.com

spark jar version problem

2019-05-13 Thread wangyongqiang0...@163.com
s the only client jar for hive and Spark Thriftserver server. is there any problems 2. is there any plan to provide hive-jdbc-2.3.3.spark2.jar in spark-2.3.1 for corresponding hive-2.3.3 wangyongqiang0...@163.com

[Spark SQL] was it correct that only one executor was used to shuffle the data for reduce task?

2018-06-25 Thread des...@163.com
ze / Records' value (about 100GB),others were zero. i saw it as there was only one reduce(shuffle) task for reading data and writing data. was this correct? (if the information was not enough,please tell me.) By Luo Hai Best Wishes! des...@163.com

How to persistent database/table created in sparkSession

2017-12-04 Thread 163
Hi, How can I persistent database/table created in spark application? object TestPersistentDB { def main(args:Array[String]): Unit = { val spark = SparkSession.builder() .appName("Create persistent table") .config("spark.

SparkSQL not support CharType

2017-11-22 Thread 163
Hi, when I use Dataframe with table schema, It goes wrong: val test_schema = StructType(Array( StructField("id", IntegerType, false), StructField("flag", CharType(1), false), StructField("time", DateType, false))); val df = spark.read.format("com.databricks.spark.csv") .schema(test_s

Re: Re: GraphFrame not init vertices when load edges

2016-12-18 Thread zjp_j...@163.com
Graph from specified edges that I think it's a good way, but now GraphLoader.edgeListFile load format is not allowed to set edge attribute in edge file, So I want to know GraphFrames has any plan about it or better ways. Thannks zjp_j...@163.com From: Felix Cheung Date: 2016-12-19 1

GraphFrame not init vertices when load edges

2016-12-18 Thread zjp_j...@163.com
t;b", "follow"), ("f", "c", "follow"), ("e", "f", "follow"), ("e", "d", "friend"), ("d", "a", "friend"), ("a", "e", "friend") )).toDF("src", "dst", "relationship") zjp_j...@163.com

Re: Java to show struct field from a Dataframe

2016-12-17 Thread zjp_j...@163.com
I think the causation is your invanlid Double data , have u checked your data ? zjp_j...@163.com From: Richard Xin Date: 2016-12-17 23:28 To: User Subject: Java to show struct field from a Dataframe let's say I have a DataFrame with schema of followings: root |-- name: string (nul

How to load edge with properties file useing GraphX

2016-12-15 Thread zjp_j...@163.com
t that can load edge file with no properties and then join Vertex properties to create Graph. So the issue is how to then attach edge properties. Thanks. zjp_j...@163.com

Re: Re: how to tuning spark shuffle

2016-07-18 Thread lizhenm...@163.com
Hi, Can you print out the environment tab on your UI. By default spark-sql runs on local mode which means that you only have one driver and one executor in one jvm. Do you increase the executor memory through SET spark.executor.memory=xG And adjust it and run the SQL again. HTH Dr Mich Taleb

how to convert the RDD[Array[Double]] to RDD[Double]

2016-03-14 Thread lizhenm...@163.com
it to RDD[Double]. Thanks. lizhenm...@163.com

Re: RE: How to compile Spark with private build of Hadoop

2016-03-08 Thread fightf...@163.com
customized hadoop jar and relative pom.xml to nexus repository. Check the link for reference: https://books.sonatype.com/nexus-book/reference/staging-deployment.html fightf...@163.com From: Lu, Yingqi Date: 2016-03-08 15:23 To: fightf...@163.com; user Subject: RE: How to compile Spark

Re: How to compile Spark with private build of Hadoop

2016-03-07 Thread fightf...@163.com
I think you can establish your own maven repository and deploy your modified hadoop binary jar with your modified version number. Then you can add your repository in spark pom.xml and use mvn -Dhadoop.version= fightf...@163.com From: Lu, Yingqi Date: 2016-03-08 15:09 To: user

Re: spark 1.6 Not able to start spark

2016-02-22 Thread fightf...@163.com
I think this may be some permission issue. Check your spark conf for hadoop related. fightf...@163.com From: Arunkumar Pillai Date: 2016-02-23 14:08 To: user Subject: spark 1.6 Not able to start spark Hi When i try to start spark-shell I'm getting following error Exception in thread

Re: Re: About cache table performance in spark sql

2016-02-04 Thread fightf...@163.com
Oh, thanks. Make sense to me. Best, Sun. fightf...@163.com From: Takeshi Yamamuro Date: 2016-02-04 16:01 To: fightf...@163.com CC: user Subject: Re: Re: About cache table performance in spark sql Hi, Parquet data are column-wise and highly compressed, so the size of deserialized rows in

Re: Re: About cache table performance in spark sql

2016-02-03 Thread fightf...@163.com
? From impala I get the overall parquet file size if about 24.59GB. Would be good to had some correction on this. Best, Sun. fightf...@163.com From: Prabhu Joseph Date: 2016-02-04 14:35 To: fightf...@163.com CC: user Subject: Re: About cache table performance in spark sql Sun, When

About cache table performance in spark sql

2016-02-03 Thread fightf...@163.com
age cannot hold the 24.59GB+ table size into memory. But why the performance is so different and even so bad ? Best, Sun. fightf...@163.com

Re: Re: clear cache using spark sql cli

2016-02-03 Thread fightf...@163.com
...@163.com From: Ted Yu Date: 2016-02-04 11:49 To: fightf...@163.com CC: user Subject: Re: Re: clear cache using spark sql cli In spark-shell, I can do: scala> sqlContext.clearCache() Is that not the case for you ? On Wed, Feb 3, 2016 at 7:35 PM, fightf...@163.com wrote: Hi, Ted Yes. I had s

Re: Re: clear cache using spark sql cli

2016-02-03 Thread fightf...@163.com
Hi, Ted Yes. I had seen that issue. But it seems that in spark-sql cli cannot do command like : sqlContext.clearCache() Is this right ? In spark-sql cli I can only run some sql queries. So I want to see if there are any available options to reach this. Best, Sun. fightf...@163.com

clear cache using spark sql cli

2016-02-03 Thread fightf...@163.com
Hi, How could I clear cache (execute sql query without any cache) using spark sql cli ? Is there any command available ? Best, Sun. fightf...@163.com

Re: Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-20 Thread fightf...@163.com
377,769 milliseconds ago. The last packet sent successfully to the server was 377,790 milliseconds ago. Do I need to increase the partitions ? Or shall I write parquet file for each partition in a iterable way ? Thanks a lot for your advice. Best, Sun. fightf...@163.com From: 刘虓 Date

Re: Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-20 Thread fightf...@163.com
sfully. Do I need to increase the partitions? Or is there any other alternatives I can choose to tune this ? Best, Sun. fightf...@163.com From: fightf...@163.com Date: 2016-01-20 15:06 To: 刘虓 CC: user Subject: Re: Re: spark dataframe jdbc read/write using dbcp connection pool Hi, Thanks a lot

Re: Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-19 Thread fightf...@163.com
4") The added_year column in mysql table contains range of (1985-2015), and I pass the numPartitions property to get the partition purpose. Is this what you recommend ? Can you advice a little more implementation on this ? Best, Sun. fightf...@163.com From: 刘虓 Date: 2016-01-20 11:26

spark dataframe jdbc read/write using dbcp connection pool

2016-01-19 Thread fightf...@163.com
1 in stage 0.0 (TID 2) com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure fightf...@163.com

spark dataframe read large mysql table running super slow

2016-01-06 Thread fightf...@163.com
rTempTable("video_test") sqlContext.sql("select count(1) from video_test").show() Overally the load process would stuck and get connection timeout. Mysql table hold about 100 million records. Would be happy to provide more usable info. Best, Sun. fightf...@163.com

Re: Spark 1.5.2 compatible spark-cassandra-connector

2015-12-29 Thread fightf...@163.com
Hi, Vivek M I had ever tried 1.5.x spark-cassandra connector and indeed encounter some classpath issues, mainly for the guaua dependency. I believe that can be solved by some maven config, but have not tried that yet. Best, Sun. fightf...@163.com From: vivek.meghanat...@wipro.com Date

回复: How can I get the column data based on specific column name and then stored these data in array or list ?

2015-12-25 Thread fightf...@163.com
Emm...I think you can do a df.map and store each column value to your list. fightf...@163.com 发件人: zml张明磊 发送时间: 2015-12-25 15:33 收件人: user@spark.apache.org 抄送: dev-subscr...@spark.apache.org 主题: How can I get the column data based on specific column name and then stored these data in array

Re: Re: Spark assembly in Maven repo?

2015-12-11 Thread fightf...@163.com
Agree with you that assembly jar is not good to publish. However, what he really need is to fetch an updatable maven jar file. fightf...@163.com From: Mark Hamstra Date: 2015-12-11 15:34 To: fightf...@163.com CC: Xiaoyong Zhu; Jeff Zhang; user; Zhaomin Xu; Joe Zhang (SDE) Subject: Re: RE

Re: RE: Spark assembly in Maven repo?

2015-12-10 Thread fightf...@163.com
Using maven to download the assembly jar is fine. I would recommend to deploy this assembly jar to your local maven repo, i.e. nexus repo, Or more likey a snapshot repository fightf...@163.com From: Xiaoyong Zhu Date: 2015-12-11 15:10 To: Jeff Zhang CC: user@spark.apache.org; Zhaomin Xu

count distinct in spark sql aggregation

2015-12-09 Thread fightf...@163.com
t and got the daily distinct count. However , I am not sure about this implementation can be some efficient workaround. Hope some guys can shed a little light on this. Best, Sun. fightf...@163.com

回复: Re: About Spark On Hbase

2015-12-09 Thread fightf...@163.com
using this. fightf...@163.com 发件人: censj 发送时间: 2015-12-09 15:44 收件人: fightf...@163.com 抄送: user@spark.apache.org 主题: Re: About Spark On Hbase So, I how to get this jar? I use set package project.I not found sbt lib. 在 2015年12月9日,15:42,fightf...@163.com 写道: I don't think it really nee

回复: Re: About Spark On Hbase

2015-12-08 Thread fightf...@163.com
I don't think it really need CDH component. Just use the API fightf...@163.com 发件人: censj 发送时间: 2015-12-09 15:31 收件人: fightf...@163.com 抄送: user@spark.apache.org 主题: Re: About Spark On Hbase But this is dependent on CDH。I not install CDH。 在 2015年12月9日,15:18,fightf...@163.com 写道: Act

Re: About Spark On Hbase

2015-12-08 Thread fightf...@163.com
Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase Also, HBASE-13992 already integrates that feature into the hbase side, but that feature has not been released. Best, Sun. fightf...@163.com From: censj Date: 2015-12-09 15:04 To: user@spark.apache.org Subject: About

Re: Re: spark sql cli query results written to file ?

2015-12-03 Thread fightf...@163.com
Well , Sorry for late reponse and thanks a lot for pointing out the clue. fightf...@163.com From: Akhil Das Date: 2015-12-03 14:50 To: Sahil Sareen CC: fightf...@163.com; user Subject: Re: spark sql cli query results written to file ? Oops 3 mins late. :) Thanks Best Regards On Thu, Dec 3

spark sql cli query results written to file ?

2015-12-02 Thread fightf...@163.com
HI, How could I save the spark sql cli running queries results and write the results to some local file ? Is there any available command ? Thanks, Sun. fightf...@163.com

Re: New to Spark

2015-12-01 Thread fightf...@163.com
hive config, that would help to locate root cause for the problem. Best, Sun. fightf...@163.com From: Ashok Kumar Date: 2015-12-01 18:54 To: user@spark.apache.org Subject: New to Spark Hi, I am new to Spark. I am trying to use spark-sql with SPARK CREATED and HIVE CREATED tables. I have

Re: RE: error while creating HiveContext

2015-11-27 Thread fightf...@163.com
Could you provide your hive-site.xml file info ? Best, Sun. fightf...@163.com From: Chandra Mohan, Ananda Vel Murugan Date: 2015-11-27 17:04 To: fightf...@163.com; user Subject: RE: error while creating HiveContext Hi, I verified and I could see hive-site.xml in spark conf directory

Re: error while creating HiveContext

2015-11-26 Thread fightf...@163.com
Hi, I think you just want to put the hive-site.xml in the spark/conf directory and it would load it into spark classpath. Best, Sun. fightf...@163.com From: Chandra Mohan, Ananda Vel Murugan Date: 2015-11-27 15:04 To: user Subject: error while creating HiveContext Hi, I am building a

Re: Spark Thrift doesn't start

2015-11-10 Thread fightf...@163.com
I think the exception info just says clear that you may miss some tez related jar on the spark thrift server classpath. fightf...@163.com From: DaeHyun Ryu Date: 2015-11-11 14:47 To: user Subject: Spark Thrift doesn't start Hi folks, I configured tez as execution engine of Hive. After

Re: Re: OLAP query using spark dataframe with cassandra

2015-11-09 Thread fightf...@163.com
Hi, Have you ever considered cassandra as a replacement ? We are now almost the seem usage as your engine, e.g. using mysql to store initial aggregated data. Can you share more about your kind of Cube queries ? We are very interested in that arch too : ) Best, Sun. fightf...@163.com

Re: Re: OLAP query using spark dataframe with cassandra

2015-11-09 Thread fightf...@163.com
prompt response. fightf...@163.com From: tsh Date: 2015-11-10 02:56 To: fightf...@163.com; user; dev Subject: Re: OLAP query using spark dataframe with cassandra Hi, I'm in the same position right now: we are going to implement something like OLAP BI + Machine Learning explorations on the

Re: Re: OLAP query using spark dataframe with cassandra

2015-11-08 Thread fightf...@163.com
of olap architecture. And we are happy to hear more use case from this community. Best, Sun. fightf...@163.com From: Jörn Franke Date: 2015-11-09 14:40 To: fightf...@163.com CC: user; dev Subject: Re: OLAP query using spark dataframe with cassandra Is there any distributor supporting

OLAP query using spark dataframe with cassandra

2015-11-08 Thread fightf...@163.com
-apache-cassandra-and-spark fightf...@163.com

Re: Re: Spark RDD cache persistence

2015-11-05 Thread r7raul1...@163.com
You can try http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html#Archival_Storage_SSD__Memory . Hive tmp table use this function to speed job. https://issues.apache.org/jira/browse/HIVE-7313 r7raul1...@163.com From: Christian Date: 2015-11-06 13:50 To

回复: spark to hbase

2015-10-27 Thread fightf...@163.com
Hi I notice that you configured the following : configuration.set("hbase.master", "192.168.1:6"); Did you mistyped the host IP ? Best, Sun. fightf...@163.com 发件人: jinhong lu 发送时间: 2015-10-27 17:22 收件人: spark users 主题: spark to hbase Hi, I write my result to hd

how to use Trees and ensembles: class probabilities

2015-10-21 Thread r7raul1...@163.com
how to use trees and ensembles: class probabilities in spark 1.5.0 . Any example or document ? r7raul1...@163.com

Re: Re: How to fix some WARN when submit job on spark 1.5 YARN

2015-09-24 Thread r7raul1...@163.com
Thank you r7raul1...@163.com From: Sean Owen Date: 2015-09-24 16:18 To: r7raul1...@163.com CC: user Subject: Re: How to fix some WARN when submit job on spark 1.5 YARN You can ignore all of these. Various libraries can take advantage of native acceleration if libs are available but it'

How to fix some WARN when submit job on spark 1.5 YARN

2015-09-23 Thread r7raul1...@163.com
1 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 2 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS 3 WARN Unable to load native-hadoop library for your platform r7raul1...@163.com

Re: Re: Spark standalone/Mesos on top of Ceph

2015-09-22 Thread fightf...@163.com
Gateway s3 rest api, agreed for such inconvinience and some incompobilities. However, we had not yet quite researched and tested over radosgw a lot. But we had some little requirements using gw in some use cases. Hope for more considerations and talks. Best, Sun. fightf...@163.com From: Jerry

Spark standalone/Mesos on top of Ceph

2015-09-22 Thread fightf...@163.com
progress ? Best, Sun. fightf...@163.com

Re: RE: spark sql hook

2015-09-16 Thread r7raul1...@163.com
Thank you r7raul1...@163.com From: Cheng, Hao Date: 2015-09-17 12:32 To: r7raul1...@163.com; user Subject: RE: RE: spark sql hook Probably a workable solution is, create your own SQLContext by extending the class HiveContext, and override the `analyzer`, and add your own rule to do the

Re: RE: spark sql hook

2015-09-16 Thread r7raul1...@163.com
Example: select * from test.table chang to select * from production.table r7raul1...@163.com From: Cheng, Hao Date: 2015-09-17 11:05 To: r7raul1...@163.com; user Subject: RE: spark sql hook Catalyst TreeNode is very fundamental API, not sure what kind of hook you need. Any concrete

spark sql hook

2015-09-16 Thread r7raul1...@163.com
I want to modify some sql treenode before execute. I cau do this by hive hook in hive. Does spark support such hook? Any advise? r7raul1...@163.com

Re: intellij14 compiling spark-1.3.1 got error: assertion failed: com.google.protobuf.InvalidProtocalBufferException

2015-08-09 Thread longda...@163.com
the stack trace is below Error:scalac: while compiling: /home/xiaoju/data/spark-1.3.1/core/src/main/scala/org/apache/spark/SparkContext.scala during phase: typer library version: version 2.10.4 compiler version: version 2.10.4 reconstructed args: -nobootcp -javabootclasspa

intellij14 compiling spark-1.3.1 got error: assertion failed: com.google.protobuf.InvalidProtocalBufferException

2015-08-09 Thread longda...@163.com
hi all, i compile spark-1.3.1 on linux use intellij14 and got error assertion failed: com.google.protobuf.InvalidProtocalBufferException, how could i solve the problem? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/intellij14-compiling-spark-1-3-1-got-

Re: Re: How RDD lineage works

2015-07-31 Thread bit1...@163.com
Thanks TD, I have got some understanding now. bit1...@163.com From: Tathagata Das Date: 2015-07-31 13:45 To: bit1...@163.com CC: yuzhihong; user Subject: Re: Re: How RDD lineage works https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/FailureSuite.scala This may

Re: Re: How RDD lineage works

2015-07-30 Thread bit1...@163.com
that partition. Thus, lost data can be recovered, often quite quickly, without requiring costly replication. bit1...@163.com From: bit1...@163.com Date: 2015-07-31 13:11 To: Tathagata Das; yuzhihong CC: user Subject: Re: Re: How RDD lineage works Thanks TD and Zhihong for the guide. I will

Re: Re: How RDD lineage works

2015-07-30 Thread bit1...@163.com
Thanks TD and Zhihong for the guide. I will check it bit1...@163.com From: Tathagata Das Date: 2015-07-31 12:27 To: Ted Yu CC: bit1...@163.com; user Subject: Re: How RDD lineage works You have to read the original Spark paper to understand how RDD lineage works. https://www.cs.berkeley.edu

How RDD lineage works

2015-07-30 Thread bit1...@163.com
Hi, I don't get a good understanding how RDD lineage works, so I would ask whether spark provides a unit test in the code base to illustrate how RDD lineage works. If there is, What's the class name is it? Thanks! bit1...@163.com

Re: PermGen Space Error

2015-07-29 Thread fightf...@163.com
Hi, Sarath Did you try to use and increase spark.excecutor.extraJaveOptions -XX:PermSize= -XX:MaxPermSize= fightf...@163.com From: Sarath Chandra Date: 2015-07-29 17:39 To: user@spark.apache.org Subject: PermGen Space Error Dear All, I'm using - => Spark 1.2.0 => Hive 0.13.

A question about spark checkpoint

2015-07-28 Thread bit1...@163.com
rdd = sc.parallelize(List(1, 2, 3, 4, 5)) val heavyOpRDD = rdd.map(squareWithHeavyOp) heavyOpRDD.checkpoint() heavyOpRDD.foreach(println) println("Job 0 has been finished, press ENTER to do job 1") readLine() heavyOpRDD.foreach(println) } } bit1...@163.com

Re: Functions in Spark SQL

2015-07-27 Thread fightf...@163.com
Hi, there I test with sqlContext.sql(select funcName(param1,param2,...) from tableName ) just worked fine. Would you like to paste your test code here ? And which version of Spark are u using ? Best, Sun. fightf...@163.com From: vinod kumar Date: 2015-07-27 15:04 To: User Subject

Stop condition Spark reading from Kafka with ReliableKafkaReceiver

2015-07-24 Thread cas...@163.com
other words, is there any API or methods I can use to check if all the messages are consumed? Anyone can help me? Any suggestions are welcome! Best regards, cast cas...@163.com

Re: Re: Need help in setting up spark cluster

2015-07-22 Thread fightf...@163.com
suggest you firstly to deploy a spark standalone cluster to run some integration tests, and also you can consider running spark on yarn for the later development use cases. Best, Sun. fightf...@163.com From: Jeetendra Gangele Date: 2015-07-23 13:39 To: user Subject: Re: Need help in setting

What if request cores are not satisfied

2015-07-22 Thread bit1...@163.com
with fewer cores, but I didn't get a chance to try/test it. Thanks. bit1...@163.com

Re: Re: Application jar file not found exception when submitting application

2015-07-06 Thread bit1...@163.com
Thanks Shixiong for the reply. Yes, I confirm that the file exists there ,simply checks with ls -l /data/software/spark-1.3.1-bin-2.4.0/applications/pss.am.core-1.0-SNAPSHOT-shaded.jar bit1...@163.com From: Shixiong Zhu Date: 2015-07-06 18:41 To: bit1...@163.com CC: user Subject: Re

Application jar file not found exception when submitting application

2015-07-06 Thread bit1...@163.com
1.run(DriverRunner.scala:72) bit1...@163.com

Explanation of the numbers on Spark Streaming UI

2015-06-30 Thread bit1...@163.com
received records are many more than processed records, I can't understand why the total delay or scheduling day is not obvious(5 secs) here. Can someone help explain what clues from this UI? Thanks. bit1...@163.com

Re: spark streaming HDFS file issue

2015-06-29 Thread bit1...@163.com
What do you mean by "new file", do you upload an already existing file onto HDFS or create a new one locally and then upload it to HDFS? bit1...@163.com From: ravi tella Date: 2015-06-30 09:59 To: user Subject: spark streaming HDFS file issue I am running a spark streaming ex

when cached RDD will unpersist its data

2015-06-23 Thread bit1...@163.com
I am kind of consused about when cached RDD will unpersist its data. I know we can explicitly unpersist it with RDD.unpersist ,but can it be unpersist automatically by the spark framework? Thanks. bit1...@163.com

How to figure out how many records received by individual receiver

2015-06-23 Thread bit1...@163.com
Hi, I am using spark1.3.1, and have 2 receivers, On the web UI, I can only see the total records received by all these 2 receivers, but I can't figure out the records received by individual receiver? Not sure whether the information is shown on the UI in spark1.4. bit1...@163.com

Re: Re: What does [Stage 0:> (0 + 2) / 2] mean on the console

2015-06-23 Thread bit1...@163.com
Hi, Akhil, Thank you for the explanation! bit1...@163.com From: Akhil Das Date: 2015-06-23 16:29 To: bit1...@163.com CC: user Subject: Re: What does [Stage 0:> (0 + 2) / 2] mean on the console Well, you could that (Stage information) is an ASCII representation of the WebUI (running on p

What does [Stage 0:> (0 + 2) / 2] mean on the console

2015-06-23 Thread bit1...@163.com
how to suppress it bit1...@163.com

Re: RE: Spark or Storm

2015-06-19 Thread bit1...@163.com
tics; if it set "largest", then it will be at most once semantics? bit1...@163.com From: Haopu Wang Date: 2015-06-19 18:47 To: Enno Shioji; Tathagata Das CC: prajod.vettiyat...@wipro.com; Cody Koeninger; bit1...@163.com; Jordan Pilat; Will Briggs; Ashish Soni; ayan guha; user@spark

Re: RE: Build spark application into uber jar

2015-06-19 Thread bit1...@163.com
Sure, Thanks Projod for the detailed steps! bit1...@163.com From: prajod.vettiyat...@wipro.com Date: 2015-06-19 16:56 To: bit1...@163.com; ak...@sigmoidanalytics.com CC: user@spark.apache.org Subject: RE: RE: Build spark application into uber jar Multiple maven profiles may be the ideal way

Re: RE: Build spark application into uber jar

2015-06-19 Thread bit1...@163.com
ClusterRun provided bit1...@163.com From: prajod.vettiyat...@wipro.com Date: 2015-06-19 15:22 To: bit1...@163.com; ak...@sigmoidanalytics.com CC: user@spark.apache.org Subject: RE: Re: Build spark application into uber jar Hi, When running inside Eclipse IDE, I use another maven target

Re: RE: Build spark application into uber jar

2015-06-19 Thread bit1...@163.com
Thank you for the reply. "Run the application locally" means that I run the application in my IDE with master as local[*]. When spark stuff is marked as provided, then I can't run it because the spark stuff is missing. So, how do you work around this? Thanks! bit1...

Build spark application into uber jar

2015-06-18 Thread bit1...@163.com
! bit1...@163.com

  1   2   >