Where is the SparkSQL Specification?

2016-07-21 Thread Linyuxin
Hi All Newbee here. My spark version is 1.5.1 And I want to know how can I find the Specification of Spark SQL to find out that if it is supported ‘a like %b_xx’ or other sql syntax

Any reference of performance tuning on SparkSQL?

2016-07-28 Thread Linyuxin
Hi ALL Is there any reference of performance tuning on SparkSQL? I can only find about turning on spark core on http://spark.apache.org/

How to avoid sql injection on SparkSQL?

2016-08-04 Thread Linyuxin
Hi All, I want to know how to avoid sql injection on SparkSQL Is there any common pattern about this? e.g. some useful tool or code segment or just create a “wheel” on SparkSQL myself. Thanks.

submit spark task on yarn asynchronously via java?

2016-12-21 Thread Linyuxin
Hi All, Version: Spark 1.5.1 Hadoop 2.7.2 Is there any way to submit and monitor spark task on yarn via java asynchronously?

答复: submit spark task on yarn asynchronously via java?

2016-12-22 Thread Linyuxin
Hi, Could Anybody help? 发件人: Linyuxin 发送时间: 2016年12月22日 14:18 收件人: user 主题: submit spark task on yarn asynchronously via java? Hi All, Version: Spark 1.5.1 Hadoop 2.7.2 Is there any way to submit and monitor spark task on yarn via java asynchronously?

答复: 答复: submit spark task on yarn asynchronously via java?

2016-12-25 Thread Linyuxin
Thanks. 发件人: Naveen [mailto:hadoopst...@gmail.com] 发送时间: 2016年12月25日 0:33 收件人: Linyuxin 抄送: user 主题: Re: 答复: submit spark task on yarn asynchronously via java? Hi, Please use SparkLauncher API class and invoke the threads using async calls using Futures. Using SparkLauncher, you can mention

can UDF accept "Any"/"AnyVal"/"AnyRef"(java Object) as parameter or as return type ?

2017-01-03 Thread Linyuxin
Hi all With Spark 1.5.1 When I want to implement a oracle decode function (like decode(col1,1,’xxx’,’p2’,’yyy’,0)) And the code may like this sqlContext.udf.register("any_test", (s:AnyVal)=>{ if(s == null) null else s }) The error shows: Exception in thread "mai

[SparkSQL] pre-check syntex before running spark job?

2017-02-20 Thread Linyuxin
Hi All, Is there any tool/api to check the sql syntax without running spark job actually? Like the siddhiQL on storm here: SiddhiManagerService. validateExecutionPlan https://github.com/wso2/siddhi/blob/master/modules/siddhi-core/src/main/java/org/wso2/siddhi/core/SiddhiManagerService.java it can

答复: [SparkSQL] pre-check syntex before running spark job?

2017-02-21 Thread Linyuxin
Actually,I want a standalone jar as I can check the syntax without spark execution environment 发件人: Irving Duran [mailto:irving.du...@gmail.com] 发送时间: 2017年2月21日 23:29 收件人: Yong Zhang 抄送: Jacek Laskowski ; Linyuxin ; user 主题: Re: [SparkSQL] pre-check syntex before running spark job? You can

答复: [SparkSQL] pre-check syntex before running spark job?

2017-02-21 Thread Linyuxin
Hi Gurdit Singh Thanks. It is very helpful. 发件人: Gurdit Singh [mailto:gurdit.si...@bitwiseglobal.com] 发送时间: 2017年2月22日 13:31 收件人: Linyuxin ; Irving Duran ; Yong Zhang 抄送: Jacek Laskowski ; user 主题: RE: [SparkSQL] pre-check syntex before running spark job? Hi, you can use spark sql Antlr

答复: GroupBy in Spark / Scala without Agg functions

2018-05-29 Thread Linyuxin
Hi, Why not group by first then join? BTW, I don’t think there any difference between ‘distinct’ and ‘group by’ Source code of 2.1: def distinct(): Dataset[T] = dropDuplicates() … def dropDuplicates(colNames: Seq[String]): Dataset[T] = withTypedPlan { … Aggregate(groupCols, aggCols, logicalPlan) }