Using Spark SQL in Spark1.2.1 with Hive 0.14

2015-05-15 Thread smazumder
Hi, I'm trying to execute queries from beeline to Hive 0.14 from Spark SQL (1.2.1). A simple query like 'show tables' or 'create schema ' doe not return back at all. Do I need to update to Spark 1.3 for this to work with 0.14 ? Is there any other alternatives ? Regards, Sourav -- View this m

Spark sql and csv data processing question

2015-05-15 Thread Mike Frampton
Hi Im getting the following error when trying to process a csv based data file. Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 10.0 failed 4 times, most recent failure: Lost task 1.3 in stage 10.0 (TID 262, hc2r1m3.semtech-solution

Why association with remote system has failed when set master in Spark programmatically

2015-05-15 Thread Yi.Zhang
Hi all, I run start-master.sh to start standalone Spark with spark://192.168.1.164:7077. Then, I use this command as below, and it's OK: ./bin/spark-shell --master spark://192.168.1.164:7077 The console print correct message, and Spark context had been initialised correctly. However, when I run

[spark sql] $ and === can't be recognised in IntelliJ

2015-05-15 Thread Yi.Zhang
Hi all, I wanted to join the data frame based on spark sql in IntelliJ, and wrote these code lines as below: df1.as('first).join(df2.as('second), $"first._1" === $"second._1") IntelliJ reported the error for $ and === in red colour. I found $ and === are defined as implicit conversion in org.apa

How to reshape RDD/Spark DataFrame

2015-05-15 Thread macwanjason
Hi all, I am a student trying to learn Spark and I had a question regarding converting rows to columns (data pivot/reshape). I have some data in the following format (either RDD or Spark DataFrame): from pyspark.sql import SQLContext sqlContext = SQLContext(sc) rdd = sc.parallelize(

Join Issue in IntelliJ Idea

2015-05-15 Thread Yi Zhang
Hi all, I wanted to join the data frame based on spark sql in IntelliJ, and wrote these code lines as below:df1.as('first).join(df2.as('second), $"first._1" === $"second._1") IntelliJ reported the error for $ and === in red colour. I found $ and === are defined as implicit conversion in  org.apa

RE: Spark's Guava pieces cause exceptions in non-trivial deployments

2015-05-15 Thread Anton Brazhnyk
For me it wouldn’t help I guess, because those newer classes would still be loaded by different classloader. What did work for me with 1.3.1 – removing of those classes from Spark’s jar completely, so they get loaded from external Guava (the version I prefer) and by the classloader I expect. Tha

[spark sql] $ and === can't be recognised in IntelliJ

2015-05-15 Thread Yi Zhang
Hi all, I wanted to join the data frame based on spark sql in IntelliJ, and wrote these code lines as below:df1.as('first).join(df2.as('second), $"first._1" === $"second._1") IntelliJ reported the error for $ and === in red colour. I found $ and === are defined as implicit conversion in  org.apa

Re: Broadcast variables can be rebroadcast?

2015-05-15 Thread ayan guha
Hi broadcast variables are shipped for the first time it is accessed in a transformation to the executors used by the transformation. It will NOT updated subsequently, even if the value has changed. However, a new value will be shipped to any new executor comes into play after the value has change

Re: Broadcast variables can be rebroadcast?

2015-05-15 Thread Ilya Ganelin
Nope. It will just work when you all x.value. On Fri, May 15, 2015 at 5:39 PM N B wrote: > Thanks Ilya. Does one have to call broadcast again once the underlying > data is updated in order to get the changes visible on all nodes? > > Thanks > NB > > > On Fri, May 15, 2015 at 5:29 PM, Ilya Ganelin

Re: Broadcast variables can be rebroadcast?

2015-05-15 Thread N B
Thanks Ilya. Does one have to call broadcast again once the underlying data is updated in order to get the changes visible on all nodes? Thanks NB On Fri, May 15, 2015 at 5:29 PM, Ilya Ganelin wrote: > The broadcast variable is like a pointer. If the underlying data changes > then the changes

Re: Broadcast variables can be rebroadcast?

2015-05-15 Thread Ilya Ganelin
The broadcast variable is like a pointer. If the underlying data changes then the changes will be visible throughout the cluster. On Fri, May 15, 2015 at 5:18 PM NB wrote: > Hello, > > Once a broadcast variable is created using sparkContext.broadcast(), can it > ever be updated again? The use cas

Broadcast variables can be rebroadcast?

2015-05-15 Thread NB
Hello, Once a broadcast variable is created using sparkContext.broadcast(), can it ever be updated again? The use case is for something like the underlying lookup data changing over time. Thanks NB -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast

Re: Best practice to avoid ambiguous columns in DataFrame.join

2015-05-15 Thread Justin Yip
Thanks Michael, This is very helpful. I have a follow up question related to NaFunctions. Usually after a left outer join, we get lots of null value and we need to handle them before further processing. I have the following piece of code, the "_1" column is duplicated and crashes the .na.fill func

Re: Best practice to avoid ambiguous columns in DataFrame.join

2015-05-15 Thread Michael Armbrust
There are several ways to solve this ambiguity: *1. use the DataFrames to get the attribute so its already "resolved" and not just a string we need to map to a DataFrame.* df.join(df2, df("_1") === df2("_1")) *2. Use aliases* df.as('a).join(df2.as('b), $"a._1" === $"b._1") *3. rename the colum

Best practice to avoid ambiguous columns in DataFrame.join

2015-05-15 Thread Justin Yip
Hello, I would like ask know if there are recommended ways of preventing ambiguous columns when joining dataframes. When we join dataframes, it usually happen we join the column with identical name. I could have rename the columns on the right data frame, as described in the following code. Is the

Re: Using groupByKey with Spark SQL

2015-05-15 Thread Michael Armbrust
Perhaps you are looking for GROUP BY and collect_set, which would allow you to stay in SQL. I'll add that in Spark 1.4 you can get access to items of a row by name. On Fri, May 15, 2015 at 10:48 AM, Edward Sargisson wrote: > Hi all, > This might be a question to be answered or feedback for a po

Re: Error communicating with MapOutputTracker

2015-05-15 Thread Thomas Gerber
Hi Imran, Thanks for the advice, tweaking with some akka parameters helped. See below. Now, we noticed that we get java heap OOM exceptions on the output tracker when we have too many tasks. I wonder: 1. where does the map output tracker live? The driver? The master (when those are not the same)?

Re: Custom Aggregate Function for DataFrame

2015-05-15 Thread Justin Yip
Hi Ayan, I have a DF constructed from the following case class Event: case class State { attr1: String, } case class Event { userId: String, time: Long, state: State } I would like to generate a DF which contains the latest state of each userId. I could have first compute the latest t

Re: Spark's Guava pieces cause exceptions in non-trivial deployments

2015-05-15 Thread Marcelo Vanzin
On Fri, May 15, 2015 at 2:35 PM, Thomas Dudziak wrote: > I've just been through this exact case with shaded guava in our Mesos > setup and that is how it behaves there (with Spark 1.3.1). > If that's the case, it's a bug in the Mesos backend, since the spark.* options should behave exactly the s

Re: Spark's Guava pieces cause exceptions in non-trivial deployments

2015-05-15 Thread Thomas Dudziak
I've just been through this exact case with shaded guava in our Mesos setup and that is how it behaves there (with Spark 1.3.1). cheers, Tom On Fri, May 15, 2015 at 12:04 PM, Marcelo Vanzin wrote: > On Fri, May 15, 2015 at 11:56 AM, Thomas Dudziak wrote: > >> Actually the extraClassPath settin

Re: SaveAsTextFile brings down data nodes with IO Exceptions

2015-05-15 Thread Puneet Kapoor
I am seeing this on hadoop 2.4.0 version. Thanks for your suggestions, i will try those and let you know if they help ! On Sat, May 16, 2015 at 1:57 AM, Steve Loughran wrote: > What version of Hadoop are you seeing this on? > > > On 15 May 2015, at 20:03, Puneet Kapoor > wrote: > > Hey, > >

Re: Forbidded : Error Code: 403

2015-05-15 Thread Mohammad Tariq
Thanks for the suggestion Steve. I'll try that out. Read the long story last night while struggling with this :). I made sure that I don't have any '/' in my key. On Saturday, May 16, 2015, Steve Loughran wrote: > > > On 15 May 2015, at 21:20, Mohammad Tariq > wrote: > > > > Thank you Ayan and

Re: Forbidded : Error Code: 403

2015-05-15 Thread Steve Loughran
> On 15 May 2015, at 21:20, Mohammad Tariq wrote: > > Thank you Ayan and Ted for the prompt response. It isn't working with s3n > either. > > And I am able to download the file. In fact I am able to read the same file > using s3 API without any issue. > sounds like an S3n config problem.

Re: SaveAsTextFile brings down data nodes with IO Exceptions

2015-05-15 Thread Steve Loughran
What version of Hadoop are you seeing this on? On 15 May 2015, at 20:03, Puneet Kapoor mailto:puneet.cse.i...@gmail.com>> wrote: Hey, Did you find any solution for this issue, we are seeing similar logs in our Data node logs. Appreciate any help. 2015-05-15 10:51:43,615 ERROR org.apache.

Re: Forbidded : Error Code: 403

2015-05-15 Thread Mohammad Tariq
Thank you Ayan and Ted for the prompt response. It isn't working with s3n either. And I am able to download the file. In fact I am able to read the same file using s3 API without any issue. On Friday, May 15, 2015, Ted Yu wrote: > Have you verified that you can download the file from bucket-nam

Re: Spark's Guava pieces cause exceptions in non-trivial deployments

2015-05-15 Thread Marcelo Vanzin
On Fri, May 15, 2015 at 11:56 AM, Thomas Dudziak wrote: > Actually the extraClassPath settings put the extra jars at the end of the > classpath so they won't help. Only the deprecated SPARK_CLASSPATH puts them > at the front. > That's definitely not the case for YARN: https://github.com/apache/s

Re: SaveAsTextFile brings down data nodes with IO Exceptions

2015-05-15 Thread Puneet Kapoor
Hey, Did you find any solution for this issue, we are seeing similar logs in our Data node logs. Appreciate any help. 2015-05-15 10:51:43,615 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: NttUpgradeDN1:50010:DataXceiver error processing WRITE_BLOCK operation src: /192.168.112.190:46253

Re: Spark's Guava pieces cause exceptions in non-trivial deployments

2015-05-15 Thread Thomas Dudziak
Actually the extraClassPath settings put the extra jars at the end of the classpath so they won't help. Only the deprecated SPARK_CLASSPATH puts them at the front. cheers, Tom On Fri, May 15, 2015 at 11:54 AM, Marcelo Vanzin wrote: > Ah, I see. yeah, it sucks that Spark has to expose Optional (

Re: Spark's Guava pieces cause exceptions in non-trivial deployments

2015-05-15 Thread Marcelo Vanzin
Ah, I see. yeah, it sucks that Spark has to expose Optional (and things it depends on), but removing that would break the public API, so... One last thing you could try is do add your newer Guava jar to "spark.driver.extraClassPath" and "spark.executor.extraClassPath". Those settings will place yo

Re: Problem with current spark

2015-05-15 Thread Shixiong Zhu
Could your provide the full driver log? Looks like a bug. Thank you! Best Regards, Shixiong Zhu 2015-05-13 14:02 GMT-07:00 Giovanni Paolo Gibilisco : > Hi, > I'm trying to run an application that uses a Hive context to perform some > queries over JSON files. > The code of the application is here

Re: Spark's Guava pieces cause exceptions in non-trivial deployments

2015-05-15 Thread Thomas Dudziak
This is still a problem in 1.3. Optional is both used in several shaded classes within Guava (e.g. the Immutable* classes) and itself uses shaded classes (e.g. AbstractIterator). This causes problems in application code. The only reliable way we've found around this is to shade Guava ourselves for

Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Mark Hamstra
If you don't send jobs to different pools, then they will all end up in the default pool. If you leave the intra-pool scheduling policy as the default FIFO, then this will effectively be the same thing as using the default FIFO scheduling. Depending on what you are trying to accomplish, you need

Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Richard Marscher
It's not a Spark Streaming app, so sorry I'm not sure of the answer to that. I would assume it should work. On Fri, May 15, 2015 at 2:22 PM, Evo Eftimov wrote: > Ok thanks a lot for clarifying that – btw was your application a Spark > Streaming App – I am also looking for confirmation that FAIR

Re: how to use rdd.countApprox

2015-05-15 Thread Du Li
Hi TD, Just let you know the job group and cancelation worked after I switched to spark 1.3.1. I set a group id for rdd.countApprox() and cancel it, then set another group id for the remaining job of the foreachRDD but let it complete. As a by-product, I use group id to indicate what the job doe

Re: store hive metastore on persistent store

2015-05-15 Thread Yana Kadiyska
My point was more to how to verify that properties are picked up from the hive-site.xml file. You don't really need hive.metastore.uris if you're not running against an external metastore. I just did an experiment with warehouse.dir. My hive-site.xml looks like this: hive.metastore

RE: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Evo Eftimov
Ok thanks a lot for clarifying that – btw was your application a Spark Streaming App – I am also looking for confirmation that FAIR scheduling is supported for Spark Streaming Apps From: Richard Marscher [mailto:rmarsc...@localytics.com] Sent: Friday, May 15, 2015 7:20 PM To: Evo Eftimov Cc

Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Richard Marscher
The doc is a bit confusing IMO, but at least for my application I had to use a fair pool configuration to get my stages to be scheduled with FAIR. On Fri, May 15, 2015 at 2:13 PM, Evo Eftimov wrote: > No pools for the moment – for each of the apps using the straightforward > way with the spark c

RE: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Evo Eftimov
No pools for the moment – for each of the apps using the straightforward way with the spark conf param for scheduling = FAIR Spark is running in a Standalone Mode Are you saying that Configuring Pools is mandatory to get the FAIR scheduling working – from the docs it seemed optional to

Re: spark log field clarification

2015-05-15 Thread yanwei
anybody shed some light for me? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-log-field-clarification-tp22892p22904.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --

Re: store hive metastore on persistent store

2015-05-15 Thread Tamas Jambor
thanks for the reply. I am trying to use it without hive setup (spark-standalone), so it prints something like this: hive_ctx.sql("show tables").collect() 15/05/15 17:59:03 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 15/05/15 17:59

Using groupByKey with Spark SQL

2015-05-15 Thread Edward Sargisson
Hi all, This might be a question to be answered or feedback for a possibly new feature depending: We have source data which is events about the state changes of an entity (identified by an ID) represented as nested JSON. We wanted to sessionize this data so that we had a collection of all the even

Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Tathagata Das
How are you configuring the fair scheduler pools? On Fri, May 15, 2015 at 8:33 AM, Evo Eftimov wrote: > I have run / submitted a few Spark Streaming apps configured with Fair > scheduling on Spark Streaming 1.2.0, however they still run in a FIFO mode. > Is FAIR scheduling supported at all for S

Re: Spark Job execution time

2015-05-15 Thread SamyaMaiti
It does depend on the network IO within your cluster & CPU usage. Said that the difference in time to run should not be huge (assumption, you are not running any other job in the cluster in parallel). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Jo

Re: [SparkSQL 1.4.0] groupBy columns are always nullable?

2015-05-15 Thread Olivier Girardot
yes, please do and send me the link. @rxin I have trouble building master, but the code is done... Le ven. 15 mai 2015 à 01:27, Haopu Wang a écrit : > Thank you, should I open a JIRA for this issue? > > > -- > > *From:* Olivier Girardot [mailto:ssab...@gmail.com] >

Re: multiple hdfs folder & files input to PySpark

2015-05-15 Thread Oleg Ruchovets
Hello , I used approach that you've suggested : lines = sc.textFile("/input/lprs/2015_05_15/file4.csv, /input/lprs/2015_05_14/file3.csv, /input/lprs/2015_05_13/file2.csv, /input/lprs/2015_05_12/file1.csv") but It doesn't work for me: py4j.protocol.Py4JJavaError: An error occurred

Re: SPARK-4412 regressed?

2015-05-15 Thread Yana Kadiyska
Thanks Sean, with the added permissions I do now have this extra option. On Fri, May 15, 2015 at 11:20 AM, Sean Owen wrote: > (I made you a Contributor in JIRA -- your yahoo-related account of the > two -- so maybe that will let you do so.) > > On Fri, May 15, 2015 at 4:19 PM, Yana Kadiyska >

Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Evo Eftimov
I have run / submitted a few Spark Streaming apps configured with Fair scheduling on Spark Streaming 1.2.0, however they still run in a FIFO mode. Is FAIR scheduling supported at all for Spark Streaming apps and from what release / version - e.g. 1.3.1 -- View this message in context: http://a

Re: SPARK-4412 regressed?

2015-05-15 Thread Sean Owen
(I made you a Contributor in JIRA -- your yahoo-related account of the two -- so maybe that will let you do so.) On Fri, May 15, 2015 at 4:19 PM, Yana Kadiyska wrote: > Hi, two questions > > 1. Can regular JIRA users reopen bugs -- I can open a new issue but it does > not appear that I can reopen

SPARK-4412 regressed?

2015-05-15 Thread Yana Kadiyska
Hi, two questions 1. Can regular JIRA users reopen bugs -- I can open a new issue but it does not appear that I can reopen issues. What is the proper protocol to follow if we discover regressions? 2. I believe SPARK-4412 regressed in Spark 1.3.1, according to this SO thread possibly even in 1.3.0

Re: Why association with remote system has failed when set master in Spark programmatically

2015-05-15 Thread Yi Zhang
I debugged it, and the remote actor can be fetched in  the tryRegisterAllMasters() method in AppClient:    def tryRegisterAllMasters() {      for (masterAkkaUrl <- masterAkkaUrls) {        logInfo("Connecting to master " + masterAkkaUrl + "...")        val actor = context.actorSelection(masterA

FetchFailedException and MetadataFetchFailedException

2015-05-15 Thread rok
I am trying to sort a collection of key,value pairs (between several hundred million to a few billion) and have recently been getting lots of "FetchFailedException" errors that seem to originate when one of the executors doesn't seem to find a temporary shuffle file on disk. E.g.: org.apache.spar

Hive Skew flag?

2015-05-15 Thread Denny Lee
Just wondering if we have any timeline on when the hive skew flag will be included within SparkSQL? Thanks! Denny

Re: Grouping and storing unordered time series data stream to HDFS

2015-05-15 Thread ayan guha
Hi Do you have a cut off time, like how "late" an event can be? Else, you may consider a different persistent storage like Cassandra/Hbase and delegate "update: part to them. On Fri, May 15, 2015 at 8:10 PM, Nisrina Luthfiyati < nisrina.luthfiy...@gmail.com> wrote: > > Hi all, > I have a stream

Re: store hive metastore on persistent store

2015-05-15 Thread Yana Kadiyska
This should work. Which version of Spark are you using? Here is what I do -- make sure hive-site.xml is in the conf directory of the machine you're using the driver from. Now let's run spark-shell from that machine: scala> val hc= new org.apache.spark.sql.hive.HiveContext(sc) hc: org.apache.spark.

Re: Custom Aggregate Function for DataFrame

2015-05-15 Thread ayan guha
can you kindly elaborate on this? it should be possible to write udafs in similar lines of sum/min etc. On Fri, May 15, 2015 at 5:49 AM, Justin Yip wrote: > Hello, > > May I know if these is way to implement aggregate function for grouped > data in DataFrame? I dug into the doc but didn't find a

Re: Worker Spark Port

2015-05-15 Thread James King
I think this answers my question "executors, on the other hand, are bound with an application, ie spark context. Thus you modify executor properties through a context." Many Thanks. jk On Fri, May 15, 2015 at 3:23 PM, ayan guha wrote: > Hi > > I think you are mixing things a bit. > > Worker i

Re: Worker Spark Port

2015-05-15 Thread ayan guha
Hi I think you are mixing things a bit. Worker is part of the cluster. So it is governed by cluster manager. If you are running standalone cluster, then you can modify spark-env and configure SPARK_WORKER_PORT. executors, on the other hand, are bound with an application, ie spark context. Thus y

Re: Forbidded : Error Code: 403

2015-05-15 Thread Ted Yu
Have you verified that you can download the file from bucket-name without using Spark ? Seems like permission issue. Cheers > On May 15, 2015, at 5:09 AM, Mohammad Tariq wrote: > > Hello list, > > Scenario : I am trying to read an Avro file stored in S3 and create a > DataFrame out of it

Forbidded : Error Code: 403

2015-05-15 Thread Mohammad Tariq
Hello list, *Scenario : *I am trying to read an Avro file stored in S3 and create a DataFrame out of it using *Spark-Avro* library, but unable to do so. This is the code which I am using : public class S3DataFrame { public static void main(String[] args

Re: Spark on Mesos vs Yarn

2015-05-15 Thread Iulian Dragoș
Hi Ankur, Just to add a thought to Tim's excellent answer, Spark on Mesos is very important to us and is the recommended deployment for our customers as Typesafe. Thanks for pointing to your PR, I see Tim already went through a round of reviews. It seems very useful, I'll give it a try as well.

Re: kafka + Spark Streaming with checkPointing fails to start with

2015-05-15 Thread Alexander Krasheninnikov
I had same problem. The solution, I've found was to use: JavaStreamingContext streamingContext = JavaStreamingContext.getOrCreate('checkpoint_dir', contextFactory); ALL configuration should be performed inside contextFactory. If you try to configure streamContext after ::getOrCreate, you recei

Grouping and storing unordered time series data stream to HDFS

2015-05-15 Thread Nisrina Luthfiyati
Hi all, I have a stream of data from Kafka that I want to process and store in hdfs using Spark Streaming. Each data has a date/time dimension and I want to write data within the same time dimension to the same hdfs directory. The data stream might be unordered (by time dimension). I'm wondering w

Re: Worker Spark Port

2015-05-15 Thread James King
So I'm using code like this to use specific ports: val conf = new SparkConf() .setMaster(master) .setAppName("namexxx") .set("spark.driver.port", "51810") .set("spark.fileserver.port", "51811") .set("spark.broadcast.port", "51812") .set("spark.replClassServer.port", "51813"

回复:Re: how to delete data from table in sparksql

2015-05-15 Thread luohui20001
got it,thank you. Thanks&Best regards! San.Luo - 原始邮件 - 发件人:Michael Armbrust 收件人:Denny Lee 抄送人:罗辉 , user 主题:Re: how to delete data from table in sparksql 日期:2015年05月15日 01点49分 The list of unsupported hive features should mention that it implicitly

Why association with remote system has failed when set master in Spark programmatically

2015-05-15 Thread Yi Zhang
Hi all, I run start-master.sh to start standalone Spark with spark://192.168.1.164:7077. Then, I use this command as below, and it's OK:./bin/spark-shell --master spark://192.168.1.164:7077 The console print correct message, and Spark context had been initialised correctly.  However, when I run

Re: Spark on Mesos vs Yarn

2015-05-15 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Tim, Thanks for such a detailed email. I am excited to hear about the new features, I had a pull request going for adding "attribute based filtering in the mesos scheduler" but it hasn't received much love - https://github.com/apache/spark/pull/556

Re: Spark on Mesos vs Yarn

2015-05-15 Thread Tim Chen
Hi Ankur, This is a great question as I've heard similar concerns about Spark on Mesos. At the time when I started to contribute to Spark on Mesos approx half year ago, the Mesos scheduler and related code hasn't really got much attention from anyone and it was pretty much in maintenance mode. A

Re: What's the advantage features of Spark SQL(JDBC)

2015-05-15 Thread Yi Zhang
OK. Thanks. On Friday, May 15, 2015 3:35 PM, "Cheng, Hao" wrote: #yiv2190097982 #yiv2190097982 -- _filtered #yiv2190097982 {font-family:Helvetica;panose-1:2 11 6 4 2 2 2 2 2 4;} _filtered #yiv2190097982 {font-family:宋体;panose-1:2 1 6 0 3 1 1 1 1 1;} _filtered #yiv2190097982 {panos

RE: What's the advantage features of Spark SQL(JDBC)

2015-05-15 Thread Cheng, Hao
Yes. From: Yi Zhang [mailto:zhangy...@yahoo.com] Sent: Friday, May 15, 2015 2:51 PM To: Cheng, Hao; User Subject: Re: What's the advantage features of Spark SQL(JDBC) @Hao, As you said, there is no advantage feature for JDBC, it just provides unified api to support different data sources. Is it