Re: A Spark Compilation Question

2015-02-15 Thread vha14
In IntelliJ: - Open View -> Tool Windows -> Maven Projects - Right click on Spark Project External Flume Sink - Click Generate Sources and Update Folders This should generate source code from sparkflume.avdl. Vu~ -- View this message in context: http://apache-spark-developers-list.1001551.n3

Re: renaming SchemaRDD -> DataFrame

2015-02-12 Thread vha14
Matei wrote (Jan 26, 2015; 5:31pm): "The intent of Spark SQL though is to be more than a SQL server -- it's meant to be a library for manipulating structured data." I think this is an important but nuanced point. There are engineers who for various reasons associate the term "SQL" with business an

Re: Spark SQL value proposition in batch pipelines

2015-02-12 Thread vha14
This is super helpful, thanks Evan and Reynold! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-SQL-value-proposition-in-batch-pipelines-tp10607p10610.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. -

Spark SQL value proposition in batch pipelines

2015-02-12 Thread vha14
My team is building a batch data processing pipeline using Spark API and trying to understand if Spark SQL can help us. Below are what we found so far: - SQL's declarative style may be more readable in some cases (e.g. joining of more than two RDDs), although some devs prefer the fluent style rega