Hi Mohammad Can you provide more info about the Service u developed On Jun 20, 2015 7:59 AM, "Mohammed Guller" <moham...@glassbeam.com> wrote:
> Hi Matthew, > > It looks fine to me. I have built a similar service that allows a user to > submit a query from a browser and returns the result in JSON format. > > > > Another alternative is to leave a Spark shell or one of the notebooks > (Spark Notebook, Zeppelin, etc.) session open and run queries from there. > This model works only if people give you the queries to execute. > > > > Mohammed > > > > *From:* Matthew Johnson [mailto:matt.john...@algomi.com] > *Sent:* Friday, June 19, 2015 2:20 AM > *To:* user@spark.apache.org > *Subject:* Code review - Spark SQL command-line client for Cassandra > > > > Hi all, > > > > I have been struggling with Cassandra’s lack of adhoc query support (I > know this is an anti-pattern of Cassandra, but sometimes management come > over and ask me to run stuff and it’s impossible to explain that it will > take me a while when it would take about 10 seconds in MySQL) so I have put > together the following code snippet that bundles DataStax’s Cassandra Spark > connector and allows you to submit Spark SQL to it, outputting the results > in a text file. > > > > Does anyone spot any obvious flaws in this plan?? (I have a lot more error > handling etc in my code, but removed it here for brevity) > > > > *private* *void* run(String sqlQuery) { > > SparkContext scc = *new* SparkContext(conf); > > CassandraSQLContext csql = *new* CassandraSQLContext(scc); > > DataFrame sql = csql.sql(sqlQuery); > > String folderName = "/tmp/output_" + System.*currentTimeMillis*(); > > *LOG*.info("Attempting to save SQL results in folder: " + > folderName); > > sql.rdd().saveAsTextFile(folderName); > > *LOG*.info("SQL results saved"); > > } > > > > *public* *static* *void* main(String[] args) { > > > > String sparkMasterUrl = args[0]; > > String sparkHost = args[1]; > > String sqlQuery = args[2]; > > > > SparkConf conf = *new* SparkConf(); > > conf.setAppName("Java Spark SQL"); > > conf.setMaster(sparkMasterUrl); > > conf.set("spark.cassandra.connection.host", sparkHost); > > > > JavaSparkSQL app = *new* JavaSparkSQL(conf); > > > > app.run(sqlQuery, printToConsole); > > } > > > > I can then submit this to Spark with ‘spark-submit’: > > > > Ø *./spark-submit --class com.algomi.spark.JavaSparkSQL --master > spark://sales3:7077 > spark-on-cassandra-0.0.1-SNAPSHOT-jar-with-dependencies.jar > spark://sales3:7077 sales3 "select * from mykeyspace.operationlog" * > > > > It seems to work pretty well, so I’m pretty happy, but wondering why this > isn’t common practice (at least I haven’t been able to find much about it > on Google) – is there something terrible that I’m missing? > > > > Thanks! > > Matthew > > > > >