Hi Michael, I want to cache a RDD and define get() and set() operators on it. Basically like memcached. Is it possible to build a memcached like distributed cache using Spark SQL ? If not what do you suggest we should use for such operations...
Thanks. Deb On Fri, Jul 18, 2014 at 1:00 PM, Michael Armbrust <mich...@databricks.com> wrote: > You can do insert into. As with other SQL on HDFS systems there is no > updating of data. > On Jul 17, 2014 1:26 AM, "Akhil Das" <ak...@sigmoidanalytics.com> wrote: > >> Is this what you are looking for? >> >> >> https://spark.apache.org/docs/1.0.0/api/java/org/apache/spark/sql/parquet/InsertIntoParquetTable.html >> >> According to the doc, it says "Operator that acts as a sink for queries >> on RDDs and can be used to store the output inside a directory of Parquet >> files. This operator is similar to Hive's INSERT INTO TABLE operation in >> the sense that one can choose to either overwrite or append to a directory. >> Note that consecutive insertions to the same table must have compatible >> (source) schemas." >> >> Thanks >> Best Regards >> >> >> On Thu, Jul 17, 2014 at 11:42 AM, Hu, Leo <leo.h...@sap.com> wrote: >> >>> Hi >>> >>> As for spark 1.0, can we insert and update a table with SPARK SQL, >>> and how? >>> >>> >>> >>> Thanks >>> >>> Best Regard >>> >> >>