-user Thank you for this, but just a small but important point about the use of the Spark name. Please take a look at https://spark.apache.org/trademarks.html Specifically, this should reference "Apache Spark" at least once prominently with a link to the project. It's also advisable to avoid using "Spark" in a project or product name entirely. "Oracle Translator for Apache Spark" or something like that would be more in line with trademark guidance.
On Thu, Jan 13, 2022 at 6:50 PM Harish Butani <rhbutani.sp...@gmail.com> wrote: > Spark on Oracle is now available as an open source Apache licensed github > repo <https://github.com/oracle/spark-oracle>. Build and deploy it as an > extension jar in your Spark clusters. > > Use it to combine Apache Spark programs with data in your existing Oracle > databases without expensive data copying or query time data movement. > > The core capability is Optimizer extensions that collapse SQL operator > sub-graphs to an OraScan that executes equivalent SQL in Oracle. Physical > plan parallelism > <https://github.com/oracle/spark-oracle/wiki/Query-Splitting>can be > controlled to split Spark tasks to operate on Oracle data block ranges, or > on resultset pages or on table partitions. > > We pushdown large parts of Spark SQL to Oracle, for example 95 of 99 TPCDS > queries are completely pushed to Oracle. > <https://github.com/oracle/spark-oracle/wiki/TPCDS-Queries> > > With Spark SQL macros > <https://github.com/oracle/spark-oracle/wiki/Spark_SQL_macros> you can > write custom Spark UDFs that get translated and pushed as Oracle SQL > expressions. > > With DML pushdown > <https://github.com/oracle/spark-oracle/wiki/DML-Support> inserts in > Spark SQL get pushed as transactionally consistent inserts/updates on > Oracle tables. > > See Quick Start Guide > <https://github.com/oracle/spark-oracle/wiki/Quick-Start-Guide> on how > to set up an Oracle free tier ADW instance, load it with TPCDS data and try > out the Spark on Oracle Demo > <https://github.com/oracle/spark-oracle/wiki/Demo> on your Spark > cluster. > > More details can be found in our blog > <https://hbutani.github.io/blogs/blog/Spark_on_Oracle_Blog.html> and the > project > wiki. <https://github.com/oracle/spark-oracle/wiki> > > regards, > Harish Butani >