How is this different from materialized views? On Sun, Feb 24, 2019 at 3:44 PM Daoyuan Wang <m...@daoyuan.wang> wrote:
> Hi everyone, > > We'd like to discuss our proposal of Spark relational cache in this > thread. Spark has native command for RDD caching, but the use of CACHE > command in Spark SQL is limited, as we cannot use the cache cross session, > as well as we have to rewrite queries by ourselves to make use of existing > cache. > To resolve this, we have done some initial work to do the following: > > 1. allow user to persist cache on HDFS in format of Parquet. > 2. rewrite user queries in Catalyst, to utilize any existing cache (on > HDFS or defined as in memory in current session) if possible. > > I have created a jira ticket( > https://issues.apache.org/jira/browse/SPARK-26764) for this and attached > an official SPIP document. > > Thanks for taking a look at the proposal. > > Best Regards, > Daoyuan >