Re: Caching tables at column level

Michael Armbrust Sun, 01 Feb 2015 13:29:23 -0800

Its not completely transparent, but you can do something like the following
today:


CACHE TABLE hotData AS SELECT columns, I, care, about FROM fullTable

On Sun, Feb 1, 2015 at 3:03 AM, Mick Davies <michael.belldav...@gmail.com>
wrote:

> I have been working a lot recently with denormalised tables with lots of
> columns, nearly 600. We are using this form to avoid joins.
>
> I have tried to use cache table with this data, but it proves too expensive
> as it seems to try to cache all the data in the table.
>
> For data sets such as the one I am using you find that certain columns will
> be hot, referenced frequently in queries, others will be used very
> infrequently.
>
> Therefore it would be great if caches could be column based. I realise that
> this may not be optimal for all use cases, but I think it could be quite a
> common need.  Has something like this been considered?
>
> Thanks Mick
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Caching-tables-at-column-level-tp10377.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Re: Caching tables at column level

Reply via email to