Hi all,
I am having a hard time understanding the caching concepts in Spark.
I have a hive table("person"), which is cached in Spark.
sqlContext.sql("create table person (name string, age int)") //Create
a new table
//Add some values to the table
...
...
//Cache the table in Spark
sqlContext.cacheTable("person")
sqlContext.isCached("person") //Returns true
sqlContext.sql("insert into table person values ("Foo", 25)") //
Insert some other value in the table
//Check caching status again
sqlContext.isCached("person") //Returns true
sqlContext is *HiveContext*.
Will the entries inserted after *cacheTable("person")* statement be cached?
In other words, ("Foo", 25) entry is cached in Spark or not?
If not, how can I cache only the entries inserted later? I don't want to
first uncache and then again cache the whole table.
Any relevant web link or information will be appreciated.
- Anjali Chadha