Have you seen https://issues.apache.org/jira/browse/SPARK-6910....I opened https://issues.apache.org/jira/browse/SPARK-6984 which I think is related to this as well. There are a bunch of issues attached to it but basically yes, Spark interactions with a large metastore are bad...very bad if your metastore is large.
On Sun, Jul 12, 2015 at 11:39 PM, Jerrick Hoang <[email protected]> wrote: > Sorry all for not being clear. I'm using spark 1.4 and the table is a hive > table, and the table is partitioned. > > On Sun, Jul 12, 2015 at 6:36 PM, Yin Huai <[email protected]> wrote: > >> Jerrick, >> >> Let me ask a few clarification questions. What is the version of Spark? >> Is the table a hive table? What is the format of the table? Is the table >> partitioned? >> >> Thanks, >> >> Yin >> >> On Sun, Jul 12, 2015 at 6:01 PM, ayan guha <[email protected]> wrote: >> >>> Describe computes statistics, so it will try to query the table. The one >>> you are looking for is df.printSchema() >>> >>> On Mon, Jul 13, 2015 at 10:03 AM, Jerrick Hoang <[email protected]> >>> wrote: >>> >>>> Hi all, >>>> >>>> I'm new to Spark and this question may be trivial or has already been >>>> answered, but when I do a 'describe table' from SparkSQL CLI it seems to >>>> try looking at all records at the table (which takes a really long time for >>>> big table) instead of just giving me the metadata of the table. Would >>>> appreciate if someone can give me some pointers, thanks! >>>> >>> >>> >>> >>> -- >>> Best Regards, >>> Ayan Guha >>> >> >> >
