(apologies for sending this twice, first via nabble; didn't realize it
wouldn't get forwarded)

Hey, I know it's not officially released yet, but I'm trying to understand
(and run) the Thrift-based JDBC server, in order to enable remote JDBC
access to our dev cluster.

Before asking about details, is my understanding of this correct?
`sbin/start-thriftserver` is a JDBC/Hive server that doesn't require
running a Hive+MR cluster (i.e. just Spark/Spark+YARN)?

Assuming yes, I have hope that it all basically works, just that some
documentation needs to be cleaned up:

- I found a release page implying that 1.1 will be released "pretty
soon-ish": https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage
- I can find recent (more recent 30 days or so) activity with promising
titles: ["Updated Spark SQL README to include the hive-thriftserver
module"](https://github.com/apache/spark/pull/1867), ["[SPARK-2410][SQL]
Merging Hive Thrift/JDBC server (with Maven profile fix)"](
https://github.com/apache/spark/pull/1620)

Am I following all the right email threads, issues trackers, and whatnot?

Specifically, I tried:

1. Building off of `branch-1.1`, synced as of ~today (2014 Aug 25)
2. Running `sbin/start-thriftserver.sh` in `yarn-client` mode
3. Can see the processing running, and the spark context/app created in
yarn logs,
and can connect to the thrift server on the default port of 10000 using
`bin/beeline`
4. However, when I try to find out what that cluster has via `show
tables;`, in the logs
I see a connection error to some (what I assume to be) random port.

So what service am I forgetting/too ignorant to run? Or did I misunderstand
and we do need a live Hive instance to back thriftserver? Or is this a
YARN-specific issue?

Only recently started learning the ecosystem and community, so apologies
for the longer post and lots of questions. :)

Matt

Reply via email to