I am able to connect to MySQL Hive metastore from the client cluster
machine.
-sh-4.1$ mysql --user=hiveuser --password=pass
--host=hostname.vip.company.com <http://hostname.vip.company.com>
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 9417286
Server version: 5.5.12-eb-5.5.12-log MySQL-eb 5.5.12, Revision 3492
Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights
reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input
statement.
mysql> use eBayHDB;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_HDB |
+---------------------------+
Regards,
Deepak
On Sat, Mar 28, 2015 at 12:35 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com
<mailto:deepuj...@gmail.com>> wrote:
Yes am using yarn-cluster and i did add it via --files. I get
"Suitable error not found error"
Please share the spark-submit command that shows mysql jar
containing driver class used to connect to Hive MySQL meta store.
Even after including it through
--driver-class-path
/home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar
OR (AND)
--jars /home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar
I keep getting "Suitable driver not found for"
Command
========
./bin/spark-submit -v --master yarn-cluster --driver-class-path
*/home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar*:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar
--jars
/home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar,*/home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.ja*r
--files $SPARK_HOME/conf/hive-site.xml --num-executors 1
--driver-memory 4g --driver-java-options "-XX:MaxPermSize=2G"
--executor-memory 2g --executor-cores 1 --queue hdmi-express
--class com.ebay.ep.poc.spark.reporting.SparkApp
spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16
endDate=2015-02-16
input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro
subcommand=successevents2
output=/user/dvasthimal/epdatasets/successdetail2
Logs
====
Caused by: java.sql.SQLException: No suitable driver found for
jdbc:mysql://hostname:3306/HDB
at java.sql.DriverManager.getConnection(DriverManager.java:596)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at
com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:416)
... 68 more
...
...
15/03/27 23:56:08 INFO yarn.Client: Uploading resource
file:/home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar ->
hdfs://apollo-NN:8020/user/dvasthimal/.sparkStaging/application_1426715280024_119815/mysql-connector-java-5.1.34.jar
...
...
-sh-4.1$ jar -tvf ../mysql-connector-java-5.1.34.jar | grep Driver
61 Fri Oct 17 08:05:36 GMT-07:00 2014
META-INF/services/java.sql.Driver
3396 Fri Oct 17 08:05:22 GMT-07:00 2014
com/mysql/fabric/jdbc/FabricMySQLDriver.class
* 692 Fri Oct 17 08:05:22 GMT-07:00 2014
com/mysql/jdbc/Driver.class*
1562 Fri Oct 17 08:05:20 GMT-07:00 2014
com/mysql/jdbc/NonRegisteringDriver$ConnectionPhantomReference.class
17817 Fri Oct 17 08:05:20 GMT-07:00 2014
com/mysql/jdbc/NonRegisteringDriver.class
690 Fri Oct 17 08:05:24 GMT-07:00 2014
com/mysql/jdbc/NonRegisteringReplicationDriver.class
731 Fri Oct 17 08:05:24 GMT-07:00 2014
com/mysql/jdbc/ReplicationDriver.class
336 Fri Oct 17 08:05:24 GMT-07:00 2014
org/gjt/mm/mysql/Driver.class
You have new mail in /var/spool/mail/dvasthimal
-sh-4.1$ cat conf/hive-site.xml | grep Driver
<name>javax.jdo.option.ConnectionDriverName</name>
*<value>com.mysql.jdbc.Driver</value>*
<description>Driver class name for a JDBC metastore</description>
-sh-4.1$
--
Deepak
On Sat, Mar 28, 2015 at 1:06 AM, Michael Armbrust
<mich...@databricks.com <mailto:mich...@databricks.com>> wrote:
Are you running on yarn?
- If you are running in yarn-client mode, set HADOOP_CONF_DIR
to /etc/hive/conf/ (or the directory where your hive-site.xml
is located).
- If you are running in yarn-cluster mode, the easiest thing
to do is to add--files=/etc/hive/conf/hive-site.xml (or the
path for your hive-site.xml) to your spark-submit script.
On Fri, Mar 27, 2015 at 5:42 AM, ÐΞ€ρ@Ҝ (๏̯͡๏)
<deepuj...@gmail.com <mailto:deepuj...@gmail.com>> wrote:
I can recreate tables but what about data. It looks like
this is a obvious feature that Spark SQL must be having.
People will want to transform tons of data stored in HDFS
through Hive from Spark SQL.
Spark programming guide suggests its possible.
Spark SQL also supports reading and writing data stored in
Apache Hive <http://hive.apache.org/>. .... Configuration
of Hive is done by placing your |hive-site.xml| file in
|conf/|.
https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#hive-tables
For some reason its not working.
On Fri, Mar 27, 2015 at 3:35 PM, Arush Kharbanda
<ar...@sigmoidanalytics.com
<mailto:ar...@sigmoidanalytics.com>> wrote:
Seems Spark SQL accesses some more columns apart from
those created by hive.
You can always recreate the tables, you would need to
execute the table creation scripts but it would be
good to avoid recreation.
On Fri, Mar 27, 2015 at 3:20 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)
<deepuj...@gmail.com <mailto:deepuj...@gmail.com>> wrote:
I did copy hive-conf.xml form Hive installation
into spark-home/conf. IT does have all the meta
store connection details, host, username, passwd,
driver and others.
Snippet
======
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://host.vip.company.com:3306/HDB
<http://host.vip.company.com:3306/HDB></value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC
metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
<description>username to use against metastore
database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>some-password</value>
<description>password to use against metastore
database</description>
</property>
<property>
<name>hive.metastore.local</name>
<value>false</value>
<description>controls whether to connect to remove
metastore server or open a new metastore server in
Hive Client JVM</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the
warehouse</description>
</property>
......
When i attempt to read hive table, it does not
work. dw_bid does not exists.
I am sure there is a way to read tables stored in
HDFS (Hive) from Spark SQL. Otherwise how would
anyone do analytics since the source tables are
always either persisted directly on HDFS or
through Hive.
On Fri, Mar 27, 2015 at 1:15 PM, Arush Kharbanda
<ar...@sigmoidanalytics.com
<mailto:ar...@sigmoidanalytics.com>> wrote:
Since hive and spark SQL internally use HDFS
and Hive metastore. The only thing you want to
change is the processing engine. You can try
to bring your hive-site.xml to
%SPARK_HOME%/conf/hive-site.xml.(Ensure that
the hive site xml captures the metastore
connection details).
Its a hack, i havnt tried it. I have played
around with the metastore and it should work.
On Fri, Mar 27, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)
<deepuj...@gmail.com
<mailto:deepuj...@gmail.com>> wrote:
I have few tables that are created in
Hive. I wan to transform data stored in
these Hive tables using Spark SQL. Is this
even possible ?
So far i have seen that i can create new
tables using Spark SQL dialect. However
when i run show tables or do desc
hive_table it says table not found.
I am now wondering is this support present
or not in Spark SQL ?
--
Deepak
--
Sigmoid Analytics
<http://htmlsig.com/www.sigmoidanalytics.com>
*Arush Kharbanda* || Technical Teamlead
ar...@sigmoidanalytics.com
<mailto:ar...@sigmoidanalytics.com> ||
www.sigmoidanalytics.com
<http://www.sigmoidanalytics.com/>
--
Deepak
--
Sigmoid Analytics
<http://htmlsig.com/www.sigmoidanalytics.com>
*Arush Kharbanda* || Technical Teamlead
ar...@sigmoidanalytics.com
<mailto:ar...@sigmoidanalytics.com> ||
www.sigmoidanalytics.com
<http://www.sigmoidanalytics.com/>
--
Deepak
--
Deepak
--
Deepak