Error while trying to debug Hive: "Cannot load this JVM TI agent twice..."

Luke Lovett Wed, 11 Feb 2015 15:59:26 -0800

Hello Hive User Mailing List,

I'm trying to debug a custom InputFormat that I'm using in Hive. I'musing version 0.12.0 of Hive and Hadoop 2.4.1.

I'm having trouble attaching a debugger to my InputFormat class insidethe Hive server. My session looks like this:


$ ./hive-0.12.0/bin/hive --debug
Listening for transport dt_socket at address: 8000

(I attach a debugger from intellij at this point, all seems to be goingwell).

15/02/11 23:28:01 INFO Configuration.deprecation:mapred.input.dir.recursive is deprecated. Instead, usemapreduce.input.fileinputformat.input.dir.recursive15/02/11 23:28:01 INFO Configuration.deprecation: mapred.max.split.sizeis deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize15/02/11 23:28:01 INFO Configuration.deprecation: mapred.min.split.sizeis deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize15/02/11 23:28:01 INFO Configuration.deprecation:mapred.min.split.size.per.rack is deprecated. Instead, usemapreduce.input.fileinputformat.split.minsize.per.rack15/02/11 23:28:01 INFO Configuration.deprecation:mapred.min.split.size.per.node is deprecated. Instead, usemapreduce.input.fileinputformat.split.minsize.per.node15/02/11 23:28:01 INFO Configuration.deprecation: mapred.reduce.tasks isdeprecated. Instead, use mapreduce.job.reduces15/02/11 23:28:01 INFO Configuration.deprecation:mapred.reduce.tasks.speculative.execution is deprecated. Instead, usemapreduce.reduce.speculative


Logging initialized using configuration in ...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [......]
SLF4J: Found binding in [......]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for anexplanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

2015-02-11 23:28:02.016 java[2237:86664] Unable to load realm info fromSCDynamicStore


Now I'm trying to exercise my custom InputFormat class:

hive> select * from messages;

The debugger attaches, I can step through, everything is still goinggreat. The trouble happens when I try anything other than "SELECT * fromTABLE," launching a MapReduce job. For example:


hive> select field from messages;

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator

Error occurred during initialization of VMERROR: Cannot load this JVM TIagent twice, check your java command line for duplicate jdwp options.


agent library failed to init: jdwp
Execution failed with exit status: 1
Obtaining error information

Task failed!
Task ID:
  Stage-1

Logs:

/tmp/luke/hive.log

FAILED: Execution Error, return code 1 fromorg.apache.hadoop.hive.ql.exec.mr.MapRedTask

I haven't explicitly set any environment variables like HADOOP_OPTS orHIVE_OPTS. I'm relying on the --debug flag to do this for me when Ilaunch Hive. However, I do notice the following if I run "set" from theHive shell:

env:HADOOP_OPTS= -Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/.../hadoop-2.4.1/logs -Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/.../hadoop-2.4.1 -Dhadoop.id.str=luke-Dhadoop.root.logger=INFO,console-Djava.library.path=/.../hadoop-2.4.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true-Xmx512m -XX:+UseParallelGC-agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=y-Dhadoop.security.logger=INFO,NullAppenderenv:HIVE_MAIN_CLIENT_DEBUG_OPTS= -XX:+UseParallelGC-agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=yenv:HIVE_CHILD_CLIENT_DEBUG_OPTS= -XX:+UseParallelGC-agentlib:jdwp=transport=dt_socket,server=y,suspend=nenv:HADOOP_CLIENT_OPTS=-Xmx512m -XX:+UseParallelGC-agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=y

Also, per https://issues.apache.org/jira/browse/HIVE-3936, I'vecommented out L217 of bin/hive which looks like this:


# Starting at line 210:
if [ "$DEBUG" ]; then
  if [ "$HELP" ]; then
    debug_help
    exit 0
  else
    get_debug_params "$DEBUG"

export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS$HIVE_MAIN_CLIENT_DEBUG_OPTS"

#    export HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
  fi
fi

As I said, this fix works fine when I no MR tasks need to be launched,but then I keep getting the same error about jdwp when I try anythingnon-trivial.


Any help is appreciated. Thank you for your time.

Luke

Error while trying to debug Hive: "Cannot load this JVM TI agent twice..."

Reply via email to