Sorry I missed the last part mentioning that it messes up logs of "show tables" after "set". That's strange, I tried on latest trunk and I don't see that happening, show tables still shows the "perf" logs.
On Wed, Jun 11, 2014 at 1:06 PM, Szehon Ho <sze...@cloudera.com> wrote: > I think that's expected. SQL Operations like "show tables" will reach > Driver, which has perf and detailed logs about execution. > > Other operations like "set" or "add" are not SQL Operations, so in HS2 > they don't hit the Driver and don't generate the logs. They are pretty > simple ops that just set some state. Did those show in HS1? If so, maybe > the implementation changed. > > Thanks > Szehon > > > On Wed, Jun 11, 2014 at 4:40 AM, Dima Machlin <dima.mach...@pursway.com> > wrote: > >> Any change somebody has a clue about this? >> >> From: Dima Machlin [mailto:dima.mach...@pursway.com] >> Sent: Sunday, May 25, 2014 1:54 PM >> To: u...@hive.apache.org >> Subject: RE: HiveServer2 VS HiveServer1 Logging >> >> I’ve made some progress in investigating this. >> It seems that this behavior happens on certain conditions. >> >> As long as i’m running any query that isn’t “set” or “add” command the >> logging is fine. >> For example “show table” : >> >> 14/05/25 13:47:17 INFO cli.CLIService: SessionHandle >> [2db07453-2235-4f22-ab72-4a27c1b1457d]: openSession() >> 14/05/25 13:47:17 INFO cli.CLIService: SessionHandle >> [2db07453-2235-4f22-ab72-4a27c1b1457d]: getInfo() >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=Driver.run> >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=TimeToSubmit> >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=compile> >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=parse> >> 14/05/25 13:47:18 INFO parse.ParseDriver: Parsing command: show tables >> 14/05/25 13:47:18 INFO parse.ParseDriver: Parse Completed >> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=parse >> start=1401014838047 end=1401014838376 duration=329> >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=semanticAnalyze> >> 14/05/25 13:47:18 INFO ql.Driver: Semantic Analysis Completed >> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=semanticAnalyze >> start=1401014838376 end=1401014838453 duration=77> >> 14/05/25 13:47:18 INFO exec.ListSinkOperator: Initializing Self 0 OP >> 14/05/25 13:47:18 INFO exec.ListSinkOperator: Operator 0 OP initialized >> 14/05/25 13:47:18 INFO exec.ListSinkOperator: Initialization Done 0 OP >> 14/05/25 13:47:18 INFO ql.Driver: Returning Hive schema: >> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from >> deserializer)], properties:null) >> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=compile >> start=1401014838011 end=1401014838521 duration=510> >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=Driver.execute> >> 14/05/25 13:47:18 INFO ql.Driver: Starting command: show tables >> 14/05/25 13:47:18 INFO ql.Driver: </PERFLOG method=TimeToSubmit >> start=1401014838011 end=1401014838531 duration=520> >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=runTasks> >> 14/05/25 13:47:18 INFO ql.Driver: <PERFLOG method=task.DDL.Stage-0> >> 14/05/25 13:47:18 INFO hive.metastore: Trying to connect to metastore >> with URI thrift://localhost:9083 >> 14/05/25 13:47:18 INFO hive.metastore: Waiting 1 seconds before next >> connection attempt. >> 14/05/25 13:47:19 INFO hive.metastore: Connected to metastore. >> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=task.DDL.Stage-0 >> start=1401014838531 end=1401014839627 duration=1096> >> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=runTasks >> start=1401014838531 end=1401014839627 duration=1096> >> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=Driver.execute >> start=1401014838521 end=1401014839627 duration=1106> >> OK >> 14/05/25 13:47:19 INFO ql.Driver: OK >> 14/05/25 13:47:19 INFO ql.Driver: <PERFLOG method=releaseLocks> >> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=releaseLocks >> start=1401014839627 end=1401014839627 duration=0> >> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=Driver.run >> start=1401014838011 end=1401014839627 duration=1616> >> 14/05/25 13:47:19 INFO cli.CLIService: SessionHandle >> [2db07453-2235-4f22-ab72-4a27c1b1457d]: executeStatement() >> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle >> [opType=EXECUTE_STATEMENT, >> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]: >> getResultSetMetadata() >> 14/05/25 13:47:19 WARN snappy.LoadSnappy: Snappy native library not loaded >> 14/05/25 13:47:19 INFO mapred.FileInputFormat: Total input paths to >> process : 1 >> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle >> [opType=EXECUTE_STATEMENT, >> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]: fetchResults() >> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle >> [opType=EXECUTE_STATEMENT, >> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]: fetchResults() >> 14/05/25 13:47:19 INFO exec.ListSinkOperator: 0 finished. closing... >> 14/05/25 13:47:19 INFO exec.ListSinkOperator: 0 forwarded 0 rows >> 14/05/25 13:47:19 INFO ql.Driver: <PERFLOG method=releaseLocks> >> 14/05/25 13:47:19 INFO ql.Driver: </PERFLOG method=releaseLocks >> start=1401014839857 end=1401014839857 duration=0> >> 14/05/25 13:47:19 INFO cli.CLIService: OperationHandle >> [opType=EXECUTE_STATEMENT, >> getHandleIdentifier()=0628b8f8-01de-4397-8279-a314cf553d7f]: closeOperation >> >> >> Now running : “set hive.enforce.bucketing = true;” >> >> 14/05/25 13:48:07 INFO operation.Operation: Putting temp output to file >> /tmp/hadoop/2db07453-2235-4f22-ab72-4a27c1b1457d2566159976359370628.pipeout >> 14/05/25 13:48:07 INFO cli.CLIService: SessionHandle >> [2db07453-2235-4f22-ab72-4a27c1b1457d]: executeStatement() >> 14/05/25 13:48:07 INFO cli.CLIService: OperationHandle >> [opType=EXECUTE_STATEMENT, >> getHandleIdentifier()=7b13a3e2-e0ea-4dae-b693-0d456519fc66]: >> getOperationStatus() >> >> First thing that happens is : “Putting temp output to file” and from now >> on, nothing is shown in the console. >> >> Running again “show tables” generates no logging at all in the console. >> >> The hiveserver is started like this : >> /opt/mapr/hive/hive-0.12/bin/hive --service hiveserver2 --hiveconf >> hive.server2.thrift.port=26006 >> >> The connection is made from JDBC. >> >> Executing the same procedure from the CLI generates all the expected >> logging. >> >> >> From: Dima Machlin [mailto:dima.mach...@pursway.com] >> Sent: Wednesday, May 21, 2014 5:06 PM >> To: u...@hive.apache.org<mailto:u...@hive.apache.org> >> Subject: HiveServer2 VS HiveServer1 Logging >> >> >> Hi all, >> >> It seems that for some reason HS2 outputs far less logging than HS1 in >> hive 0.12 for example, starting HS1 in the following way : hive --service >> hiveserver >> and executing show tables produces this : >> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO service.HiveServer: Running the >> query: show tables >> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: 1 >> finished. closing... >> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: 1 >> forwarded 0 rows >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=Driver.run> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=TimeToSubmit> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=compile> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG method=parse> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO parse.ParseDriver: Parsing >> command: show tables >> 14/04/30 17:14:16 [pool-1-thread-2] INFO parse.ParseDriver: Parse >> Completed >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=parse start=1398867256379 end=1398867256380 duration=1> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=semanticAnalyze> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: Semantic Analysis >> Completed >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=semanticAnalyze start=1398867256380 end=1398867256381 duration=1> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: >> Initializing Self 2 OP >> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: Operator >> 2 OP initialized >> 14/04/30 17:14:16 [pool-1-thread-2] INFO exec.ListSinkOperator: >> Initialization Done 2 OP >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: Returning Hive >> schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, >> comment:from deserializer)], properties:null) >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=compile start=1398867256379 end=1398867256384 duration=5> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=Driver.execute> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: Starting command: >> show tables >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=TimeToSubmit start=1398867256378 end=1398867256384 duration=6> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=runTasks> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=task.DDL.Stage-0> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=task.DDL.Stage-0 start=1398867256384 end=1398867256398 duration=14> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=runTasks start=1398867256384 end=1398867256398 duration=14> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=Driver.execute start=1398867256384 end=1398867256398 duration=14> OK >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: OK >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: <PERFLOG >> method=releaseLocks> >> 14/04/30 17:14:16 [pool-1-thread-2] INFO ql.Driver: </PERFLOG >> method=releaseLocks start=1398867256398 end=1398867256398 duration=0> >> >> while when running on HS2 (hive --service hiveserver2), the only logging >> data is from the initialization of the connection: >> >> 14/04/30 17:18:54 [pool-2-thread-1] INFO cli.CLIService: SessionHandle >> [4b0a96cb-626b-45ae-ae0d-6a959e84e5a4]: openSession() >> 14/04/30 17:18:55 [pool-2-thread-1] INFO operation.Operation: Putting >> temp output to file >> /tmp/hadoop/4b0a96cb-626b-45ae-ae0d-6a959e84e5a44568051681296760332.pipeout >> 14/04/30 17:18:55 [pool-2-thread-1] INFO cli.CLIService: SessionHandle >> [4b0a96cb-626b-45ae-ae0d-6a959e84e5a4]: executeStatement() >> 14/04/30 17:18:55 [pool-2-thread-1] INFO cli.CLIService: OperationHandle >> [opType=EXECUTE_STATEMENT, >> getHandleIdentifier()=01a12804-8f77-4e8f-b17a-de4c39e481c8]: >> getOperationStatus() >> >> And no sign for the executed query. >> The show tables is just an example, it also doesn’t show any queries >> (both mapreduce and direct) >> >> Settings : >> hive.root.logger in hive-log4j.properties is set to INFO,console >> >> Any reason for this? >> >> >> >> >> >> >> ************************************************************************************ >> This footnote confirms that this email message has been scanned by >> PineApp Mail-SeCure for the presence of malicious code, vandals & >> computer viruses. >> >> ************************************************************************************ >> >> >> >> >> >> ************************************************************************************ >> This footnote confirms that this email message has been scanned by >> PineApp Mail-SeCure for the presence of malicious code, vandals & >> computer viruses. >> >> ************************************************************************************ >> > >