[ 
https://issues.apache.org/jira/browse/HIVE-20190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20190:
-------------------------------
    Description: 
https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320

There are times when a misbehaving client can knock a HS2 instance offline 
because it opens many simultaneous connections and takes up all of the 
resources.  It would be nice if we could log the source IP address of each 
connection along with the "Client protocol version" information.  In this way 
it is much easier to pinpoint the problematic client.  Extra credit for 
kerberos principal name as well.

The current logging of a client connecting is something like:

{code}
2018-07-16 09:40:44,939  INFO  
org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: 
Thread-290000]: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V7
2018-07-16 09:40:44,941  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
Thread-290000]: Trying to connect to metastore with URI thrift://host:9083
2018-07-16 09:40:44,942  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
Thread-290000]: Opened a connection to metastore, current connections: 40
2018-07-16 09:40:44,943  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
Thread-290000]: Connected to metastore.
2018-07-16 09:40:44,950  INFO  org.apache.hadoop.hive.ql.session.SessionState: 
[HiveServer2-Handler-Pool: Thread-290000]: Created local directory: 
/tmp/d88e17d3-ac42-4de5-8043-9a9e2097ef8d_resources
2018-07-16 09:40:44,953  INFO  org.apache.hadoop.hive.ql.session.SessionState: 
[HiveServer2-Handler-Pool: Thread-290000]: Created HDFS directory: 
/tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
2018-07-16 09:40:44,954  INFO  org.apache.hadoop.hive.ql.session.SessionState: 
[HiveServer2-Handler-Pool: Thread-290000]: Created local directory: 
/tmp/hive/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
2018-07-16 09:40:44,957  INFO  org.apache.hadoop.hive.ql.session.SessionState: 
[HiveServer2-Handler-Pool: Thread-290000]: Created HDFS directory: 
/tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d/_tmp_space.db
2018-07-16 09:40:44,958  INFO  org.apache.hadoop.hive.ql.session.SessionState: 
[HiveServer2-Handler-Pool: Thread-290000]: No Tez session required at this 
point. hive.execution.engine=mr.
2018-07-16 09:40:44,958  INFO  
org.apache.hive.service.cli.session.HiveSessionImpl: [HiveServer2-Handler-Pool: 
Thread-290000]: Operation log session directory is created: 
/tmp/hive/operation_logs/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
2018-07-16 09:40:44,959  INFO  
org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: 
Thread-290000]: Opened a session, current sessions: 883
{code}

  was:
https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320

There are times when a misbehaving client can knock a HS2 instance offline 
because it opens many simultaneous connections and takes up all of the 
resources.  It would be nice if we could log the source IP address of each 
connection along with the "Client protocol version" information.  In this way 
it is much easier to pinpoint the problematic client.  Extra credit for 
kerberos principal name as well.


> Report Client IP Address When Opening New Session
> -------------------------------------------------
>
>                 Key: HIVE-20190
>                 URL: https://issues.apache.org/jira/browse/HIVE-20190
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>    Affects Versions: 3.0.0, 2.3.2, 4.0.0
>            Reporter: BELUGA BEHR
>            Priority: Major
>
> https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320
> There are times when a misbehaving client can knock a HS2 instance offline 
> because it opens many simultaneous connections and takes up all of the 
> resources.  It would be nice if we could log the source IP address of each 
> connection along with the "Client protocol version" information.  In this way 
> it is much easier to pinpoint the problematic client.  Extra credit for 
> kerberos principal name as well.
> The current logging of a client connecting is something like:
> {code}
> 2018-07-16 09:40:44,939  INFO  
> org.apache.hive.service.cli.thrift.ThriftCLIService: 
> [HiveServer2-Handler-Pool: Thread-290000]: Client protocol version: 
> HIVE_CLI_SERVICE_PROTOCOL_V7
> 2018-07-16 09:40:44,941  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-290000]: Trying to connect to metastore with URI thrift://host:9083
> 2018-07-16 09:40:44,942  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-290000]: Opened a connection to metastore, current connections: 40
> 2018-07-16 09:40:44,943  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-290000]: Connected to metastore.
> 2018-07-16 09:40:44,950  INFO  
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: 
> Thread-290000]: Created local directory: 
> /tmp/d88e17d3-ac42-4de5-8043-9a9e2097ef8d_resources
> 2018-07-16 09:40:44,953  INFO  
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: 
> Thread-290000]: Created HDFS directory: 
> /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
> 2018-07-16 09:40:44,954  INFO  
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: 
> Thread-290000]: Created local directory: 
> /tmp/hive/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
> 2018-07-16 09:40:44,957  INFO  
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: 
> Thread-290000]: Created HDFS directory: 
> /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d/_tmp_space.db
> 2018-07-16 09:40:44,958  INFO  
> org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: 
> Thread-290000]: No Tez session required at this point. 
> hive.execution.engine=mr.
> 2018-07-16 09:40:44,958  INFO  
> org.apache.hive.service.cli.session.HiveSessionImpl: 
> [HiveServer2-Handler-Pool: Thread-290000]: Operation log session directory is 
> created: /tmp/hive/operation_logs/d88e17d3-ac42-4de5-8043-9a9e2097ef8d
> 2018-07-16 09:40:44,959  INFO  
> org.apache.hive.service.cli.thrift.ThriftCLIService: 
> [HiveServer2-Handler-Pool: Thread-290000]: Opened a session, current 
> sessions: 883
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to