[ https://issues.apache.org/jira/browse/HIVE-20190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
BELUGA BEHR updated HIVE-20190: ------------------------------- Description: https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320 There are times when a misbehaving client can knock a HS2 instance offline because it opens many simultaneous connections and takes up all of the resources. It would be nice if we could log the source IP address of each connection along with the "Client protocol version" information. In this way it is much easier to pinpoint the problematic client. Extra credit for kerberos principal name as well. The current logging of a client connecting is something like: {code} 2018-07-16 09:40:44,939 INFO org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-290000]: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V7 2018-07-16 09:40:44,941 INFO hive.metastore: [HiveServer2-Handler-Pool: Thread-290000]: Trying to connect to metastore with URI thrift://host:9083 2018-07-16 09:40:44,942 INFO hive.metastore: [HiveServer2-Handler-Pool: Thread-290000]: Opened a connection to metastore, current connections: 40 2018-07-16 09:40:44,943 INFO hive.metastore: [HiveServer2-Handler-Pool: Thread-290000]: Connected to metastore. 2018-07-16 09:40:44,950 INFO org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: Thread-290000]: Created local directory: /tmp/d88e17d3-ac42-4de5-8043-9a9e2097ef8d_resources 2018-07-16 09:40:44,953 INFO org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: Thread-290000]: Created HDFS directory: /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d 2018-07-16 09:40:44,954 INFO org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: Thread-290000]: Created local directory: /tmp/hive/d88e17d3-ac42-4de5-8043-9a9e2097ef8d 2018-07-16 09:40:44,957 INFO org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: Thread-290000]: Created HDFS directory: /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d/_tmp_space.db 2018-07-16 09:40:44,958 INFO org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: Thread-290000]: No Tez session required at this point. hive.execution.engine=mr. 2018-07-16 09:40:44,958 INFO org.apache.hive.service.cli.session.HiveSessionImpl: [HiveServer2-Handler-Pool: Thread-290000]: Operation log session directory is created: /tmp/hive/operation_logs/d88e17d3-ac42-4de5-8043-9a9e2097ef8d 2018-07-16 09:40:44,959 INFO org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-290000]: Opened a session, current sessions: 883 {code} was: https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320 There are times when a misbehaving client can knock a HS2 instance offline because it opens many simultaneous connections and takes up all of the resources. It would be nice if we could log the source IP address of each connection along with the "Client protocol version" information. In this way it is much easier to pinpoint the problematic client. Extra credit for kerberos principal name as well. > Report Client IP Address When Opening New Session > ------------------------------------------------- > > Key: HIVE-20190 > URL: https://issues.apache.org/jira/browse/HIVE-20190 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 > Affects Versions: 3.0.0, 2.3.2, 4.0.0 > Reporter: BELUGA BEHR > Priority: Major > > https://github.com/apache/hive/blob/e7d1781ec4662e088dcd6ffbe3f866738792ad9b/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L320 > There are times when a misbehaving client can knock a HS2 instance offline > because it opens many simultaneous connections and takes up all of the > resources. It would be nice if we could log the source IP address of each > connection along with the "Client protocol version" information. In this way > it is much easier to pinpoint the problematic client. Extra credit for > kerberos principal name as well. > The current logging of a client connecting is something like: > {code} > 2018-07-16 09:40:44,939 INFO > org.apache.hive.service.cli.thrift.ThriftCLIService: > [HiveServer2-Handler-Pool: Thread-290000]: Client protocol version: > HIVE_CLI_SERVICE_PROTOCOL_V7 > 2018-07-16 09:40:44,941 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-290000]: Trying to connect to metastore with URI thrift://host:9083 > 2018-07-16 09:40:44,942 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-290000]: Opened a connection to metastore, current connections: 40 > 2018-07-16 09:40:44,943 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-290000]: Connected to metastore. > 2018-07-16 09:40:44,950 INFO > org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: > Thread-290000]: Created local directory: > /tmp/d88e17d3-ac42-4de5-8043-9a9e2097ef8d_resources > 2018-07-16 09:40:44,953 INFO > org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: > Thread-290000]: Created HDFS directory: > /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d > 2018-07-16 09:40:44,954 INFO > org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: > Thread-290000]: Created local directory: > /tmp/hive/d88e17d3-ac42-4de5-8043-9a9e2097ef8d > 2018-07-16 09:40:44,957 INFO > org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: > Thread-290000]: Created HDFS directory: > /tmp/hive/user/d88e17d3-ac42-4de5-8043-9a9e2097ef8d/_tmp_space.db > 2018-07-16 09:40:44,958 INFO > org.apache.hadoop.hive.ql.session.SessionState: [HiveServer2-Handler-Pool: > Thread-290000]: No Tez session required at this point. > hive.execution.engine=mr. > 2018-07-16 09:40:44,958 INFO > org.apache.hive.service.cli.session.HiveSessionImpl: > [HiveServer2-Handler-Pool: Thread-290000]: Operation log session directory is > created: /tmp/hive/operation_logs/d88e17d3-ac42-4de5-8043-9a9e2097ef8d > 2018-07-16 09:40:44,959 INFO > org.apache.hive.service.cli.thrift.ThriftCLIService: > [HiveServer2-Handler-Pool: Thread-290000]: Opened a session, current > sessions: 883 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)