ABHISHEK KUMAR GUPTA created SPARK-24627:
--------------------------------------------

             Summary: [Spark2.3.0] After HDFS Token expire kinit not able to 
submit job using beeline
                 Key: SPARK-24627
                 URL: https://issues.apache.org/jira/browse/SPARK-24627
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.3.0
         Environment: OS: SUSE11

Spark Version: 2.3.0 

Hadoop: 2.8.3
            Reporter: ABHISHEK KUMAR GUPTA


Steps:

beeline session was active.
1.Launch spark-beeline 
2. create table alt_s1 (time timestamp, name string, isright boolean, datetoday 
date, num binary, height double, score float, decimaler decimal(10,0), id 
tinyint, age int, license bigint, length smallint) row format delimited fields 
terminated by ',';
3. load data local inpath '/opt/typeddata60.txt' into table alt_s1;
4. show tables;( Table listed successfully )
5. select * from alt_s1;
Throws HDFS_DELEGATION_TOKEN Exception


0: jdbc:hive2://10.18.18.214:23040/default> select * from alt_s1;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
1 in stage 22.0 failed 4 times, most recent failure: Lost task 1.3 in stage 
22.0 (TID 106, blr1000023110, executor 1): 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 7 for spark) can't be found in cache
        at org.apache.hadoop.ipc.Client.call(Client.java:1475)
        at org.apache.hadoop.ipc.Client.call(Client.java:1412)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255)
        at sun.reflect.GeneratedMethodAccessor44.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy16.getBlockLocations(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1226)
        at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
        at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
        at 
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:306)
        at 
org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:272)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:264)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1526)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
        at 
org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109)
        at 
org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
        at 
org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:257)
        at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:256)
        at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:214)
        at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)

**Note: Even after kinit spark/hadoop  token is not getting renewed.**



Now Launch spark sql session ( Select * from alt_s1 ) is successful.
1. Launch spark-sql
2.spark-sql> select * from alt_s1;
2018-06-22 14:24:04 INFO  HiveMetaStore:746 - 0: get_table : db=test_one 
tbl=alt_s1
2018-06-22 14:24:04 INFO  audit:371 - ugi=spark/had...@hadoop.com       
ip=unknown-ip-addr      cmd=get_table : db=test_one tbl=alt_s1
2018-06-22 14:24:04 INFO  SQLStdHiveAccessController:95 - Created 
SQLStdHiveAccessController for session context : HiveAuthzSessionContext 
[sessionString=2cf6aac4-91c6-4c2d-871b-4d7620d91f43, clientType=HIVECLI]
2018-06-22 14:24:04 INFO  metastore:291 - Mestastore configuration 
hive.metastore.filter.hook changed from 
org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to 
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
2018-06-22 14:24:04 INFO  HiveMetaStore:746 - 0: Shutting down the object 
store...
2018-06-22 14:24:04 INFO  audit:371 - ugi=spark/had...@hadoop.com       
ip=unknown-ip-addr      cmd=Shutting down the object store...
2018-06-22 14:24:04 INFO  HiveMetaStore:746 - 0: Metastore shutdown complete.
2018-06-22 14:24:04 INFO  audit:371 - ugi=spark/had...@hadoop.com       
ip=unknown-ip-addr      cmd=Metastore shutdown complete.
2018-06-22 14:24:04 INFO  MemoryStore:54 - Block broadcast_0 stored as values 
in memory (estimated size 437.9 KB, free 365.9 MB)
2018-06-22 14:24:04 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as 
bytes in memory (estimated size 33.6 KB, free 365.8 MB)
2018-06-22 14:24:04 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in 
memory on blr1000023111:45858 (size: 33.6 KB, free: 366.3 MB)
2018-06-22 14:24:04 INFO  SparkContext:54 - Created broadcast 0 from
2018-06-22 14:24:05 INFO  DFSClient:1033 - Created HDFS_DELEGATION_TOKEN token 
42 for spark on ha-hdfs:hacluster
2018-06-22 14:24:05 INFO  TokenCache:125 - Got dt for hdfs://hacluster; Kind: 
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster, Ident: 
(HDFS_DELEGATION_TOKEN token 42 for spark)
2018-06-22 14:24:05 INFO  FileInputFormat:249 - Total input paths to process : 1
2018-06-22 14:24:05 INFO  SparkContext:54 - Starting job: processCmd at 
CliDriver.java:376
2018-06-22 14:24:05 INFO  DAGScheduler:54 - Got job 0 (processCmd at 
CliDriver.java:376) with 1 output partitions
2018-06-22 14:24:05 INFO  DAGScheduler:54 - Final stage: ResultStage 0 
(processCmd at CliDriver.java:376)
2018-06-22 14:24:05 INFO  DAGScheduler:54 - Parents of final stage: List()
2018-06-22 14:24:05 INFO  DAGScheduler:54 - Missing parents: List()
2018-06-22 14:24:05 INFO  DAGScheduler:54 - Submitting ResultStage 0 
(MapPartitionsRDD[4] at processCmd at CliDriver.java:376), which has no missing 
parents
2018-06-22 14:24:05 INFO  MemoryStore:54 - Block broadcast_1 stored as values 
in memory (estimated size 10.6 KB, free 365.8 MB)
2018-06-22 14:24:05 INFO  MemoryStore:54 - Block broadcast_1_piece0 stored as 
bytes in memory (estimated size 5.3 KB, free 365.8 MB)
2018-06-22 14:24:05 INFO  BlockManagerInfo:54 - Added broadcast_1_piece0 in 
memory on blr1000023111:45858 (size: 5.3 KB, free: 366.3 MB)
2018-06-22 14:24:05 INFO  SparkContext:54 - Created broadcast 1 from broadcast 
at DAGScheduler.scala:1039
2018-06-22 14:24:05 INFO  ContextCleaner:54 - Cleaned accumulator 0
2018-06-22 14:24:05 INFO  DAGScheduler:54 - Submitting 1 missing tasks from 
ResultStage 0 (MapPartitionsRDD[4] at processCmd at CliDriver.java:376) (first 
15 tasks are for partitions Vector(0))
2018-06-22 14:24:05 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 1 
tasks
2018-06-22 14:24:05 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 
(TID 0, localhost, executor driver, partition 0, ANY, 7923 bytes)
2018-06-22 14:24:05 INFO  Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2018-06-22 14:24:05 INFO  HadoopRDD:54 - Input split: 
hdfs://hacluster/user/sparkhive/warehouse/test_one.db/alt_s1/typeddata60.txt:0+4802
2018-06-22 14:24:05 INFO  CodeGenerator:54 - Code generated in 232.938395 ms
2018-06-22 14:24:06 INFO  Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 
2221 bytes result sent to driver
2018-06-22 14:24:06 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 
(TID 0) in 660 ms on localhost (executor driver) (1/1)
2018-06-22 14:24:06 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose 
tasks have all completed, from pool
2018-06-22 14:24:06 INFO  DAGScheduler:54 - ResultStage 0 (processCmd at 
CliDriver.java:376) finished in 0.769 s
2018-06-22 14:24:06 INFO  DAGScheduler:54 - Job 0 finished: processCmd at 
CliDriver.java:376, took 0.829488 s
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  1
2015-11-13 12:10:22     jaie    true    2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  2
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  3
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  4
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  5
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   6
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  7
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  8
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  9
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   10
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  11
2015-11-13 12:10:22     jaie    true    2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  12
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  13
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  14
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  15
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   16
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  17
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  18
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  19
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   20
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  21
2015-11-13 12:10:22     jaie    true    2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  22
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  23
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  24
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  25
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   26
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  27
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  28
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  29
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   30
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  31
2015-11-13 12:10:22     jaie    true    2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  32
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  33
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  34
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  35
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   36
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  37
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  38
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  39
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   40
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  41
2015-11-13 12:10:22     jaie    true    2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  42
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  43
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  44
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  45
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   46
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  47
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  48
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  49
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   50
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  51
2015-11-13 12:10:22     jaie    true    2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  52
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  53
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  54
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  55
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   56
2014-09-11 22:10:22     suss    false   2013-01-21      8       2200.0  19.0    
232344559       7       777     777777  57
2016-10-12 12:00:00     abcd    true    2015-10-10      5       1600.0  26.0    
123456789       9       999     999999  58
2015-11-13 12:10:22     jaie    false   2016-11-11      7       1800.0  29.0    
455756789       8       888     888888  59
2014-09-11 22:10:22     suss    false   2003-01-21      8       2200.0  19.0    
24444559        1       111     11111   60

Below is configuration in hdfs-site.xml
  <property>
    <name>dfs.namenode.delegation.key.update-interval</name>
    <value>18000000</value>
  </property>
  <property>
    <name>dfs.namenode.delegation.token.max-lifetime</name>
    <value>60000000</value>
  </property>
  <property>

Insert gives job aborted Exception:

0: jdbc:hive2://10.18.18.214:23040/default> show tables;
+-----------+------------+--------------+--+
| database  | tableName  | isTemporary  |
+-----------+------------+--------------+--+
| test_one  | alt_s1     | false        |
| test_one  | csvtable   | false        |
| test_one  | csvtable1  | false        |
| test_one  | csvtable2  | false        |
| test_one  | emp        | false        |
| test_one  | tab1       | false        |
| test_one  | tab2       | false        |
+-----------+------------+--------------+--+
7 rows selected (0.049 seconds)
0: jdbc:hive2://10.18.18.214:23040/default> insert into emp values( 34 );
Error: org.apache.spark.SparkException: Job aborted. (state=,code=0)
0: jdbc:hive2://10.18.18.214:23040/default>


JDBC Server Logs snapshot

Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 7 for spark) can't be found in cache
        at org.apache.hadoop.ipc.Client.call(Client.java:1475)
        at org.apache.hadoop.ipc.Client.call(Client.java:1412)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy15.create(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy16.create(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1648)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1689)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1624)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:804)
        at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
        at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:261)
        at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:246)
        ... 16 more
2018-06-22 14:53:22,557 | ERROR | [HiveServer2-Background-Pool: Thread-2966] | 
Error running hive query:  | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179)
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.SparkException: 
Job aborted.
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:269)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)



 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to