I set this up with the application name as follows: // Checkpoint directory val hdfsDir = "hdfs://rhes564:9000/user/hduser/checkpoint/"+ *this.getClass.getSimpleName.trim*
And then use that directory for checkpointing ssc.checkpoint(hdfsDir) It creates it OK as follows with the application name as sub-directory name. BTW anyone knows why it adds a "$" sign at the end of the class name? hdfs dfs -ls /user/hduser/checkpoint/TwitterAnalyzer$ 16/06/03 23:45:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 13 items drwxr-xr-x - hduser supergroup 0 2016-06-03 23:38 /user/hduser/checkpoint/TwitterAnalyzer$/9796158c-574a-466d-86a4-d02ed7dd22a0 -rw-r--r-- 2 hduser supergroup 5199 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993554000 -rw-r--r-- 2 hduser supergroup 5199 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993556000 -rw-r--r-- 2 hduser supergroup 5202 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993556000.bk -rw-r--r-- 2 hduser supergroup 5199 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993558000 -rw-r--r-- 2 hduser supergroup 5202 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993558000.bk -rw-r--r-- 2 hduser supergroup 5199 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993560000 -rw-r--r-- 2 hduser supergroup 5202 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993560000.bk -rw-r--r-- 2 hduser supergroup 5199 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993562000 -rw-r--r-- 2 hduser supergroup 5202 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993562000.bk -rw-r--r-- 2 hduser supergroup 5202 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/checkpoint-1464993564000 drwxr-xr-x - hduser supergroup 0 2016-06-03 23:38 /user/hduser/checkpoint/TwitterAnalyzer$/receivedBlockMetadata -rw-r--r-- 2 hduser supergroup 5199 2016-06-03 23:39 /user/hduser/checkpoint/TwitterAnalyzer$/temp It works fine. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 3 June 2016 at 22:06, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > sure I am trying to use > > SparkContext.setCheckpointDir(directory: String) > > to set it up. > > I agree that once one start creating subdirectory > like "~/checkpoints/${APPLICATION_NAME}/${USERNAME}!" it becomes a bit messy > > cheers > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 3 June 2016 at 21:52, David Newberger <david.newber...@wandcorp.com> > wrote: > >> Hi Mich, >> >> My gut says you are correct that each application should have its own >> checkpoint directory. Though honestly I’m a bit fuzzy on checkpointing >> still as I’ve not worked with it much yet. >> >> >> >> *Cheers,* >> >> >> >> *David Newberger* >> >> >> >> *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com] >> *Sent:* Friday, June 3, 2016 3:40 PM >> *To:* David Newberger >> *Cc:* user @spark >> >> *Subject:* Re: Twitter streaming error : No lease on >> /user/hduser/checkpoint/temp (inode 806125): File does not exist. >> >> >> >> Hi David >> >> >> >> yes they do >> >> >> >> The first streaming job does >> >> >> >> val ssc = new StreamingContext(sparkConf, Seconds(2)) >> >> ssc.checkpoint("checkpoint") >> >> >> >> And the twitter does >> >> >> >> /** Returns the HDFS URL */ >> def getCheckpointDirectory(): String = { >> try { >> val name : String = Seq("bash", "-c", "curl -s >> http://169.254.169.254/latest/meta-data/hostname") !! ; >> println("Hostname = " + name) >> "hdfs://" + name.trim + ":9000/checkpoint/" >> } catch { >> case e: Exception => { >> "./checkpoint/" >> } >> } >> >> >> >> I need to change one of these. >> >> >> >> Actually a better alternative would be that each application has its own >> checkpoint? >> >> >> >> THanks >> >> >> >> >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn >> *https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> >> >> On 3 June 2016 at 21:23, David Newberger <david.newber...@wandcorp.com> >> wrote: >> >> I was going to ask if you had 2 jobs running. If the checkpointing for >> both are setup to look at the same location I could see an error like this >> happening. Do both spark jobs have a reference to a checkpointing dir? >> >> >> >> *David Newberger* >> >> >> >> *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com] >> *Sent:* Friday, June 3, 2016 3:20 PM >> *To:* user @spark >> *Subject:* Re: Twitter streaming error : No lease on >> /user/hduser/checkpoint/temp (inode 806125): File does not exist. >> >> >> >> OK >> >> >> >> I was running two spark streaming jobs, one using streaming data from >> Kafka and another from twitter in local mode on the same node. >> >> >> >> It is possible that the directory /user/hduser/checkpoint/temp is shared >> by both spark streaming jobs >> >> >> >> any experience on this please? >> >> >> >> Thanks >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn >> *https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> >> >> On 3 June 2016 at 20:48, Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >> Hi, >> >> >> >> Just started seeing these errors: >> >> >> >> 16/06/03 20:30:01 ERROR DFSClient: Failed to close inode 806125 >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >> No lease on /user/hduser/checkpoint/temp (inode 806125): File does not >> exist. [Lease. Holder: DFSClient_NONMAPREDUCE_-907736468_1, >> pendingcreates: 1] >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3516) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3313) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3169) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641) >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482) >> at >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >> at >> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) >> >> >> >> >> Sounds like a connection is left open but cannot establish why! >> >> >> >> Thanks >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn >> *https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> >> >> >> >> >