I did some back of the envelope math when implementing txids, and
determined that overflow is not ever going to happen... A "busy" namenode
does 1000 write transactions/second (2^10). MAX_LONG is 2^63. So, we can
run for 2^63 seconds. A year is about 2^25 seconds. So, at 1k tps, you can
run your namenode for 2^(63-10-25) = 268 million years.

Hadoop is great software and I'm sure it will be around for years to come,
but if it's still running in 268 million years, that will be a pretty
depressing rate of technological progress!

-Todd

On Tue, Jun 25, 2013 at 6:14 AM, Harsh J <ha...@cloudera.com> wrote:

> Yes, it logically can if there have been as many transactions (its a
> very very large number to reach though).
>
> Long.MAX_VALUE is (2^63 - 1) or 9223372036854775807.
>
> I hacked up my local NN's txids manually to go very large (close to
> max) and decided to try out if this causes any harm. I basically
> bumped up the freshly formatted starting txid to 9223372036854775805
> (and ensured image references the same):
>
> ➜  current  ls
> VERSION
> fsimage_9223372036854775805.md5
> fsimage_9223372036854775805
> seen_txid
> ➜  current  cat seen_txid
> 9223372036854775805
>
> NameNode started up as expected.
>
> 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded
> in 0 seconds.
> 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid
> 9223372036854775805 from
> /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805
> 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at
> 9223372036854775806
>
> I could create a bunch of files and do regular ops (counting to much
> after the long max increments). I created over 100 files, just to make
> it go well over the Long.MAX_VALUE.
>
> Quitting NameNode and restarting fails though, with the following error:
>
> 13/06/25 18:31:08 FATAL namenode.NameNode: Exception in namenode join
> java.io.IOException: Gap in transactions. Expected to be able to read
> up until at least txid 9223372036854775806 but unable to find any edit
> logs containing txid -9223372036854775808
>
> So it looks like it cannot currently handle an overflow.
>
> I've filed https://issues.apache.org/jira/browse/HDFS-4936 to discuss
> this. I don't think this is of immediate concern though, so we should
> be able to address it in future (unless there's parts of the code
> which already are preventing reaching this number in the first place -
> please do correct me if there is such a part).
>
> On Tue, Jun 25, 2013 at 3:09 PM, Azuryy Yu <azury...@gmail.com> wrote:
> > Hi dear All,
> >
> > It's long type for the txid currently,
> >
> > FSImage.java:
> >
> > boolean loadFSImage(FSNamesystem target, MetaRecoveryContext recovery)
> >     throws IOException{
> >
> >   editLog.setNextTxId(lastAppliedTxId + 1L);
> > }
> >
> > Is it possible that (lastAppliedTxId + 1L) exceed Long.MAX_VALUE ?
>
>
>
> --
> Harsh J
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to