Hi Colin, Please feel free to file JIRAs if you see unit test failures.
Let's continue the immutable file discussion on HDFS-3154. Nicholas ________________________________ From: Colin McCabe <cmcc...@alumni.cmu.edu> To: hdfs-dev@hadoop.apache.org; Tsz Wo Sze <szets...@yahoo.com> Sent: Monday, March 26, 2012 2:31 PM Subject: Re: [DISCUSS] Remove append? On Mon, Mar 26, 2012 at 1:55 PM, Tsz Wo Sze <szets...@yahoo.com> wrote: >> Just one comment: If we do decide to keep append in, we should get it >> to be actually stable and usable. In my opinion, this should >> definitely happen before adding any new operations. > > @Colin, append is currently stable and, of course, usable. Many people in > different organizations have tested it > in small and large scale. However, it is not yet in a stable release and so > it is not yet heavy used. The append unit test failed on me recently on Jenkins. It's possible that this was due to a Jenkins timeout, or something, but I assumed it was due to instability at the time. If it happens again, I'll be sure to check the backtrace and file a JIRA if needed. >> I agree that the notion of an immutable file is useful since it lets the >> system and tools optimize certain things. A xerox-parc file system in the >> 80s had this feature that the system exploited. I would support adding the >> notion of an immutable file to Hadoop. I think Eli was hoping that making files immutable would make the system simpler, and hopefully, less buggy. You won't get that benefit if only certain files are immutable. In fact, quite the contrary-- you'll just be adding more complexity. I'd also like to see what the "certain things" are that having certain files, but not others, be immutable would allow you to optimize. The thread you linked to from the JIRA has no information on this. I am aware of at least two "filesystems" (in the loose sense of the word) that have immutable files. One is Venti from Plan9, and the other is git, by Linus Torvalds. Both of them are significantly simpler because of their invariant that files cannot change. However, both of them are append-only, meaning that files can never be deleted. This seems unsuitable for the HDFS use case, and in fact, I see no reason to believe that having some, but not all, files be immutable would provide any benefit. Feel free to prove me wrong if you think of something, though! cheers, Colin > > @Sanjay, I filed HDFS-3154. > > @Eli and others, it turns out that the discussion is very useful! Thanks. > > Nicholas