from:"Erez Zadok"

Re: [PATCH 0/9] overlay filesystem: request for inclusion (v17)

2013-03-15 Thread Erez Zadok

I tend to agree with Al's and Linus's POV regarding whiteouts. There are three 
general techniques to implementing whiteouts:

1. namespace: special file names, hard/symlinks, or special "hidden" dot files.
2. extended attributes.
3. DT_WHT dirent flags.
(there's actually a 4th method I've tried before that I won't discuss below: 
implementing your own data structures on a raw partition — way too cumbersome.)

The namespace techniques require lower file systems to support hard/symlinks 
and sometimes need long names. A plus: they work on most file systems (but not 
all).  But they cause all sorts of namespace ugliness, where you have to hide 
the special file names, avoid them, ensure atomic updates for ops that involve 
whiteouts.  It's all doable, as had been demonstrated by several 
implementations.  But it's still icky in terms of namespace pollution.

The extended attributes technique, I think, is better than the namespace one in 
that you don't pollute the namespace; plus, I think the EA technique minimizes 
atomicity issues that show up with the namespace method.  Yet, it still 
requires EA support in lower file systems, so it won't work unless lower file 
systems support xattr ops.  Plus it could fail for file systems that have 
limited xattr support (e.g., number of EAs per inode).

The DT_WHT technique is the cleanest in the long run, and the best of the three 
IMHO.  It's well understood and has been done in BSD a long time ago.  It 
doesn't have the namespace pollution as seen in technique #1 above.  And I 
believe it also minimize atomicity issues.  Plus you won't have issues running 
out of EAs.

A while back I've looked at the unionmount code for DT_WHT support for 
ext2/tmpfs, and it was small, clean, and mostly additive.  I even had a 
prototype port of unionfs using unionmount's DT_WHT support: it was relatively 
easy to port unionfs to use DT_WHT instead of namespace techniques. Plus, I was 
able to reduce the amount of code devoted to whiteout support by quite a bit.

So I think it'll be easy to port overlayfs to use DT_WHT.  Given that most 
people use unioning with ext* and tmpfs, minimal DT_WHT support in those would 
get most users happy initially. And we can then let other file systems support 
DT_WHT on their own, in whatever way they deem suitable (as Al suggested, this 
is really best deferred to the F/S to implement).

Lastly, what I'm not sure is what API to use for whiteouts: should every f/s 
implement some new methods to add/remove/query a whiteout, or should the upper 
f/s and VFS directly check DT_WHT flags with S_ISWHT.  The generic f/s methods 
may allow file systems to implement whiteouts in arbitrary ways, not 
necessarily as a dirent flag.

Cheers,
Erez.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: unionfs_copy_attr_times oopses

2008-02-01 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Hugh Dickins writes:
> Hi Erez,
> 
> Aside from the occasional "unionfs: new lower inode mtime" messages
> on directories (which I've got into the habit of ignoring now), the
> only problem I'm still suffering with unionfs over tmpfs (not tested
> any other fs's below it recently) is oops in unionfs_copy_attr_times.
[...]

Thanks for the report and patch(es).  I'll start looking at this over the
weekend.  Could you do me a favor and open a bugzilla report
(https://bugzilla.filesystems.org/), upload your patches, etc.?  It'd be
easier for us to exchange info/patches and track the problem that way.

Thanks,
Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-02-02 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Al Viro writes:
> On Sat, Jan 26, 2008 at 12:08:30AM -0500, Erez Zadok wrote:

[concerns about lower directories moving around...]

> You are thinking about non-interesting case.  _Files_ are not much
> of a problem.  Directory tree is.  The real problems with all unionfs and
> stacking implementations I've seen so far, all way back to Heidemann et.al.
> start when topology of the underlying layer changes.

OK, so if I understand you, your concerns center around the fact that lower
directories can be moved around (i.e., topology changes), what happens then
for operations that go through the stackable f/s, and what should users
expect to see.

> If you have clear
> semantics for unionfs behaviour in presence of such things, by all means,
> publish it - as far as I know *nobody* had done that; not even on the
> "what should we see when..." level, nevermind the implementation.

Since stacking and NFS have some similarities, I first checked w/ the NFS
people to see what are their semantics in a similar scenario: an NFS client
could be validating a directory, then issue, say, a ->create; but in
between, the server could have moved the directory that was validated.  In
NFS, the ->create operation succeeds, and creates the file in the new
location of the directory which was validated.

Unionfs's behavior is similar: the newly created file will be successfully
created in the moved directory.  The only exception is that if a lower
branch is marked readonly by unionfs, a copyup will take place.

This had not been a problem for unionfs users to date.  The main reason is
that when unionfs users modify lower files, they often do so while there's
little to no activity going through the union itself.  And while it doesn't
prevent directories from being moved around, this common usage mode does
reduce the frequency in which topology changes can be an issue for unionfs
users.

I'll submit a patch to document this behavior.

> > Perhaps this general topic is a good one to discuss at more length at LSF?
> > Suggestions are welcome.
> 
> It would; I honestly do not know if the problem is solvable with the
> (lack of) constraints you apparently want.  Again, the real PITA begins
> when you start dealing with pieces of underlying trees getting moved
> around, changing parents, etc.  Cross-directory rename() certainly rates
> very high on the list of "WTF had they been smoking in UCB?" misfeatures,
> but it's there and it has to be dealt with.

Well, it was UCB, UCLA, and Sun.  I don't think back in the early 90s they
were too concerned about topology changes; f/s stacking was a new idea and
they wanted to explore what can be done with it conceptually, not produce
commercial-grade software (still, remind me to tell you the first-hand story
I've learned about of how full-blown stacking _almost_ made it into Solaris
2.0 :-)

The only known reference to try and address this coherency problem was
Heidemann's SOSP'95 paper titled "Performance of cache coherence in
stackable filing."  The paper advocated changing the whole VFS and the
caches (page cache + dnlc) to create a "unified cache manager" that was
aware of complex layering topologies (including fan-out and fan-in).  It was
even able to handle compression layers, where file data offsets change b/t
the layers (a nasty problem).  Code for this unified cache manager was never
released AFAIK.  I think Heidemann's approach was elegant, but I don't think
it was practical as it required radical VFS/VM surgery.  Ironically, MS
Windows has a single I/O cache manager that all storage and filesystem
modules talk to directly (they're not allowed to pass IRPs directly b/t
layers): so Windows can handle such coherency better than most Unix systems
can today.

I've always thought of a different way to allow users to write to lower
branches -- through the union.  This is similar to what an old AT&T
unioning-like file system named "3DFS" did.  3DFS introduced a new directory
called "..." so if you cd to /mntpt/... then you got to the next level down
the stack (as if you popped the top one and now you see how the union looks
like without the top layer).  And if you "cd /mntpt/.../..." then you see
the view without the top two layers, etc.

So my idea is similar: to introduce virtual directory views that restrict
access to a single lower branch within the union.  So if someone does a "cd
/mnt/unionfs/.1" then they get access to branch 1; "cd /mnt/unionfs/.2" gets
access to branch 2; etc.  While this technique will waste a few names, it's
probably worth the savings in terms of cache-coherency pains (plus, the
actual virtual directory names can be configurable at mount time to allow
users to choose a non-conflicting dir name).  W

[PATCH] unhide CONFIG_DEBUG_SECTION_MISMATCH

2008-02-14 Thread Erez Zadok

Using: v2.6.25-rc1-120-ge760e71

In a normal compilation, I might this message:

...
  MODPOST vmlinux.o
WARNING: modpost: Found 4 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'
...

I can indeed try to re-make, passing CONFIG_DEBUG_SECTION_MISMATCH=y on the
command line, but I can't turn on the option in my .config.  That's because
the option depends on "UNDEFINED".  (Was that an attempt to "hide" the
option?  Why?)  The following small patch allows me to set the option in my
.config.

Cheers,
Erez.



Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a370fe8..3399d9d 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -81,7 +81,6 @@ config HEADERS_CHECK
 
 config DEBUG_SECTION_MISMATCH
bool "Enable full Section mismatch analysis"
-   depends on UNDEFINED
help
  The section mismatch analysis checks if there are illegal
  references from one section to another section.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] unhide CONFIG_DEBUG_SECTION_MISMATCH

2008-02-14 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Sam Ravnborg writes:
> On Thu, Feb 14, 2008 at 03:54:04PM -0500, Erez Zadok wrote:
> > Using: v2.6.25-rc1-120-ge760e71
> > 
> > In a normal compilation, I might this message:
> > 
> > ...
> >   MODPOST vmlinux.o
> > WARNING: modpost: Found 4 section mismatch(es).
> > To see full details build your kernel with:
> > 'make CONFIG_DEBUG_SECTION_MISMATCH=y'
> > ...
> > 
> > I can indeed try to re-make, passing CONFIG_DEBUG_SECTION_MISMATCH=y on the
> > command line, but I can't turn on the option in my .config.  That's because
> > the option depends on "UNDEFINED".  (Was that an attempt to "hide" the
> > option?  Why?)  The following small patch allows me to set the option in my
> > .config.
> 
> It was done so on purpose.
> The rationale here is that we yet have too many section mismatch for
> a typical build to pass my "good stomach" test.
> So until we get down on an acceptable level we should not be
> too noisy.
> And I wanted it to be turned off also for allyesconfig builds.
> 
> I hope to spend time on the reaming warnings soon but the
> option will not be enabled until latest next mergewindow.
> 
> You could argue that I could have doen this in better ways
> The present version was just simple.
> 
>   Sam

Thanks, Sam.  That explains it.  Perhaps a small comment next to the Kconfig
entry might help explain why the option is "hidden" for now, so others won't
bother unhiding it again.

Of course, forcing it on might help people to fix those warnings sooner. :-)

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mmotm] one more reject fix in arch/x86/Kconfig

2008-02-16 Thread Erez Zadok

Using mmotm-2008-02-15-11-03, I get

$ make ARCH=i386
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/kxgettext.o
  HOSTCC  scripts/kconfig/zconf.tab.o
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf -s arch/x86/Kconfig
arch/x86/Kconfig:24: unknown option "HEAD"
arch/x86/Kconfig:30: unknown option "FETCH_HEAD"
make[2]: *** [silentoldconfig] Error 1
make[1]: *** [silentoldconfig] Error 2
make: *** No rule to make target `include/config/auto.conf', needed by 
`include/config/kernel.release'.  Stop.

The following small patch fixes it for me, but may not be the right fix (I
chose to resolve the conflict by including both parts).

Cheers,
Erez.


diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6d6150b..dd4d64d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -21,13 +21,9 @@ config X86
select HAVE_IDE
select HAVE_OPROFILE
select HAVE_KPROBES
-<<< HEAD:arch/x86/Kconfig
select HAVE_KVM
select HAVE_ARCH_KGDB
-
-===
select HAVE_FTRACE
->>> FETCH_HEAD:arch/x86/Kconfig
 
 config GENERIC_LOCKBREAK
def_bool n
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mmotm] fs/sysfs/file.c d_path fix

2008-02-16 Thread Erez Zadok

Using mmotm-2008-02-15-11-03, I get

  CC  fs/sysfs/file.o
fs/sysfs/file.c: In function 'sysfs_open_file':
fs/sysfs/file.c:334: warning: passing argument 1 of 'd_path' from incompatible 
pointer type
fs/sysfs/file.c:334: warning: passing argument 2 of 'd_path' from incompatible 
pointer type
fs/sysfs/file.c:334: warning: passing argument 3 of 'd_path' makes integer from 
pointer without a cast
fs/sysfs/file.c:334: error: too many arguments to function 'd_path'
make[2]: *** [fs/sysfs/file.o] Error 1
make[1]: *** [fs/sysfs] Error 2
make: *** [fs] Error 2

The following small patch fixes it for me.

Cheers,
Erez.


Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 02223e2..a57b024 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -329,9 +329,11 @@ static int sysfs_open_file(struct inode *inode, struct 
file *file)
struct sysfs_ops *ops;
int error = -EACCES;
char *p;
+   struct path sysfs_path;
 
-   p = d_path(file->f_dentry, sysfs_mount, last_sysfs_file,
-  sizeof(last_sysfs_file));
+   sysfs_path.dentry = file->f_dentry;
+   sysfs_path.mnt = sysfs_mount;
+   p = d_path(&sysfs_path, last_sysfs_file, sizeof(last_sysfs_file));
if (p)
memmove(last_sysfs_file, p, strlen(p) + 1);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL -mm] 00/17 Unionfs updates/fixes/cleanups

2008-02-16 Thread Erez Zadok


The following is a series of patchsets related to Unionfs.  The most
significant changes are several fixes to races/locking.  This release also
supports newer APIs in 2.6.25-rc: not using iget/read_inode, using the
changed d_path, using the revised nameidata which embeds a struct path, and
using path_get/put.

These patches were tested (where appropriate) on 2.6.25, MM, as well as the
backports to 2.6.{24,23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs,
nfs2/3/4, jffs2, ramfs, tmpfs, cramfs, and squashfs (where available).  Also
tested with LTP-full and with a continuous parallel kernel compile (while
forcing cache flushing, manipulating lower branches, etc.).  See
http://unionfs.filesystems.org/ to download back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Andrew Morton (1):
  Unionfs: embed a struct path into struct nameidata instead of nd dentrymnt

David Howells (1):
  Unionfs: stop using iget() and read_inode()

Erez Zadok (14):
  Unionfs: grab lower super_block references
  Unionfs: ensure consistent lower inodes types
  Unionfs: document behavior when the lower topology changes
  Unionfs: uninline unionfs_copy_attr_times and unionfs_copy_attr_all
  Unionfs: initialize path_save variable
  Unionfs: extend dentry branch configuration lock in open
  Unionfs: follow_link locking fixes
  Unionfs: improve debugging in copy_attr_times
  Unionfs: revalidation code cleanup and refactoring
  Unionfs: factor out revalidation routine
  Unionfs: lock parents' branch configuration fixes
  Unionfs: branch management/configuration fixes
  Unionfs: use dget_parent in revalidation code
  VFS/Unionfs: use generic path_get/path_put functions

Jan Blunck (1):
  Unionfs: use the new path_put

 Documentation/filesystems/unionfs/concepts.txt |   13 +
 fs/unionfs/commonfops.c|   54 ---
 fs/unionfs/copyup.c|3 
 fs/unionfs/dentry.c|  182 ++---
 fs/unionfs/fanout.h|   50 --
 fs/unionfs/inode.c |   78 +++---
 fs/unionfs/lookup.c|   13 +
 fs/unionfs/main.c  |   26 +--
 fs/unionfs/mmap.c  |   17 --
 fs/unionfs/rename.c|7 
 fs/unionfs/subr.c  |   56 +++
 fs/unionfs/super.c |   72 ++---
 fs/unionfs/union.h |   13 +
 fs/unionfs/unlink.c|   11 +
 include/linux/namei.h  |   12 -
 15 files changed, 366 insertions(+), 241 deletions(-)

Thanks.
---
Erez Zadok
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/17] Unionfs: grab lower super_block references

2008-02-16 Thread Erez Zadok

This prevents the lower super_block from being destroyed too early, when a
lower file system is being unmounted with MNT_FORCE or MNT_DETACH.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c  |3 +++
 fs/unionfs/super.c |   14 ++
 fs/unionfs/union.h |2 +-
 3 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 23c18f7..ba3471d 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -636,6 +636,7 @@ static int unionfs_read_super(struct super_block *sb, void 
*raw_data,
sbend(sb) = bend = lower_root_info->bend;
for (bindex = bstart; bindex <= bend; bindex++) {
struct dentry *d = lower_root_info->lower_paths[bindex].dentry;
+   atomic_inc(&d->d_sb->s_active);
unionfs_set_lower_super_idx(sb, bindex, d->d_sb);
}
 
@@ -711,6 +712,8 @@ out_dput:
dput(d);
/* initializing: can't use unionfs_mntput here */
mntput(m);
+   /* drop refs we took earlier */
+   atomic_dec(&d->d_sb->s_active);
}
kfree(lower_root_info->lower_paths);
kfree(lower_root_info);
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 986c980..175840f 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -116,6 +116,14 @@ static void unionfs_put_super(struct super_block *sb)
}
BUG_ON(leaks != 0);
 
+   /* decrement lower super references */
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   struct super_block *s;
+   s = unionfs_lower_super_idx(sb, bindex);
+   unionfs_set_lower_super_idx(sb, bindex, NULL);
+   atomic_dec(&s->s_active);
+   }
+
kfree(spd->data);
kfree(spd);
sb->s_fs_info = NULL;
@@ -729,6 +737,12 @@ out_no_change:
 */
purge_sb_data(sb);
 
+   /* grab new lower super references; release old ones */
+   for (i = 0; i < new_branches; i++)
+   atomic_inc(&new_data[i].sb->s_active);
+   for (i = 0; i < new_branches; i++)
+   atomic_dec(&UNIONFS_SB(sb)->data[i].sb->s_active);
+
/* copy new vectors into their correct place */
tmp_data = UNIONFS_SB(sb)->data;
UNIONFS_SB(sb)->data = new_data;
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 4b4d6c9..786c4be 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -127,7 +127,7 @@ struct unionfs_dentry_info {
 
 /* These are the pointers to our various objects. */
 struct unionfs_data {
-   struct super_block *sb;
+   struct super_block *sb; /* lower super_block */
atomic_t open_files;/* number of open files on branch */
int branchperms;
int branch_id;  /* unique branch ID at re/mount time */
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/17] Unionfs: document behavior when the lower topology changes

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/concepts.txt |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/Documentation/filesystems/unionfs/concepts.txt 
b/Documentation/filesystems/unionfs/concepts.txt
index bed69bd..8d9a1c5 100644
--- a/Documentation/filesystems/unionfs/concepts.txt
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -210,4 +210,17 @@ there's a lot of concurrent activity on both the upper and 
lower objects,
 for the same file(s).  Lastly, this delayed time attribute detection is
 similar to how NFS clients operate (e.g., acregmin).
 
+Finally, there is no way currently in Linux to prevent lower directories
+from being moved around (i.e., topology changes); there's no way to prevent
+modifications to directory sub-trees of whole file systems which are mounted
+read-write.  It is therefore possible for in-flight operations in unionfs to
+take place, while a lower directory is being moved around.  Therefore, if
+you try to, say, create a new file in a directory through unionfs, while the
+directory is being moved around directly, then the new file may get created
+in the new location where that directory was moved to.  This is a somewhat
+similar behaviour in NFS: an NFS client could be creating a new file while
+th NFS server is moving th directory around; the file will get successfully
+created in the new location.  (The one exception in unionfs is that if the
+branch is marked read-only by unionfs, then a copyup will take place.)
+
 For more information, see <http://unionfs.filesystems.org/>.
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/17] Unionfs: follow_link locking fixes

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 8d939dc..6377533 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -820,7 +820,11 @@ static void *unionfs_follow_link(struct dentry *dentry, 
struct nameidata *nd)
err = 0;
 
 out:
-   unionfs_check_dentry(dentry);
+   if (!err) {
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   }
unionfs_check_nd(nd);
return ERR_PTR(err);
 }
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/17] Unionfs: revalidation code cleanup and refactoring

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   55 --
 1 files changed, 35 insertions(+), 20 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index cd15243..afa2120 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -18,6 +18,39 @@
 
 #include "union.h"
 
+
+static inline void __dput_lowers(struct dentry *dentry, int start, int end)
+{
+   struct dentry *lower_dentry;
+   int bindex;
+
+   if (start < 0)
+   return;
+   for (bindex = start; bindex <= end; bindex++) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry)
+   continue;
+   unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
+   dput(lower_dentry);
+   }
+}
+
+static inline void __iput_lowers(struct inode *inode, int start, int end)
+{
+   struct inode *lower_inode;
+   int bindex;
+
+   if (start < 0)
+   return;
+   for (bindex = start; bindex <= end; bindex++) {
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (!lower_inode)
+   continue;
+   unionfs_set_lower_inode_idx(inode, bindex, NULL);
+   iput(lower_inode);
+   }
+}
+
 /*
  * Revalidate a single dentry.
  * Assume that dentry's info node is locked.
@@ -72,15 +105,7 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
/* Free the pointers for our inodes and this dentry. */
bstart = dbstart(dentry);
bend = dbend(dentry);
-   if (bstart >= 0) {
-   struct dentry *lower_dentry;
-   for (bindex = bstart; bindex <= bend; bindex++) {
-   lower_dentry =
-   unionfs_lower_dentry_idx(dentry,
-bindex);
-   dput(lower_dentry);
-   }
-   }
+   __dput_lowers(dentry, bstart, bend);
set_dbstart(dentry, -1);
set_dbend(dentry, -1);
 
@@ -90,17 +115,7 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
 
bstart = ibstart(dentry->d_inode);
bend = ibend(dentry->d_inode);
-   if (bstart >= 0) {
-   struct inode *lower_inode;
-   for (bindex = bstart; bindex <= bend;
-bindex++) {
-   lower_inode =
-   unionfs_lower_inode_idx(
-   dentry->d_inode,
-   bindex);
-   iput(lower_inode);
-   }
-   }
+   __iput_lowers(dentry->d_inode, bstart, bend);
kfree(UNIONFS_I(dentry->d_inode)->lower_inodes);
UNIONFS_I(dentry->d_inode)->lower_inodes = NULL;
ibstart(dentry->d_inode) = -1;
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/17] Unionfs: ensure consistent lower inodes types

2008-02-16 Thread Erez Zadok

When looking up a lower object in multiple branches, especially for
directories, ignore any existing entries whose type is different than the
type of the first found object (otherwise we'll be trying to, say, call
readdir on a non-dir inode).

Signed-off-by: Himanshu Kanda <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/lookup.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index b9ee072..755158e 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -256,6 +256,19 @@ struct dentry *unionfs_lookup_backend(struct dentry 
*dentry,
continue;
}
 
+   /*
+* If we already found at least one positive dentry
+* (dentry_count is non-zero), then we skip all remaining
+* positive dentries if their type is a non-dir.  This is
+* because only directories are allowed to stack on multiple
+* branches, but we have to skip non-dirs (to avoid, say,
+* calling readdir on a regular file).
+*/
+   if (!S_ISDIR(lower_dentry->d_inode->i_mode) && dentry_count) {
+   dput(lower_dentry);
+   continue;
+   }
+
/* number of positive dentries */
dentry_count++;
 
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/17] Unionfs: uninline unionfs_copy_attr_times and unionfs_copy_attr_all

2008-02-16 Thread Erez Zadok

This reduces text size by about 6k.

Cc: Hugh Dickins <[EMAIL PROTECTED]>

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/fanout.h |   50 --
 fs/unionfs/subr.c   |   50 ++
 fs/unionfs/union.h  |2 ++
 3 files changed, 52 insertions(+), 50 deletions(-)

diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index 4d9a45f..29d42fb 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -313,54 +313,4 @@ static inline void verify_locked(struct dentry *d)
BUG_ON(!mutex_is_locked(&UNIONFS_D(d)->lock));
 }
 
-/* copy a/m/ctime from the lower branch with the newest times */
-static inline void unionfs_copy_attr_times(struct inode *upper)
-{
-   int bindex;
-   struct inode *lower;
-
-   if (!upper || ibstart(upper) < 0)
-   return;
-   for (bindex = ibstart(upper); bindex <= ibend(upper); bindex++) {
-   lower = unionfs_lower_inode_idx(upper, bindex);
-   if (!lower)
-   continue; /* not all lower dir objects may exist */
-   if (unlikely(timespec_compare(&upper->i_mtime,
- &lower->i_mtime) < 0))
-   upper->i_mtime = lower->i_mtime;
-   if (unlikely(timespec_compare(&upper->i_ctime,
- &lower->i_ctime) < 0))
-   upper->i_ctime = lower->i_ctime;
-   if (unlikely(timespec_compare(&upper->i_atime,
- &lower->i_atime) < 0))
-   upper->i_atime = lower->i_atime;
-   }
-}
-
-/*
- * A unionfs/fanout version of fsstack_copy_attr_all.  Uses a
- * unionfs_get_nlinks to properly calcluate the number of links to a file.
- * Also, copies the max() of all a/m/ctimes for all lower inodes (which is
- * important if the lower inode is a directory type)
- */
-static inline void unionfs_copy_attr_all(struct inode *dest,
-const struct inode *src)
-{
-   dest->i_mode = src->i_mode;
-   dest->i_uid = src->i_uid;
-   dest->i_gid = src->i_gid;
-   dest->i_rdev = src->i_rdev;
-
-   unionfs_copy_attr_times(dest);
-
-   dest->i_blkbits = src->i_blkbits;
-   dest->i_flags = src->i_flags;
-
-   /*
-* Update the nlinks AFTER updating the above fields, because the
-* get_links callback may depend on them.
-*/
-   dest->i_nlink = unionfs_get_nlinks(dest);
-}
-
 #endif /* not _FANOUT_H */
diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 0a0fce9..68a6280 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -240,3 +240,53 @@ char *alloc_whname(const char *name, int len)
 
return buf;
 }
+
+/* copy a/m/ctime from the lower branch with the newest times */
+void unionfs_copy_attr_times(struct inode *upper)
+{
+   int bindex;
+   struct inode *lower;
+
+   if (!upper || ibstart(upper) < 0)
+   return;
+   for (bindex = ibstart(upper); bindex <= ibend(upper); bindex++) {
+   lower = unionfs_lower_inode_idx(upper, bindex);
+   if (!lower)
+   continue; /* not all lower dir objects may exist */
+   if (unlikely(timespec_compare(&upper->i_mtime,
+ &lower->i_mtime) < 0))
+   upper->i_mtime = lower->i_mtime;
+   if (unlikely(timespec_compare(&upper->i_ctime,
+ &lower->i_ctime) < 0))
+   upper->i_ctime = lower->i_ctime;
+   if (unlikely(timespec_compare(&upper->i_atime,
+ &lower->i_atime) < 0))
+   upper->i_atime = lower->i_atime;
+   }
+}
+
+/*
+ * A unionfs/fanout version of fsstack_copy_attr_all.  Uses a
+ * unionfs_get_nlinks to properly calcluate the number of links to a file.
+ * Also, copies the max() of all a/m/ctimes for all lower inodes (which is
+ * important if the lower inode is a directory type)
+ */
+void unionfs_copy_attr_all(struct inode *dest,
+  const struct inode *src)
+{
+   dest->i_mode = src->i_mode;
+   dest->i_uid = src->i_uid;
+   dest->i_gid = src->i_gid;
+   dest->i_rdev = src->i_rdev;
+
+   unionfs_copy_attr_times(dest);
+
+   dest->i_blkbits = src->i_blkbits;
+   dest->i_flags = src->i_flags;
+
+   /*
+* Update the nlinks AFTER updating the above fields, because the
+* get_links callback may depend on them.
+*/
+   dest->i_nlink = unionfs_get_nlinks(d

[PATCH 17/17] VFS/Unionfs: use generic path_get/path_put functions

2008-02-16 Thread Erez Zadok

Remove unionfs's versions thereof.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c |   10 +-
 fs/unionfs/super.c|   27 ++-
 include/linux/namei.h |   12 
 3 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 3585b29..8f59fb5 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -230,11 +230,11 @@ void unionfs_reinterpose(struct dentry *dentry)
 int check_branch(struct nameidata *nd)
 {
/* XXX: remove in ODF code -- stacking unions allowed there */
-   if (!strcmp(nd->dentry->d_sb->s_type->name, UNIONFS_NAME))
+   if (!strcmp(nd->path.dentry->d_sb->s_type->name, UNIONFS_NAME))
return -EINVAL;
-   if (!nd->dentry->d_inode)
+   if (!nd->path.dentry->d_inode)
return -ENOENT;
-   if (!S_ISDIR(nd->dentry->d_inode->i_mode))
+   if (!S_ISDIR(nd->path.dentry->d_inode->i_mode))
return -ENOTDIR;
return 0;
 }
@@ -375,8 +375,8 @@ static int parse_dirs_option(struct super_block *sb, struct 
unionfs_dentry_info
goto out;
}
 
-   lower_root_info->lower_paths[bindex].dentry = nd.dentry;
-   lower_root_info->lower_paths[bindex].mnt = nd.mnt;
+   lower_root_info->lower_paths[bindex].dentry = nd.path.dentry;
+   lower_root_info->lower_paths[bindex].mnt = nd.path.mnt;
 
set_branchperms(sb, bindex, perms);
set_branch_count(sb, bindex, 0);
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 773623e..fba1598 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -231,8 +231,8 @@ static noinline int do_remount_mode_option(char *optarg, 
int cur_branches,
goto out;
}
for (idx = 0; idx < cur_branches; idx++)
-   if (nd.mnt == new_lower_paths[idx].mnt &&
-   nd.dentry == new_lower_paths[idx].dentry)
+   if (nd.path.mnt == new_lower_paths[idx].mnt &&
+   nd.path.dentry == new_lower_paths[idx].dentry)
break;
path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
@@ -274,8 +274,8 @@ static noinline int do_remount_del_option(char *optarg, int 
cur_branches,
goto out;
}
for (idx = 0; idx < cur_branches; idx++)
-   if (nd.mnt == new_lower_paths[idx].mnt &&
-   nd.dentry == new_lower_paths[idx].dentry)
+   if (nd.path.mnt == new_lower_paths[idx].mnt &&
+   nd.path.dentry == new_lower_paths[idx].dentry)
break;
path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
@@ -358,8 +358,8 @@ static noinline int do_remount_add_option(char *optarg, int 
cur_branches,
goto out;
}
for (idx = 0; idx < cur_branches; idx++)
-   if (nd.mnt == new_lower_paths[idx].mnt &&
-   nd.dentry == new_lower_paths[idx].dentry)
+   if (nd.path.mnt == new_lower_paths[idx].mnt &&
+   nd.path.dentry == new_lower_paths[idx].dentry)
break;
path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
@@ -425,10 +425,10 @@ found_insertion_point:
memmove(&new_lower_paths[idx+1], &new_lower_paths[idx],
(cur_branches - idx) * sizeof(struct path));
}
-   new_lower_paths[idx].dentry = nd.dentry;
-   new_lower_paths[idx].mnt = nd.mnt;
+   new_lower_paths[idx].dentry = nd.path.dentry;
+   new_lower_paths[idx].mnt = nd.path.mnt;
 
-   new_data[idx].sb = nd.dentry->d_sb;
+   new_data[idx].sb = nd.path.dentry->d_sb;
atomic_set(&new_data[idx].open_files, 0);
new_data[idx].branchperms = perms;
new_data[idx].branch_id = ++*high_branch_id; /* assign new branch ID */
@@ -577,7 +577,7 @@ static int unionfs_remount_fs(struct super_block *sb, int 
*flags,
memcpy(tmp_lower_paths, UNIONFS_D(sb->s_root)->lower_paths,
   cur_branches * sizeof(struct path));
for (i = 0; i < cur_branches; i++)
-   pathget(&tmp_lower_paths[i]); /* drop refs at end of fxn */
+   path_get(&tmp_lower_paths[i]); /* drop refs at end of fxn */
 
/***
 * For each branch command, do path_lookup on the requested branch,
@@ -1008,9 +1008,10 @@ static int unionfs_show_options(struct seq_file *m, 
struct vfsmount *mnt)
 
seq_printf(m, ",dirs=");
for (bindex = bstart; bindex <= bend; bindex++) {
-   path = d_path

[PATCH 16/17] Unionfs: use the new path_put

2008-02-16 Thread Erez Zadok

From: Jan Blunck <[EMAIL PROTECTED]>

* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

Signed-off-by: Jan Blunck <[EMAIL PROTECTED]>
Signed-off-by: Andreas Gruenbacher <[EMAIL PROTECTED]>
Acked-by: Christoph Hellwig <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c  |2 +-
 fs/unionfs/super.c |   12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 4bc2c66..3585b29 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -371,7 +371,7 @@ static int parse_dirs_option(struct super_block *sb, struct 
unionfs_dentry_info
if (err) {
printk(KERN_ERR "unionfs: lower directory "
   "'%s' is not a valid branch\n", name);
-   path_release(&nd);
+   path_put(&nd.path);
goto out;
}
 
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index b71fc2a..773623e 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -234,7 +234,7 @@ static noinline int do_remount_mode_option(char *optarg, 
int cur_branches,
if (nd.mnt == new_lower_paths[idx].mnt &&
nd.dentry == new_lower_paths[idx].dentry)
break;
-   path_release(&nd);  /* no longer needed */
+   path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
err = -ENOENT;  /* err may have been reset above */
printk(KERN_ERR "unionfs: branch \"%s\" "
@@ -277,7 +277,7 @@ static noinline int do_remount_del_option(char *optarg, int 
cur_branches,
if (nd.mnt == new_lower_paths[idx].mnt &&
nd.dentry == new_lower_paths[idx].dentry)
break;
-   path_release(&nd);  /* no longer needed */
+   path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
printk(KERN_ERR "unionfs: branch \"%s\" "
   "not found\n", optarg);
@@ -296,7 +296,7 @@ static noinline int do_remount_del_option(char *optarg, int 
cur_branches,
 * new_data and new_lower_paths one to the left.  Finally, adjust
 * cur_branches.
 */
-   pathput(&new_lower_paths[idx]);
+   path_put(&new_lower_paths[idx]);
 
if (idx < cur_branches - 1) {
/* if idx==cur_branches-1, we delete last branch: easy */
@@ -361,7 +361,7 @@ static noinline int do_remount_add_option(char *optarg, int 
cur_branches,
if (nd.mnt == new_lower_paths[idx].mnt &&
nd.dentry == new_lower_paths[idx].dentry)
break;
-   path_release(&nd);  /* no longer needed */
+   path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
printk(KERN_ERR "unionfs: branch \"%s\" "
   "not found\n", optarg);
@@ -408,7 +408,7 @@ found_insertion_point:
if (err) {
printk(KERN_ERR "unionfs: lower directory "
   "\"%s\" is not a valid branch\n", optarg);
-   path_release(&nd);
+   path_put(&nd.path);
goto out;
}
 
@@ -818,7 +818,7 @@ out_release:
/* no need to cleanup/release anything in tmp_data */
if (tmp_lower_paths)
for (i = 0; i < new_branches; i++)
-   pathput(&tmp_lower_paths[i]);
+   path_put(&tmp_lower_paths[i]);
 out_free:
kfree(tmp_lower_paths);
kfree(tmp_data);
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 12/17] Unionfs: branch management/configuration fixes

2008-02-16 Thread Erez Zadok

Remove unnecessary calls to update branch m/ctimes, and use them only when
needed.  Update branch vfsmounts after operations that could cause a copyup.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |9 +++--
 fs/unionfs/copyup.c |3 ++-
 fs/unionfs/dentry.c |1 +
 fs/unionfs/mmap.c   |   17 -
 fs/unionfs/rename.c |7 +--
 5 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 491e2ff..2add167 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -604,6 +604,7 @@ out:
}
 out_nofree:
if (!err) {
+   unionfs_postcopyup_setmnt(dentry);
unionfs_copy_attr_times(inode);
unionfs_check_file(file);
unionfs_check_inode(inode);
@@ -839,13 +840,9 @@ int unionfs_flush(struct file *file, fl_owner_t id)
 
}
 
-   /* on success, update our times */
-   unionfs_copy_attr_times(dentry->d_inode);
-   /* parent time could have changed too (async) */
-   unionfs_copy_attr_times(dentry->d_parent->d_inode);
-
 out:
-   unionfs_check_file(file);
+   if (!err)
+   unionfs_check_file(file);
unionfs_unlock_dentry(file->f_path.dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 9beac01..f71bddf 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -819,7 +819,8 @@ begin:
 * update times of this dentry, but also the parent, because if
 * we changed, the parent may have changed too.
 */
-   unionfs_copy_attr_times(parent_dentry->d_inode);
+   fsstack_copy_attr_times(parent_dentry->d_inode,
+   lower_parent_dentry->d_inode);
unionfs_copy_attr_times(child_dentry->d_inode);
 
parent_dentry = child_dentry;
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 17b297d..a956b94 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -482,6 +482,7 @@ static int unionfs_d_revalidate(struct dentry *dentry, 
struct nameidata *nd)
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
err = __unionfs_d_revalidate_chain(dentry, nd, false);
if (likely(err > 0)) { /* true==1: dentry is valid */
+   unionfs_postcopyup_setmnt(dentry);
unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
}
diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index ad770ac..d6ac61e 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -227,20 +227,11 @@ static int unionfs_prepare_write(struct file *file, 
struct page *page,
int err;
 
unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
-   /*
-* This is the only place where we unconditionally copy the lower
-* attribute times before calling unionfs_file_revalidate.  The
-* reason is that our ->write calls do_sync_write which in turn will
-* call our ->prepare_write and then ->commit_write.  Before our
-* ->write is called, the lower mtimes are in sync, but by the time
-* the VFS calls our ->commit_write, the lower mtimes have changed.
-* Therefore, the only reasonable time for us to sync up from the
-* changed lower mtimes, and avoid an invariant violation warning,
-* is here, in ->prepare_write.
-*/
-   unionfs_copy_attr_times(file->f_path.dentry->d_inode);
err = unionfs_file_revalidate(file, true);
-   unionfs_check_file(file);
+   if (!err) {
+   unionfs_copy_attr_times(file->f_path.dentry->d_inode);
+   unionfs_check_file(file);
+   }
unionfs_read_unlock(file->f_path.dentry->d_sb);
 
return err;
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 5ab13f9..cc16eb2 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -138,6 +138,11 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
err = vfs_rename(lower_old_dir_dentry->d_inode, lower_old_dentry,
 lower_new_dir_dentry->d_inode, lower_new_dentry);
 out_err_unlock:
+   if (!err) {
+   /* update parent dir times */
+   fsstack_copy_attr_times(old_dir, lower_old_dir_dentry->d_inode);
+   fsstack_copy_attr_times(new_dir, lower_new_dir_dentry->d_inode);
+   }
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
lockdep_on();
 
@@ -526,8 +531,6 @@ int unionfs_rename(struct inode *old_dir, struct dentry 
*old_dentry,
}
}
/* if all of this renaming succeeded, update our times */
-   unionfs_copy_attr_times(old_dir);
-   unionfs_copy_attr_times(new_dir);
unionfs

[PATCH 10/17] Unionfs: factor out revalidation routine

2008-02-16 Thread Erez Zadok

To be used by rest of revalidation code, as well a callers who already
locked the child and parent dentry branch-configurations.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   87 +++---
 fs/unionfs/union.h  |3 ++
 2 files changed, 57 insertions(+), 33 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index afa2120..3bd2dfb 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -285,6 +285,59 @@ void purge_sb_data(struct super_block *sb)
 }
 
 /*
+ * Revalidate a single file/symlink/special dentry.  Assume that info nodes
+ * of the dentry and its parent are locked.  Assume that parent(s) are all
+ * valid already, but the child may not yet be valid.  Returns true if
+ * valid, false otherwise.
+ */
+bool __unionfs_d_revalidate_one_locked(struct dentry *dentry,
+  struct nameidata *nd,
+  bool willwrite)
+{
+   bool valid = false; /* default is invalid */
+   int sbgen, dgen, bindex;
+
+   verify_locked(dentry);
+   verify_locked(dentry->d_parent);
+
+   sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
+   dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+
+   if (unlikely(is_newer_lower(dentry))) {
+   /* root dentry special case as aforementioned */
+   if (IS_ROOT(dentry)) {
+   unionfs_copy_attr_times(dentry->d_inode);
+   } else {
+   /*
+* reset generation number to zero, guaranteed to be
+* "old"
+*/
+   dgen = 0;
+   atomic_set(&UNIONFS_D(dentry)->generation, dgen);
+   }
+   if (!willwrite)
+   purge_inode_data(dentry->d_inode);
+   }
+   valid = __unionfs_d_revalidate_one(dentry, nd);
+
+   /*
+* If __unionfs_d_revalidate_one() succeeded above, then it will
+* have incremented the refcnt of the mnt's, but also the branch
+* indices of the dentry will have been updated (to take into
+* account any branch insertions/deletion.  So the current
+* dbstart/dbend match the current, and new, indices of the mnts
+* which __unionfs_d_revalidate_one has incremented.  Note: the "if"
+* test below does not depend on whether chain_len was 0 or greater.
+*/
+   if (!valid || sbgen == dgen)
+   goto out;
+   for (bindex = dbstart(dentry); bindex <= dbend(dentry); bindex++)
+   unionfs_mntput(dentry, bindex);
+out:
+   return valid;
+}
+
+/*
  * Revalidate a parent chain of dentries, then the actual node.
  * Assumes that dentry is locked, but will lock all parents if/when needed.
  *
@@ -404,42 +457,10 @@ out_this:
if (dentry != dentry->d_parent)
unionfs_lock_dentry(dentry->d_parent,
UNIONFS_DMUTEX_REVAL_PARENT);
-   dgen = atomic_read(&UNIONFS_D(dentry)->generation);
-
-   if (unlikely(is_newer_lower(dentry))) {
-   /* root dentry special case as aforementioned */
-   if (IS_ROOT(dentry)) {
-   unionfs_copy_attr_times(dentry->d_inode);
-   } else {
-   /*
-* reset generation number to zero, guaranteed to be
-* "old"
-*/
-   dgen = 0;
-   atomic_set(&UNIONFS_D(dentry)->generation, dgen);
-   }
-   if (!willwrite)
-   purge_inode_data(dentry->d_inode);
-   }
-   valid = __unionfs_d_revalidate_one(dentry, nd);
+   valid = __unionfs_d_revalidate_one_locked(dentry, nd, willwrite);
if (dentry != dentry->d_parent)
unionfs_unlock_dentry(dentry->d_parent);
 
-   /*
-* If __unionfs_d_revalidate_one() succeeded above, then it will
-* have incremented the refcnt of the mnt's, but also the branch
-* indices of the dentry will have been updated (to take into
-* account any branch insertions/deletion.  So the current
-* dbstart/dbend match the current, and new, indices of the mnts
-* which __unionfs_d_revalidate_one has incremented.  Note: the "if"
-* test below does not depend on whether chain_len was 0 or greater.
-*/
-   if (valid && sbgen != dgen)
-   for (bindex = dbstart(dentry);
-bindex <= dbend(dentry);
-bindex++)
-   unionfs_mntput(dentry, bindex);
-
 out_free:
/* unlock/dput all dentries in chain and return status */
if (chain_len >

[PATCH 08/17] Unionfs: improve debugging in copy_attr_times

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/subr.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 68a6280..1a40f63 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -247,8 +247,14 @@ void unionfs_copy_attr_times(struct inode *upper)
int bindex;
struct inode *lower;
 
-   if (!upper || ibstart(upper) < 0)
+   if (!upper)
return;
+   if (ibstart(upper) < 0) {
+#ifdef CONFIG_UNION_FS_DEBUG
+   WARN_ON(ibstart(upper) < 0);
+#endif /* CONFIG_UNION_FS_DEBUG */
+   return;
+   }
for (bindex = ibstart(upper); bindex <= ibend(upper); bindex++) {
lower = unionfs_lower_inode_idx(upper, bindex);
if (!lower)
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 13/17] Unionfs: use dget_parent in revalidation code

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   13 -
 1 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index a956b94..f8f65e1 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -410,15 +410,10 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
goto out;
}
 
-   /*
-* lock all dentries in chain, in child to parent order.
-* if failed, then sleep for a little, then retry.
-*/
-   dtmp = dentry->d_parent;
-   for (i = chain_len-1; i >= 0; i--) {
-   chain[i] = dget(dtmp);
-   dtmp = dtmp->d_parent;
-   }
+   /* grab all dentries in chain, in child to parent order */
+   dtmp = dentry;
+   for (i = chain_len-1; i >= 0; i--)
+   dtmp = chain[i] = dget_parent(dtmp);
 
/*
 * call __unionfs_d_revalidate_one() on each dentry, but in parent
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/17] Unionfs: initialize path_save variable

2008-02-16 Thread Erez Zadok

This is not strictly necessary, but it helps quiet a gcc-4.2 warning (a good
optimizer may optimize this initialization away).

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
Signed-off-by: Josef 'Jeff' Sipek <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 0b92da2..8d939dc 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -246,7 +246,7 @@ static struct dentry *unionfs_lookup(struct inode *parent,
 struct dentry *dentry,
 struct nameidata *nd)
 {
-   struct path path_save;
+   struct path path_save = {NULL, NULL};
struct dentry *ret;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/17] Unionfs: extend dentry branch configuration lock in open

2008-02-16 Thread Erez Zadok

Dentry branch configuration "info node" lock should extend to calls to
copy_attr_times.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |   14 --
 1 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index f37192f..96473c4 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -521,11 +521,12 @@ int unionfs_open(struct inode *inode, struct file *file)
 {
int err = 0;
struct file *lower_file = NULL;
-   struct dentry *dentry = NULL;
+   struct dentry *dentry = file->f_path.dentry;
int bindex = 0, bstart = 0, bend = 0;
int size;
 
unionfs_read_lock(inode->i_sb, UNIONFS_SMUTEX_PARENT);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
file->private_data =
kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
@@ -551,9 +552,6 @@ int unionfs_open(struct inode *inode, struct file *file)
goto out;
}
 
-   dentry = file->f_path.dentry;
-   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
-
bstart = fbstart(file) = dbstart(dentry);
bend = fbend(file) = dbend(dentry);
 
@@ -573,15 +571,12 @@ int unionfs_open(struct inode *inode, struct file *file)
if (!lower_file)
continue;
 
-   branchput(file->f_path.dentry->d_sb, bindex);
+   branchput(dentry->d_sb, bindex);
/* fput calls dput for lower_dentry */
fput(lower_file);
}
}
 
-   /* XXX: should this unlock be moved to the function bottom? */
-   unionfs_unlock_dentry(dentry);
-
 out:
if (err) {
kfree(UNIONFS_F(file)->lower_files);
@@ -590,12 +585,11 @@ out:
}
 out_nofree:
if (!err) {
-   dentry = file->f_path.dentry;
-   unionfs_copy_attr_times(dentry->d_parent->d_inode);
unionfs_copy_attr_times(inode);
unionfs_check_file(file);
unionfs_check_inode(inode);
}
+   unionfs_unlock_dentry(dentry);
unionfs_read_unlock(inode->i_sb);
return err;
 }
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 15/17] Unionfs: embed a struct path into struct nameidata instead of nd dentrymnt

2008-02-16 Thread Erez Zadok

From: Andrew Morton <[EMAIL PROTECTED]>

Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 2e791fd..640969d 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -254,8 +254,8 @@ static struct dentry *unionfs_lookup(struct inode *parent,
 
/* save the dentry & vfsmnt from namei */
if (nd) {
-   path_save.dentry = nd->dentry;
-   path_save.mnt = nd->mnt;
+   path_save.dentry = nd->path.dentry;
+   path_save.mnt = nd->path.mnt;
}
 
/*
@@ -266,8 +266,8 @@ static struct dentry *unionfs_lookup(struct inode *parent,
 
/* restore the dentry & vfsmnt in namei */
if (nd) {
-   nd->dentry = path_save.dentry;
-   nd->mnt = path_save.mnt;
+   nd->path.dentry = path_save.dentry;
+   nd->path.mnt = path_save.mnt;
}
if (!IS_ERR(ret)) {
if (ret)
@@ -885,7 +885,7 @@ static int unionfs_permission(struct inode *inode, int mask,
const int write_mask = (mask & MAY_WRITE) && !(mask & MAY_READ);
 
if (nd)
-   unionfs_lock_dentry(nd->dentry, UNIONFS_DMUTEX_CHILD);
+   unionfs_lock_dentry(nd->path.dentry, UNIONFS_DMUTEX_CHILD);
 
if (!UNIONFS_I(inode)->lower_inodes) {
if (is_file)/* dirs can be unlinked but chdir'ed to */
@@ -960,7 +960,7 @@ out:
unionfs_check_inode(inode);
unionfs_check_nd(nd);
if (nd)
-   unionfs_unlock_dentry(nd->dentry);
+   unionfs_unlock_dentry(nd->path.dentry);
return err;
 }
 
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 11/17] Unionfs: lock parents' branch configuration fixes

2008-02-16 Thread Erez Zadok

Ensure that we lock the branch configuration of parent and child dentries in
operations which need it, and in the right order.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |   31 +---
 fs/unionfs/dentry.c |   26 +---
 fs/unionfs/inode.c  |   58 +++---
 fs/unionfs/union.h  |5 ++-
 fs/unionfs/unlink.c |   11 -
 5 files changed, 91 insertions(+), 40 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 96473c4..491e2ff 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -300,8 +300,9 @@ out:
  * Revalidate the struct file
  * @file: file to revalidate
  * @willwrite: true if caller may cause changes to the file; false otherwise.
+ * Caller must lock/unlock dentry's branch configuration.
  */
-int unionfs_file_revalidate(struct file *file, bool willwrite)
+int unionfs_file_revalidate_locked(struct file *file, bool willwrite)
 {
struct super_block *sb;
struct dentry *dentry;
@@ -311,7 +312,6 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
int err = 0;
 
dentry = file->f_path.dentry;
-   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
sb = dentry->d_sb;
 
/*
@@ -416,7 +416,17 @@ out:
 out_nofree:
if (!err)
unionfs_check_file(file);
-   unionfs_unlock_dentry(dentry);
+   return err;
+}
+
+int unionfs_file_revalidate(struct file *file, bool willwrite)
+{
+   int err;
+
+   unionfs_lock_dentry(file->f_path.dentry, UNIONFS_DMUTEX_CHILD);
+   err = unionfs_file_revalidate_locked(file, willwrite);
+   unionfs_unlock_dentry(file->f_path.dentry);
+
return err;
 }
 
@@ -524,9 +534,18 @@ int unionfs_open(struct inode *inode, struct file *file)
struct dentry *dentry = file->f_path.dentry;
int bindex = 0, bstart = 0, bend = 0;
int size;
+   int valid = 0;
 
unionfs_read_lock(inode->i_sb, UNIONFS_SMUTEX_PARENT);
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+   if (dentry != dentry->d_parent)
+   unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_PARENT);
+
+   valid = __unionfs_d_revalidate_chain(dentry->d_parent, NULL, false);
+   if (unlikely(!valid)) {
+   err = -ESTALE;
+   goto out_nofree;
+   }
 
file->private_data =
kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
@@ -589,6 +608,8 @@ out_nofree:
unionfs_check_file(file);
unionfs_check_inode(inode);
}
+   if (dentry != dentry->d_parent)
+   unionfs_unlock_dentry(dentry->d_parent);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(inode->i_sb);
return err;
@@ -797,8 +818,9 @@ int unionfs_flush(struct file *file, fl_owner_t id)
int bindex, bstart, bend;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
-   err = unionfs_file_revalidate(file, true);
+   err = unionfs_file_revalidate_locked(file, true);
if (unlikely(err))
goto out;
unionfs_check_file(file);
@@ -824,6 +846,7 @@ int unionfs_flush(struct file *file, fl_owner_t id)
 
 out:
unionfs_check_file(file);
+   unionfs_unlock_dentry(file->f_path.dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
 }
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 3bd2dfb..17b297d 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -363,6 +363,7 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
chain_len = 0;
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
dtmp = dentry->d_parent;
+   verify_locked(dentry);
if (dentry != dtmp)
unionfs_lock_dentry(dtmp, UNIONFS_DMUTEX_REVAL_PARENT);
dgen = atomic_read(&UNIONFS_D(dtmp)->generation);
@@ -453,7 +454,7 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
 
 out_this:
/* finally, lock this dentry and revalidate it */
-   verify_locked(dentry);
+   verify_locked(dentry);  /* verify child is locked */
if (dentry != dentry->d_parent)
unionfs_lock_dentry(dentry->d_parent,
UNIONFS_DMUTEX_REVAL_PARENT);
@@ -491,24 +492,20 @@ static int unionfs_d_revalidate(struct dentry *dentry, 
struct nameidata *nd)
return err;
 }
 
-/*
- * At this point no one can reference this dentry, so we don't have to be
- * careful about concurrent access.
- */
 static void unionfs_d_release(struct dentry *dentry)
 {
int bindex, bstart, bend;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHI

[PATCH 14/17] Unionfs: stop using iget() and read_inode()

2008-02-16 Thread Erez Zadok

From: David Howells <[EMAIL PROTECTED]>

Replace unionfs_read_inode() with unionfs_iget(), and call that instead of
iget().  unionfs_iget() then uses iget_locked() directly and returns a
proper error code instead of an inode in the event of an error.

unionfs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c  |   11 +--
 fs/unionfs/super.c |   19 ++-
 fs/unionfs/union.h |1 +
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index ba3471d..4bc2c66 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -104,9 +104,8 @@ struct dentry *unionfs_interpose(struct dentry *dentry, 
struct super_block *sb,
BUG_ON(is_negative_dentry);
 
/*
-* We allocate our new inode below, by calling iget.
-* iget will call our read_inode which will initialize some
-* of the new inode's fields
+* We allocate our new inode below by calling unionfs_iget,
+* which will initialize some of the new inode's fields
 */
 
/*
@@ -128,9 +127,9 @@ struct dentry *unionfs_interpose(struct dentry *dentry, 
struct super_block *sb,
}
} else {
/* get unique inode number for unionfs */
-   inode = iget(sb, iunique(sb, UNIONFS_ROOT_INO));
-   if (!inode) {
-   err = -EACCES;
+   inode = unionfs_iget(sb, iunique(sb, UNIONFS_ROOT_INO));
+   if (IS_ERR(inode)) {
+   err = PTR_ERR(inode);
goto out;
}
if (atomic_read(&inode->i_count) > 1)
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 175840f..b71fc2a 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -24,11 +24,19 @@
  */
 static struct kmem_cache *unionfs_inode_cachep;
 
-static void unionfs_read_inode(struct inode *inode)
+struct inode *unionfs_iget(struct super_block *sb, unsigned long ino)
 {
int size;
-   struct unionfs_inode_info *info = UNIONFS_I(inode);
+   struct unionfs_inode_info *info;
+   struct inode *inode;
 
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   info = UNIONFS_I(inode);
memset(info, 0, offsetof(struct unionfs_inode_info, vfs_inode));
info->bstart = -1;
info->bend = -1;
@@ -44,7 +52,8 @@ static void unionfs_read_inode(struct inode *inode)
if (unlikely(!info->lower_inodes)) {
printk(KERN_CRIT "unionfs: no kernel memory when allocating "
   "lower-pointer array!\n");
-   BUG();
+   iget_failed(inode);
+   return ERR_PTR(-ENOMEM);
}
 
inode->i_version++;
@@ -60,7 +69,8 @@ static void unionfs_read_inode(struct inode *inode)
inode->i_atime.tv_sec = inode->i_atime.tv_nsec = 0;
inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
-
+   unlock_new_inode(inode);
+   return inode;
 }
 
 /*
@@ -1025,7 +1035,6 @@ out:
 }
 
 struct super_operations unionfs_sops = {
-   .read_inode = unionfs_read_inode,
.delete_inode   = unionfs_delete_inode,
.put_super  = unionfs_put_super,
.statfs = unionfs_statfs,
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 1bf0c09..533806c 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -364,6 +364,7 @@ extern int unionfs_fsync(struct file *file, struct dentry 
*dentry,
 extern int unionfs_fasync(int fd, struct file *file, int flag);
 
 /* Inode operations */
+extern struct inode *unionfs_iget(struct super_block *sb, unsigned long ino);
 extern int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
  struct inode *new_dir, struct dentry *new_dentry);
 extern int unionfs_unlink(struct inode *dir, struct dentry *dentry);
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: unionfs_copy_attr_times oopses

2008-02-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Hugh Dickins writes:

> Hi Erez,
> 
> Aside from the occasional "unionfs: new lower inode mtime" messages
> on directories (which I've got into the habit of ignoring now), the
> only problem I'm still suffering with unionfs over tmpfs (not tested
> any other fs's below it recently) is oops in unionfs_copy_attr_times.
> 
> I believe I'm working with your latest: 2.6.24-rc8-mm1 plus the four
> patches you posted to lkml on 26 Jan.  But this problem has been around
> for a while before that: I'd been hoping to debug it myself, but taken
> too long to make too little progress, so now handing over to you.
> 
> The oops occurs while doing repeated "make -j20" kernel builds in a
> unionfs mount of a tmpfs (though I doubt tmpfs is relevant): most of
> my testing was while swapping, but today I find that's irrelevant,
> and it should happen much quicker without.  SMP kernels (4 cpus),
> I haven't tried UP; happens with or without PREEMPT, may just be
> coincidence that it happens quicker on the machines with PREEMPT.
> 
> Most commonly it's unionfs_copy_attr_times called from unionfs_create,
> but that's probably just the most common route in this workload:
> I've seen it also when called from unionfs_rename or unionfs_open or
> unionfs_unlink.  It looks like there's a locking or refcounting bug,
> hence a race: the unionfs_inode_info which unionfs_copy_attr_times
> is working on gets changed underneath it, so it oopses on NULL
> lower_inodes.
[...]

Hugh,

Check out my latest set of patches (which correspond to release 2.2.4 of
Unionfs).  Thanks to your info and the patch, I was able to trigger several
races more frequently, and fix them.  I've tested my code with make -j N
(for N=4 and N=20), on a 4 cpu machine a well as a 2 cpu machine (w/
different amounts of memory and CPU speeds, also 32-bit vs 64-bit); I ran a
kernel compile for ~10-12 hours.  With the patches I just posted, I wasn't
able to trigger any of the WARN_ON's in unionfs_copy_attr_times.  I also
tried it while flushing caches via /proc, and/or performing branch-mgmt
commands in unionfs.

Give it a good shake and let me know what you find.

Thanks,
Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] make vfs_ioctl() static

2008-02-17 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Christoph Hellwig writes:
> On Sun, Feb 17, 2008 at 10:18:42AM +0200, Adrian Bunk wrote:
> > This patch makes the needlessly global vfs_ioctl() static.
> 
> I think the point was toa eventually export it for stackable filesystem
> use.  But until they start using it marking it static seems fine with
> me.

Right.  I'm not using it yet in unionfs, although I could; for now I'm just
calling very similar code myself.  This is only used in unionfs after I
process my own ioctls; IOW, I pass all unknown ioctls to the lower level and
let it handle it.

eCryptfs, however, doesn't pass unknown ioctls to the lower layer: it only
processes its own.

Honestly I'm not sure which is more appropriate: should a stackable f/s pass
unknown ioctls to the lower f/s or not?  If it doesn't, would any important
functionality be lost?

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fix up kerneldoc in fs/ioctl.c a little bit

2008-02-08 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Christoph Hellwig writes:
>  - remove non-standard in/out markers
>  - use tabs for formatting
> 
> 
> Signed-off-by: Christoph Hellwig <[EMAIL PROTECTED]>
> 
> Index: linux-2.6/fs/ioctl.c
> ===
> --- linux-2.6.orig/fs/ioctl.c 2008-02-09 07:49:02.0 +0100
> +++ linux-2.6/fs/ioctl.c  2008-02-09 07:49:20.0 +0100
> @@ -18,9 +18,9 @@
>  
>  /**
>   * vfs_ioctl - call filesystem specific ioctl methods
> - * @filp: [in] open file to invoke ioctl method on
> - * @cmd:  [in] ioctl command to execute
> - * @arg:  [in/out] command-specific argument for ioctl
> + * @filp:open file to invoke ioctl method on
> + * @cmd: ioctl command to execute
> + * @arg: command-specific argument for ioctl
>   *
>   * Invokes filesystem specific ->unlocked_ioctl, if one exists; otherwise
>   * invokes * filesystem specific ->ioctl method.  If neither method exists,
  ^

I also think this extra '*' in the last comment line above is spurious,
perhaps the result of a paragraph reformatting command that mixed comment
*'s with text.

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] Unionfs: use printk KERN_CONT for debugging messages

2008-01-03 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Joe Perches writes:
> On Thu, 2008-01-03 at 00:57 -0500, Erez Zadok wrote:

> I think printks should be single statements and
> KERN_CONT should be used as sparingly as possible.
[...]

KERN_CONT is documented as not being SMP safe, but I figured it was harmless
for just some debugging message.  Still, I like your way better.  Thanks
Joe.

> Perhaps:
>   pr_debug("IT(%lu:%d): %s:%s:%d "
>"um=%lu/%lu lm=%lu/%lu "
>"uc=%lu/%lu lc=%lu/%lu\n",
>inode->i_ino, bindex, file, fnx, line,
>inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
>lower_inode->i_mtime.tv_sec,
>lower_inode->i_mtime.tv_nsec
>inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
>lower_inode->i_ctime.tv_sec,
>lower_inode->i_ctime.tv_nsec);
[...]

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] block2mtd lockdep_init_map warning

2008-01-05 Thread Erez Zadok

Hi David,

I've reported before a lockdep warning when block2mtd is modloaded, and a
device is initialized, as in

modprobe block2mtd block2mtd=/dev/loop0

A typical warning looks like this:

BUG: key f88565c0 not in .data!
WARNING: at kernel/lockdep.c:2331 lockdep_init_map()
Pid: 1823, comm: modprobe Not tainted 2.6.24-rc6-unionfs2 #135
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x6c/0x72
 [] lockdep_init_map+0x94/0x374
 [] debug_mutex_init+0x2c/0x3c
 [] __mutex_init+0x38/0x40
 [] 0xf885520d
 [] parse_args+0x121/0x1fb
 [] sys_init_module+0x10e8/0x1576
 [] sysenter_past_esp+0x5f/0xa5
 ===

This is a long-standing problem I've seen in several of the latest stable
kernels.  Once lockdep turns itself off, there's no easy way to turn it back
on short of a reboot.

I looked more closely at the mtd code.  I believe the problem is that
lockdep doesn't like a mutex_init to be called from a module_init code path,
possibly because the module's symbols aren't all initialized yet.  (This
could arguably be considered a limitation of lockdep.)

So I tried to defer the call to module_init until it's absolutely needed.  I
couldn't find a clean way to do that via the struct mtd_info ops (there's no
suitable ->init op), so instead I used an int to mark whether the mutex is
initialized or not.  Below is a patch.  It works, but it's not as clean as
it should be: a better way would be to probably add an mtd_info ->init op or
so.

At least with this patch, lockdep doesn't complain any longer, so I can run
a clean set of regression tests w/ unionfs on top of jffs2 and other file
systems.

In lieu of a better fix, is this patch acceptable?

Thanks,
Erez.

--

block2mtd: defer mutex initialization to avoid a lockdep warning

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>

diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index be4b994..2c6d3e7 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -32,6 +32,7 @@ struct block2mtd_dev {
struct list_head list;
struct block_device *blkdev;
struct mtd_info mtd;
+   int mutex_initialized;
struct mutex write_mutex;
 };
 
@@ -85,6 +86,11 @@ static int block2mtd_erase(struct mtd_info *mtd, struct 
erase_info *instr)
size_t len = instr->len;
int err;
 
+   if (!dev->mutex_initialized) {
+   mutex_init(&dev->write_mutex);
+   dev->mutex_initialized = 1;
+   }
+
instr->state = MTD_ERASING;
mutex_lock(&dev->write_mutex);
err = _block2mtd_erase(dev, from, len);
@@ -194,6 +200,11 @@ static int block2mtd_write(struct mtd_info *mtd, loff_t 
to, size_t len,
struct block2mtd_dev *dev = mtd->priv;
int err;
 
+   if (!dev->mutex_initialized) {
+   mutex_init(&dev->write_mutex);
+   dev->mutex_initialized = 1;
+   }
+
if (!len)
return 0;
if (to >= mtd->size)
@@ -275,8 +286,6 @@ static struct block2mtd_dev *add_device(char *devname, int 
erase_size)
goto devinit_err;
}
 
-   mutex_init(&dev->write_mutex);
-
/* Setup the MTD structure */
/* make the name contain the block device in */
dev->mtd.name = kmalloc(sizeof("block2mtd: ") + strlen(devname),
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block2mtd lockdep_init_map warning

2008-01-06 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, =?utf-8?B?SsO2cm4=?= Engel writes:
> Maybe it is not obvious that I maintain this driver and would like to be
> kept on Cc:.  Will send a patch to fix that shortly.

I was looking in MAINTAINERS, and you weren't listed on the main MTD section
as a maintainer.  Perhaps an update?

If you get a patch, I'd love to test it.

> On Sun, 6 January 2008 02:17:32 -0500, Erez Zadok wrote:
[...]
> > In lieu of a better fix, is this patch acceptable?
> 
> Not for me.  I don't mind if you keep such a hack until a proper
> solution if found in your private tree.  But it is a horrible solution
> to a problem introduced elsewhere.

Yup, that's what I figured.  I'll keep this ugly hack but I won't push it to
my unionfs tree on kernel.org.

> Ingo, Peter, does either of you actually care about this problem?  In
> the last round when I debugged this problem there was a notable lack of
> reaction from either of you.

The problem appears to be an interaction of two components--module loading
and lockdep--that's perhaps why it wasn't given enough attention.

I run a lot of regression tests w/ unionfs on top of many different file
systems, and monitoring for all sorts of problems, including lockdep
warnings.  So it's frustrating that each time I get to test w/ jffs2, I get
this lockdep warning, and lockdep shuts down until a reboot.  That kills my
automated testing and masks out possible other lockdep problems (that's why
I wanted to at least have some workaround).  Other lockdep problems I
reported with other subsystems had been fixed within days; this one,
unfortunately, has lingered for a while.  So if a fix can be made in a
reasonable time-frame, I'd really appreciate it.

Thanks,
Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL -mm] 0/4 Unionfs updates/fixes/cleanups

2008-01-09 Thread Erez Zadok


The following is a series of patchsets related to Unionfs.  This is the
fourth set of patchsets resulting from an lkml review of the entire unionfs
code base, in preparation for a merge into mainline.  The most significant
changes here are a few locking/race bugfix related to branch-management.

These patches were tested (where appropriate) on Linus's 2.6.24 latest code
(as of v2.6.24-rc7-71-gfd0b45d), MM, as well as the backports to
2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4, jffs2,
ramfs, tmpfs, cramfs, and squashfs (where available).  Also tested with
LTP-full and with a continuous parallel kernel compile (while forcing cache
flushing, manipulating lower branches, etc.).  See
http://unionfs.filesystems.org/ to download back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (4):
  Unionfs: merged several printk KERN_CONT together into one pr_debug
  Unionfs: mmap fixes
  Unionfs: branch-management related locking fixes
  Unionfs: ensure we have lower dentries in d_iput

 commonfops.c |6 ++
 debug.c  |   51 +--
 dentry.c |9 +++--
 inode.c  |   17 +
 mmap.c   |   26 +-
 5 files changed, 76 insertions(+), 33 deletions(-)

---
Erez Zadok
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/4] Unionfs: branch-management related locking fixes

2008-01-09 Thread Erez Zadok

Add necessary locking to dentry/inode branch-configuration, so we get
consistent values during branch-management actions.  In d_revalidate_chain,
->permission, and ->create, also lock parent dentry.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |6 ++
 fs/unionfs/dentry.c |6 +-
 fs/unionfs/inode.c  |   17 +
 3 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 2c32ada..f37192f 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -318,6 +318,7 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
 * First revalidate the dentry inside struct file,
 * but not unhashed dentries.
 */
+reval_dentry:
if (unlikely(!d_deleted(dentry) &&
 !__unionfs_d_revalidate_chain(dentry, NULL, willwrite))) {
err = -ESTALE;
@@ -328,6 +329,11 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
dgen = atomic_read(&UNIONFS_D(dentry)->generation);
fgen = atomic_read(&UNIONFS_F(file)->generation);
 
+   if (unlikely(sbgen > dgen)) {
+   pr_debug("unionfs: retry dentry revalidation\n");
+   schedule();
+   goto reval_dentry;
+   }
BUG_ON(sbgen > dgen);
 
/*
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 7646828..d969640 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -203,7 +203,7 @@ bool is_newer_lower(const struct dentry *dentry)
if (!dentry || !UNIONFS_D(dentry))
return false;
inode = dentry->d_inode;
-   if (!inode || !UNIONFS_I(inode) ||
+   if (!inode || !UNIONFS_I(inode)->lower_inodes ||
ibstart(inode) < 0 || ibend(inode) < 0)
return false;
 
@@ -295,6 +295,8 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
chain_len = 0;
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
dtmp = dentry->d_parent;
+   if (dentry != dtmp)
+   unionfs_lock_dentry(dtmp, UNIONFS_DMUTEX_REVAL_PARENT);
dgen = atomic_read(&UNIONFS_D(dtmp)->generation);
/* XXX: should we check if is_newer_lower all the way up? */
if (unlikely(is_newer_lower(dtmp))) {
@@ -315,6 +317,8 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
}
purge_inode_data(dtmp->d_inode);
}
+   if (dentry != dtmp)
+   unionfs_unlock_dentry(dtmp);
while (sbgen != dgen) {
/* The root entry should always be valid */
BUG_ON(IS_ROOT(dtmp));
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 6095c4f..e15ddb9 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -30,6 +30,13 @@ static int unionfs_create(struct inode *parent, struct 
dentry *dentry,
struct nameidata lower_nd;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_PARENT);
+   valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
+   unionfs_unlock_dentry(dentry->d_parent);
+   if (unlikely(!valid)) {
+   err = -ESTALE;  /* same as what real_lookup does */
+   goto out;
+   }
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
valid = __unionfs_d_revalidate_chain(dentry, nd, false);
@@ -936,6 +943,14 @@ static int unionfs_permission(struct inode *inode, int 
mask,
const int is_file = !S_ISDIR(inode->i_mode);
const int write_mask = (mask & MAY_WRITE) && !(mask & MAY_READ);
 
+   if (nd)
+   unionfs_lock_dentry(nd->dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (!UNIONFS_I(inode)->lower_inodes) {
+   if (is_file)/* dirs can be unlinked but chdir'ed to */
+   err = -ESTALE;  /* force revalidate */
+   goto out;
+   }
bstart = ibstart(inode);
bend = ibend(inode);
if (unlikely(bstart < 0 || bend < 0)) {
@@ -1003,6 +1018,8 @@ static int unionfs_permission(struct inode *inode, int 
mask,
 out:
unionfs_check_inode(inode);
unionfs_check_nd(nd);
+   if (nd)
+   unionfs_unlock_dentry(nd->dentry);
return err;
 }
 
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/4] Unionfs: mmap fixes

2008-01-09 Thread Erez Zadok

Ensure we have lower inodes in prepare/commit_write.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/mmap.c |   26 +-
 1 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index a0e654b..ad770ac 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -224,13 +224,26 @@ out:
 static int unionfs_prepare_write(struct file *file, struct page *page,
 unsigned from, unsigned to)
 {
+   int err;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
/*
-* Just copy lower inode attributes and return success.  Not much
-* else to do here.  No need to lock either (lockdep won't like it).
-* Let commit_write do all the hard work instead.
+* This is the only place where we unconditionally copy the lower
+* attribute times before calling unionfs_file_revalidate.  The
+* reason is that our ->write calls do_sync_write which in turn will
+* call our ->prepare_write and then ->commit_write.  Before our
+* ->write is called, the lower mtimes are in sync, but by the time
+* the VFS calls our ->commit_write, the lower mtimes have changed.
+* Therefore, the only reasonable time for us to sync up from the
+* changed lower mtimes, and avoid an invariant violation warning,
+* is here, in ->prepare_write.
 */
unionfs_copy_attr_times(file->f_path.dentry->d_inode);
-   return 0;
+   err = unionfs_file_revalidate(file, true);
+   unionfs_check_file(file);
+   unionfs_read_unlock(file->f_path.dentry->d_sb);
+
+   return err;
 }
 
 static int unionfs_commit_write(struct file *file, struct page *page,
@@ -252,7 +265,6 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
unionfs_check_file(file);
 
inode = page->mapping->host;
-   lower_inode = unionfs_lower_inode(inode);
 
if (UNIONFS_F(file) != NULL)
lower_file = unionfs_lower_file(file);
@@ -282,6 +294,10 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
goto out;
 
/* if vfs_write succeeded above, sync up our times/sizes */
+   lower_inode = lower_file->f_path.dentry->d_inode;
+   if (!lower_inode)
+   lower_inode = unionfs_lower_inode(inode);
+   BUG_ON(!lower_inode);
fsstack_copy_inode_size(inode, lower_inode);
unionfs_copy_attr_times(inode);
mark_inode_dirty_sync(inode);
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] Unionfs: ensure we have lower dentries in d_iput

2008-01-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index d969640..cd15243 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -507,9 +507,10 @@ static void unionfs_d_iput(struct dentry *dentry, struct 
inode *inode)
 {
int bindex, rc;
 
+   BUG_ON(!dentry);
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
 
-   if (dbstart(dentry) < 0)
+   if (!UNIONFS_D(dentry) || dbstart(dentry) < 0)
goto drop_lower_inodes;
for (bindex = dbstart(dentry); bindex <= dbend(dentry); bindex++) {
if (unionfs_lower_mnt_idx(dentry, bindex)) {
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/4] Unionfs: merged several printk KERN_CONT together into one pr_debug

2008-01-09 Thread Erez Zadok

CC: Joe Perches <[EMAIL PROTECTED]>

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/debug.c |   51 +--
 1 files changed, 25 insertions(+), 26 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index 5f1d887..d154c32 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -472,16 +472,16 @@ void __show_inode_times(const struct inode *inode,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (unlikely(!lower_inode))
continue;
-   pr_debug("IT(%lu:%d): ", inode->i_ino, bindex);
-   printk(KERN_CONT "%s:%s:%d ", file, fxn, line);
-   printk(KERN_CONT "um=%lu/%lu lm=%lu/%lu ",
-  inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-  lower_inode->i_mtime.tv_sec,
-  lower_inode->i_mtime.tv_nsec);
-   printk(KERN_CONT "uc=%lu/%lu lc=%lu/%lu\n",
-  inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
-  lower_inode->i_ctime.tv_sec,
-  lower_inode->i_ctime.tv_nsec);
+   pr_debug("IT(%lu:%d): %s:%s:%d "
+"um=%lu/%lu lm=%lu/%lu uc=%lu/%lu lc=%lu/%lu\n",
+inode->i_ino, bindex,
+file, fxn, line,
+inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+lower_inode->i_mtime.tv_sec,
+lower_inode->i_mtime.tv_nsec,
+inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+lower_inode->i_ctime.tv_sec,
+lower_inode->i_ctime.tv_nsec);
}
 }
 
@@ -496,17 +496,16 @@ void __show_dinode_times(const struct dentry *dentry,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (!lower_inode)
continue;
-   pr_debug("DT(%s:%lu:%d): ", dentry->d_name.name, inode->i_ino,
-bindex);
-   printk(KERN_CONT "%s:%s:%d ", file, fxn, line);
-   printk(KERN_CONT "um=%lu/%lu lm=%lu/%lu ",
-  inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-  lower_inode->i_mtime.tv_sec,
-  lower_inode->i_mtime.tv_nsec);
-   printk(KERN_CONT "uc=%lu/%lu lc=%lu/%lu\n",
-  inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
-  lower_inode->i_ctime.tv_sec,
-  lower_inode->i_ctime.tv_nsec);
+   pr_debug("DT(%s:%lu:%d): %s:%s:%d "
+"um=%lu/%lu lm=%lu/%lu uc=%lu/%lu lc=%lu/%lu\n",
+dentry->d_name.name, inode->i_ino, bindex,
+file, fxn, line,
+inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+lower_inode->i_mtime.tv_sec,
+lower_inode->i_mtime.tv_nsec,
+inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+lower_inode->i_ctime.tv_sec,
+lower_inode->i_ctime.tv_nsec);
}
 }
 
@@ -525,10 +524,10 @@ void __show_inode_counts(const struct inode *inode,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (unlikely(!lower_inode))
continue;
-   printk(KERN_CONT "SIC(%lu:%d:%d): ", inode->i_ino, bindex,
-  atomic_read(&(inode)->i_count));
-   printk(KERN_CONT "lc=%d ",
-  atomic_read(&(lower_inode)->i_count));
-   printk(KERN_CONT "%s:%s:%d\n", file, fxn, line);
+   pr_debug("SIC(%lu:%d:%d): lc=%d %s:%s:%d\n",
+inode->i_ino, bindex,
+atomic_read(&(inode)->i_count),
+atomic_read(&(lower_inode)->i_count),
+file, fxn, line);
}
 }
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/29] Unionfs: lower-level copyup routines

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/copyup.c |  899 +++
 1 files changed, 899 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/copyup.c

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
new file mode 100644
index 000..16b2c7c
--- /dev/null
+++ b/fs/unionfs/copyup.c
@@ -0,0 +1,899 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * For detailed explanation of copyup see:
+ * Documentation/filesystems/unionfs/concepts.txt
+ */
+
+#ifdef CONFIG_UNION_FS_XATTR
+/* copyup all extended attrs for a given dentry */
+static int copyup_xattrs(struct dentry *old_lower_dentry,
+struct dentry *new_lower_dentry)
+{
+   int err = 0;
+   ssize_t list_size = -1;
+   char *name_list = NULL;
+   char *attr_value = NULL;
+   char *name_list_buf = NULL;
+
+   /* query the actual size of the xattr list */
+   list_size = vfs_listxattr(old_lower_dentry, NULL, 0);
+   if (list_size <= 0) {
+   err = list_size;
+   goto out;
+   }
+
+   /* allocate space for the actual list */
+   name_list = unionfs_xattr_alloc(list_size + 1, XATTR_LIST_MAX);
+   if (unlikely(!name_list || IS_ERR(name_list))) {
+   err = PTR_ERR(name_list);
+   goto out;
+   }
+
+   name_list_buf = name_list; /* save for kfree at end */
+
+   /* now get the actual xattr list of the source file */
+   list_size = vfs_listxattr(old_lower_dentry, name_list, list_size);
+   if (list_size <= 0) {
+   err = list_size;
+   goto out;
+   }
+
+   /* allocate space to hold each xattr's value */
+   attr_value = unionfs_xattr_alloc(XATTR_SIZE_MAX, XATTR_SIZE_MAX);
+   if (unlikely(!attr_value || IS_ERR(attr_value))) {
+   err = PTR_ERR(name_list);
+   goto out;
+   }
+
+   /* in a loop, get and set each xattr from src to dst file */
+   while (*name_list) {
+   ssize_t size;
+
+   /* Lock here since vfs_getxattr doesn't lock for us */
+   mutex_lock(&old_lower_dentry->d_inode->i_mutex);
+   size = vfs_getxattr(old_lower_dentry, name_list,
+   attr_value, XATTR_SIZE_MAX);
+   mutex_unlock(&old_lower_dentry->d_inode->i_mutex);
+   if (size < 0) {
+   err = size;
+   goto out;
+   }
+   if (size > XATTR_SIZE_MAX) {
+   err = -E2BIG;
+   goto out;
+   }
+   /* Don't lock here since vfs_setxattr does it for us. */
+   err = vfs_setxattr(new_lower_dentry, name_list, attr_value,
+  size, 0);
+   /*
+* Selinux depends on "security.*" xattrs, so to maintain
+* the security of copied-up files, if Selinux is active,
+* then we must copy these xattrs as well.  So we need to
+* temporarily get FOWNER privileges.
+* XXX: move entire copyup code to SIOQ.
+*/
+   if (err == -EPERM && !capable(CAP_FOWNER)) {
+   cap_raise(current->cap_effective, CAP_FOWNER);
+   err = vfs_setxattr(new_lower_dentry, name_list,
+  attr_value, size, 0);
+   cap_lower(current->cap_effective, CAP_FOWNER);
+   }
+   if (err < 0)
+   goto out;
+   name_list += strlen(name_list) + 1;
+   }
+out:
+   unionfs_xattr_kfree(name_list_buf);
+   unionfs_xattr_kfree(attr_value);
+   /* Ignore if xattr isn't supported */
+   if (err == -ENOTSUPP || err == -EOPNOTSUPP)
+   err = 0;
+   return err;
+}
+#endif /* CONFIG_UNION_FS_XATTR */
+
+/*
+ * Determine the mode based on the copyup flags, and the existing dentry.
+ *
+ * Handle file systems which may not support certain options.  For example
+ * jffs2 doesn't allow one to chmod a sy

[PATCH 17/29] Unionfs: unlink/rmdir operations

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/unlink.c |  251 +++
 1 files changed, 251 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/unlink.c

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
new file mode 100644
index 000..1e370a1
--- /dev/null
+++ b/fs/unionfs/unlink.c
@@ -0,0 +1,251 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* unlink a file by creating a whiteout */
+static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
+{
+   struct dentry *lower_dentry;
+   struct dentry *lower_dir_dentry;
+   int bindex;
+   int err = 0;
+
+   err = unionfs_partial_lookup(dentry);
+   if (err)
+   goto out;
+
+   bindex = dbstart(dentry);
+
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry)
+   goto out;
+
+   lower_dir_dentry = lock_parent(lower_dentry);
+
+   /* avoid destroying the lower inode if the file is in use */
+   dget(lower_dentry);
+   err = is_robranch_super(dentry->d_sb, bindex);
+   if (!err) {
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
+   err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
+   lockdep_on();
+   }
+   /* if vfs_unlink succeeded, update our inode's times */
+   if (!err)
+   unionfs_copy_attr_times(dentry->d_inode);
+   dput(lower_dentry);
+   fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
+   unlock_dir(lower_dir_dentry);
+
+   if (err && !IS_COPYUP_ERR(err))
+   goto out;
+
+   /*
+* We create whiteouts if (1) there was an error unlinking the main
+* file; (2) there is a lower priority file with the same name
+* (dbopaque); (3) the branch in which the file is not the last
+* (rightmost0 branch.  The last rule is an optimization to avoid
+* creating all those whiteouts if there's no chance they'd be
+* masking any lower-priority branch, as well as unionfs is used
+* with only one branch (using only one branch, while odd, is still
+* possible).
+*/
+   if (err) {
+   if (dbstart(dentry) == 0)
+   goto out;
+   err = create_whiteout(dentry, dbstart(dentry) - 1);
+   } else if (dbopaque(dentry) != -1) {
+   err = create_whiteout(dentry, dbopaque(dentry));
+   } else if (dbstart(dentry) < sbend(dentry->d_sb)) {
+   err = create_whiteout(dentry, dbstart(dentry));
+   }
+
+out:
+   if (!err)
+   inode_dec_link_count(dentry->d_inode);
+
+   /* We don't want to leave negative leftover dentries for revalidate. */
+   if (!err && (dbopaque(dentry) != -1))
+   update_bstart(dentry);
+
+   return err;
+}
+
+int unionfs_unlink(struct inode *dir, struct dentry *dentry)
+{
+   int err = 0;
+   struct inode *inode = dentry->d_inode;
+
+   BUG_ON(S_ISDIR(inode->i_mode));
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+   unionfs_check_dentry(dentry);
+
+   err = unionfs_unlink_whiteout(dir, dentry);
+   /* call d_drop so the system "forgets" about us */
+   if (!err) {
+   unionfs_postcopyup_release(dentry);
+   if (inode->i_nlink == 0) {
+   /* drop lower inodes */
+   iput(unionfs_lower_inode(inode));
+   unionfs_set_lower_inode(inode, NULL);
+   ibstart(inode) = ibend(inode) = -1;
+   }
+   d_drop(dentry);
+   /*
+* if unlink/whiteout succeeded, parent dir mtime has
+* changed
+*/
+   unionfs_copy_attr_times(dir);
+   }
+
+out:
+   if (!err) {
+   unionfs_check

[PATCH 01/29] Unionfs: documentation

2008-01-10 Thread Erez Zadok

Includes index files, MAINTAINERS, and documentation on general concepts,
usage, issues, and renaming operations.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/00-INDEX |2 +
 Documentation/filesystems/unionfs/00-INDEX |   10 +
 Documentation/filesystems/unionfs/concepts.txt |  213 
 Documentation/filesystems/unionfs/issues.txt   |   28 +++
 Documentation/filesystems/unionfs/rename.txt   |   31 
 Documentation/filesystems/unionfs/usage.txt|  134 +++
 MAINTAINERS|9 +
 7 files changed, 427 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/filesystems/unionfs/00-INDEX
 create mode 100644 Documentation/filesystems/unionfs/concepts.txt
 create mode 100644 Documentation/filesystems/unionfs/issues.txt
 create mode 100644 Documentation/filesystems/unionfs/rename.txt
 create mode 100644 Documentation/filesystems/unionfs/usage.txt

diff --git a/Documentation/filesystems/00-INDEX 
b/Documentation/filesystems/00-INDEX
index 1de155e..b168331 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -96,6 +96,8 @@ udf.txt
- info and mount options for the UDF filesystem.
 ufs.txt
- info on the ufs filesystem.
+unionfs/
+   - info on the unionfs filesystem
 vfat.txt
- info on using the VFAT filesystem used in Windows NT and Windows 95
 vfs.txt
diff --git a/Documentation/filesystems/unionfs/00-INDEX 
b/Documentation/filesystems/unionfs/00-INDEX
new file mode 100644
index 000..96fdf67
--- /dev/null
+++ b/Documentation/filesystems/unionfs/00-INDEX
@@ -0,0 +1,10 @@
+00-INDEX
+   - this file.
+concepts.txt
+   - A brief introduction of concepts.
+issues.txt
+   - A summary of known issues with unionfs.
+rename.txt
+   - Information regarding rename operations.
+usage.txt
+   - Usage information and examples.
diff --git a/Documentation/filesystems/unionfs/concepts.txt 
b/Documentation/filesystems/unionfs/concepts.txt
new file mode 100644
index 000..bed69bd
--- /dev/null
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -0,0 +1,213 @@
+Unionfs 2.x CONCEPTS:
+=
+
+This file describes the concepts needed by a namespace unification file
+system.
+
+
+Branch Priority:
+
+
+Each branch is assigned a unique priority - starting from 0 (highest
+priority).  No two branches can have the same priority.
+
+
+Branch Mode:
+
+
+Each branch is assigned a mode - read-write or read-only. This allows
+directories on media mounted read-write to be used in a read-only manner.
+
+
+Whiteouts:
+==
+
+A whiteout removes a file name from the namespace. Whiteouts are needed when
+one attempts to remove a file on a read-only branch.
+
+Suppose we have a two-branch union, where branch 0 is read-write and branch
+1 is read-only. And a file 'foo' on branch 1:
+
+./b0/
+./b1/
+./b1/foo
+
+The unified view would simply be:
+
+./union/
+./union/foo
+
+Since 'foo' is stored on a read-only branch, it cannot be removed. A
+whiteout is used to remove the name 'foo' from the unified namespace. Again,
+since branch 1 is read-only, the whiteout cannot be created there. So, we
+try on a higher priority (lower numerically) branch and create the whiteout
+there.
+
+./b0/
+./b0/.wh.foo
+./b1/
+./b1/foo
+
+Later, when Unionfs traverses branches (due to lookup or readdir), it
+eliminate 'foo' from the namespace (as well as the whiteout itself.)
+
+
+Duplicate Elimination:
+==
+
+It is possible for files on different branches to have the same name.
+Unionfs then has to select which instance of the file to show to the user.
+Given the fact that each branch has a priority associated with it, the
+simplest solution is to take the instance from the highest priority
+(numerically lowest value) and "hide" the others.
+
+
+Copyup:
+===
+
+When a change is made to the contents of a file's data or meta-data, they
+have to be stored somewhere.  The best way is to create a copy of the
+original file on a branch that is writable, and then redirect the write
+though to this copy.  The copy must be made on a higher priority branch so
+that lookup and readdir return this newer "version" of the file rather than
+the original (see duplicate elimination).
+
+An entire unionfs mount can be read-only or read-write.  If it's read-only,
+then none of the branches will be written to, even if some of the branches
+are physically writeable.  If the unionfs mount is read-write, then the
+leftmost (highest priority) branch must be writeable (for copyup to take
+place); the remaining branches can be any mix of read-write and read-only.
+
+In a writeable mount, unionfs will create new files/dir in the leftmost
+branch.  If one tries to modify a file in a read-only branch/media, unionfs
+will copyup the fi

[PATCH 11/29] Unionfs: lower-level lookup routines

2008-01-10 Thread Erez Zadok

Includes lower nameidata support routines.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/lookup.c |  652 +++
 1 files changed, 652 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/lookup.c

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
new file mode 100644
index 000..b9ee072
--- /dev/null
+++ b/fs/unionfs/lookup.c
@@ -0,0 +1,652 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int realloc_dentry_private_data(struct dentry *dentry);
+
+/* is the filename valid == !(whiteout for a file or opaque dir marker) */
+static int is_validname(const char *name)
+{
+   if (!strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN))
+   return 0;
+   if (!strncmp(name, UNIONFS_DIR_OPAQUE_NAME,
+sizeof(UNIONFS_DIR_OPAQUE_NAME) - 1))
+   return 0;
+   return 1;
+}
+
+/* The rest of these are utility functions for lookup. */
+static noinline int is_opaque_dir(struct dentry *dentry, int bindex)
+{
+   int err = 0;
+   struct dentry *lower_dentry;
+   struct dentry *wh_lower_dentry;
+   struct inode *lower_inode;
+   struct sioq_args args;
+
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   lower_inode = lower_dentry->d_inode;
+
+   BUG_ON(!S_ISDIR(lower_inode->i_mode));
+
+   mutex_lock(&lower_inode->i_mutex);
+
+   if (!permission(lower_inode, MAY_EXEC, NULL)) {
+   wh_lower_dentry =
+   lookup_one_len(UNIONFS_DIR_OPAQUE, lower_dentry,
+  sizeof(UNIONFS_DIR_OPAQUE) - 1);
+   } else {
+   args.is_opaque.dentry = lower_dentry;
+   run_sioq(__is_opaque_dir, &args);
+   wh_lower_dentry = args.ret;
+   }
+
+   mutex_unlock(&lower_inode->i_mutex);
+
+   if (IS_ERR(wh_lower_dentry)) {
+   err = PTR_ERR(wh_lower_dentry);
+   goto out;
+   }
+
+   /* This is an opaque dir iff wh_lower_dentry is positive */
+   err = !!wh_lower_dentry->d_inode;
+
+   dput(wh_lower_dentry);
+out:
+   return err;
+}
+
+/*
+ * Main (and complex) driver function for Unionfs's lookup
+ *
+ * Returns: NULL (ok), ERR_PTR if an error occurred, or a non-null non-error
+ * PTR if d_splice returned a different dentry.
+ *
+ * If lookupmode is INTERPOSE_PARTIAL/REVAL/REVAL_NEG, the passed dentry's
+ * inode info must be locked.  If lookupmode is INTERPOSE_LOOKUP (i.e., a
+ * newly looked-up dentry), then unionfs_lookup_backend will return a locked
+ * dentry's info, which the caller must unlock.
+ */
+struct dentry *unionfs_lookup_backend(struct dentry *dentry,
+ struct nameidata *nd, int lookupmode)
+{
+   int err = 0;
+   struct dentry *lower_dentry = NULL;
+   struct dentry *wh_lower_dentry = NULL;
+   struct dentry *lower_dir_dentry = NULL;
+   struct dentry *parent_dentry = NULL;
+   struct dentry *d_interposed = NULL;
+   int bindex, bstart = -1, bend, bopaque;
+   int dentry_count = 0;   /* Number of positive dentries. */
+   int first_dentry_offset = -1; /* -1 is uninitialized */
+   struct dentry *first_dentry = NULL;
+   struct dentry *first_lower_dentry = NULL;
+   struct vfsmount *first_lower_mnt = NULL;
+   int opaque;
+   char *whname = NULL;
+   const char *name;
+   int namelen;
+
+   /*
+* We should already have a lock on this dentry in the case of a
+* partial lookup, or a revalidation. Otherwise it is returned from
+* new_dentry_private_data already locked.
+*/
+   if (lookupmode == INTERPOSE_PARTIAL || lookupmode == INTERPOSE_REVAL ||
+   lookupmode == INTERPOSE_REVAL_NEG)
+   verify_locked(dentry);
+   else/* this could only be INTERPOSE_LOOKUP */
+   BUG_ON(UNIONFS_D(dentry) != NULL);
+
+   switch (lookupmode) {
+   case INTERPOSE_PARTIAL:
+   break;
+   case INTERPOSE_LOOKUP:
+   err = new_dentry_private_data(dentry, UNIONFS_DMUTEX_CHILD);
+   if (unlikely(err))
+

[PATCH 21/29] Unionfs: extended attributes operations

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/xattr.c |  153 
 1 files changed, 153 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/xattr.c

diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
new file mode 100644
index 000..8001c65
--- /dev/null
+++ b/fs/unionfs/xattr.c
@@ -0,0 +1,153 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* This is lifted from fs/xattr.c */
+void *unionfs_xattr_alloc(size_t size, size_t limit)
+{
+   void *ptr;
+
+   if (size > limit)
+   return ERR_PTR(-E2BIG);
+
+   if (!size)  /* size request, no buffer is needed */
+   return NULL;
+
+   ptr = kmalloc(size, GFP_KERNEL);
+   if (unlikely(!ptr))
+   return ERR_PTR(-ENOMEM);
+   return ptr;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,
+size_t size)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_getxattr(lower_dentry, (char *) name, value, size);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+int unionfs_setxattr(struct dentry *dentry, const char *name,
+const void *value, size_t size, int flags)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_setxattr(lower_dentry, (char *) name, (void *) value,
+  size, flags);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+int unionfs_removexattr(struct dentry *dentry, const char *name)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_removexattr(lower_dentry, (char *) name);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+   char *encoded_list = NULL;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   encoded_list = list;
+   err = vfs_listxattr(lower_dentry, encoded_list, size);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of

[PATCH 10/29] Unionfs: dentry revalidation

2008-01-10 Thread Erez Zadok

Includes d_release methods and cache-coherency support for dentries.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |  548 +++
 1 files changed, 548 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dentry.c

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
new file mode 100644
index 000..cd15243
--- /dev/null
+++ b/fs/unionfs/dentry.c
@@ -0,0 +1,548 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Revalidate a single dentry.
+ * Assume that dentry's info node is locked.
+ * Assume that parent(s) are all valid already, but
+ * the child may not yet be valid.
+ * Returns true if valid, false otherwise.
+ */
+static bool __unionfs_d_revalidate_one(struct dentry *dentry,
+  struct nameidata *nd)
+{
+   bool valid = true;  /* default is valid */
+   struct dentry *lower_dentry;
+   int bindex, bstart, bend;
+   int sbgen, dgen;
+   int positive = 0;
+   int interpose_flag;
+   struct nameidata lowernd; /* TODO: be gentler to the stack */
+
+   if (nd)
+   memcpy(&lowernd, nd, sizeof(struct nameidata));
+   else
+   memset(&lowernd, 0, sizeof(struct nameidata));
+
+   verify_locked(dentry);
+   verify_locked(dentry->d_parent);
+
+   /* if the dentry is unhashed, do NOT revalidate */
+   if (d_deleted(dentry))
+   goto out;
+
+   BUG_ON(dbstart(dentry) == -1);
+   if (dentry->d_inode)
+   positive = 1;
+   dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+   sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
+   /*
+* If we are working on an unconnected dentry, then there is no
+* revalidation to be done, because this file does not exist within
+* the namespace, and Unionfs operates on the namespace, not data.
+*/
+   if (unlikely(sbgen != dgen)) {
+   struct dentry *result;
+   int pdgen;
+
+   /* The root entry should always be valid */
+   BUG_ON(IS_ROOT(dentry));
+
+   /* We can't work correctly if our parent isn't valid. */
+   pdgen = atomic_read(&UNIONFS_D(dentry->d_parent)->generation);
+   BUG_ON(pdgen != sbgen); /* should never happen here */
+
+   /* Free the pointers for our inodes and this dentry. */
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+   if (bstart >= 0) {
+   struct dentry *lower_dentry;
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_dentry =
+   unionfs_lower_dentry_idx(dentry,
+bindex);
+   dput(lower_dentry);
+   }
+   }
+   set_dbstart(dentry, -1);
+   set_dbend(dentry, -1);
+
+   interpose_flag = INTERPOSE_REVAL_NEG;
+   if (positive) {
+   interpose_flag = INTERPOSE_REVAL;
+
+   bstart = ibstart(dentry->d_inode);
+   bend = ibend(dentry->d_inode);
+   if (bstart >= 0) {
+   struct inode *lower_inode;
+   for (bindex = bstart; bindex <= bend;
+bindex++) {
+   lower_inode =
+   unionfs_lower_inode_idx(
+   dentry->d_inode,
+   bindex);
+   iput(lower_inode);
+   }
+   }
+   kfree(UNIONFS_I(dentry->d_inode)->lower_inodes);
+   UNIONFS_I(dentry->d_inode)->lower_inodes = NULL;
+   ibstart(dentry->d_inode) = -1;
+

[PATCH 26/29] Unionfs: common header file for user-land utilities and kernel

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/union_fs.h |   24 
 1 files changed, 24 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/union_fs.h

diff --git a/include/linux/union_fs.h b/include/linux/union_fs.h
new file mode 100644
index 000..9d601d2
--- /dev/null
+++ b/include/linux/union_fs.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _LINUX_UNION_FS_H
+#define _LINUX_UNION_FS_H
+
+#define UNIONFS_VERSION  "2.2-mm"
+
+/*
+ * DEFINITIONS FOR USER AND KERNEL CODE:
+ */
+# define UNIONFS_IOCTL_INCGEN  _IOR(0x15, 11, int)
+# define UNIONFS_IOCTL_QUERYFILE   _IOR(0x15, 15, int)
+
+#endif /* _LINUX_UNIONFS_H */
+
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 12/29] Unionfs: rename method and helpers

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/rename.c |  531 +++
 1 files changed, 531 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/rename.c

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
new file mode 100644
index 000..9306a2b
--- /dev/null
+++ b/fs/unionfs/rename.c
@@ -0,0 +1,531 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
+   struct inode *new_dir, struct dentry *new_dentry,
+   int bindex, struct dentry **wh_old)
+{
+   int err = 0;
+   struct dentry *lower_old_dentry;
+   struct dentry *lower_new_dentry;
+   struct dentry *lower_old_dir_dentry;
+   struct dentry *lower_new_dir_dentry;
+   struct dentry *lower_wh_dentry;
+   struct dentry *lower_wh_dir_dentry;
+   char *wh_name = NULL;
+
+   lower_new_dentry = unionfs_lower_dentry_idx(new_dentry, bindex);
+   lower_old_dentry = unionfs_lower_dentry_idx(old_dentry, bindex);
+
+   if (!lower_new_dentry) {
+   lower_new_dentry =
+   create_parents(new_dentry->d_parent->d_inode,
+  new_dentry, new_dentry->d_name.name,
+  bindex);
+   if (IS_ERR(lower_new_dentry)) {
+   err = PTR_ERR(lower_new_dentry);
+   if (IS_COPYUP_ERR(err))
+   goto out;
+   printk(KERN_ERR "unionfs: error creating directory "
+  "tree for rename, bindex=%d err=%d\n",
+  bindex, err);
+   goto out;
+   }
+   }
+
+   wh_name = alloc_whname(new_dentry->d_name.name,
+  new_dentry->d_name.len);
+   if (unlikely(IS_ERR(wh_name))) {
+   err = PTR_ERR(wh_name);
+   goto out;
+   }
+
+   lower_wh_dentry = lookup_one_len(wh_name, lower_new_dentry->d_parent,
+new_dentry->d_name.len +
+UNIONFS_WHLEN);
+   if (IS_ERR(lower_wh_dentry)) {
+   err = PTR_ERR(lower_wh_dentry);
+   goto out;
+   }
+
+   if (lower_wh_dentry->d_inode) {
+   /* get rid of the whiteout that is existing */
+   if (lower_new_dentry->d_inode) {
+   printk(KERN_ERR "unionfs: both a whiteout and a "
+  "dentry exist when doing a rename!\n");
+   err = -EIO;
+
+   dput(lower_wh_dentry);
+   goto out;
+   }
+
+   lower_wh_dir_dentry = lock_parent_wh(lower_wh_dentry);
+   err = is_robranch_super(old_dentry->d_sb, bindex);
+   if (!err)
+   err = vfs_unlink(lower_wh_dir_dentry->d_inode,
+lower_wh_dentry);
+
+   dput(lower_wh_dentry);
+   unlock_dir(lower_wh_dir_dentry);
+   if (err)
+   goto out;
+   } else {
+   dput(lower_wh_dentry);
+   }
+
+   err = is_robranch_super(old_dentry->d_sb, bindex);
+   if (err)
+   goto out;
+
+   dget(lower_old_dentry);
+   lower_old_dir_dentry = dget_parent(lower_old_dentry);
+   lower_new_dir_dentry = dget_parent(lower_new_dentry);
+
+   /*
+* ready to whiteout for old_dentry. caller will create the actual
+* whiteout, and must dput(*wh_old)
+*/
+   if (wh_old) {
+   char *whname;
+   whname = alloc_whname(old_dentry->d_name.name,
+ old_dentry->d_name.len);
+   err = PTR_ERR(whname);
+   if (unlikely(IS_ERR(whname)))
+   goto out_dput;
+   *wh_old = lookup_one_len(whname, lower_old_dir_dentry,
+old_dentry->d_name.len +
+

[PATCH 14/29] Unionfs: readdir helper functions

2008-01-10 Thread Erez Zadok

Includes whiteout handling for directories.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirhelper.c |  267 
 1 files changed, 267 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dirhelper.c

diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
new file mode 100644
index 000..4b73bb6
--- /dev/null
+++ b/fs/unionfs/dirhelper.c
@@ -0,0 +1,267 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Delete all of the whiteouts in a given directory for rmdir.
+ *
+ * lower directory inode should be locked
+ */
+int do_delete_whiteouts(struct dentry *dentry, int bindex,
+   struct unionfs_dir_state *namelist)
+{
+   int err = 0;
+   struct dentry *lower_dir_dentry = NULL;
+   struct dentry *lower_dentry;
+   char *name = NULL, *p;
+   struct inode *lower_dir;
+   int i;
+   struct list_head *pos;
+   struct filldir_node *cursor;
+
+   /* Find out lower parent dentry */
+   lower_dir_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   BUG_ON(!S_ISDIR(lower_dir_dentry->d_inode->i_mode));
+   lower_dir = lower_dir_dentry->d_inode;
+   BUG_ON(!S_ISDIR(lower_dir->i_mode));
+
+   err = -ENOMEM;
+   name = __getname();
+   if (unlikely(!name))
+   goto out;
+   strcpy(name, UNIONFS_WHPFX);
+   p = name + UNIONFS_WHLEN;
+
+   err = 0;
+   for (i = 0; !err && i < namelist->size; i++) {
+   list_for_each(pos, &namelist->list[i]) {
+   cursor =
+   list_entry(pos, struct filldir_node,
+  file_list);
+   /* Only operate on whiteouts in this branch. */
+   if (cursor->bindex != bindex)
+   continue;
+   if (!cursor->whiteout)
+   continue;
+
+   strcpy(p, cursor->name);
+   lower_dentry =
+   lookup_one_len(name, lower_dir_dentry,
+  cursor->namelen +
+  UNIONFS_WHLEN);
+   if (IS_ERR(lower_dentry)) {
+   err = PTR_ERR(lower_dentry);
+   break;
+   }
+   if (lower_dentry->d_inode)
+   err = vfs_unlink(lower_dir, lower_dentry);
+   dput(lower_dentry);
+   if (err)
+   break;
+   }
+   }
+
+   __putname(name);
+
+   /* After all of the removals, we should copy the attributes once. */
+   fsstack_copy_attr_times(dentry->d_inode, lower_dir_dentry->d_inode);
+
+out:
+   return err;
+}
+
+/* delete whiteouts in a dir (for rmdir operation) using sioq if necessary */
+int delete_whiteouts(struct dentry *dentry, int bindex,
+struct unionfs_dir_state *namelist)
+{
+   int err;
+   struct super_block *sb;
+   struct dentry *lower_dir_dentry;
+   struct inode *lower_dir;
+   struct sioq_args args;
+
+   sb = dentry->d_sb;
+
+   BUG_ON(!S_ISDIR(dentry->d_inode->i_mode));
+   BUG_ON(bindex < dbstart(dentry));
+   BUG_ON(bindex > dbend(dentry));
+   err = is_robranch_super(sb, bindex);
+   if (err)
+   goto out;
+
+   lower_dir_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   BUG_ON(!S_ISDIR(lower_dir_dentry->d_inode->i_mode));
+   lower_dir = lower_dir_dentry->d_inode;
+   BUG_ON(!S_ISDIR(lower_dir->i_mode));
+
+   if (!permission(lower_dir, MAY_WRITE | MAY_EXEC, NULL)) {
+   err = do_delete_whiteouts(dentry, bindex, namelist);
+   } else {
+   args.deletewh.namelist = namelist;
+   args.deletewh.dentry = dentry;
+   args.deletewh.bindex = bindex;
+   run_sioq(__delete_whiteouts, &args);
+   err = args.err;
+   }
+
+out:
+   retur

[PATCH 22/29] Unionfs: async I/O queue

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/sioq.c |  119 +
 fs/unionfs/sioq.h |   92 +
 2 files changed, 211 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/sioq.c
 create mode 100644 fs/unionfs/sioq.h

diff --git a/fs/unionfs/sioq.c b/fs/unionfs/sioq.c
new file mode 100644
index 000..2a8c88e
--- /dev/null
+++ b/fs/unionfs/sioq.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006  Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006  Junjiro Okajima
+ * Copyright (c) 2006  David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Super-user IO work Queue - sometimes we need to perform actions which
+ * would fail due to the unix permissions on the parent directory (e.g.,
+ * rmdir a directory which appears empty, but in reality contains
+ * whiteouts).
+ */
+
+static struct workqueue_struct *superio_workqueue;
+
+int __init init_sioq(void)
+{
+   int err;
+
+   superio_workqueue = create_workqueue("unionfs_siod");
+   if (!IS_ERR(superio_workqueue))
+   return 0;
+
+   err = PTR_ERR(superio_workqueue);
+   printk(KERN_ERR "unionfs: create_workqueue failed %d\n", err);
+   superio_workqueue = NULL;
+   return err;
+}
+
+void stop_sioq(void)
+{
+   if (superio_workqueue)
+   destroy_workqueue(superio_workqueue);
+}
+
+void run_sioq(work_func_t func, struct sioq_args *args)
+{
+   INIT_WORK(&args->work, func);
+
+   init_completion(&args->comp);
+   while (!queue_work(superio_workqueue, &args->work)) {
+   /* TODO: do accounting if needed */
+   schedule();
+   }
+   wait_for_completion(&args->comp);
+}
+
+void __unionfs_create(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct create_args *c = &args->create;
+
+   args->err = vfs_create(c->parent, c->dentry, c->mode, c->nd);
+   complete(&args->comp);
+}
+
+void __unionfs_mkdir(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct mkdir_args *m = &args->mkdir;
+
+   args->err = vfs_mkdir(m->parent, m->dentry, m->mode);
+   complete(&args->comp);
+}
+
+void __unionfs_mknod(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct mknod_args *m = &args->mknod;
+
+   args->err = vfs_mknod(m->parent, m->dentry, m->mode, m->dev);
+   complete(&args->comp);
+}
+
+void __unionfs_symlink(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct symlink_args *s = &args->symlink;
+
+   args->err = vfs_symlink(s->parent, s->dentry, s->symbuf, s->mode);
+   complete(&args->comp);
+}
+
+void __unionfs_unlink(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct unlink_args *u = &args->unlink;
+
+   args->err = vfs_unlink(u->parent, u->dentry);
+   complete(&args->comp);
+}
+
+void __delete_whiteouts(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct deletewh_args *d = &args->deletewh;
+
+   args->err = do_delete_whiteouts(d->dentry, d->bindex, d->namelist);
+   complete(&args->comp);
+}
+
+void __is_opaque_dir(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+
+   args->ret = lookup_one_len(UNIONFS_DIR_OPAQUE, args->is_opaque.dentry,
+  sizeof(UNIONFS_DIR_OPAQUE) - 1);
+   complete(&args->comp);
+}
diff --git a/fs/unionfs/sioq.h b/fs/unionfs/sioq.h
new file mode 100644
index 000..afb71ee
--- /dev/null
+++ b/fs/unionfs/sioq.h
@@ -0,0 +1,92 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006  Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006  Junjiro Okajima
+ * Copyright (c) 2006  David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU

[PATCH 15/29] Unionfs: readdir state helpers

2008-01-10 Thread Erez Zadok

Includes duplicate name elimination and whiteout-handling code.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/rdstate.c |  285 ++
 1 files changed, 285 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/rdstate.c

diff --git a/fs/unionfs/rdstate.c b/fs/unionfs/rdstate.c
new file mode 100644
index 000..7ba1e1a
--- /dev/null
+++ b/fs/unionfs/rdstate.c
@@ -0,0 +1,285 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* This file contains the routines for maintaining readdir state. */
+
+/*
+ * There are two structures here, rdstate which is a hash table
+ * of the second structure which is a filldir_node.
+ */
+
+/*
+ * This is a struct kmem_cache for filldir nodes, because we allocate a lot
+ * of them and they shouldn't waste memory.  If the node has a small name
+ * (as defined by the dentry structure), then we use an inline name to
+ * preserve kmalloc space.
+ */
+static struct kmem_cache *unionfs_filldir_cachep;
+
+int unionfs_init_filldir_cache(void)
+{
+   unionfs_filldir_cachep =
+   kmem_cache_create("unionfs_filldir",
+ sizeof(struct filldir_node), 0,
+ SLAB_RECLAIM_ACCOUNT, NULL);
+
+   return (unionfs_filldir_cachep ? 0 : -ENOMEM);
+}
+
+void unionfs_destroy_filldir_cache(void)
+{
+   if (unionfs_filldir_cachep)
+   kmem_cache_destroy(unionfs_filldir_cachep);
+}
+
+/*
+ * This is a tuning parameter that tells us roughly how big to make the
+ * hash table in directory entries per page.  This isn't perfect, but
+ * at least we get a hash table size that shouldn't be too overloaded.
+ * The following averages are based on my home directory.
+ * 14.44693Overall
+ * 12.29   Single Page Directories
+ * 117.93  Multi-page directories
+ */
+#define DENTPAGE 4096
+#define DENTPERONEPAGE 12
+#define DENTPERPAGE 118
+#define MINHASHSIZE 1
+static int guesstimate_hash_size(struct inode *inode)
+{
+   struct inode *lower_inode;
+   int bindex;
+   int hashsize = MINHASHSIZE;
+
+   if (UNIONFS_I(inode)->hashsize > 0)
+   return UNIONFS_I(inode)->hashsize;
+
+   for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (!lower_inode)
+   continue;
+
+   if (i_size_read(lower_inode) == DENTPAGE)
+   hashsize += DENTPERONEPAGE;
+   else
+   hashsize += (i_size_read(lower_inode) / DENTPAGE) *
+   DENTPERPAGE;
+   }
+
+   return hashsize;
+}
+
+int init_rdstate(struct file *file)
+{
+   BUG_ON(sizeof(loff_t) !=
+  (sizeof(unsigned int) + sizeof(unsigned int)));
+   BUG_ON(UNIONFS_F(file)->rdstate != NULL);
+
+   UNIONFS_F(file)->rdstate = alloc_rdstate(file->f_path.dentry->d_inode,
+fbstart(file));
+
+   return (UNIONFS_F(file)->rdstate ? 0 : -ENOMEM);
+}
+
+struct unionfs_dir_state *find_rdstate(struct inode *inode, loff_t fpos)
+{
+   struct unionfs_dir_state *rdstate = NULL;
+   struct list_head *pos;
+
+   spin_lock(&UNIONFS_I(inode)->rdlock);
+   list_for_each(pos, &UNIONFS_I(inode)->readdircache) {
+   struct unionfs_dir_state *r =
+   list_entry(pos, struct unionfs_dir_state, cache);
+   if (fpos == rdstate2offset(r)) {
+   UNIONFS_I(inode)->rdcount--;
+   list_del(&r->cache);
+   rdstate = r;
+   break;
+   }
+   }
+   spin_unlock(&UNIONFS_I(inode)->rdlock);
+   return rdstate;
+}
+
+struct unionfs_dir_state *alloc_rdstate(struct inode *inode, int bindex)
+{
+   int i = 0;
+   int hashsize;
+   unsigned long mallocsize = sizeof(struct unionfs_dir_state);
+   struct unionfs_dir_state *rdstate;
+
+   hashsize = guesstimate_hash_size(inode);
+   mallocsize += hashsize * sizeof(struct list_head);
+   mallocsize

[PATCH 03/29] Makefile: hook to compile unionfs

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/Makefile |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/Makefile b/fs/Makefile
index 500cf15..e202288 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -118,3 +118,4 @@ obj-$(CONFIG_HPPFS) += hppfs/
 obj-$(CONFIG_DEBUG_FS) += debugfs/
 obj-$(CONFIG_OCFS2_FS) += ocfs2/
 obj-$(CONFIG_GFS2_FS)   += gfs2/
+obj-$(CONFIG_UNION_FS) += unionfs/
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/29] VFS/eCryptfs: use simplified fs_stack API to fsstack_copy_attr_all

2008-01-10 Thread Erez Zadok

Acked-by: Mike Halcrow <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/ecryptfs/dentry.c |2 +-
 fs/ecryptfs/inode.c  |6 +++---
 fs/ecryptfs/main.c   |2 +-
 fs/stack.c   |   38 --
 include/linux/fs_stack.h |   21 -
 5 files changed, 45 insertions(+), 24 deletions(-)

diff --git a/fs/ecryptfs/dentry.c b/fs/ecryptfs/dentry.c
index cb20b96..a8c1686 100644
--- a/fs/ecryptfs/dentry.c
+++ b/fs/ecryptfs/dentry.c
@@ -62,7 +62,7 @@ static int ecryptfs_d_revalidate(struct dentry *dentry, 
struct nameidata *nd)
struct inode *lower_inode =
ecryptfs_inode_to_lower(dentry->d_inode);
 
-   fsstack_copy_attr_all(dentry->d_inode, lower_inode, NULL);
+   fsstack_copy_attr_all(dentry->d_inode, lower_inode);
}
 out:
return rc;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 5a71918..89e8560 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -576,9 +576,9 @@ ecryptfs_rename(struct inode *old_dir, struct dentry 
*old_dentry,
lower_new_dir_dentry->d_inode, lower_new_dentry);
if (rc)
goto out_lock;
-   fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode, NULL);
+   fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode);
if (new_dir != old_dir)
-   fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode, 
NULL);
+   fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode);
 out_lock:
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
dput(lower_new_dentry->d_parent);
@@ -912,7 +912,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct 
iattr *ia)
 
rc = notify_change(lower_dentry, ia);
 out:
-   fsstack_copy_attr_all(inode, lower_inode, NULL);
+   fsstack_copy_attr_all(inode, lower_inode);
return rc;
 }
 
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index e5580bc..6276cdf 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -211,7 +211,7 @@ int ecryptfs_interpose(struct dentry *lower_dentry, struct 
dentry *dentry,
d_add(dentry, inode);
else
d_instantiate(dentry, inode);
-   fsstack_copy_attr_all(inode, lower_inode, NULL);
+   fsstack_copy_attr_all(inode, lower_inode);
/* This size will be overwritten for real files w/ headers and
 * other metadata */
fsstack_copy_inode_size(inode, lower_inode);
diff --git a/fs/stack.c b/fs/stack.c
index 67716f6..4336f2b 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -1,24 +1,42 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
 #include 
 #include 
 #include 
 
-/* does _NOT_ require i_mutex to be held.
+/*
+ * does _NOT_ require i_mutex to be held.
  *
  * This function cannot be inlined since i_size_{read,write} is rather
  * heavy-weight on 32-bit systems
  */
 void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
 {
-   i_size_write(dst, i_size_read((struct inode *)src));
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_lock(&dst->i_lock);
+#endif
+   i_size_write(dst, i_size_read(src));
dst->i_blocks = src->i_blocks;
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_unlock(&dst->i_lock);
+#endif
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
 
-/* copy all attributes; get_nlinks is optional way to override the i_nlink
+/*
+ * copy all attributes; get_nlinks is optional way to override the i_nlink
  * copying
  */
-void fsstack_copy_attr_all(struct inode *dest, const struct inode *src,
-   int (*get_nlinks)(struct inode *))
+void fsstack_copy_attr_all(struct inode *dest, const struct inode *src)
 {
dest->i_mode = src->i_mode;
dest->i_uid = src->i_uid;
@@ -29,14 +47,6 @@ void fsstack_copy_attr_all(struct inode *dest, const struct 
inode *src,
dest->i_ctime = src->i_ctime;
dest->i_blkbits = src->i_blkbits;
dest->i_flags = src->i_flags;
-
-   /*
-* Update the nlinks AFTER updating the above fields, because the
-* get_links callback may depend on them.
-*/
-   if (!get_nlinks)
-   dest->i_nlink = src->i_nlink;
-   else
-   dest->i_nlink = (*get_nlinks)(dest);
+   dest->i_nlink = src->i_nlink;
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_attr_all);
diff --git a/include/linux

[UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-01-10 Thread Erez Zadok

X files to indicate that file XXX has been whited-out.
This works well on many file systems, but it tends to clutter lower
branches with these .wh.* files.  We recently optimized our whiteout
creation algorithm so it minimizes the number of conditions in which
whiteouts are created, and that helped some people a lot.  But still, if
you unify a readonly and writeable branch, and you try to delete a file
from the readonly branch/medium, there's no way to avoid creating some
sort of a whiteout.  BTW, of course, these whiteouts are completely
hidden from the view of the user who accesses files/dirs via the union.

In the long run, we really hope to see native whiteout support in Linux (ala
BSD).  Of course, this would require a change to the VFS and several native
file systems (possibly even a change to the on-disk format), so we realize
that this isn't likely to happen soon.  If/when native whiteout support was
available, unionfs could easily use it.  Until that time, we have lots of
users who want to use unionfs on top of numerous different file systems, and
so we have to do the next best thing wrt whiteouts.

This is a good point to mention that the version of unionfs in -mm is 2.2.x.
We have been working on a newer and still experimental version of Unionfs,
called "Unionfs with On-Disk Format" or Unionfs-ODF.  Unionfs-ODF uses a
small persistent store (e.g., a small ext2 partition) to store whiteouts in,
among other info; this moves the union-level meta-data (e.g., whiteouts),
outside the lower file systems, and thus eliminates the need to create .wh.*
files.  Unionfs-ODF has other useful benefits, and you can get more detail
about it here: <http://www.filesystems.org/unionfs-odf.txt>.  We recently
sync'ed up our unionfs 2.2 and unionfs-odf releases and we're tracking
Linus's tree for both.  IOW, every fix and user-visible feature that has
gone into unionfs in -mm, is now also in unionfs-odf.  Our intent is to
continue to develop both versions, and gradually move features from
unionfs-odf into unionfs 2.2; this would be possible even if/after
unionfs-2.2 gets merged, because the changes will all be internal to the
implementation, and users won't need to change the way they, say, mount a
union or manipulate its branches.

(4) branch management.  One of the most useful features of unioning is to be
able to add/remove branches from the union.  We used to do this via
ioctl's, which was considered racy, unclean, and non-atomic (only one
branch-manipulation operation at a time).  We now do that via the
remount interface, and allow users to pass multiple branch-manipulation
commands, which are handled as one action.


* GENERAL

I should note that my philosophy in developing any stackable file system had
been to minimize changes to the VFS, and to not change any lower file system
whatsoever: that ensures that unionfs couldn't affect the stability of
performance of the rest of the kernel.  Still, some of the things unionfs
does could possibly be done more cleanly and easily at the VFS level (e.g.,
better hooks for cache coherency).

Unionfs 2.2.x is currently maintained on 2.6.9 and all major kernels since
2.6.18, all the way to Linus's latest 2.6.24-rc tree and -mm.  We've got a
lot users who use unionfs in more creative ways than even we could think of,
and this has helped us find the RIGHT set of features to please the users,
as well as stabilize the code.  Before every new release, we test the new
code on all versions using ltp-full, parallel compiles, and our own
unionfs-aware regression suite which exercises unionfs's unique features
(e.g., copy-up).

I therefore believe that unionfs is in a good enough shape now to be
considered for merging in 2.6.25.  (Heck, using Unionfs I've managed to
find, report, and in most cases fix bugs I found in nfsv2/3/4, xfs, jffs2,
and ext3/jbd :-) The user-visible unionfs behavior isn't likely to change;
and any changes to the VFS to better support stacking, could be handled
internally in subsequent kernels without affecting how users use unionfs.
Aside from greater exposure to stackable file systems and unionfs, I think
one of the other important benefits of a merge could be that we'd have more
than one stackable f/s in the kernel (i.e., ecryptfs and unionfs); this
would allow us to slowly and gradually generalize the VFS so it can better
support stackable file systems.


Lastly, shortlog and diffstats:

Erez Zadok (29):
  Unionfs: documentation
  VFS/eCryptfs: use simplified fs_stack API to fsstack_copy_attr_all
  Makefile: hook to compile unionfs
  Unionfs: main Makefile
  Unionfs: fanout header definitions
  Unionfs: main header file
  Unionfs: common file copyup/revalidation operations
  Unionfs: basic file operations
  Unionfs: lower-level copyup routines
  Unionfs: dentry revalidation
  Unionfs:

[PATCH 27/29] VFS path get/put ops used by Unionfs

2008-01-10 Thread Erez Zadok

Note: this will become obsolete once similar patches, now in -mm, make it to
mainline.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/namei.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/namei.h b/include/linux/namei.h
index 4cb4f8d..63f16d9 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 struct vfsmount;
 
@@ -100,4 +101,16 @@ static inline char *nd_get_link(struct nameidata *nd)
return nd->saved_names[nd->depth];
 }
 
+static inline void pathget(struct path *path)
+{
+   mntget(path->mnt);
+   dget(path->dentry);
+}
+
+static inline void pathput(struct path *path)
+{
+   dput(path->dentry);
+   mntput(path->mnt);
+}
+
 #endif /* _LINUX_NAMEI_H */
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/29] Unionfs: main Makefile

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/Makefile |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/Makefile

diff --git a/fs/unionfs/Makefile b/fs/unionfs/Makefile
new file mode 100644
index 000..17ca4a7
--- /dev/null
+++ b/fs/unionfs/Makefile
@@ -0,0 +1,13 @@
+obj-$(CONFIG_UNION_FS) += unionfs.o
+
+unionfs-y := subr.o dentry.o file.o inode.o main.o super.o \
+   rdstate.o copyup.o dirhelper.o rename.o unlink.o \
+   lookup.o commonfops.o dirfops.o sioq.o mmap.o
+
+unionfs-$(CONFIG_UNION_FS_XATTR) += xattr.o
+
+unionfs-$(CONFIG_UNION_FS_DEBUG) += debug.o
+
+ifeq ($(CONFIG_UNION_FS_DEBUG),y)
+EXTRA_CFLAGS += -DDEBUG
+endif
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 29/29] Put Unionfs and eCryptfs under one layered filesystems menu

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/Kconfig |   53 +
 1 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 487236c..55a78b7 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -1041,6 +1041,47 @@ config CONFIGFS_FS
 
 endmenu
 
+menu "Layered filesystems"
+
+config ECRYPT_FS
+   tristate "eCrypt filesystem layer support (EXPERIMENTAL)"
+   depends on EXPERIMENTAL && KEYS && CRYPTO && NET
+   help
+ Encrypted filesystem that operates on the VFS layer.  See
+  to learn more about
+ eCryptfs.  Userspace components are required and can be
+ obtained from <http://ecryptfs.sf.net>.
+
+ To compile this file system support as a module, choose M here: the
+ module will be called ecryptfs.
+
+config UNION_FS
+   tristate "Union file system (EXPERIMENTAL)"
+   depends on EXPERIMENTAL
+   help
+ Unionfs is a stackable unification file system, which appears to
+ merge the contents of several directories (branches), while keeping
+ their physical content separate.
+
+ See <http://unionfs.filesystems.org> for details
+
+config UNION_FS_XATTR
+   bool "Unionfs extended attributes"
+   depends on UNION_FS
+   help
+ Extended attributes are name:value pairs associated with inodes by
+ the kernel or by users (see the attr(5) manual page).
+
+ If unsure, say N.
+
+config UNION_FS_DEBUG
+   bool "Debug Unionfs"
+   depends on UNION_FS
+   help
+ If you say Y here, you can turn on debugging output from Unionfs.
+
+endmenu
+
 menu "Miscellaneous filesystems"
 
 config ADFS_FS
@@ -1093,18 +1134,6 @@ config AFFS_FS
  To compile this file system support as a module, choose M here: the
  module will be called affs.  If unsure, say N.
 
-config ECRYPT_FS
-   tristate "eCrypt filesystem layer support (EXPERIMENTAL)"
-   depends on EXPERIMENTAL && KEYS && CRYPTO && NET
-   help
- Encrypted filesystem that operates on the VFS layer.  See
-  to learn more about
- eCryptfs.  Userspace components are required and can be
- obtained from <http://ecryptfs.sf.net>.
-
- To compile this file system support as a module, choose M here: the
- module will be called ecryptfs.
-
 config HFS_FS
tristate "Apple Macintosh file system support (EXPERIMENTAL)"
depends on BLOCK && EXPERIMENTAL
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 08/29] Unionfs: basic file operations

2008-01-10 Thread Erez Zadok

Includes read, write, mmap, fsync, and fasync.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/file.c |  184 +
 1 files changed, 184 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/file.c

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
new file mode 100644
index 000..0c424f6
--- /dev/null
+++ b/fs/unionfs/file.c
@@ -0,0 +1,184 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int unionfs_file_readdir(struct file *file, void *dirent,
+   filldir_t filldir)
+{
+   return -ENOTDIR;
+}
+
+static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)
+{
+   int err = 0;
+   bool willwrite;
+   struct file *lower_file;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+
+   /* This might be deferred to mmap's writepage */
+   willwrite = ((vma->vm_flags | VM_SHARED | VM_WRITE) == vma->vm_flags);
+   err = unionfs_file_revalidate(file, willwrite);
+   if (unlikely(err))
+   goto out;
+   unionfs_check_file(file);
+
+   /*
+* File systems which do not implement ->writepage may use
+* generic_file_readonly_mmap as their ->mmap op.  If you call
+* generic_file_readonly_mmap with VM_WRITE, you'd get an -EINVAL.
+* But we cannot call the lower ->mmap op, so we can't tell that
+* writeable mappings won't work.  Therefore, our only choice is to
+* check if the lower file system supports the ->writepage, and if
+* not, return EINVAL (the same error that
+* generic_file_readonly_mmap returns in that case).
+*/
+   lower_file = unionfs_lower_file(file);
+   if (willwrite && !lower_file->f_mapping->a_ops->writepage) {
+   err = -EINVAL;
+   printk(KERN_ERR "unionfs: branch %d file system does not "
+  "support writeable mmap\n", fbstart(file));
+   } else {
+   err = generic_file_mmap(file, vma);
+   if (err)
+   printk(KERN_ERR
+  "unionfs: generic_file_mmap failed %d\n", err);
+   }
+
+out:
+   if (!err) {
+   /* copyup could cause parent dir times to change */
+   unionfs_copy_attr_times(file->f_path.dentry->d_parent->d_inode);
+   unionfs_check_file(file);
+   }
+   unionfs_read_unlock(file->f_path.dentry->d_sb);
+   return err;
+}
+
+int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)
+{
+   int bindex, bstart, bend;
+   struct file *lower_file;
+   struct dentry *lower_dentry;
+   struct inode *lower_inode, *inode;
+   int err = -EINVAL;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+   err = unionfs_file_revalidate(file, true);
+   if (unlikely(err))
+   goto out;
+   unionfs_check_file(file);
+
+   bstart = fbstart(file);
+   bend = fbend(file);
+   if (bstart < 0 || bend < 0)
+   goto out;
+
+   inode = dentry->d_inode;
+   if (unlikely(!inode)) {
+   printk(KERN_ERR
+  "unionfs: null lower inode in unionfs_fsync\n");
+   goto out;
+   }
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (!lower_inode || !lower_inode->i_fop->fsync)
+   continue;
+   lower_file = unionfs_lower_file_idx(file, bindex);
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   mutex_lock(&lower_inode->i_mutex);
+   err = lower_inode->i_fop->fsync(lower_file,
+   lower_dentry,
+   datasync);
+   mutex_unlock(&lower_inode->i_mutex);
+   if (err)
+   goto out;
+   }
+
+   unionfs_copy_attr_times(inode);
+
+out:
+   unionfs

[PATCH 23/29] Unionfs: miscellaneous helper routines

2008-01-10 Thread Erez Zadok

Mostly related to whiteouts.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/subr.c |  242 +
 1 files changed, 242 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/subr.c

diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
new file mode 100644
index 000..0a0fce9
--- /dev/null
+++ b/fs/unionfs/subr.c
@@ -0,0 +1,242 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Pass an unionfs dentry and an index.  It will try to create a whiteout
+ * for the filename in dentry, and will try in branch 'index'.  On error,
+ * it will proceed to a branch to the left.
+ */
+int create_whiteout(struct dentry *dentry, int start)
+{
+   int bstart, bend, bindex;
+   struct dentry *lower_dir_dentry;
+   struct dentry *lower_dentry;
+   struct dentry *lower_wh_dentry;
+   struct nameidata nd;
+   char *name = NULL;
+   int err = -EINVAL;
+
+   verify_locked(dentry);
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   /* create dentry's whiteout equivalent */
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
+
+   for (bindex = start; bindex >= 0; bindex--) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+
+   if (!lower_dentry) {
+   /*
+* if lower dentry is not present, create the
+* entire lower dentry directory structure and go
+* ahead.  Since we want to just create whiteout, we
+* only want the parent dentry, and hence get rid of
+* this dentry.
+*/
+   lower_dentry = create_parents(dentry->d_inode,
+ dentry,
+ dentry->d_name.name,
+ bindex);
+   if (!lower_dentry || IS_ERR(lower_dentry)) {
+   int ret = PTR_ERR(lower_dentry);
+   if (!IS_COPYUP_ERR(ret))
+   printk(KERN_ERR
+  "unionfs: create_parents for "
+  "whiteout failed: bindex=%d "
+  "err=%d\n", bindex, ret);
+   continue;
+   }
+   }
+
+   lower_wh_dentry =
+   lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(lower_wh_dentry))
+   continue;
+
+   /*
+* The whiteout already exists. This used to be impossible,
+* but now is possible because of opaqueness.
+*/
+   if (lower_wh_dentry->d_inode) {
+   dput(lower_wh_dentry);
+   err = 0;
+   goto out;
+   }
+
+   err = init_lower_nd(&nd, LOOKUP_CREATE);
+   if (unlikely(err < 0))
+   goto out;
+   lower_dir_dentry = lock_parent_wh(lower_wh_dentry);
+   err = is_robranch_super(dentry->d_sb, bindex);
+   if (!err)
+   err = vfs_create(lower_dir_dentry->d_inode,
+lower_wh_dentry,
+~current->fs->umask & S_IRWXUGO,
+&nd);
+   unlock_dir(lower_dir_dentry);
+   dput(lower_wh_dentry);
+   release_lower_nd(&nd, err);
+
+   if (!err || !IS_COPYUP_ERR(err))
+   break;
+   }
+
+   /* set dbopaque so that lookup will not proceed after this branch */
+   if (!err)
+   set_dbopa

[PATCH 05/29] Unionfs: fanout header definitions

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/fanout.h |  366 +++
 1 files changed, 366 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/fanout.h

diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
new file mode 100644
index 000..4d9a45f
--- /dev/null
+++ b/fs/unionfs/fanout.h
@@ -0,0 +1,366 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _FANOUT_H_
+#define _FANOUT_H_
+
+/*
+ * Inode to private data
+ *
+ * Since we use containers and the struct inode is _inside_ the
+ * unionfs_inode_info structure, UNIONFS_I will always (given a non-NULL
+ * inode pointer), return a valid non-NULL pointer.
+ */
+static inline struct unionfs_inode_info *UNIONFS_I(const struct inode *inode)
+{
+   return container_of(inode, struct unionfs_inode_info, vfs_inode);
+}
+
+#define ibstart(ino) (UNIONFS_I(ino)->bstart)
+#define ibend(ino) (UNIONFS_I(ino)->bend)
+
+/* Superblock to private data */
+#define UNIONFS_SB(super) ((struct unionfs_sb_info *)(super)->s_fs_info)
+#define sbstart(sb) 0
+#define sbend(sb) (UNIONFS_SB(sb)->bend)
+#define sbmax(sb) (UNIONFS_SB(sb)->bend + 1)
+#define sbhbid(sb) (UNIONFS_SB(sb)->high_branch_id)
+
+/* File to private Data */
+#define UNIONFS_F(file) ((struct unionfs_file_info *)((file)->private_data))
+#define fbstart(file) (UNIONFS_F(file)->bstart)
+#define fbend(file) (UNIONFS_F(file)->bend)
+
+/* macros to manipulate branch IDs in stored in our superblock */
+static inline int branch_id(struct super_block *sb, int index)
+{
+   BUG_ON(!sb || index < 0);
+   return UNIONFS_SB(sb)->data[index].branch_id;
+}
+
+static inline void set_branch_id(struct super_block *sb, int index, int val)
+{
+   BUG_ON(!sb || index < 0);
+   UNIONFS_SB(sb)->data[index].branch_id = val;
+}
+
+static inline void new_branch_id(struct super_block *sb, int index)
+{
+   BUG_ON(!sb || index < 0);
+   set_branch_id(sb, index, ++UNIONFS_SB(sb)->high_branch_id);
+}
+
+/*
+ * Find new index of matching branch with an existing superblock of a known
+ * (possibly old) id.  This is needed because branches could have been
+ * added/deleted causing the branches of any open files to shift.
+ *
+ * @sb: the new superblock which may have new/different branch IDs
+ * @id: the old/existing id we're looking for
+ * Returns index of newly found branch (0 or greater), -1 otherwise.
+ */
+static inline int branch_id_to_idx(struct super_block *sb, int id)
+{
+   int i;
+   for (i = 0; i < sbmax(sb); i++) {
+   if (branch_id(sb, i) == id)
+   return i;
+   }
+   /* in the non-ODF code, this should really never happen */
+   printk(KERN_WARNING "unionfs: cannot find branch with id %d\n", id);
+   return -1;
+}
+
+/* File to lower file. */
+static inline struct file *unionfs_lower_file(const struct file *f)
+{
+   BUG_ON(!f);
+   return UNIONFS_F(f)->lower_files[fbstart(f)];
+}
+
+static inline struct file *unionfs_lower_file_idx(const struct file *f,
+ int index)
+{
+   BUG_ON(!f || index < 0);
+   return UNIONFS_F(f)->lower_files[index];
+}
+
+static inline void unionfs_set_lower_file_idx(struct file *f, int index,
+ struct file *val)
+{
+   BUG_ON(!f || index < 0);
+   UNIONFS_F(f)->lower_files[index] = val;
+   /* save branch ID (may be redundant?) */
+   UNIONFS_F(f)->saved_branch_ids[index] =
+   branch_id((f)->f_path.dentry->d_sb, index);
+}
+
+static inline void unionfs_set_lower_file(struct file *f, struct file *val)
+{
+   BUG_ON(!f);
+   unionfs_set_lower_file_idx((f), fbstart(f), (val));
+}
+
+/* Inode to lower inode. */
+static inline struct inode *unionfs_lower_inode(const struct inode *i)
+{
+   BUG_ON(!i);
+   return UNIONFS_I(i)->lower_inodes[ibstart(i)];
+}
+
+static inline struct inode *unionfs_lower_inode_idx(const struct inode *i,
+   int index)
+{
+   BUG_ON(!i || index < 0);
+   return UNIONFS_I(i)->lower_inodes[index];
+}
+
+static inline void unionfs_set_lower_inode_idx(struct inode *i

[PATCH 18/29] Unionfs: address-space operations

2008-01-10 Thread Erez Zadok

Includes writepage, writepages, readpage, prepare_write, and commit_write.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/mmap.c |  343 +
 1 files changed, 343 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/mmap.c

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
new file mode 100644
index 000..ad770ac
--- /dev/null
+++ b/fs/unionfs/mmap.c
@@ -0,0 +1,343 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2006  Shaya Potter
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
+{
+   int err = -EIO;
+   struct inode *inode;
+   struct inode *lower_inode;
+   struct page *lower_page;
+   struct address_space *lower_mapping; /* lower inode mapping */
+   gfp_t mask;
+
+   BUG_ON(!PageUptodate(page));
+   inode = page->mapping->host;
+   /* if no lower inode, nothing to do */
+   if (!inode || !UNIONFS_I(inode) || UNIONFS_I(inode)->lower_inodes) {
+   err = 0;
+   goto out;
+   }
+   lower_inode = unionfs_lower_inode(inode);
+   lower_mapping = lower_inode->i_mapping;
+
+   /*
+* find lower page (returns a locked page)
+*
+* We turn off __GFP_FS while we look for or create a new lower
+* page.  This prevents a recursion into the file system code, which
+* under memory pressure conditions could lead to a deadlock.  This
+* is similar to how the loop driver behaves (see loop_set_fd in
+* drivers/block/loop.c).  If we can't find the lower page, we
+* redirty our page and return "success" so that the VM will call us
+* again in the (hopefully near) future.
+*/
+   mask = mapping_gfp_mask(lower_mapping) & ~(__GFP_FS);
+   lower_page = find_or_create_page(lower_mapping, page->index, mask);
+   if (!lower_page) {
+   err = 0;
+   set_page_dirty(page);
+   goto out;
+   }
+
+   /* copy page data from our upper page to the lower page */
+   copy_highpage(lower_page, page);
+   flush_dcache_page(lower_page);
+   SetPageUptodate(lower_page);
+   set_page_dirty(lower_page);
+
+   /*
+* Call lower writepage (expects locked page).  However, if we are
+* called with wbc->for_reclaim, then the VFS/VM just wants to
+* reclaim our page.  Therefore, we don't need to call the lower
+* ->writepage: just copy our data to the lower page (already done
+* above), then mark the lower page dirty and unlock it, and return
+* success.
+*/
+   if (wbc->for_reclaim) {
+   unlock_page(lower_page);
+   goto out_release;
+   }
+
+   BUG_ON(!lower_mapping->a_ops->writepage);
+   wait_on_page_writeback(lower_page); /* prevent multiple writers */
+   clear_page_dirty_for_io(lower_page); /* emulate VFS behavior */
+   err = lower_mapping->a_ops->writepage(lower_page, wbc);
+   if (err < 0)
+   goto out_release;
+
+   /*
+* Lower file systems such as ramfs and tmpfs, may return
+* AOP_WRITEPAGE_ACTIVATE so that the VM won't try to (pointlessly)
+* write the page again for a while.  But those lower file systems
+* also set the page dirty bit back again.  Since we successfully
+* copied our page data to the lower page, then the VM will come
+* back to the lower page (directly) and try to flush it.  So we can
+* save the VM the hassle of coming back to our page and trying to
+* flush too.  Therefore, we don't re-dirty our own page, and we
+* never return AOP_WRITEPAGE_ACTIVATE back to the VM (we consider
+* this a success).
+*
+* We also unlock the lower page if the lower ->writepage returned
+* AOP_WRITEPAGE_ACTIVATE.  (This "anomalous" behaviour may be
+* addressed in future shmem/VM code.)
+*/
+   if (err == AOP_WRITEPAGE_ACTIVATE) {
+   err = 0;
+   unlock_page(lower_page);
+   }
+
+   /* all is well

[PATCH 16/29] Unionfs: inode operations

2008-01-10 Thread Erez Zadok

Includes create, lookup, link, symlink, mkdir, mknod, readlink, follow_link,
put_link, permission, and setattr.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c | 1174 
 1 files changed, 1174 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/inode.c

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
new file mode 100644
index 000..e15ddb9
--- /dev/null
+++ b/fs/unionfs/inode.c
@@ -0,0 +1,1174 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int unionfs_create(struct inode *parent, struct dentry *dentry,
+ int mode, struct nameidata *nd)
+{
+   int err = 0;
+   struct dentry *lower_dentry = NULL;
+   struct dentry *wh_dentry = NULL;
+   struct dentry *lower_parent_dentry = NULL;
+   char *name = NULL;
+   int valid = 0;
+   struct nameidata lower_nd;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_PARENT);
+   valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
+   unionfs_unlock_dentry(dentry->d_parent);
+   if (unlikely(!valid)) {
+   err = -ESTALE;  /* same as what real_lookup does */
+   goto out;
+   }
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   valid = __unionfs_d_revalidate_chain(dentry, nd, false);
+   /*
+* It's only a bug if this dentry was not negative and couldn't be
+* revalidated (shouldn't happen).
+*/
+   BUG_ON(!valid && dentry->d_inode);
+
+   /*
+* We shouldn't create things in a read-only branch; this check is a
+* bit redundant as we don't allow branch 0 to be read-only at the
+* moment
+*/
+   err = is_robranch_super(dentry->d_sb, 0);
+   if (err) {
+   err = -EROFS;
+   goto out;
+   }
+
+   /*
+* We _always_ create on branch 0
+*/
+   lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
+   if (lower_dentry) {
+   /*
+* check if whiteout exists in this branch, i.e. lookup .wh.foo
+* first.
+*/
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
+
+   wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(wh_dentry)) {
+   err = PTR_ERR(wh_dentry);
+   wh_dentry = NULL;
+   goto out;
+   }
+
+   if (wh_dentry->d_inode) {
+   /*
+* .wh.foo has been found, so let's unlink it
+*/
+   struct dentry *lower_dir_dentry;
+
+   lower_dir_dentry = lock_parent_wh(wh_dentry);
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
+   err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+   lockdep_on();
+   unlock_dir(lower_dir_dentry);
+
+   /*
+* Whiteouts are special files and should be deleted
+* no matter what (as if they never existed), in
+* order to allow this create operation to succeed.
+* This is especially important in sticky
+* directories: a whiteout may have been created by
+* one user, but the newly created file may be
+* created by another user.  Therefore, in order to
+* maintain Unix semantics, if the vfs_unlink above
+* ailed, then we have to try to directly unlink the
+* whiteout.  Note: in the ODF version of unionfs,
+*

[PATCH 25/29] Unionfs file system magic number

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/magic.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/magic.h b/include/linux/magic.h
index 1fa0c2c..67043ed 100644
--- a/include/linux/magic.h
+++ b/include/linux/magic.h
@@ -35,6 +35,8 @@
 #define REISER2FS_SUPER_MAGIC_STRING   "ReIsEr2Fs"
 #define REISER2FS_JR_SUPER_MAGIC_STRING"ReIsEr3Fs"
 
+#define UNIONFS_SUPER_MAGIC 0xf15f083d
+
 #define SMB_SUPER_MAGIC0x517B
 #define USBDEVICE_SUPER_MAGIC  0x9fa2
 #define CGROUP_SUPER_MAGIC 0x27e0eb
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 28/29] VFS: export release_open_intent symbol

2008-01-10 Thread Erez Zadok

Needed to release the resources of the lower nameidata structures that we
create and pass to lower file systems (e.g., when calling vfs_create).

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/namei.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 3b993db..14f9861 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -389,6 +389,7 @@ void release_open_intent(struct nameidata *nd)
else
fput(nd->intent.open.file);
 }
+EXPORT_SYMBOL(release_open_intent);
 
 static inline struct dentry *
 do_revalidate(struct dentry *dentry, struct nameidata *nd)
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 19/29] Unionfs: mount-time and stacking-interposition functions

2008-01-10 Thread Erez Zadok

Includes read_super and module-linkage routines.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c |  794 +
 1 files changed, 794 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/main.c

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
new file mode 100644
index 000..23c18f7
--- /dev/null
+++ b/fs/unionfs/main.c
@@ -0,0 +1,794 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+#include 
+#include 
+
+static void unionfs_fill_inode(struct dentry *dentry,
+  struct inode *inode)
+{
+   struct inode *lower_inode;
+   struct dentry *lower_dentry;
+   int bindex, bstart, bend;
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry) {
+   unionfs_set_lower_inode_idx(inode, bindex, NULL);
+   continue;
+   }
+
+   /* Initialize the lower inode to the new lower inode. */
+   if (!lower_dentry->d_inode)
+   continue;
+
+   unionfs_set_lower_inode_idx(inode, bindex,
+   igrab(lower_dentry->d_inode));
+   }
+
+   ibstart(inode) = dbstart(dentry);
+   ibend(inode) = dbend(dentry);
+
+   /* Use attributes from the first branch. */
+   lower_inode = unionfs_lower_inode(inode);
+
+   /* Use different set of inode ops for symlinks & directories */
+   if (S_ISLNK(lower_inode->i_mode))
+   inode->i_op = &unionfs_symlink_iops;
+   else if (S_ISDIR(lower_inode->i_mode))
+   inode->i_op = &unionfs_dir_iops;
+
+   /* Use different set of file ops for directories */
+   if (S_ISDIR(lower_inode->i_mode))
+   inode->i_fop = &unionfs_dir_fops;
+
+   /* properly initialize special inodes */
+   if (S_ISBLK(lower_inode->i_mode) || S_ISCHR(lower_inode->i_mode) ||
+   S_ISFIFO(lower_inode->i_mode) || S_ISSOCK(lower_inode->i_mode))
+   init_special_inode(inode, lower_inode->i_mode,
+  lower_inode->i_rdev);
+
+   /* all well, copy inode attributes */
+   unionfs_copy_attr_all(inode, lower_inode);
+   fsstack_copy_inode_size(inode, lower_inode);
+}
+
+/*
+ * Connect a unionfs inode dentry/inode with several lower ones.  This is
+ * the classic stackable file system "vnode interposition" action.
+ *
+ * @sb: unionfs's super_block
+ */
+struct dentry *unionfs_interpose(struct dentry *dentry, struct super_block *sb,
+int flag)
+{
+   int err = 0;
+   struct inode *inode;
+   int is_negative_dentry = 1;
+   int bindex, bstart, bend;
+   int need_fill_inode = 1;
+   struct dentry *spliced = NULL;
+
+   verify_locked(dentry);
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   /* Make sure that we didn't get a negative dentry. */
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   if (unionfs_lower_dentry_idx(dentry, bindex) &&
+   unionfs_lower_dentry_idx(dentry, bindex)->d_inode) {
+   is_negative_dentry = 0;
+   break;
+   }
+   }
+   BUG_ON(is_negative_dentry);
+
+   /*
+* We allocate our new inode below, by calling iget.
+* iget will call our read_inode which will initialize some
+* of the new inode's fields
+*/
+
+   /*
+* On revalidate we've already got our own inode and just need
+* to fix it up.
+*/
+   if (flag == INTERPOSE_REVAL) {
+   inode = dentry->d_inode;
+   UNIONFS_I(inode)->bstart = -1;
+   UNIONFS_I(inode)->bend = -1;
+   atomic_set(&UNIONFS_I(inode)->generation,
+  atomic_read(&UNIONFS_SB(sb)->generation));
+
+   UNIONFS_I(inode)->lower_inodes =
+

[PATCH 24/29] Unionfs: debugging infrastructure

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/debug.c |  533 
 1 files changed, 533 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/debug.c

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
new file mode 100644
index 000..d154c32
--- /dev/null
+++ b/fs/unionfs/debug.c
@@ -0,0 +1,533 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Helper debugging functions for maintainers (and for users to report back
+ * useful information back to maintainers)
+ */
+
+/* it's always useful to know what part of the code called us */
+#define PRINT_CALLER(fname, fxn, line) \
+   do {\
+   if (!printed_caller) {  \
+   pr_debug("PC:%s:%s:%d\n", (fname), (fxn), (line)); \
+   printed_caller = 1; \
+   }   \
+   } while (0)
+
+/*
+ * __unionfs_check_{inode,dentry,file} perform exhaustive sanity checking on
+ * the fan-out of various Unionfs objects.  We check that no lower objects
+ * exist  outside the start/end branch range; that all objects within are
+ * non-NULL (with some allowed exceptions); that for every lower file
+ * there's a lower dentry+inode; that the start/end ranges match for all
+ * corresponding lower objects; that open files/symlinks have only one lower
+ * objects, but directories can have several; and more.
+ */
+void __unionfs_check_inode(const struct inode *inode,
+  const char *fname, const char *fxn, int line)
+{
+   int bindex;
+   int istart, iend;
+   struct inode *lower_inode;
+   struct super_block *sb;
+   int printed_caller = 0;
+   void *poison_ptr;
+
+   /* for inodes now */
+   BUG_ON(!inode);
+   sb = inode->i_sb;
+   istart = ibstart(inode);
+   iend = ibend(inode);
+   /* don't check inode if no lower branches */
+   if (istart < 0 && iend < 0)
+   return;
+   if (unlikely(istart > iend)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci0: inode=%p istart/end=%d:%d\n",
+inode, istart, iend);
+   }
+   if (unlikely((istart == -1 && iend != -1) ||
+(istart != -1 && iend == -1))) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci1: inode=%p istart/end=%d:%d\n",
+inode, istart, iend);
+   }
+   if (!S_ISDIR(inode->i_mode)) {
+   if (unlikely(iend != istart)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci2: inode=%p istart=%d iend=%d\n",
+inode, istart, iend);
+   }
+   }
+
+   for (bindex = sbstart(sb); bindex < sbmax(sb); bindex++) {
+   if (unlikely(!UNIONFS_I(inode))) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci3: no inode_info %p\n", inode);
+   return;
+   }
+   if (unlikely(!UNIONFS_I(inode)->lower_inodes)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci4: no lower_inodes %p\n", inode);
+   return;
+   }
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (lower_inode) {
+   memset(&poison_ptr, POISON_INUSE, sizeof(void *));
+   if (unlikely(bindex < istart || bindex > iend)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci5: inode/linode=%p:%p bindex=%d "
+"istart/end=%d:%d\n", inode,
+lower_inode, bindex, istart, iend);
+   } else if (unlikely(lower_inode == poison_ptr)) {
+   /* freed inode! */
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci6: inode/linode=%p:%p bindex=%d "
+"istart/end=%d:%d\n", inode,
+lower_inode, bin

[PATCH 13/29] Unionfs: directory reading file operations

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirfops.c |  290 ++
 1 files changed, 290 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dirfops.c

diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
new file mode 100644
index 000..a613862
--- /dev/null
+++ b/fs/unionfs/dirfops.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* Make sure our rdstate is playing by the rules. */
+static void verify_rdstate_offset(struct unionfs_dir_state *rdstate)
+{
+   BUG_ON(rdstate->offset >= DIREOF);
+   BUG_ON(rdstate->cookie >= MAXRDCOOKIE);
+}
+
+struct unionfs_getdents_callback {
+   struct unionfs_dir_state *rdstate;
+   void *dirent;
+   int entries_written;
+   int filldir_called;
+   int filldir_error;
+   filldir_t filldir;
+   struct super_block *sb;
+};
+
+/* based on generic filldir in fs/readir.c */
+static int unionfs_filldir(void *dirent, const char *name, int namelen,
+  loff_t offset, u64 ino, unsigned int d_type)
+{
+   struct unionfs_getdents_callback *buf = dirent;
+   struct filldir_node *found = NULL;
+   int err = 0;
+   int is_wh_entry = 0;
+
+   buf->filldir_called++;
+
+   if ((namelen > UNIONFS_WHLEN) &&
+   !strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN)) {
+   name += UNIONFS_WHLEN;
+   namelen -= UNIONFS_WHLEN;
+   is_wh_entry = 1;
+   }
+
+   found = find_filldir_node(buf->rdstate, name, namelen, is_wh_entry);
+
+   if (found) {
+   /*
+* If we had non-whiteout entry in dir cache, then mark it
+* as a whiteout and but leave it in the dir cache.
+*/
+   if (is_wh_entry && !found->whiteout)
+   found->whiteout = is_wh_entry;
+   goto out;
+   }
+
+   /* if 'name' isn't a whiteout, filldir it. */
+   if (!is_wh_entry) {
+   off_t pos = rdstate2offset(buf->rdstate);
+   u64 unionfs_ino = ino;
+
+   err = buf->filldir(buf->dirent, name, namelen, pos,
+  unionfs_ino, d_type);
+   buf->rdstate->offset++;
+   verify_rdstate_offset(buf->rdstate);
+   }
+   /*
+* If we did fill it, stuff it in our hash, otherwise return an
+* error.
+*/
+   if (err) {
+   buf->filldir_error = err;
+   goto out;
+   }
+   buf->entries_written++;
+   err = add_filldir_node(buf->rdstate, name, namelen,
+  buf->rdstate->bindex, is_wh_entry);
+   if (err)
+   buf->filldir_error = err;
+
+out:
+   return err;
+}
+
+static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir)
+{
+   int err = 0;
+   struct file *lower_file = NULL;
+   struct inode *inode = NULL;
+   struct unionfs_getdents_callback buf;
+   struct unionfs_dir_state *uds;
+   int bend;
+   loff_t offset;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+
+   err = unionfs_file_revalidate(file, false);
+   if (unlikely(err))
+   goto out;
+
+   inode = file->f_path.dentry->d_inode;
+
+   uds = UNIONFS_F(file)->rdstate;
+   if (!uds) {
+   if (file->f_pos == DIREOF) {
+   goto out;
+   } else if (file->f_pos > 0) {
+   uds = find_rdstate(inode, file->f_pos);
+   if (unlikely(!uds)) {
+   err = -ESTALE;
+   goto out;
+   }
+   UNIONFS_F(file)->rdstate = uds;
+   } else {
+   init_rdstate(file);
+   uds = UNIONFS_F(file)->rdstate;
+   }
+   }
+   bend = fbend(file);
+
+   while (uds->bindex <= bend) {
+   lower_file = unionfs_lower_file_idx(file, ud

[PATCH 07/29] Unionfs: common file copyup/revalidation operations

2008-01-10 Thread Erez Zadok

Includes open, ioctl, and flush operations.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |  835 +++
 1 files changed, 835 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/commonfops.c

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
new file mode 100644
index 000..f37192f
--- /dev/null
+++ b/fs/unionfs/commonfops.c
@@ -0,0 +1,835 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * 1) Copyup the file
+ * 2) Rename the file to '.unionfs' - obviously
+ * stolen from NFS's silly rename
+ */
+static int copyup_deleted_file(struct file *file, struct dentry *dentry,
+  int bstart, int bindex)
+{
+   static unsigned int counter;
+   const int i_inosize = sizeof(dentry->d_inode->i_ino) * 2;
+   const int countersize = sizeof(counter) * 2;
+   const int nlen = sizeof(".unionfs") + i_inosize + countersize - 1;
+   char name[nlen + 1];
+   int err;
+   struct dentry *tmp_dentry = NULL;
+   struct dentry *lower_dentry;
+   struct dentry *lower_dir_dentry = NULL;
+
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bstart);
+
+   sprintf(name, ".unionfs%*.*lx",
+   i_inosize, i_inosize, lower_dentry->d_inode->i_ino);
+
+   /*
+* Loop, looking for an unused temp name to copyup to.
+*
+* It's somewhat silly that we look for a free temp tmp name in the
+* source branch (bstart) instead of the dest branch (bindex), where
+* the final name will be created.  We _will_ catch it if somehow
+* the name exists in the dest branch, but it'd be nice to catch it
+* sooner than later.
+*/
+retry:
+   tmp_dentry = NULL;
+   do {
+   char *suffix = name + nlen - countersize;
+
+   dput(tmp_dentry);
+   counter++;
+   sprintf(suffix, "%*.*x", countersize, countersize, counter);
+
+   pr_debug("unionfs: trying to rename %s to %s\n",
+dentry->d_name.name, name);
+
+   tmp_dentry = lookup_one_len(name, lower_dentry->d_parent,
+   nlen);
+   if (IS_ERR(tmp_dentry)) {
+   err = PTR_ERR(tmp_dentry);
+   goto out;
+   }
+   } while (tmp_dentry->d_inode != NULL);  /* need negative dentry */
+   dput(tmp_dentry);
+
+   err = copyup_named_file(dentry->d_parent->d_inode, file, name, bstart,
+   bindex,
+   i_size_read(file->f_path.dentry->d_inode));
+   if (err) {
+   if (unlikely(err == -EEXIST))
+   goto retry;
+   goto out;
+   }
+
+   /* bring it to the same state as an unlinked file */
+   lower_dentry = unionfs_lower_dentry_idx(dentry, dbstart(dentry));
+   if (!unionfs_lower_inode_idx(dentry->d_inode, bindex)) {
+   atomic_inc(&lower_dentry->d_inode->i_count);
+   unionfs_set_lower_inode_idx(dentry->d_inode, bindex,
+   lower_dentry->d_inode);
+   }
+   lower_dir_dentry = lock_parent(lower_dentry);
+   err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
+   unlock_dir(lower_dir_dentry);
+
+out:
+   if (!err)
+   unionfs_check_dentry(dentry);
+   return err;
+}
+
+/*
+ * put all references held by upper struct file and free lower file pointer
+ * array
+ */
+static void cleanup_file(struct file *file)
+{
+   int bindex, bstart, bend;
+   struct file **lower_files;
+   struct file *lower_file;
+   struct super_block *sb = file->f_path.dentry->d_sb;
+
+   lower_files = UNIONFS_F(file)->lower_files;
+   bstart = fbstart(file);
+   bend = fbend(file);
+
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   int i;  /* holds (possibly) updated branch index */
+   int old_bid;
+
+   lower_file = unionfs_lower_file_idx(file, bind

[PATCH 06/29] Unionfs: main header file

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/union.h |  602 
 1 files changed, 602 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/union.h

diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
new file mode 100644
index 000..d324f83
--- /dev/null
+++ b/fs/unionfs/union.h
@@ -0,0 +1,602 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _UNION_H_
+#define _UNION_H_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+
+/* the file system name */
+#define UNIONFS_NAME "unionfs"
+
+/* unionfs root inode number */
+#define UNIONFS_ROOT_INO 1
+
+/* number of times we try to get a unique temporary file name */
+#define GET_TMPNAM_MAX_RETRY   5
+
+/* maximum number of branches we support, to avoid memory blowup */
+#define UNIONFS_MAX_BRANCHES   128
+
+/* minimum time (seconds) required for time-based cache-coherency */
+#define UNIONFS_MIN_CC_TIME3
+
+/* Operations vectors defined in specific files. */
+extern struct file_operations unionfs_main_fops;
+extern struct file_operations unionfs_dir_fops;
+extern struct inode_operations unionfs_main_iops;
+extern struct inode_operations unionfs_dir_iops;
+extern struct inode_operations unionfs_symlink_iops;
+extern struct super_operations unionfs_sops;
+extern struct dentry_operations unionfs_dops;
+extern struct address_space_operations unionfs_aops;
+
+/* How long should an entry be allowed to persist */
+#define RDCACHE_JIFFIES(5*HZ)
+
+/* file private data. */
+struct unionfs_file_info {
+   int bstart;
+   int bend;
+   atomic_t generation;
+
+   struct unionfs_dir_state *rdstate;
+   struct file **lower_files;
+   int *saved_branch_ids; /* IDs of branches when file was opened */
+};
+
+/* unionfs inode data in memory */
+struct unionfs_inode_info {
+   int bstart;
+   int bend;
+   atomic_t generation;
+   int stale;
+   /* Stuff for readdir over NFS. */
+   spinlock_t rdlock;
+   struct list_head readdircache;
+   int rdcount;
+   int hashsize;
+   int cookie;
+
+   /* The lower inodes */
+   struct inode **lower_inodes;
+
+   struct inode vfs_inode;
+};
+
+/* unionfs dentry data in memory */
+struct unionfs_dentry_info {
+   /*
+* The semaphore is used to lock the dentry as soon as we get into a
+* unionfs function from the VFS.  Our lock ordering is that children
+* go before their parents.
+*/
+   struct mutex lock;
+   int bstart;
+   int bend;
+   int bopaque;
+   int bcount;
+   atomic_t generation;
+   struct path *lower_paths;
+};
+
+/* These are the pointers to our various objects. */
+struct unionfs_data {
+   struct super_block *sb;
+   atomic_t open_files;/* number of open files on branch */
+   int branchperms;
+   int branch_id;  /* unique branch ID at re/mount time */
+};
+
+/* unionfs super-block data in memory */
+struct unionfs_sb_info {
+   int bend;
+
+   atomic_t generation;
+
+   /*
+* This rwsem is used to make sure that a branch management
+* operation...
+*   1) will not begin before all currently in-flight operations
+*  complete.
+*   2) any new operations do not execute until the currently
+*  running branch management operation completes.
+*
+* The write_lock_owner records the PID of the task which grabbed
+* the rw_sem for writing.  If the same task also tries to grab the
+* read lock, we allow it.  This prevents a self-deadlock when
+* branch-management is used on a pivot_root'ed union, because we
+* have to ->lookup paths which belong to the same union.
+*/
+   struct rw_semaphore rwsem;
+   pid_t write_lock_owner; /* PID of rw_sem owner (write lock) */
+   int high_branch_id; /* last unique branch ID given */
+   struct unionfs_data *data;
+};
+
+/*
+ * structure for making the li

[PATCH 20/29] Unionfs: super_block operations

2008-01-10 Thread Erez Zadok

Includes read_inode, delete_inode, put_super, statfs, remount_fs (which
supports branch-management ops), clear_inode, alloc_inode, destroy_inode,
write_inode, umount_begin, and show_options.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/super.c | 1025 
 1 files changed, 1025 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/super.c

diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
new file mode 100644
index 000..986c980
--- /dev/null
+++ b/fs/unionfs/super.c
@@ -0,0 +1,1025 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * The inode cache is used with alloc_inode for both our inode info and the
+ * vfs inode.
+ */
+static struct kmem_cache *unionfs_inode_cachep;
+
+static void unionfs_read_inode(struct inode *inode)
+{
+   int size;
+   struct unionfs_inode_info *info = UNIONFS_I(inode);
+
+   memset(info, 0, offsetof(struct unionfs_inode_info, vfs_inode));
+   info->bstart = -1;
+   info->bend = -1;
+   atomic_set(&info->generation,
+  atomic_read(&UNIONFS_SB(inode->i_sb)->generation));
+   spin_lock_init(&info->rdlock);
+   info->rdcount = 1;
+   info->hashsize = -1;
+   INIT_LIST_HEAD(&info->readdircache);
+
+   size = sbmax(inode->i_sb) * sizeof(struct inode *);
+   info->lower_inodes = kzalloc(size, GFP_KERNEL);
+   if (unlikely(!info->lower_inodes)) {
+   printk(KERN_CRIT "unionfs: no kernel memory when allocating "
+  "lower-pointer array!\n");
+   BUG();
+   }
+
+   inode->i_version++;
+   inode->i_op = &unionfs_main_iops;
+   inode->i_fop = &unionfs_main_fops;
+
+   inode->i_mapping->a_ops = &unionfs_aops;
+
+   /*
+* reset times so unionfs_copy_attr_all can keep out time invariants
+* right (upper inode time being the max of all lower ones).
+*/
+   inode->i_atime.tv_sec = inode->i_atime.tv_nsec = 0;
+   inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
+   inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
+
+}
+
+/*
+ * we now define delete_inode, because there are two VFS paths that may
+ * destroy an inode: one of them calls clear inode before doing everything
+ * else that's needed, and the other is fine.  This way we truncate the inode
+ * size (and its pages) and then clear our own inode, which will do an iput
+ * on our and the lower inode.
+ *
+ * No need to lock sb info's rwsem.
+ */
+static void unionfs_delete_inode(struct inode *inode)
+{
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_lock(&inode->i_lock);
+#endif
+   i_size_write(inode, 0); /* every f/s seems to do that */
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_unlock(&inode->i_lock);
+#endif
+
+   if (inode->i_data.nrpages)
+   truncate_inode_pages(&inode->i_data, 0);
+
+   clear_inode(inode);
+}
+
+/*
+ * final actions when unmounting a file system
+ *
+ * No need to lock rwsem.
+ */
+static void unionfs_put_super(struct super_block *sb)
+{
+   int bindex, bstart, bend;
+   struct unionfs_sb_info *spd;
+   int leaks = 0;
+
+   spd = UNIONFS_SB(sb);
+   if (!spd)
+   return;
+
+   bstart = sbstart(sb);
+   bend = sbend(sb);
+
+   /* Make sure we have no leaks of branchget/branchput. */
+   for (bindex = bstart; bindex <= bend; bindex++)
+   if (unlikely(branch_count(sb, bindex) != 0)) {
+   printk(KERN_CRIT
+  "unionfs: branch %d has %d references left!\n",
+  bindex, branch_count(sb, bindex));
+   leaks = 1;
+   }
+   BUG_ON(leaks != 0);
+
+   kfree(spd->data);
+   kfree(spd);
+   sb->s_fs_info = NULL;
+}
+
+/*
+ * Since people use this to answer the "How big of a file can I write?"
+ * question, we report the size of the highest priority branch as the size of
+ * the union.
+ */
+static int unionfs_sta

Re: [UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-01-10 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Christoph Hellwig writes:
> On Thu, Jan 10, 2008 at 09:59:19AM -0500, Erez Zadok wrote:
> > 
> > Dear Linus, Al, Christoph, and Andrew,
> > 
> > As per your request, I'm posting for review the unionfs code (and related
> > code) that's in my korg tree against mainline (v2.6.24-rc7-71-gfd0b45d).
> > This is in preparation for merge in 2.6.25.
> 
> Huh?  There's still aboslutely not fix to the underlying problems of
> the whole idea.   I think we made it pretty clear that unionfs is not
> the way to go, and that we'll get the union mount patches clear once
> the per-mountpoint r/o and unprivilegued mount patches series are in
> and stable.

I'll reiterate what I've said before: unionfs is used today by many users,
it works, and is stable.  After years of working with unionfs, we've settled
on a set of features that users actually use.  This functionality can be in
mainline today.

Unioning at the VFS level, will take a long time to reach the same level of
maturity and support the same set of features.  Based on my years of
practical experience with it, unioning directories seems like a simple idea,
but in practice it's quite hard no matter the approach taken to implement
it.

Existing users of unioning aren't likely to switch to Union Mounts unless it
supports the same set of features.  How long will it realistically take to
get whiteout support in every lower file system that's used by Unionfs
users?  How will Union Mounts support persistent inode numbers at the VFS
level?  Those are just a few of the questions.

I think a better approach would be to start with Unionfs (a standalone file
system that doesn't touch the rest of the kernel).  And as Linux gradually
starts supporting more and more features that help unioning/stacking in
general, to change Unionfs to use those features (e.g., native whiteout
support).  Eventually there could be basic unioning support at the VFS
level, and concurrently a file-system which offers the extra features (e.g.,
persistency).  This can be done w/o affecting user-visible APIs.

Cheers,
Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

NFSv2/3 broken exporting/mounting (permission denied) in 2.6.24-rc4

2007-12-06 Thread Erez Zadok

I get a "permission denied" when trying to mount a localhost nfsv2/3
exported volume, on v2.6.24-rc4-124-gf194d13.  It works w/ nfsv4 mounting.
It worked fine in 2.6.24-rc3.  Here's a sequence of ops I tried:

# mount -t ext2 /dev/hdb1 /n/lower/b0
# exportfs -o no_root_squash,rw localhost:/n/lower/b0
# mount -t nfs -o nfsvers=3 localhost:/n/lower/b0 /mnt

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NFS] NFSv2/3 broken exporting/mounting (permission denied) in 2.6.24-rc4

2007-12-07 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, "J. Bruce Fields" writes:
> On Fri, Dec 07, 2007 at 03:00:13PM -0500, Erez Zadok wrote:
[...]
> Those files are actually in a separate filesystem (of type "nfsd") which
> is supposed to be mounted on /proc/fs/nfsd/.   So that mount must have
> failed in the bad case?  It's not immediately obvious to me what this
> patch has to do with that.  Hm.

Yes, it is indeed a separate mount in both cases, but in the broken case,
/proc/fs/nfsd is empty.

The patch in question introduces a proc ->d_revalidate method which does
this:

static int proc_revalidate_dentry(struct dentry *dentry, struct nameidata *nd)
{
d_drop(dentry);
return 0;
}

I'm not sure why it drops the dentry and then returns OK to the VFS; is it
to force the VFS to revalidate the dentry?  In that case, I think it should
return -ESTALE.  I also don't know why /proc needs a ->d_revalidate in the
first place (it was fine up until now).  Perhaps what proc does now is
correct, but its behavior has changed such that nfsd's /proc/fs/nfsd needs
to do something different (like grab an extra dentry ref?).

Anyway, if I comment out the d_drop line in proc_revalidate_dentry, or
remove proc's ->d_revalidate method, nfs exporting works again.

Someone more familiar with this patch and /proc should investigate.  Until
then, nfsv2/3 exporting are broken in 2.6.24-rc4.

> --b.

Cheers,
Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NFS] NFSv2/3 broken exporting/mounting (permission denied) in 2.6.24-rc4

2007-12-07 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, "J. Bruce Fields" writes:
> On Thu, Dec 06, 2007 at 09:20:41PM -0500, Erez Zadok wrote:
> > I get a "permission denied" when trying to mount a localhost nfsv2/3
> > exported volume, on v2.6.24-rc4-124-gf194d13.  It works w/ nfsv4 mounting.
> > It worked fine in 2.6.24-rc3.  Here's a sequence of ops I tried:
> > 
> > # mount -t ext2 /dev/hdb1 /n/lower/b0
> > # exportfs -o no_root_squash,rw localhost:/n/lower/b0
> > # mount -t nfs -o nfsvers=3 localhost:/n/lower/b0 /mnt
> 
> What do you see if you watch the network traffic in ethereal?
> 
> --b.

Bruce, I'm using nfs-utils-1.0.10-14.fc6 on an FC6 system with all latest
FC6 patches.  Using git-bisect I was able to find the patch which broke it:

commit 2b1e300a9dfc3196ccddf6f1d74b91b7af55e416
Author: Eric W. Biederman <[EMAIL PROTECTED]>
Date:   Sun Dec 2 00:33:17 2007 +1100

[NETNS]: Fix /proc/net breakage

Well I clearly goofed when I added the initial network namespace support
for /proc/net.  Currently things work but there are odd details visible to
user space, even when we have a single network namespace.

Since we do not cache proc_dir_entry dentries at the moment we can just
modify ->lookup to return a different directory inode depending on the
network namespace of the process looking at /proc/net, replacing the
current technique of using a magic and fragile follow_link method.

To accomplish that this patch:
- introduces a shadow_proc method to allow different dentries to
  be returned from proc_lookup.
- Removes the old /proc/net follow_link magic
- Fixes a weakness in our not caching of proc generic dentries.

As shadow_proc uses a task struct to decided which dentry to return we can
go back later and fix the proc generic caching without modifying any code
that uses the shadow_proc method.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Cc: "Rafael J. Wysocki" <[EMAIL PROTECTED]>
Cc: Pavel Machek <[EMAIL PROTECTED]>
Cc: Pavel Emelyanov <[EMAIL PROTECTED]>
Cc: "David S. Miller" <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

With the above patch, rpc.mountd is unable to open /proc/fs/nfsd/filehandle.
Strace shows:

open("/proc/fs/nfsd/filehandle", O_RDWR|O_LARGEFILE) = -1 ENOENT (No such file 
or directory)

Without the above patch, /proc/fs/nfsd is populated with a number of files,
including "filehandle".

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NFS] NFSv2/3 broken exporting/mounting (permission denied) in 2.6.24-rc4

2007-12-07 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, "J. Bruce Fields" writes:
> On Thu, Dec 06, 2007 at 09:20:41PM -0500, Erez Zadok wrote:
> > I get a "permission denied" when trying to mount a localhost nfsv2/3
> > exported volume, on v2.6.24-rc4-124-gf194d13.  It works w/ nfsv4 mounting.
> > It worked fine in 2.6.24-rc3.  Here's a sequence of ops I tried:
> > 
> > # mount -t ext2 /dev/hdb1 /n/lower/b0
> > # exportfs -o no_root_squash,rw localhost:/n/lower/b0
> > # mount -t nfs -o nfsvers=3 localhost:/n/lower/b0 /mnt
> 
> What do you see if you watch the network traffic in ethereal?
> 
> --b.

Bruce, I'm using nfs-utils-1.0.10-14.fc6 on an FC6 system with all latest
FC6 patches.  Using git-bisect I was able to find the patch which broke it:

commit 2b1e300a9dfc3196ccddf6f1d74b91b7af55e416
Author: Eric W. Biederman <[EMAIL PROTECTED]>
Date:   Sun Dec 2 00:33:17 2007 +1100

[NETNS]: Fix /proc/net breakage

Well I clearly goofed when I added the initial network namespace support
for /proc/net.  Currently things work but there are odd details visible to
user space, even when we have a single network namespace.

Since we do not cache proc_dir_entry dentries at the moment we can just
modify ->lookup to return a different directory inode depending on the
network namespace of the process looking at /proc/net, replacing the
current technique of using a magic and fragile follow_link method.

To accomplish that this patch:
- introduces a shadow_proc method to allow different dentries to
  be returned from proc_lookup.
- Removes the old /proc/net follow_link magic
- Fixes a weakness in our not caching of proc generic dentries.

As shadow_proc uses a task struct to decided which dentry to return we can
go back later and fix the proc generic caching without modifying any code
that uses the shadow_proc method.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Cc: "Rafael J. Wysocki" <[EMAIL PROTECTED]>
Cc: Pavel Machek <[EMAIL PROTECTED]>
Cc: Pavel Emelyanov <[EMAIL PROTECTED]>
Cc: "David S. Miller" <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

With the above patch, rpc.mountd is unable to open /proc/fs/nfsd/filehandle.
Strace shows:

open("/proc/fs/nfsd/filehandle", O_RDWR|O_LARGEFILE) = -1 ENOENT (No such file 
or directory)

Without the above patch, /proc/fs/nfsd is populated with a number of files,
including "filehandle".

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL -mm] 0/2 Unionfs updates/fixes/cleanups

2007-12-08 Thread Erez Zadok


The following is a series of patches related to Unionfs: just a couple of
small bug fixes and/or optimizations.

These patches were tested (where appropriate) on Linus's 2.6.24 latest code
(as of v2.6.24-rc4-124-gf194d13), MM, as well as the backports to
2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4, jffs2,
ramfs, tmpfs, cramfs, and squashfs (where available).  See
http://unionfs.filesystems.org/ to download back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (2):
  Unionfs: cleanup/consolidate branch-mode parsing code
  Unionfs: reduce the amount of cache-coherency debugging messages

 dentry.c |   38 +++---
 main.c   |   44 ++--
 super.c  |   12 
 union.h  |3 +--
 4 files changed, 54 insertions(+), 43 deletions(-)

---
Erez Zadok
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] Unionfs: cleanup/consolidate branch-mode parsing code

2007-12-08 Thread Erez Zadok

Also a bug fix: disallow unrecognized branch modes at mount time, instead of
defaulting to "rw".

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c  |   44 ++--
 fs/unionfs/super.c |   12 
 fs/unionfs/union.h |3 +--
 3 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index ffb0da1..22aa6e6 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -256,30 +256,21 @@ static int is_branch_overlap(struct dentry *dent1, struct 
dentry *dent2)
 }
 
 /*
- * Parse branch mode helper function
+ * Parse "ro" or "rw" options, but default to "rw" if no mode options was
+ * specified.  Fill the mode bits in @perms.  If encounter an unknown
+ * string, return -EINVAL.  Otherwise return 0.
  */
-int __parse_branch_mode(const char *name)
+int parse_branch_mode(const char *name, int *perms)
 {
-   if (!name)
+   if (!name || !strcmp(name, "rw")) {
+   *perms = MAY_READ | MAY_WRITE;
return 0;
-   if (!strcmp(name, "ro"))
-   return MAY_READ;
-   if (!strcmp(name, "rw"))
-   return (MAY_READ | MAY_WRITE);
-   return 0;
-}
-
-/*
- * Parse "ro" or "rw" options, but default to "rw" of no mode options
- * was specified.
- */
-int parse_branch_mode(const char *name)
-{
-   int perms = __parse_branch_mode(name);
-
-   if (perms == 0)
-   perms = MAY_READ | MAY_WRITE;
-   return perms;
+   }
+   if (!strcmp(name, "ro")) {
+   *perms = MAY_READ;
+   return 0;
+   }
+   return -EINVAL;
 }
 
 /*
@@ -350,8 +341,17 @@ static int parse_dirs_option(struct super_block *sb, 
struct unionfs_dentry_info
if (mode)
*mode++ = '\0';
 
-   perms = parse_branch_mode(mode);
+   err = parse_branch_mode(mode, &perms);
+   if (err) {
+   printk(KERN_ERR "unionfs: invalid mode \"%s\" for "
+  "branch %d\n", mode, bindex);
+   goto out;
+   }
+   /* ensure that leftmost branch is writeable */
if (!bindex && !(perms & MAY_WRITE)) {
+   printk(KERN_ERR "unionfs: leftmost branch cannot be "
+  "read-only (use \"-o ro\" to create a "
+  "read-only union)\n");
err = -EINVAL;
goto out;
}
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 88f77d7..d9cf2a7 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -181,8 +181,8 @@ static noinline int do_remount_mode_option(char *optarg, 
int cur_branches,
goto out;
}
*modename++ = '\0';
-   perms = __parse_branch_mode(modename);
-   if (perms == 0) {
+   err = parse_branch_mode(modename, &perms);
+   if (err) {
printk(KERN_ERR "unionfs: invalid mode \"%s\" for \"%s\"\n",
   modename, optarg);
goto out;
@@ -350,13 +350,17 @@ found_insertion_point:
modename = strchr(new_branch, '=');
if (modename)
*modename++ = '\0';
-   perms = parse_branch_mode(modename);
-
if (!new_branch || !*new_branch) {
printk(KERN_ERR "unionfs: null new branch\n");
err = -EINVAL;
goto out;
}
+   err = parse_branch_mode(modename, &perms);
+   if (err) {
+   printk(KERN_ERR "unionfs: invalid mode \"%s\" for "
+  "branch \"%s\"\n", modename, new_branch);
+   goto out;
+   }
err = path_lookup(new_branch, LOOKUP_FOLLOW, &nd);
if (err) {
printk(KERN_ERR "unionfs: error accessing "
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 283b085..20bff7b 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -474,8 +474,7 @@ static inline int is_robranch(const struct dentry *dentry)
  */
 extern char *alloc_whname(const char *name, int len);
 extern int check_branch(struct nameidata *nd);
-extern int __parse_branch_mode(const char *name);
-extern int parse_branch_mode(const char *name);
+extern int parse_branch_mode(const char *name, int *perms);
 
 /*
  * These two functions are here because it is kind of daft to copy and paste
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] Unionfs: reduce the amount of cache-coherency debugging messages

2007-12-08 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   38 +++---
 1 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 05d9914..7d27987 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -195,9 +195,11 @@ out:
  * file system doesn't change its inode times quick enough, resulting in a
  * false positive indication (which is harmless, it just makes unionfs do
  * extra work in re-validating the objects).  To minimize the chances of
- * these situations, we delay the detection of changed times by
- * UNIONFS_MIN_CC_TIME (which defaults to 3 seconds, as with NFS's
- * acregmin).
+ * these situations, we still consider such small time changes valid, but we
+ * don't print debugging messages unless the time changes are greater than
+ * UNIONFS_MIN_CC_TIME (which defaults to 3 seconds, as with NFS's acregmin)
+ * because significant changes are more likely due to users manually
+ * touching lower files.
  */
 bool is_newer_lower(const struct dentry *dentry)
 {
@@ -219,20 +221,26 @@ bool is_newer_lower(const struct dentry *dentry)
continue;
 
/* check if mtime/ctime have changed */
-   if (unlikely((lower_inode->i_mtime.tv_sec -
- inode->i_mtime.tv_sec) > UNIONFS_MIN_CC_TIME)) {
-   pr_info("unionfs: new lower inode mtime "
-   "(bindex=%d, name=%s)\n", bindex,
-   dentry->d_name.name);
-   show_dinode_times(dentry);
+   if (unlikely(timespec_compare(&inode->i_mtime,
+ &lower_inode->i_mtime) < 0)) {
+   if ((lower_inode->i_mtime.tv_sec -
+inode->i_mtime.tv_sec) > UNIONFS_MIN_CC_TIME) {
+   pr_info("unionfs: new lower inode mtime "
+   "(bindex=%d, name=%s)\n", bindex,
+   dentry->d_name.name);
+   show_dinode_times(dentry);
+   }
return true;
}
-   if (unlikely((lower_inode->i_ctime.tv_sec -
- inode->i_ctime.tv_sec) > UNIONFS_MIN_CC_TIME)) {
-   pr_info("unionfs: new lower inode ctime "
-   "(bindex=%d, name=%s)\n", bindex,
-   dentry->d_name.name);
-   show_dinode_times(dentry);
+   if (unlikely(timespec_compare(&inode->i_ctime,
+ &lower_inode->i_ctime) < 0)) {
+   if ((lower_inode->i_ctime.tv_sec -
+inode->i_ctime.tv_sec) > UNIONFS_MIN_CC_TIME) {
+   pr_info("unionfs: new lower inode ctime "
+   "(bindex=%d, name=%s)\n", bindex,
+   dentry->d_name.name);
+   show_dinode_times(dentry);
+   }
return true;
}
}
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[UNIONFS] 00/42 Unionfs and related patches review

2007-12-09 Thread Erez Zadok


Al, Christoph, and Andrew,

As per your request, I'm posting for review the unionfs code (and related
code) that's in my korg tree against mainline (v2.6.24-rc4-190-g94545ba).
This code is nearly identical to what's in -mm (the mm code has a couple of
additional things that depend on mm-specific patches that aren't in mainline
yet).

I really tried to keep this message short, by offering pointers to more
info, but still there's a bunch of info here.

Andrew, you've asked me to list the main issues that came about in
discussions regarding unionfs, and how were they addressed.  So I've
reviewed my notes from OLS'06, LSF'07, and OLS'07, as well as assorted
postings in mailing lists, and I came up with this prioritized list (in
descending priority order):

1. cache coherency
2. nameidata handling
3. namespace pollution
4. use of ioctls for branch management

(1) Cache coherency: by far, the biggest concern had been around cache
coherency: what happens if someone modifies a lower object
(file/dir/etc.).  I met with Mike Halcrow in October and we discussed
stacking in general; Mike also emphasized that cache-coherency was one
of his most pressing concerns in ecryptfs.

At OLS'06, several suggestions were made, including fancy tricks to hide the
lower namespace or "lock" it so users have readonly access.  None of these
solutions would have been able to easily handle the problem of an existing
open file descriptor on a lower file, and they might have required
significant VFS changes.  Moreover, unionfs users actually want to modify
lower branches directly, and then be able to see their changes reflected in
the union immediately.  So we explored a number of ideas.  We feel that the
VFS is complex enough so we tried our best to handle cache-coherency inside
unionfs.  The solution we have implemented is to compare the mtime/ctime of
upper/lower objects during revalidation (esp. of dentries); and if the lower
times are newer, we reconstruct the union object (drop the older objects,
and re-lookup them).  This time-based cache-coherency works well and is
similar to the NFS model.  Because Unionfs users tend to have a burst of
activity on lower branches, our current cache-coherency also defers the
revalidation actions until absolutely needed, so this idea tends to also be
more efficient for the common usage patterns.  More details about how we
handle cache-coherency are available in our
Documentation/filesystems/unionfs/concepts.txt file.

That said, we're now developing some VFS patches that would allow lower file
systems to more directly inform the upper objects about such (mtime)
changes.  We're exploring a couple of different options but our key goals
are to (a) minimize VFS changes and (b) avoid any changes to lower file
systems.

(2) nameidata handling.  Another important question raised (esp. by NFS
people) was how we handle struct nameidata.  The VFS passes nameidata
structs to file systems, and some file systems use that.  We used to
either pass NULL or the upper nd to the lower f/s.  That caused NULL
de-refs inside nfsv4, among other problems.  We now create our own
nameidata structure, fill it up as needed (esp. for intent data), and
pass it down.  We do this every time we call any VFS function that takes
a nameidata (e.g., vfs_create).  This seems to work well.

There's been some discussion on lkml about splitting struct nameidata in
two, one of which would handle just the intent information.  I'd like to see
that happen, maybe even help, because right now we pass a whole large-ish
struct nameidata for just a couple of intent bits of information that the
lower f/s needs.

(3) namespace pollution.  Unioning readonly and readwrite directories
requires the ability to mask, or white-out, files that are being deleted
from a readonly directory.  Unionfs does this in a portable way, by
creating .wh.XXX files to indicate that file XXX has been whited-out.
This works well on many file systems, but it tends to clutter lower
branches with these .wh.* files.  We recently optimized our whiteout
creation algorithm so it minimizes the number of conditions in which
whiteouts are created, and that helped some people a lot.  But still, if
you unify a readonly and writeable branch, and you try to delete a file
from the readonly branch/medium, there's no way to avoid creating some
sort of a whiteout.  BTW, of course, these whiteouts are completely
hidden from the view of the user who accesses files/dirs via the union.

In the long run, we really hope to see native whiteout support in Linux (ala
BSD).  Of course, this would require a change to the VFS and several native
file systems (possibly even a change to the on-disk format), so we realize
that this isn't likely to happen soon.  If/when native whiteout support was
available, unionfs could easily use it.  Until that time, we have lots of
users who want to use unionfs on top of numerous dif

[PATCH 01/42] Unionfs: filesystems documentation index

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/00-INDEX |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/Documentation/filesystems/00-INDEX 
b/Documentation/filesystems/00-INDEX
index 1de155e..b168331 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -96,6 +96,8 @@ udf.txt
- info and mount options for the UDF filesystem.
 ufs.txt
- info on the ufs filesystem.
+unionfs/
+   - info on the unionfs filesystem
 vfat.txt
- info on using the VFAT filesystem used in Windows NT and Windows 95
 vfs.txt
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 18/42] Unionfs: directory reading file operations

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirfops.c |  290 ++
 1 files changed, 290 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dirfops.c

diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
new file mode 100644
index 000..88df635
--- /dev/null
+++ b/fs/unionfs/dirfops.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* Make sure our rdstate is playing by the rules. */
+static void verify_rdstate_offset(struct unionfs_dir_state *rdstate)
+{
+   BUG_ON(rdstate->offset >= DIREOF);
+   BUG_ON(rdstate->cookie >= MAXRDCOOKIE);
+}
+
+struct unionfs_getdents_callback {
+   struct unionfs_dir_state *rdstate;
+   void *dirent;
+   int entries_written;
+   int filldir_called;
+   int filldir_error;
+   filldir_t filldir;
+   struct super_block *sb;
+};
+
+/* based on generic filldir in fs/readir.c */
+static int unionfs_filldir(void *dirent, const char *name, int namelen,
+  loff_t offset, u64 ino, unsigned int d_type)
+{
+   struct unionfs_getdents_callback *buf = dirent;
+   struct filldir_node *found = NULL;
+   int err = 0;
+   int is_wh_entry = 0;
+
+   buf->filldir_called++;
+
+   if ((namelen > UNIONFS_WHLEN) &&
+   !strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN)) {
+   name += UNIONFS_WHLEN;
+   namelen -= UNIONFS_WHLEN;
+   is_wh_entry = 1;
+   }
+
+   found = find_filldir_node(buf->rdstate, name, namelen, is_wh_entry);
+
+   if (found) {
+   /*
+* If we had non-whiteout entry in dir cache, then mark it
+* as a whiteout and but leave it in the dir cache.
+*/
+   if (is_wh_entry && !found->whiteout)
+   found->whiteout = is_wh_entry;
+   goto out;
+   }
+
+   /* if 'name' isn't a whiteout, filldir it. */
+   if (!is_wh_entry) {
+   off_t pos = rdstate2offset(buf->rdstate);
+   u64 unionfs_ino = ino;
+
+   err = buf->filldir(buf->dirent, name, namelen, pos,
+  unionfs_ino, d_type);
+   buf->rdstate->offset++;
+   verify_rdstate_offset(buf->rdstate);
+   }
+   /*
+* If we did fill it, stuff it in our hash, otherwise return an
+* error.
+*/
+   if (err) {
+   buf->filldir_error = err;
+   goto out;
+   }
+   buf->entries_written++;
+   err = add_filldir_node(buf->rdstate, name, namelen,
+  buf->rdstate->bindex, is_wh_entry);
+   if (err)
+   buf->filldir_error = err;
+
+out:
+   return err;
+}
+
+static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir)
+{
+   int err = 0;
+   struct file *lower_file = NULL;
+   struct inode *inode = NULL;
+   struct unionfs_getdents_callback buf;
+   struct unionfs_dir_state *uds;
+   int bend;
+   loff_t offset;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb);
+
+   err = unionfs_file_revalidate(file, false);
+   if (unlikely(err))
+   goto out;
+
+   inode = file->f_path.dentry->d_inode;
+
+   uds = UNIONFS_F(file)->rdstate;
+   if (!uds) {
+   if (file->f_pos == DIREOF) {
+   goto out;
+   } else if (file->f_pos > 0) {
+   uds = find_rdstate(inode, file->f_pos);
+   if (unlikely(!uds)) {
+   err = -ESTALE;
+   goto out;
+   }
+   UNIONFS_F(file)->rdstate = uds;
+   } else {
+   init_rdstate(file);
+   uds = UNIONFS_F(file)->rdstate;
+   }
+   }
+   bend = fbend(file);
+
+   while (uds->bindex <= bend) {
+   lower_file = unionfs_lower_file_idx(file, uds->bindex);
+   if (!lower_file) {
+

[PATCH 02/42] Unionfs: unionfs documentation index

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/00-INDEX |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/filesystems/unionfs/00-INDEX

diff --git a/Documentation/filesystems/unionfs/00-INDEX 
b/Documentation/filesystems/unionfs/00-INDEX
new file mode 100644
index 000..96fdf67
--- /dev/null
+++ b/Documentation/filesystems/unionfs/00-INDEX
@@ -0,0 +1,10 @@
+00-INDEX
+   - this file.
+concepts.txt
+   - A brief introduction of concepts.
+issues.txt
+   - A summary of known issues with unionfs.
+rename.txt
+   - Information regarding rename operations.
+usage.txt
+   - Usage information and examples.
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/42] Unionfs: documentation for any known issues

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/issues.txt |   24 
 1 files changed, 24 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/filesystems/unionfs/issues.txt

diff --git a/Documentation/filesystems/unionfs/issues.txt 
b/Documentation/filesystems/unionfs/issues.txt
new file mode 100644
index 000..9db1d70
--- /dev/null
+++ b/Documentation/filesystems/unionfs/issues.txt
@@ -0,0 +1,24 @@
+KNOWN Unionfs 2.1 ISSUES:
+=
+
+1. Unionfs should not use lookup_one_len() on the underlying f/s as it
+   confuses NFSv4.  Currently, unionfs_lookup() passes lookup intents to the
+   lower file-system, this eliminates part of the problem.  The remaining
+   calls to lookup_one_len may need to be changed to pass an intent.  We are
+   currently introducing VFS changes to fs/namei.c's do_path_lookup() to
+   allow proper file lookup and opening in stackable file systems.
+
+2. Lockdep (a debugging feature) isn't aware of stacking, and so it
+   incorrectly complains about locking problems.  The problem boils down to
+   this: Lockdep considers all objects of a certain type to be in the same
+   class, for example, all inodes.  Lockdep doesn't like to see a lock held
+   on two inodes within the same task, and warns that it could lead to a
+   deadlock.  However, stackable file systems do precisely that: they lock
+   an upper object, and then a lower object, in a strict order to avoid
+   locking problems; in addition, Unionfs, as a fan-out file system, may
+   have to lock several lower inodes.  We are currently looking into Lockdep
+   to see how to make it aware of stackable file systems.  In the meantime,
+   if you get any warnings from Lockdep, you can safely ignore them (or feel
+   free to report them to the Unionfs maintainers, just to be sure).
+
+For more information, see <http://unionfs.filesystems.org/>.
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/42] Unionfs: documentation for general concepts

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/concepts.txt |  199 
 1 files changed, 199 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/filesystems/unionfs/concepts.txt

diff --git a/Documentation/filesystems/unionfs/concepts.txt 
b/Documentation/filesystems/unionfs/concepts.txt
new file mode 100644
index 000..7654ccc
--- /dev/null
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -0,0 +1,199 @@
+Unionfs 2.1 CONCEPTS:
+=
+
+This file describes the concepts needed by a namespace unification file
+system.
+
+
+Branch Priority:
+
+
+Each branch is assigned a unique priority - starting from 0 (highest
+priority).  No two branches can have the same priority.
+
+
+Branch Mode:
+
+
+Each branch is assigned a mode - read-write or read-only. This allows
+directories on media mounted read-write to be used in a read-only manner.
+
+
+Whiteouts:
+==
+
+A whiteout removes a file name from the namespace. Whiteouts are needed when
+one attempts to remove a file on a read-only branch.
+
+Suppose we have a two-branch union, where branch 0 is read-write and branch
+1 is read-only. And a file 'foo' on branch 1:
+
+./b0/
+./b1/
+./b1/foo
+
+The unified view would simply be:
+
+./union/
+./union/foo
+
+Since 'foo' is stored on a read-only branch, it cannot be removed. A
+whiteout is used to remove the name 'foo' from the unified namespace. Again,
+since branch 1 is read-only, the whiteout cannot be created there. So, we
+try on a higher priority (lower numerically) branch and create the whiteout
+there.
+
+./b0/
+./b0/.wh.foo
+./b1/
+./b1/foo
+
+Later, when Unionfs traverses branches (due to lookup or readdir), it
+eliminate 'foo' from the namespace (as well as the whiteout itself.)
+
+
+Duplicate Elimination:
+==
+
+It is possible for files on different branches to have the same name.
+Unionfs then has to select which instance of the file to show to the user.
+Given the fact that each branch has a priority associated with it, the
+simplest solution is to take the instance from the highest priority
+(numerically lowest value) and "hide" the others.
+
+
+Copyup:
+===
+
+When a change is made to the contents of a file's data or meta-data, they
+have to be stored somewhere. The best way is to create a copy of the
+original file on a branch that is writable, and then redirect the write
+though to this copy. The copy must be made on a higher priority branch so
+that lookup and readdir return this newer "version" of the file rather than
+the original (see duplicate elimination).
+
+
+Cache Coherency:
+
+
+Unionfs users often want to be able to modify files and directories directly
+on the lower branches, and have those changes be visible at the Unionfs
+level.  This means that data (e.g., pages) and meta-data (dentries, inodes,
+open files, etc.) have to be synchronized between the upper and lower
+layers.  In other words, the newest changes from a layer below have to be
+propagated to the Unionfs layer above.  If the two layers are not in sync, a
+cache incoherency ensues, which could lead to application failures and even
+oopses.  The Linux kernel, however, has a rather limited set of mechanisms
+to ensure this inter-layer cache coherency---so Unionfs has to do most of
+the hard work on its own.
+
+Maintaining Invariants:
+
+The way Unionfs ensures cache coherency is as follows.  At each entry point
+to a Unionfs file system method, we call a utility function to validate the
+primary objects of this method.  Generally, we call unionfs_file_revalidate
+on open files, and __unionfs_d_revalidate_chain on dentries (which also
+validates inodes).  These utility functions check to see whether the upper
+Unionfs object is in sync with any of the lower objects that it represents.
+The checks we perform include whether the Unionfs superblock has a newer
+generation number, or if any of the lower objects mtime's or ctime's are
+newer.  (Note: generation numbers change when branch-management commands are
+issued, so in a way, maintaining cache coherency is also very important for
+branch-management.)  If indeed we determine that any Unionfs object is no
+longer in sync with its lower counterparts, then we rebuild that object
+similarly to how we do so for branch-management.
+
+While rebuilding Unionfs's objects, we also purge any page mappings and
+truncate inode pages (see fs/unionfs/dentry.c:purge_inode_data).  This is to
+ensure that Unionfs will re-get the newer data from the lower branches.  We
+perform this purging only if the Unionfs operation in question is a reading
+operation; if Unionfs is performing a data writing operation (e.g., ->write,
+->commit_write, etc.) then we do NOT flush the lower mappings/pages: this is
+because (1) a self-deadlock could

[PATCH 07/42] Unionfs maintainers

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 MAINTAINERS |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f3d7256..95f16f0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3805,6 +3805,15 @@ L:   linux-kernel@vger.kernel.org
 W: http://www.kernel.dk
 S: Maintained
 
+UNIONFS
+P:     Erez Zadok
+M: [EMAIL PROTECTED]
+P: Josef "Jeff" Sipek
+M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
+W: http://unionfs.filesystems.org
+S: Maintained
+
 USB ACM DRIVER
 P: Oliver Neukum
 M: [EMAIL PROTECTED]
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 08/42] Makefile: hook to compile unionfs

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/Makefile |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/Makefile b/fs/Makefile
index 500cf15..e202288 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -118,3 +118,4 @@ obj-$(CONFIG_HPPFS) += hppfs/
 obj-$(CONFIG_DEBUG_FS) += debugfs/
 obj-$(CONFIG_OCFS2_FS) += ocfs2/
 obj-$(CONFIG_GFS2_FS)   += gfs2/
+obj-$(CONFIG_UNION_FS) += unionfs/
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/42] Unionfs: usage documentation for users

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/usage.txt |  115 +++
 1 files changed, 115 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/filesystems/unionfs/usage.txt

diff --git a/Documentation/filesystems/unionfs/usage.txt 
b/Documentation/filesystems/unionfs/usage.txt
new file mode 100644
index 000..a6b1aca
--- /dev/null
+++ b/Documentation/filesystems/unionfs/usage.txt
@@ -0,0 +1,115 @@
+Unionfs is a stackable unification file system, which can appear to merge
+the contents of several directories (branches), while keeping their physical
+content separate.  Unionfs is useful for unified source tree management,
+merged contents of split CD-ROM, merged separate software package
+directories, data grids, and more.  Unionfs allows any mix of read-only and
+read-write branches, as well as insertion and deletion of branches anywhere
+in the fan-out.  To maintain Unix semantics, Unionfs handles elimination of
+duplicates, partial-error conditions, and more.
+
+# mount -t unionfs -o branch-option[,union-options[,...]] none MOUNTPOINT
+
+The available branch-option for the mount command is:
+
+   dirs=branch[=ro|=rw][:...]
+
+specifies a separated list of which directories compose the union.
+Directories that come earlier in the list have a higher precedence than
+those which come later. Additionally, read-only or read-write permissions of
+the branch can be specified by appending =ro or =rw (default) to each
+directory.
+
+Syntax:
+
+   dirs=/branch1[=ro|=rw]:/branch2[=ro|=rw]:...:/branchN[=ro|=rw]
+
+Example:
+
+   dirs=/writable_branch=rw:/read-only_branch=ro
+
+
+DYNAMIC BRANCH MANAGEMENT AND REMOUNTS
+==
+
+You can remount a union and change its overall mode, or reconfigure the
+branches, as follows.
+
+To downgrade a union from read-write to read-only:
+
+# mount -t unionfs -o remount,ro none MOUNTPOINT
+
+To upgrade a union from read-only to read-write:
+
+# mount -t unionfs -o remount,rw none MOUNTPOINT
+
+To delete a branch /foo, regardless where it is in the current union:
+
+# mount -t unionfs -o remount,del=/foo none MOUNTPOINT
+
+To insert (add) a branch /foo before /bar:
+
+# mount -t unionfs -o remount,add=/bar:/foo none MOUNTPOINT
+
+To insert (add) a branch /foo (with the "rw" mode flag) before /bar:
+
+# mount -t unionfs -o remount,add=/bar:/foo=rw none MOUNTPOINT
+
+To insert (add) a branch /foo (in "rw" mode) at the very beginning (i.e., a
+new highest-priority branch), you can use the above syntax, or use a short
+hand version as follows:
+
+# mount -t unionfs -o remount,add=/foo none MOUNTPOINT
+
+To append a branch to the very end (new lowest-priority branch):
+
+# mount -t unionfs -o remount,add=:/foo none MOUNTPOINT
+
+To append a branch to the very end (new lowest-priority branch), in
+read-only mode:
+
+# mount -t unionfs -o remount,add=:/foo=ro none MOUNTPOINT
+
+Finally, to change the mode of one existing branch, say /foo, from read-only
+to read-write, and change /bar from read-write to read-only:
+
+# mount -t unionfs -o remount,mode=/foo=rw,mode=/bar=ro none MOUNTPOINT
+
+Note: in Unionfs 2.x, you cannot set the leftmost branch to readonly because
+then Unionfs won't have any writable place for copyups to take place.
+Moreover, the VFS can get confused when it tries to modify something in a
+file system mounted read-write, but isn't permitted to write to it.
+Instead, you should set the whole union as readonly, as described above.
+If, however, you must set the leftmost branch as readonly, perhaps so you
+can get a snapshot of it at a point in time, then you should insert a new
+writable top-level branch, and mark the one you want as readonly.  This can
+be accomplished as follows, assuming that /foo is your current leftmost
+branch:
+
+# mount -t tmpfs -o size=NNN /new
+# mount -t unionfs -o remount,add=/new,mode=/foo=ro none MOUNTPOINT
+
+# mount -t unionfs -o remount,del=/new,mode=/foo=rw none MOUNTPOINT
+
+# umount /new
+
+CACHE CONSISTENCY
+=
+
+If you modify any file on any of the lower branches directly, while there is
+a Unionfs 2.1 mounted above any of those branches, you should tell Unionfs
+to purge its caches and re-get the objects.  To do that, you have to
+increment the generation number of the superblock using the following
+command:
+
+# mount -t unionfs -o remount,incgen none MOUNTPOINT
+
+Note that the older way of incrementing the generation number using an
+ioctl, is no longer supported in Unionfs 2.0 and newer.  Ioctls in general
+are not encouraged.  Plus, an ioctl is per-file concept, whereas the
+generation number is a per-file-system concept.  Worse, such an ioctl
+requires an open file, which then has to be invalidated by the very nature
+of the generation number increase (read: the old generation increase ioctl
+was pretty racy).
+
+
+For more information, see <http:/

[PATCH 22/42] Unionfs: unlink/rmdir operations

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/unlink.c |  236 +++
 1 files changed, 236 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/unlink.c

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
new file mode 100644
index 000..423ff36
--- /dev/null
+++ b/fs/unionfs/unlink.c
@@ -0,0 +1,236 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* unlink a file by creating a whiteout */
+static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
+{
+   struct dentry *lower_dentry;
+   struct dentry *lower_dir_dentry;
+   int bindex;
+   int err = 0;
+
+   err = unionfs_partial_lookup(dentry);
+   if (err)
+   goto out;
+
+   bindex = dbstart(dentry);
+
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry)
+   goto out;
+
+   lower_dir_dentry = lock_parent(lower_dentry);
+
+   /* avoid destroying the lower inode if the file is in use */
+   dget(lower_dentry);
+   err = is_robranch_super(dentry->d_sb, bindex);
+   if (!err)
+   err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
+   /* if vfs_unlink succeeded, update our inode's times */
+   if (!err)
+   unionfs_copy_attr_times(dentry->d_inode);
+   dput(lower_dentry);
+   fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
+   unlock_dir(lower_dir_dentry);
+
+   if (err && !IS_COPYUP_ERR(err))
+   goto out;
+
+   /*
+* We create whiteouts if (1) there was an error unlinking the main
+* file; (2) there is a lower priority file with the same name
+* (dbopaque); (3) the branch in which the file is not the last
+* (rightmost0 branch.  The last rule is an optimization to avoid
+* creating all those whiteouts if there's no chance they'd be
+* masking any lower-priority branch, as well as unionfs is used
+* with only one branch (using only one branch, while odd, is still
+* possible).
+*/
+   if (err) {
+   if (dbstart(dentry) == 0)
+   goto out;
+   err = create_whiteout(dentry, dbstart(dentry) - 1);
+   } else if (dbopaque(dentry) != -1) {
+   err = create_whiteout(dentry, dbopaque(dentry));
+   } else if (dbstart(dentry) < sbend(dentry->d_sb)) {
+   err = create_whiteout(dentry, dbstart(dentry));
+   }
+
+out:
+   if (!err)
+   dentry->d_inode->i_nlink--;
+
+   /* We don't want to leave negative leftover dentries for revalidate. */
+   if (!err && (dbopaque(dentry) != -1))
+   update_bstart(dentry);
+
+   return err;
+}
+
+int unionfs_unlink(struct inode *dir, struct dentry *dentry)
+{
+   int err = 0;
+
+   unionfs_read_lock(dentry->d_sb);
+   unionfs_lock_dentry(dentry);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+   unionfs_check_dentry(dentry);
+
+   err = unionfs_unlink_whiteout(dir, dentry);
+   /* call d_drop so the system "forgets" about us */
+   if (!err) {
+   if (!S_ISDIR(dentry->d_inode->i_mode))
+   unionfs_postcopyup_release(dentry);
+   d_drop(dentry);
+   /*
+* if unlink/whiteout succeeded, parent dir mtime has
+* changed
+*/
+   unionfs_copy_attr_times(dir);
+   }
+
+out:
+   if (!err) {
+   unionfs_check_dentry(dentry);
+   unionfs_check_inode(dir);
+   }
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+static int unionfs_rmdir_first(struct inode *dir, struct dentry *dentry,
+  struct unionfs_dir_state *namelist)
+{
+   int err;
+   struct dentry *lower_dentry;
+   struct dentry *lower_dir_dentry = NULL;
+
+   /* Here we need to remove whiteout entries. */
+   err = delete_whi

[PATCH 27/42] Unionfs: async I/O queue headers

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/sioq.h |   92 +
 1 files changed, 92 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/sioq.h

diff --git a/fs/unionfs/sioq.h b/fs/unionfs/sioq.h
new file mode 100644
index 000..afb71ee
--- /dev/null
+++ b/fs/unionfs/sioq.h
@@ -0,0 +1,92 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006  Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006  Junjiro Okajima
+ * Copyright (c) 2006  David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _SIOQ_H
+#define _SIOQ_H
+
+struct deletewh_args {
+   struct unionfs_dir_state *namelist;
+   struct dentry *dentry;
+   int bindex;
+};
+
+struct is_opaque_args {
+   struct dentry *dentry;
+};
+
+struct create_args {
+   struct inode *parent;
+   struct dentry *dentry;
+   umode_t mode;
+   struct nameidata *nd;
+};
+
+struct mkdir_args {
+   struct inode *parent;
+   struct dentry *dentry;
+   umode_t mode;
+};
+
+struct mknod_args {
+   struct inode *parent;
+   struct dentry *dentry;
+   umode_t mode;
+   dev_t dev;
+};
+
+struct symlink_args {
+   struct inode *parent;
+   struct dentry *dentry;
+   char *symbuf;
+   umode_t mode;
+};
+
+struct unlink_args {
+   struct inode *parent;
+   struct dentry *dentry;
+};
+
+
+struct sioq_args {
+   struct completion comp;
+   struct work_struct work;
+   int err;
+   void *ret;
+
+   union {
+   struct deletewh_args deletewh;
+   struct is_opaque_args is_opaque;
+   struct create_args create;
+   struct mkdir_args mkdir;
+   struct mknod_args mknod;
+   struct symlink_args symlink;
+   struct unlink_args unlink;
+   };
+};
+
+/* Extern definitions for SIOQ functions */
+extern int __init init_sioq(void);
+extern void stop_sioq(void);
+extern void run_sioq(work_func_t func, struct sioq_args *args);
+
+/* Extern definitions for our privilege escalation helpers */
+extern void __unionfs_create(struct work_struct *work);
+extern void __unionfs_mkdir(struct work_struct *work);
+extern void __unionfs_mknod(struct work_struct *work);
+extern void __unionfs_symlink(struct work_struct *work);
+extern void __unionfs_unlink(struct work_struct *work);
+extern void __delete_whiteouts(struct work_struct *work);
+extern void __is_opaque_dir(struct work_struct *work);
+
+#endif /* not _SIOQ_H */
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/42] Unionfs: main Makefile

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/Makefile |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/Makefile

diff --git a/fs/unionfs/Makefile b/fs/unionfs/Makefile
new file mode 100644
index 000..17ca4a7
--- /dev/null
+++ b/fs/unionfs/Makefile
@@ -0,0 +1,13 @@
+obj-$(CONFIG_UNION_FS) += unionfs.o
+
+unionfs-y := subr.o dentry.o file.o inode.o main.o super.o \
+   rdstate.o copyup.o dirhelper.o rename.o unlink.o \
+   lookup.o commonfops.o dirfops.o sioq.o mmap.o
+
+unionfs-$(CONFIG_UNION_FS_XATTR) += xattr.o
+
+unionfs-$(CONFIG_UNION_FS_DEBUG) += debug.o
+
+ifeq ($(CONFIG_UNION_FS_DEBUG),y)
+EXTRA_CFLAGS += -DDEBUG
+endif
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 28/42] Unionfs: async I/O queue operations

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/sioq.c |  119 +
 1 files changed, 119 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/sioq.c

diff --git a/fs/unionfs/sioq.c b/fs/unionfs/sioq.c
new file mode 100644
index 000..2a8c88e
--- /dev/null
+++ b/fs/unionfs/sioq.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006  Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006  Junjiro Okajima
+ * Copyright (c) 2006  David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Super-user IO work Queue - sometimes we need to perform actions which
+ * would fail due to the unix permissions on the parent directory (e.g.,
+ * rmdir a directory which appears empty, but in reality contains
+ * whiteouts).
+ */
+
+static struct workqueue_struct *superio_workqueue;
+
+int __init init_sioq(void)
+{
+   int err;
+
+   superio_workqueue = create_workqueue("unionfs_siod");
+   if (!IS_ERR(superio_workqueue))
+   return 0;
+
+   err = PTR_ERR(superio_workqueue);
+   printk(KERN_ERR "unionfs: create_workqueue failed %d\n", err);
+   superio_workqueue = NULL;
+   return err;
+}
+
+void stop_sioq(void)
+{
+   if (superio_workqueue)
+   destroy_workqueue(superio_workqueue);
+}
+
+void run_sioq(work_func_t func, struct sioq_args *args)
+{
+   INIT_WORK(&args->work, func);
+
+   init_completion(&args->comp);
+   while (!queue_work(superio_workqueue, &args->work)) {
+   /* TODO: do accounting if needed */
+   schedule();
+   }
+   wait_for_completion(&args->comp);
+}
+
+void __unionfs_create(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct create_args *c = &args->create;
+
+   args->err = vfs_create(c->parent, c->dentry, c->mode, c->nd);
+   complete(&args->comp);
+}
+
+void __unionfs_mkdir(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct mkdir_args *m = &args->mkdir;
+
+   args->err = vfs_mkdir(m->parent, m->dentry, m->mode);
+   complete(&args->comp);
+}
+
+void __unionfs_mknod(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct mknod_args *m = &args->mknod;
+
+   args->err = vfs_mknod(m->parent, m->dentry, m->mode, m->dev);
+   complete(&args->comp);
+}
+
+void __unionfs_symlink(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct symlink_args *s = &args->symlink;
+
+   args->err = vfs_symlink(s->parent, s->dentry, s->symbuf, s->mode);
+   complete(&args->comp);
+}
+
+void __unionfs_unlink(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct unlink_args *u = &args->unlink;
+
+   args->err = vfs_unlink(u->parent, u->dentry);
+   complete(&args->comp);
+}
+
+void __delete_whiteouts(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct deletewh_args *d = &args->deletewh;
+
+   args->err = do_delete_whiteouts(d->dentry, d->bindex, d->namelist);
+   complete(&args->comp);
+}
+
+void __is_opaque_dir(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+
+   args->ret = lookup_one_len(UNIONFS_DIR_OPAQUE, args->is_opaque.dentry,
+  sizeof(UNIONFS_DIR_OPAQUE) - 1);
+   complete(&args->comp);
+}
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 26/42] Unionfs: extended attributes operations

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/xattr.c |  153 
 1 files changed, 153 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/xattr.c

diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
new file mode 100644
index 000..00c6d0d
--- /dev/null
+++ b/fs/unionfs/xattr.c
@@ -0,0 +1,153 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* This is lifted from fs/xattr.c */
+void *unionfs_xattr_alloc(size_t size, size_t limit)
+{
+   void *ptr;
+
+   if (size > limit)
+   return ERR_PTR(-E2BIG);
+
+   if (!size)  /* size request, no buffer is needed */
+   return NULL;
+
+   ptr = kmalloc(size, GFP_KERNEL);
+   if (unlikely(!ptr))
+   return ERR_PTR(-ENOMEM);
+   return ptr;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,
+size_t size)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb);
+   unionfs_lock_dentry(dentry);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_getxattr(lower_dentry, (char *) name, value, size);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+int unionfs_setxattr(struct dentry *dentry, const char *name,
+const void *value, size_t size, int flags)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb);
+   unionfs_lock_dentry(dentry);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_setxattr(lower_dentry, (char *) name, (void *) value,
+  size, flags);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+int unionfs_removexattr(struct dentry *dentry, const char *name)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb);
+   unionfs_lock_dentry(dentry);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_removexattr(lower_dentry, (char *) name);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+   char *encoded_list = NULL;
+
+   unionfs_read_lock(dentry->d_sb);
+   unionfs_lock_dentry(dentry);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   encoded_list = list;
+   err = vfs_listxattr(lower_dentry, encoded_list, size);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 15/42] Unionfs: dentry revalidation

2007-12-09 Thread Erez Zadok

Includes d_release methods and cache-coherency support for dentries.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |  498 +++
 1 files changed, 498 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dentry.c

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
new file mode 100644
index 000..7d27987
--- /dev/null
+++ b/fs/unionfs/dentry.c
@@ -0,0 +1,498 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Revalidate a single dentry.
+ * Assume that dentry's info node is locked.
+ * Assume that parent(s) are all valid already, but
+ * the child may not yet be valid.
+ * Returns true if valid, false otherwise.
+ */
+static bool __unionfs_d_revalidate_one(struct dentry *dentry,
+  struct nameidata *nd)
+{
+   bool valid = true;  /* default is valid */
+   struct dentry *lower_dentry;
+   int bindex, bstart, bend;
+   int sbgen, dgen;
+   int positive = 0;
+   int locked = 0;
+   int interpose_flag;
+   struct nameidata lowernd; /* TODO: be gentler to the stack */
+
+   if (nd)
+   memcpy(&lowernd, nd, sizeof(struct nameidata));
+   else
+   memset(&lowernd, 0, sizeof(struct nameidata));
+
+   verify_locked(dentry);
+
+   /* if the dentry is unhashed, do NOT revalidate */
+   if (d_deleted(dentry))
+   goto out;
+
+   BUG_ON(dbstart(dentry) == -1);
+   if (dentry->d_inode)
+   positive = 1;
+   dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+   sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
+   /*
+* If we are working on an unconnected dentry, then there is no
+* revalidation to be done, because this file does not exist within
+* the namespace, and Unionfs operates on the namespace, not data.
+*/
+   if (unlikely(sbgen != dgen)) {
+   struct dentry *result;
+   int pdgen;
+
+   /* The root entry should always be valid */
+   BUG_ON(IS_ROOT(dentry));
+
+   /* We can't work correctly if our parent isn't valid. */
+   pdgen = atomic_read(&UNIONFS_D(dentry->d_parent)->generation);
+   BUG_ON(pdgen != sbgen); /* should never happen here */
+
+   /* Free the pointers for our inodes and this dentry. */
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+   if (bstart >= 0) {
+   struct dentry *lower_dentry;
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_dentry =
+   unionfs_lower_dentry_idx(dentry,
+bindex);
+   dput(lower_dentry);
+   }
+   }
+   set_dbstart(dentry, -1);
+   set_dbend(dentry, -1);
+
+   interpose_flag = INTERPOSE_REVAL_NEG;
+   if (positive) {
+   interpose_flag = INTERPOSE_REVAL;
+   /*
+* During BRM, the VFS could already hold a lock on
+* a file being read, so don't lock it again
+* (deadlock), but if you lock it in this function,
+* then release it here too.
+*/
+   if (!mutex_is_locked(&dentry->d_inode->i_mutex)) {
+   mutex_lock(&dentry->d_inode->i_mutex);
+   locked = 1;
+   }
+
+   bstart = ibstart(dentry->d_inode);
+   bend = ibend(dentry->d_inode);
+   if (bstart >= 0) {
+   struct inode *lower_inode;
+   for (bindex = bstart; bindex <= bend;
+bindex++) {
+   lower_inode =
+

[PATCH 29/42] Unionfs: miscellaneous helper routines

2007-12-09 Thread Erez Zadok

Mostly related to whiteouts.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/subr.c |  242 +
 1 files changed, 242 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/subr.c

diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
new file mode 100644
index 000..1a26c57
--- /dev/null
+++ b/fs/unionfs/subr.c
@@ -0,0 +1,242 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Pass an unionfs dentry and an index.  It will try to create a whiteout
+ * for the filename in dentry, and will try in branch 'index'.  On error,
+ * it will proceed to a branch to the left.
+ */
+int create_whiteout(struct dentry *dentry, int start)
+{
+   int bstart, bend, bindex;
+   struct dentry *lower_dir_dentry;
+   struct dentry *lower_dentry;
+   struct dentry *lower_wh_dentry;
+   struct nameidata nd;
+   char *name = NULL;
+   int err = -EINVAL;
+
+   verify_locked(dentry);
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   /* create dentry's whiteout equivalent */
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
+
+   for (bindex = start; bindex >= 0; bindex--) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+
+   if (!lower_dentry) {
+   /*
+* if lower dentry is not present, create the
+* entire lower dentry directory structure and go
+* ahead.  Since we want to just create whiteout, we
+* only want the parent dentry, and hence get rid of
+* this dentry.
+*/
+   lower_dentry = create_parents(dentry->d_inode,
+ dentry,
+ dentry->d_name.name,
+ bindex);
+   if (!lower_dentry || IS_ERR(lower_dentry)) {
+   int ret = PTR_ERR(lower_dentry);
+   if (!IS_COPYUP_ERR(ret))
+   printk(KERN_ERR
+  "unionfs: create_parents for "
+  "whiteout failed: bindex=%d "
+  "err=%d\n", bindex, ret);
+   continue;
+   }
+   }
+
+   lower_wh_dentry =
+   lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(lower_wh_dentry))
+   continue;
+
+   /*
+* The whiteout already exists. This used to be impossible,
+* but now is possible because of opaqueness.
+*/
+   if (lower_wh_dentry->d_inode) {
+   dput(lower_wh_dentry);
+   err = 0;
+   goto out;
+   }
+
+   err = init_lower_nd(&nd, LOOKUP_CREATE);
+   if (unlikely(err < 0))
+   goto out;
+   lower_dir_dentry = lock_parent(lower_wh_dentry);
+   err = is_robranch_super(dentry->d_sb, bindex);
+   if (!err)
+   err = vfs_create(lower_dir_dentry->d_inode,
+lower_wh_dentry,
+~current->fs->umask & S_IRWXUGO,
+&nd);
+   unlock_dir(lower_dir_dentry);
+   dput(lower_wh_dentry);
+   release_lower_nd(&nd, err);
+
+   if (!err || !IS_COPYUP_ERR(err))
+   break;
+   }
+
+   /* set dbopaque so that lookup will not proceed after this branch */
+   if (!err)
+   set_dbopaque(d

[PATCH 31/42] VFS: fs_stack header cleanups

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/fs_stack.h |   21 -
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/linux/fs_stack.h b/include/linux/fs_stack.h
index bb516ce..6b52faf 100644
--- a/include/linux/fs_stack.h
+++ b/include/linux/fs_stack.h
@@ -1,17 +1,28 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
 #ifndef _LINUX_FS_STACK_H
 #define _LINUX_FS_STACK_H
 
-/* This file defines generic functions used primarily by stackable
+/*
+ * This file defines generic functions used primarily by stackable
  * filesystems; none of these functions require i_mutex to be held.
  */
 
 #include 
 
 /* externs for fs/stack.c */
-extern void fsstack_copy_attr_all(struct inode *dest, const struct inode *src,
-   int (*get_nlinks)(struct inode *));
-
-extern void fsstack_copy_inode_size(struct inode *dst, const struct inode 
*src);
+extern void fsstack_copy_attr_all(struct inode *dest, const struct inode *src);
+extern void fsstack_copy_inode_size(struct inode *dst,
+   const struct inode *src);
 
 /* inlines */
 static inline void fsstack_copy_attr_atime(struct inode *dest,
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 41/42] eCryptfs: use simplified fs_stack API for inode operations

2007-12-09 Thread Erez Zadok

CC: Mike Halcrow <[EMAIL PROTECTED]>

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/ecryptfs/inode.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 0b1ab01..a846557 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -588,9 +588,9 @@ ecryptfs_rename(struct inode *old_dir, struct dentry 
*old_dentry,
lower_new_dir_dentry->d_inode, lower_new_dentry);
if (rc)
goto out_lock;
-   fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode, NULL);
+   fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode);
if (new_dir != old_dir)
-   fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode, 
NULL);
+   fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode);
 out_lock:
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
dput(lower_new_dentry->d_parent);
@@ -924,7 +924,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct 
iattr *ia)
 
rc = notify_change(lower_dentry, ia);
 out:
-   fsstack_copy_attr_all(inode, lower_inode, NULL);
+   fsstack_copy_attr_all(inode, lower_inode);
return rc;
 }
 
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 38/42] VFS: simplified fsstack_copy_attr_all

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/stack.c |   30 +-
 1 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/fs/stack.c b/fs/stack.c
index 67716f6..a548aac 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -1,8 +1,20 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
 #include 
 #include 
 #include 
 
-/* does _NOT_ require i_mutex to be held.
+/*
+ * does _NOT_ require i_mutex to be held.
  *
  * This function cannot be inlined since i_size_{read,write} is rather
  * heavy-weight on 32-bit systems
@@ -14,11 +26,11 @@ void fsstack_copy_inode_size(struct inode *dst, const 
struct inode *src)
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
 
-/* copy all attributes; get_nlinks is optional way to override the i_nlink
+/*
+ * copy all attributes; get_nlinks is optional way to override the i_nlink
  * copying
  */
-void fsstack_copy_attr_all(struct inode *dest, const struct inode *src,
-   int (*get_nlinks)(struct inode *))
+void fsstack_copy_attr_all(struct inode *dest, const struct inode *src)
 {
dest->i_mode = src->i_mode;
dest->i_uid = src->i_uid;
@@ -29,14 +41,6 @@ void fsstack_copy_attr_all(struct inode *dest, const struct 
inode *src,
dest->i_ctime = src->i_ctime;
dest->i_blkbits = src->i_blkbits;
dest->i_flags = src->i_flags;
-
-   /*
-* Update the nlinks AFTER updating the above fields, because the
-* get_links callback may depend on them.
-*/
-   if (!get_nlinks)
-   dest->i_nlink = src->i_nlink;
-   else
-   dest->i_nlink = (*get_nlinks)(dest);
+   dest->i_nlink = src->i_nlink;
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_attr_all);
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 24/42] Unionfs: mount-time and stacking-interposition functions

2007-12-09 Thread Erez Zadok

Includes read_super and module-linkage routines.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c |  783 +
 1 files changed, 783 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/main.c

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
new file mode 100644
index 000..22aa6e6
--- /dev/null
+++ b/fs/unionfs/main.c
@@ -0,0 +1,783 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+#include 
+#include 
+
+static void unionfs_fill_inode(struct dentry *dentry,
+  struct inode *inode)
+{
+   struct inode *lower_inode;
+   struct dentry *lower_dentry;
+   int bindex, bstart, bend;
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry) {
+   unionfs_set_lower_inode_idx(inode, bindex, NULL);
+   continue;
+   }
+
+   /* Initialize the lower inode to the new lower inode. */
+   if (!lower_dentry->d_inode)
+   continue;
+
+   unionfs_set_lower_inode_idx(inode, bindex,
+   igrab(lower_dentry->d_inode));
+   }
+
+   ibstart(inode) = dbstart(dentry);
+   ibend(inode) = dbend(dentry);
+
+   /* Use attributes from the first branch. */
+   lower_inode = unionfs_lower_inode(inode);
+
+   /* Use different set of inode ops for symlinks & directories */
+   if (S_ISLNK(lower_inode->i_mode))
+   inode->i_op = &unionfs_symlink_iops;
+   else if (S_ISDIR(lower_inode->i_mode))
+   inode->i_op = &unionfs_dir_iops;
+
+   /* Use different set of file ops for directories */
+   if (S_ISDIR(lower_inode->i_mode))
+   inode->i_fop = &unionfs_dir_fops;
+
+   /* properly initialize special inodes */
+   if (S_ISBLK(lower_inode->i_mode) || S_ISCHR(lower_inode->i_mode) ||
+   S_ISFIFO(lower_inode->i_mode) || S_ISSOCK(lower_inode->i_mode))
+   init_special_inode(inode, lower_inode->i_mode,
+  lower_inode->i_rdev);
+
+   /* all well, copy inode attributes */
+   unionfs_copy_attr_all(inode, lower_inode);
+   fsstack_copy_inode_size(inode, lower_inode);
+}
+
+/*
+ * Connect a unionfs inode dentry/inode with several lower ones.  This is
+ * the classic stackable file system "vnode interposition" action.
+ *
+ * @sb: unionfs's super_block
+ */
+struct dentry *unionfs_interpose(struct dentry *dentry, struct super_block *sb,
+int flag)
+{
+   int err = 0;
+   struct inode *inode;
+   int is_negative_dentry = 1;
+   int bindex, bstart, bend;
+   int need_fill_inode = 1;
+   struct dentry *spliced = NULL;
+
+   verify_locked(dentry);
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   /* Make sure that we didn't get a negative dentry. */
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   if (unionfs_lower_dentry_idx(dentry, bindex) &&
+   unionfs_lower_dentry_idx(dentry, bindex)->d_inode) {
+   is_negative_dentry = 0;
+   break;
+   }
+   }
+   BUG_ON(is_negative_dentry);
+
+   /*
+* We allocate our new inode below, by calling iget.
+* iget will call our read_inode which will initialize some
+* of the new inode's fields
+*/
+
+   /*
+* On revalidate we've already got our own inode and just need
+* to fix it up.
+*/
+   if (flag == INTERPOSE_REVAL) {
+   inode = dentry->d_inode;
+   UNIONFS_I(inode)->bstart = -1;
+   UNIONFS_I(inode)->bend = -1;
+   atomic_set(&UNIONFS_I(inode)->generation,
+  atomic_read(&UNIONFS_SB(sb)->generation));
+
+   UNIONFS_I(inode)->lower_inodes =
+

[PATCH 33/42] MM: extern for drop_pagecache_sb

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/mm.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1b7b95c..fc61bd3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -19,6 +19,7 @@ struct anon_vma;
 struct file_ra_state;
 struct user_struct;
 struct writeback_control;
+struct super_block;
 
 #ifndef CONFIG_DISCONTIGMEM  /* Don't use mapnrs, do it properly */
 extern unsigned long max_mapnr;
@@ -1135,6 +1136,7 @@ int drop_caches_sysctl_handler(struct ctl_table *, int, 
struct file *,
void __user *, size_t *, loff_t *);
 unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
unsigned long lru_pages);
+extern void drop_pagecache_sb(struct super_block *);
 void drop_pagecache(void);
 void drop_slab(void);
 
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 14/42] Unionfs: lower-level copyup routines

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/copyup.c |  897 +++
 1 files changed, 897 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/copyup.c

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
new file mode 100644
index 000..3fe4865
--- /dev/null
+++ b/fs/unionfs/copyup.c
@@ -0,0 +1,897 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * For detailed explanation of copyup see:
+ * Documentation/filesystems/unionfs/concepts.txt
+ */
+
+#ifdef CONFIG_UNION_FS_XATTR
+/* copyup all extended attrs for a given dentry */
+static int copyup_xattrs(struct dentry *old_lower_dentry,
+struct dentry *new_lower_dentry)
+{
+   int err = 0;
+   ssize_t list_size = -1;
+   char *name_list = NULL;
+   char *attr_value = NULL;
+   char *name_list_buf = NULL;
+
+   /* query the actual size of the xattr list */
+   list_size = vfs_listxattr(old_lower_dentry, NULL, 0);
+   if (list_size <= 0) {
+   err = list_size;
+   goto out;
+   }
+
+   /* allocate space for the actual list */
+   name_list = unionfs_xattr_alloc(list_size + 1, XATTR_LIST_MAX);
+   if (unlikely(!name_list || IS_ERR(name_list))) {
+   err = PTR_ERR(name_list);
+   goto out;
+   }
+
+   name_list_buf = name_list; /* save for kfree at end */
+
+   /* now get the actual xattr list of the source file */
+   list_size = vfs_listxattr(old_lower_dentry, name_list, list_size);
+   if (list_size <= 0) {
+   err = list_size;
+   goto out;
+   }
+
+   /* allocate space to hold each xattr's value */
+   attr_value = unionfs_xattr_alloc(XATTR_SIZE_MAX, XATTR_SIZE_MAX);
+   if (unlikely(!attr_value || IS_ERR(attr_value))) {
+   err = PTR_ERR(name_list);
+   goto out;
+   }
+
+   /* in a loop, get and set each xattr from src to dst file */
+   while (*name_list) {
+   ssize_t size;
+
+   /* Lock here since vfs_getxattr doesn't lock for us */
+   mutex_lock(&old_lower_dentry->d_inode->i_mutex);
+   size = vfs_getxattr(old_lower_dentry, name_list,
+   attr_value, XATTR_SIZE_MAX);
+   mutex_unlock(&old_lower_dentry->d_inode->i_mutex);
+   if (size < 0) {
+   err = size;
+   goto out;
+   }
+   if (size > XATTR_SIZE_MAX) {
+   err = -E2BIG;
+   goto out;
+   }
+   /* Don't lock here since vfs_setxattr does it for us. */
+   err = vfs_setxattr(new_lower_dentry, name_list, attr_value,
+  size, 0);
+   /*
+* Selinux depends on "security.*" xattrs, so to maintain
+* the security of copied-up files, if Selinux is active,
+* then we must copy these xattrs as well.  So we need to
+* temporarily get FOWNER privileges.
+* XXX: move entire copyup code to SIOQ.
+*/
+   if (err == -EPERM && !capable(CAP_FOWNER)) {
+   cap_raise(current->cap_effective, CAP_FOWNER);
+   err = vfs_setxattr(new_lower_dentry, name_list,
+  attr_value, size, 0);
+   cap_lower(current->cap_effective, CAP_FOWNER);
+   }
+   if (err < 0)
+   goto out;
+   name_list += strlen(name_list) + 1;
+   }
+out:
+   unionfs_xattr_kfree(name_list_buf);
+   unionfs_xattr_kfree(attr_value);
+   /* Ignore if xattr isn't supported */
+   if (err == -ENOTSUPP || err == -EOPNOTSUPP)
+   err = 0;
+   return err;
+}
+#endif /* CONFIG_UNION_FS_XATTR */
+
+/*
+ * Determine the mode based on the copyup flags, and the existing dentry.
+ *
+ * Handle file systems which may not support certain options.  For example
+ * jffs2 doesn't allow one to chmod a sy

[PATCH 37/42] VFS: export release_open_intent symbol

2007-12-09 Thread Erez Zadok

Needed to release the resources of the lower nameidata structures that we
create and pass to lower file systems (e.g., when calling vfs_create).

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/namei.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 3b993db..14f9861 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -389,6 +389,7 @@ void release_open_intent(struct nameidata *nd)
else
fput(nd->intent.open.file);
 }
+EXPORT_SYMBOL(release_open_intent);
 
 static inline struct dentry *
 do_revalidate(struct dentry *dentry, struct nameidata *nd)
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 32/42] Unionfs file system magic number

2007-12-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/magic.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/magic.h b/include/linux/magic.h
index 1fa0c2c..67043ed 100644
--- a/include/linux/magic.h
+++ b/include/linux/magic.h
@@ -35,6 +35,8 @@
 #define REISER2FS_SUPER_MAGIC_STRING   "ReIsEr2Fs"
 #define REISER2FS_JR_SUPER_MAGIC_STRING"ReIsEr3Fs"
 
+#define UNIONFS_SUPER_MAGIC 0xf15f083d
+
 #define SMB_SUPER_MAGIC0x517B
 #define USBDEVICE_SUPER_MAGIC  0x9fa2
 #define CGROUP_SUPER_MAGIC 0x27e0eb
-- 
1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 25/42] Unionfs: super_block operations

2007-12-09 Thread Erez Zadok

Includes read_inode, delete_inode, put_super, statfs, remount_fs (which
supports branch-management ops), clear_inode, alloc_inode, destroy_inode,
write_inode, umount_begin, and show_options.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/super.c | 1020 
 1 files changed, 1020 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/super.c

diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
new file mode 100644
index 000..d9cf2a7
--- /dev/null
+++ b/fs/unionfs/super.c
@@ -0,0 +1,1020 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * The inode cache is used with alloc_inode for both our inode info and the
+ * vfs inode.
+ */
+static struct kmem_cache *unionfs_inode_cachep;
+
+static void unionfs_read_inode(struct inode *inode)
+{
+   int size;
+   struct unionfs_inode_info *info = UNIONFS_I(inode);
+
+   unionfs_read_lock(inode->i_sb);
+
+   memset(info, 0, offsetof(struct unionfs_inode_info, vfs_inode));
+   info->bstart = -1;
+   info->bend = -1;
+   atomic_set(&info->generation,
+  atomic_read(&UNIONFS_SB(inode->i_sb)->generation));
+   spin_lock_init(&info->rdlock);
+   info->rdcount = 1;
+   info->hashsize = -1;
+   INIT_LIST_HEAD(&info->readdircache);
+
+   size = sbmax(inode->i_sb) * sizeof(struct inode *);
+   info->lower_inodes = kzalloc(size, GFP_KERNEL);
+   if (unlikely(!info->lower_inodes)) {
+   printk(KERN_CRIT "unionfs: no kernel memory when allocating "
+  "lower-pointer array!\n");
+   BUG();
+   }
+
+   inode->i_version++;
+   inode->i_op = &unionfs_main_iops;
+   inode->i_fop = &unionfs_main_fops;
+
+   inode->i_mapping->a_ops = &unionfs_aops;
+
+   unionfs_read_unlock(inode->i_sb);
+}
+
+/*
+ * we now define delete_inode, because there are two VFS paths that may
+ * destroy an inode: one of them calls clear inode before doing everything
+ * else that's needed, and the other is fine.  This way we truncate the inode
+ * size (and its pages) and then clear our own inode, which will do an iput
+ * on our and the lower inode.
+ *
+ * No need to lock sb info's rwsem.
+ */
+static void unionfs_delete_inode(struct inode *inode)
+{
+   i_size_write(inode, 0); /* every f/s seems to do that */
+
+   if (inode->i_data.nrpages)
+   truncate_inode_pages(&inode->i_data, 0);
+
+   clear_inode(inode);
+}
+
+/*
+ * final actions when unmounting a file system
+ *
+ * No need to lock rwsem.
+ */
+static void unionfs_put_super(struct super_block *sb)
+{
+   int bindex, bstart, bend;
+   struct unionfs_sb_info *spd;
+   int leaks = 0;
+
+   spd = UNIONFS_SB(sb);
+   if (!spd)
+   return;
+
+   bstart = sbstart(sb);
+   bend = sbend(sb);
+
+   /* Make sure we have no leaks of branchget/branchput. */
+   for (bindex = bstart; bindex <= bend; bindex++)
+   if (unlikely(branch_count(sb, bindex) != 0)) {
+   printk(KERN_CRIT
+  "unionfs: branch %d has %d references left!\n",
+  bindex, branch_count(sb, bindex));
+   leaks = 1;
+   }
+   BUG_ON(leaks != 0);
+
+   kfree(spd->data);
+   kfree(spd);
+   sb->s_fs_info = NULL;
+}
+
+/*
+ * Since people use this to answer the "How big of a file can I write?"
+ * question, we report the size of the highest priority branch as the size of
+ * the union.
+ */
+static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)
+{
+   int err = 0;
+   struct super_block *sb;
+   struct dentry *lower_dentry;
+
+   sb = dentry->d_sb;
+
+   unionfs_read_lock(sb);
+   unionfs_lock_dentry(dentry);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+   unionfs_check_dentry(dentry);
+
+   lower_dentry = unionfs_lower_dentry(sb->s_root);
+   err = vfs_statf

[PATCH 19/42] Unionfs: readdir helper functions

2007-12-09 Thread Erez Zadok

Includes whiteout handling for directories.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirhelper.c |  272 
 1 files changed, 272 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dirhelper.c

diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
new file mode 100644
index 000..2e52fc3
--- /dev/null
+++ b/fs/unionfs/dirhelper.c
@@ -0,0 +1,272 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Delete all of the whiteouts in a given directory for rmdir.
+ *
+ * lower directory inode should be locked
+ */
+int do_delete_whiteouts(struct dentry *dentry, int bindex,
+   struct unionfs_dir_state *namelist)
+{
+   int err = 0;
+   struct dentry *lower_dir_dentry = NULL;
+   struct dentry *lower_dentry;
+   char *name = NULL, *p;
+   struct inode *lower_dir;
+   int i;
+   struct list_head *pos;
+   struct filldir_node *cursor;
+
+   /* Find out lower parent dentry */
+   lower_dir_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   BUG_ON(!S_ISDIR(lower_dir_dentry->d_inode->i_mode));
+   lower_dir = lower_dir_dentry->d_inode;
+   BUG_ON(!S_ISDIR(lower_dir->i_mode));
+
+   err = -ENOMEM;
+   name = __getname();
+   if (unlikely(!name))
+   goto out;
+   strcpy(name, UNIONFS_WHPFX);
+   p = name + UNIONFS_WHLEN;
+
+   err = 0;
+   for (i = 0; !err && i < namelist->size; i++) {
+   list_for_each(pos, &namelist->list[i]) {
+   cursor =
+   list_entry(pos, struct filldir_node,
+  file_list);
+   /* Only operate on whiteouts in this branch. */
+   if (cursor->bindex != bindex)
+   continue;
+   if (!cursor->whiteout)
+   continue;
+
+   strcpy(p, cursor->name);
+   lower_dentry =
+   lookup_one_len(name, lower_dir_dentry,
+  cursor->namelen +
+  UNIONFS_WHLEN);
+   if (IS_ERR(lower_dentry)) {
+   err = PTR_ERR(lower_dentry);
+   break;
+   }
+   if (lower_dentry->d_inode)
+   err = vfs_unlink(lower_dir, lower_dentry);
+   dput(lower_dentry);
+   if (err)
+   break;
+   }
+   }
+
+   __putname(name);
+
+   /* After all of the removals, we should copy the attributes once. */
+   fsstack_copy_attr_times(dentry->d_inode, lower_dir_dentry->d_inode);
+
+out:
+   return err;
+}
+
+/* delete whiteouts in a dir (for rmdir operation) using sioq if necessary */
+int delete_whiteouts(struct dentry *dentry, int bindex,
+struct unionfs_dir_state *namelist)
+{
+   int err;
+   struct super_block *sb;
+   struct dentry *lower_dir_dentry;
+   struct inode *lower_dir;
+   struct sioq_args args;
+
+   sb = dentry->d_sb;
+
+   BUG_ON(!S_ISDIR(dentry->d_inode->i_mode));
+   BUG_ON(bindex < dbstart(dentry));
+   BUG_ON(bindex > dbend(dentry));
+   err = is_robranch_super(sb, bindex);
+   if (err)
+   goto out;
+
+   lower_dir_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   BUG_ON(!S_ISDIR(lower_dir_dentry->d_inode->i_mode));
+   lower_dir = lower_dir_dentry->d_inode;
+   BUG_ON(!S_ISDIR(lower_dir->i_mode));
+
+   mutex_lock(&lower_dir->i_mutex);
+   if (!permission(lower_dir, MAY_WRITE | MAY_EXEC, NULL)) {
+   err = do_delete_whiteouts(dentry, bindex, namelist);
+   } else {
+   args.deletewh.namelist = namelist;
+   args.deletewh.dentry = dentry;
+   args.deletewh.bindex = bindex;
+   run_sioq(__delete_whiteouts, &args);
+   err = args.

1 2 3 4 5 >

1 - 100 of 405 matches

Mail list logo