Re: [Gluster-users] mkdir produces stale file handles

Stefan Solbrig Thu, 19 Sep 2019 08:03:32 -0700

Thanks for the help!

> > Thanks for the quick answer!
> >
> > I think I can reduce data on the "full" bricks, solving the problem 
> > temporarily.
> >
> > The thing is, that the behavior changed from 3.12 to 6.5:   3.12 didn't 
> > have problems with almost full bricks, so I thought everything was fine. 
> > Then, after the upgrade, I ran into this problem. This might be a corner 
> > case that will go away once no-one uses 3.12 any more.
> >
> > But I think I can create a situation with 6.5 only that reproduces the 
> > error. Suppose I have a brick that 99% full.  So a write() will succeed. 
> > After the write, the brick can be 100% full, so a subsequent mkdir() will 
> > produce stale file handles (i.e., bricks that have different directory 
> > trees).  The funny thing is, that the mkdir() on the user side does not 
> > produce an error.   Clearly, no-one should ever let the file system get to 
> > 99%, but still, mkdir should fail then... 
> 
> I think there is a soft and hard limit that prevents creation of 
> files/folders when a specific threshold is hit , but that threshold might be 
> per brick instead of per replica set.


There is the cluster.min-free-disk, which states that the server should look 
for a free brick if the hash would place the file on a brick with less than 
"min-free-disk" bytes.   However, this seems to be a "should". If all bricks 
have less space than "min-free-disk", then the file is written anyway. 

Apart from that, I have some really large bricks (around 200 TB each), which 
means that if these are 99% full, then there are still 2 TB left (a signifikant 
amount).  The logic of "do not create a directory if the brick is 100% full" 
seems to be hard coded.  I didn't find a setting to disable this logic.

Nonethess, I think I can construct a test case where a sequence of write() and 
mkdir() would create stale file handles, even though all userland operations 
succeed.   Should I consider this a bug and make the effort to construct a test 
case?  (not on my production system, but on a toy model?  It will take me a few 
days...)

> > What remains:  is there a recommended way how to deal with the situation 
> > that I have some bricks that don't have all directories?
> 
> I think that you can mount the gluster volume and run a find with stat that 
> will force a sync.
> find /rhev/mnt/full-path/directory-missing-on-some-bricks -iname '*' -exec 
> stat {} \;

Thank you a lot! That indeed fixed the missing directories!   (I didn't know a 
"stat" triggers a sync of the bricks.)

best wishes,
Stefan

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] mkdir produces stale file handles

Reply via email to