Please do not make the option have the same name but different semantics. Strongly suggest adding the Darwin name as a toggle and a FreeBSD name as a specific size option.
-Alfred * Xin LI <delp...@delphij.net> [090917 15:27] wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, Igor, > > Igor Sysoev wrote: > > Hi, > > > > nginx-0.8.15 can use completely non-blocking sendfile() using SF_NODISKIO > > flag. When sendfile() returns EBUSY, nginx calls aio_read() to read single > > byte. The first aio_read() preloads the first 128K part of a file in VM > > cache, > > however, all successive aio_read()s preload just 16K parts of the file. > > This makes non-blocking sendfile() usage ineffective for files larger > > than 128K. > > > > I've created a small patch for Darwin compatible F_RDAHEAD fcntl: > > > > fcntl(fd, F_RDAHEAD, preload_size) > > > > There is small incompatibilty: Darwin's fcntl allows just to enable/disable > > read ahead, while the proposed patch allows to set exact preload size. > > > > Currently the preload size affects vn_read() code path only and does not > > affect on sendfile() code path. However, it can be easy extended on > > sendfile() part too. The preload size is still limited by sysctl > > vfs.read_max. > > > > The patch is against FreeBSD 7.2 and was tested on FreeBSD 7.2-STABLE only. > > I have ported this as a patch against -HEAD (should apply on 8.0-R but > it's too late for us to add a new feature) plus a manual page entry > documenting the feature. > > I've used F_READAHEAD as the name, but reading the manual page, it looks > like we can just use F_RDAHEAD since Darwin seems to just distinguish 0 > and !=0 case so that programmers won't have to use #ifdef or something > else to get code working on different platform? > > Cheers, > - -- > Xin LI <delp...@delphij.net> http://www.delphij.net/ > FreeBSD - The Power to Serve! > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.12 (FreeBSD) > > iEYEARECAAYFAkqyt40ACgkQi+vbBBjt66AdKgCfXOo/Vn+zw0cCjS+gGJUgPo8t > WToAmgKIXaVKsKUcqVOqTwHl4eTFsbkM > =uP3m > -----END PGP SIGNATURE----- > Index: lib/libc/sys/fcntl.2 > =================================================================== > --- lib/libc/sys/fcntl.2 (revision 197297) > +++ lib/libc/sys/fcntl.2 (working copy) > @@ -28,7 +28,7 @@ > .\" @(#)fcntl.2 8.2 (Berkeley) 1/12/94 > .\" $FreeBSD$ > .\" > -.Dd March 8, 2008 > +.Dd September 19, 2009 > .Dt FCNTL 2 > .Os > .Sh NAME > @@ -241,6 +241,14 @@ > .Dv SA_RESTART > (see > .Xr sigaction 2 ) . > +.It Dv F_READAHEAD > +Set or clear the read ahead amount for sequential access to the third > +argument, > +.Fa arg , > +which is rounded up to the nearest block size. > +A zero value in > +.Fa arg > +turns off read ahead. > .El > .Pp > When a shared lock has been set on a segment of a file, > Index: sys/kern/kern_descrip.c > =================================================================== > --- sys/kern/kern_descrip.c (revision 197297) > +++ sys/kern/kern_descrip.c (working copy) > @@ -421,6 +421,7 @@ > struct vnode *vp; > int error, flg, tmp; > int vfslocked; > + uint64_t bsize; > > vfslocked = 0; > error = 0; > @@ -686,6 +687,31 @@ > vfslocked = 0; > fdrop(fp, td); > break; > + > + case F_READAHEAD: > + FILEDESC_SLOCK(fdp); > + if ((fp = fdtofp(fd, fdp)) == NULL) { > + FILEDESC_SUNLOCK(fdp); > + error = EBADF; > + break; > + } > + if (fp->f_type != DTYPE_VNODE) { > + FILEDESC_SUNLOCK(fdp); > + error = EBADF; > + break; > + } > + fhold(fp); > + FILEDESC_SUNLOCK(fdp); > + if (arg) { > + bsize = fp->f_vnode->v_mount->mnt_stat.f_iosize; > + fp->f_seqcount = (arg + bsize - 1) / bsize; > + fp->f_flag |= O_READAHEAD; > + } else { > + fp->f_flag &= ~O_READAHEAD; > + } > + fdrop(fp, td); > + break; > + > default: > error = EINVAL; > break; > Index: sys/kern/vfs_vnops.c > =================================================================== > --- sys/kern/vfs_vnops.c (revision 197297) > +++ sys/kern/vfs_vnops.c (working copy) > @@ -312,6 +312,9 @@ > sequential_heuristic(struct uio *uio, struct file *fp) > { > > + if (fp->f_flag & O_READAHEAD) > + return (fp->f_seqcount << IO_SEQSHIFT); > + > /* > * Offset 0 is handled specially. open() sets f_seqcount to 1 so > * that the first I/O is normally considered to be slightly > Index: sys/sys/fcntl.h > =================================================================== > --- sys/sys/fcntl.h (revision 197297) > +++ sys/sys/fcntl.h (working copy) > @@ -112,7 +112,11 @@ > #if __BSD_VISIBLE > /* Attempt to bypass buffer cache */ > #define O_DIRECT 0x00010000 > +#ifdef _KERNEL > +/* Read ahead */ > +#define O_READAHEAD 0x00020000 > #endif > +#endif > > /* Defined by POSIX Extended API Set Part 2 */ > #if __BSD_VISIBLE > @@ -218,6 +222,7 @@ > #define F_SETLK 12 /* set record locking > information */ > #define F_SETLKW 13 /* F_SETLK; wait if blocked */ > #define F_SETLK_REMOTE 14 /* debugging support for remote > locks */ > +#define F_READAHEAD 15 /* read ahead */ > > /* file descriptor flags (F_GETFD, F_SETFD) */ > #define FD_CLOEXEC 1 /* close-on-exec flag */ > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org" -- - Alfred Perlstein .- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250 .- FreeBSD committer _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"