Please do not make the option have the same name but different
semantics.

Strongly suggest adding the Darwin name as a toggle and a FreeBSD
name as a specific size option.

-Alfred

* Xin LI <delp...@delphij.net> [090917 15:27] wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi, Igor,
> 
> Igor Sysoev wrote:
> > Hi,
> > 
> > nginx-0.8.15 can use completely non-blocking sendfile() using SF_NODISKIO
> > flag. When sendfile() returns EBUSY, nginx calls aio_read() to read single
> > byte. The first aio_read() preloads the first 128K part of a file in VM 
> > cache,
> > however, all successive aio_read()s preload just 16K parts of the file.
> > This makes non-blocking sendfile() usage ineffective for files larger
> > than 128K.
> > 
> > I've created a small patch for Darwin compatible F_RDAHEAD fcntl:
> > 
> >    fcntl(fd, F_RDAHEAD, preload_size)
> > 
> > There is small incompatibilty: Darwin's fcntl allows just to enable/disable
> > read ahead, while the proposed patch allows to set exact preload size.
> > 
> > Currently the preload size affects vn_read() code path only and does not
> > affect on sendfile() code path. However, it can be easy extended on
> > sendfile() part too. The preload size is still limited by sysctl 
> > vfs.read_max.
> > 
> > The patch is against FreeBSD 7.2 and was tested on FreeBSD 7.2-STABLE only.
> 
> I have ported this as a patch against -HEAD (should apply on 8.0-R but
> it's too late for us to add a new feature) plus a manual page entry
> documenting the feature.
> 
> I've used F_READAHEAD as the name, but reading the manual page, it looks
> like we can just use F_RDAHEAD since Darwin seems to just distinguish 0
> and !=0 case so that programmers won't have to use #ifdef or something
> else to get code working on different platform?
> 
> Cheers,
> - --
> Xin LI <delp...@delphij.net>  http://www.delphij.net/
> FreeBSD - The Power to Serve!
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.12 (FreeBSD)
> 
> iEYEARECAAYFAkqyt40ACgkQi+vbBBjt66AdKgCfXOo/Vn+zw0cCjS+gGJUgPo8t
> WToAmgKIXaVKsKUcqVOqTwHl4eTFsbkM
> =uP3m
> -----END PGP SIGNATURE-----

> Index: lib/libc/sys/fcntl.2
> ===================================================================
> --- lib/libc/sys/fcntl.2      (revision 197297)
> +++ lib/libc/sys/fcntl.2      (working copy)
> @@ -28,7 +28,7 @@
>  .\"     @(#)fcntl.2  8.2 (Berkeley) 1/12/94
>  .\" $FreeBSD$
>  .\"
> -.Dd March 8, 2008
> +.Dd September 19, 2009
>  .Dt FCNTL 2
>  .Os
>  .Sh NAME
> @@ -241,6 +241,14 @@
>  .Dv SA_RESTART
>  (see
>  .Xr sigaction 2 ) .
> +.It Dv F_READAHEAD
> +Set or clear the read ahead amount for sequential access to the third
> +argument,
> +.Fa arg ,
> +which is rounded up to the nearest block size.
> +A zero value in
> +.Fa arg
> +turns off read ahead.
>  .El
>  .Pp
>  When a shared lock has been set on a segment of a file,
> Index: sys/kern/kern_descrip.c
> ===================================================================
> --- sys/kern/kern_descrip.c   (revision 197297)
> +++ sys/kern/kern_descrip.c   (working copy)
> @@ -421,6 +421,7 @@
>       struct vnode *vp;
>       int error, flg, tmp;
>       int vfslocked;
> +     uint64_t bsize;
>  
>       vfslocked = 0;
>       error = 0;
> @@ -686,6 +687,31 @@
>               vfslocked = 0;
>               fdrop(fp, td);
>               break;
> +
> +     case F_READAHEAD:
> +             FILEDESC_SLOCK(fdp);
> +             if ((fp = fdtofp(fd, fdp)) == NULL) {
> +                     FILEDESC_SUNLOCK(fdp);
> +                     error = EBADF;
> +                     break;
> +             }
> +             if (fp->f_type != DTYPE_VNODE) {
> +                     FILEDESC_SUNLOCK(fdp);
> +                     error = EBADF;
> +                     break;
> +             }
> +             fhold(fp);
> +             FILEDESC_SUNLOCK(fdp);
> +             if (arg) {
> +                     bsize = fp->f_vnode->v_mount->mnt_stat.f_iosize;
> +                     fp->f_seqcount = (arg + bsize - 1) / bsize;
> +                     fp->f_flag |= O_READAHEAD;
> +             } else {
> +                     fp->f_flag &= ~O_READAHEAD;
> +             }
> +             fdrop(fp, td);
> +             break;
> +
>       default:
>               error = EINVAL;
>               break;
> Index: sys/kern/vfs_vnops.c
> ===================================================================
> --- sys/kern/vfs_vnops.c      (revision 197297)
> +++ sys/kern/vfs_vnops.c      (working copy)
> @@ -312,6 +312,9 @@
>  sequential_heuristic(struct uio *uio, struct file *fp)
>  {
>  
> +     if (fp->f_flag & O_READAHEAD)
> +             return (fp->f_seqcount << IO_SEQSHIFT);
> +
>       /*
>        * Offset 0 is handled specially.  open() sets f_seqcount to 1 so
>        * that the first I/O is normally considered to be slightly
> Index: sys/sys/fcntl.h
> ===================================================================
> --- sys/sys/fcntl.h   (revision 197297)
> +++ sys/sys/fcntl.h   (working copy)
> @@ -112,7 +112,11 @@
>  #if __BSD_VISIBLE
>  /* Attempt to bypass buffer cache */
>  #define O_DIRECT     0x00010000
> +#ifdef _KERNEL
> +/* Read ahead */
> +#define O_READAHEAD  0x00020000
>  #endif
> +#endif
>  
>  /* Defined by POSIX Extended API Set Part 2 */
>  #if __BSD_VISIBLE
> @@ -218,6 +222,7 @@
>  #define      F_SETLK         12              /* set record locking 
> information */
>  #define      F_SETLKW        13              /* F_SETLK; wait if blocked */
>  #define      F_SETLK_REMOTE  14              /* debugging support for remote 
> locks */
> +#define      F_READAHEAD     15              /* read ahead */
>  
>  /* file descriptor flags (F_GETFD, F_SETFD) */
>  #define      FD_CLOEXEC      1               /* close-on-exec flag */

> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to