On Sat, 24 Jul 2004, Andi Gutmans wrote:
> Has anyone had a chance to try, test and benchmark the realpath() patch I 
> send to the list?

I had a look the other day, but here is a more detailed look at the system 
calls involved in a php5 request.  I have stripped out most of the stuff 
we can't do anything about.  The script I am testing is 
/var/home/rasmus/php5/o which contains:
<?php
        include '/var/www/lerdorf.com/abc';
        include '/var/www/lerdorf.com/abc';
        include '/var/www/lerdorf.com/def';
        include '/var/www/phpics.com/ghi';
?>

The system calls required to open and read the script:

getcwd("/var/home/rasmus/php5", 4096)   = 22
lstat64("/var/home/rasmus/php5/o", {st_mode=S_IFREG|0644, st_size=156, ...}) = 0
setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
rt_sigaction(SIGPROF, {0x4067a7a0, [PROF], SA_RESTORER|SA_RESTART, 0x406ea658}, 
{0x4067a7a0, [PROF], SA_RESTORER|SA_RESTART, 0x406ea658}, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [PROF], NULL, 8) = 0
read(3, "<?php\n\tinclude \'/var/www/lerdorf"..., 8192) = 156
read(3, "", 4096)                       = 0
read(3, "", 8192)                       = 0
close(3)                                = 0

ok so far.  Now the first include:

getcwd("/var/home/rasmus/php5", 4096)   = 22
lstat64("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/var/www", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/var/www/lerdorf.com", {st_mode=S_IFDIR|0777, st_size=16384, ...}) = 0
lstat64("/var/www/lerdorf.com/abc", {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
open("/var/www/lerdorf.com/abc", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "abc\n", 8192)                  = 4
read(3, "", 8192)                       = 0
read(3, "", 8192)                       = 0
close(3)                                = 0

Since nothing is cached yet we see realpath() statting the entire tree 
leading to the abc include file.  There is an obvious double fstat on the 
opened descriptor there that we should track down.

Then we get:

fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x40018000
write(1, "abc\n", 4)                    = 4

To write out the results.

The second time we include the same file the stat cache kicks in and we 
have:

getcwd("/var/home/rasmus/php5", 4096)   = 22
time(NULL)                              = 1090679437
open("/var/www/lerdorf.com/abc", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "abc\n", 8192)                  = 4
read(3, "", 8192)                       = 0
read(3, "", 8192)                       = 0
close(3)                                = 0
write(1, "abc\n", 4)                    = 4

Not sure where that time() call came from here, and again we see the 
double fstat.  And since we have already started writing the output the 
results are written out immediately in a single write.

Now the /var/www/lerdorf.com/def include:

getcwd("/var/home/rasmus/php5", 4096)   = 22
lstat64("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/var/www", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/var/www/lerdorf.com", {st_mode=S_IFDIR|0777, st_size=16384, ...}) = 0
lstat64("/var/www/lerdorf.com/def", {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
time(NULL)                              = 1090679437
open("/var/www/lerdorf.com/def", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "def\n", 8192)                  = 4
read(3, "", 8192)                       = 0
read(3, "", 8192)                       = 0
close(3)                                = 0
write(1, "def\n", 4)                    = 4

We get what we expect here too.  Since the stat cache only caches full 
paths and we are now including a different file from the same location we 
have to stat the whole tree again.

And the final include:

getcwd("/var/home/rasmus/php5", 4096)   = 22
lstat64("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/var/www", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/var/www/phpics.com", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0
lstat64("/var/www/phpics.com/ghi", {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
time(NULL)                              = 1090679437
open("/var/www/phpics.com/ghi", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "ghi\n", 8192)                  = 4
read(3, "", 8192)                       = 0
read(3, "", 8192)                       = 0
close(3)                                = 0
write(1, "ghi\n", 4)                    = 4

No surprises there either.

Shutting down (cli version):

close(1)                                = 0
close(0)                                = 0
setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={30, 0}}, NULL) = 0
rt_sigaction(SIGPROF, {0x4067a7a0, [PROF], SA_RESTORER|SA_RESTART, 0x406ea658}, 
{0x4067a7a0, [PROF], SA_RESTORER|SA_RESTART, 0x406ea658}, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [PROF], NULL, 8) = 0
setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
munmap(0x40018000, 4096)                = 0
exit_group(0)                           = ?


I'd love to see a realpath() replacement function which makes use of the 
stat cache for partial paths as well.  Chances are that on a busy ISP 
server, for example, you will have thousands of scripts and include files 
served up from the same base path and you are going to be statting your 
way along that tree every time you hit a file that isn't in the cache.

And that double-fstat needs to go away.  A very quick look for it seems to
show one checking if the opened file is a regular file in
fopen_wrappers.c.  Ideally we would simply use the stat struct from our
own realpath() implementation and neither of these fstat() calls would be
needed.  The goal should be to average at most 1 stat per file, whether it
be the main script or include files.  For the main script, if we are
running under Apache, Apache has already done this stat and passed us the
stat struct

-Rasmus

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to