On Sat, 24 Apr 2004, Christian Schneider wrote: > Rasmus Lerdorf wrote: > > 1. The included_files list gets updated each time you include a file. In > > order to make sure that the same file included by different paths or > > symlinks don't conflict we do a realpath() on the file to be included. > > That's done by PHP, not APC, right? Does this only apply to require_once > or require as well?
Right, it is done by PHP which is why APC can't do anything about it. It happens on all files parsed by PHP no matter how they are included. > > 2. APC uses the file's device and inode as the lookup key into shared > > memory to find the opcodes for the file, so a stat has to be done on > > Hmm.. If the stat on the file and the check for device/inode would be > done first then you wouldn't have to do a realpath, right? But I guess > that's not easily done until after the realpath check. Well, the included_files list is pathname based and when checking to see if a file has already been included we do a string compare. We could potentially follow apc's lead and use device+inode instead and thereby not need the realpath call and still maintain the current functionality. There are some other issues with this though. Like, for example, when you edit a file most editors will actually create a new file and delete the old, so you end up with the same pathname having a new inode. We could state that you should never edit a file on a live production web server, but everyone does and I can just see the stream of bug reports this might spur. > > So yes, jumping from 20 to 30 include files could very well bring a rather > > significant performance hit. > > I guess that's only important if your PHP code is really simple and you > don't do something like e.g. DB queries because otherwise that'd be 90% > of the running time anyway, right? Sure, you could argue that the stat calls will get lost in the noise of a complex script. But they really can add up fast. Bouncing along the include_path looking for include files adds extra stats, open_basedir is another killer when it comes to stats. Even just having "." as the first piece of the include_path is going to cost you an extra stat when including stuff from PEAR (unless you hardcode the pear path). And the sort of people that need to worry about this tend to cache stuff from backends like crazy which, if done right, means that the backend is only touched once in a while and the bulk of the requests will pull data right out of a fast memory-based session mechanism or use other similar tricks. However, if at the same time you are stuck with 30 includes and each of these cost you 10 stat calls, that is 300 disk-touching system calls per request that you really could do without. > I guess someone _that_ considered about performance could easily do a > cat *.php | grep -v require | php -w >app.lib > or the like and include app.lib. Yes, and I know a number of folks that do pre-processing like this on their code before pushing it to their production servers. I am all for pre-runtime content management systems that allow authors to structure their code in whatever manner they see fit and when they push to the production servers the content is optimized for the delivery mechanism. For the majority of users PHP is plenty fast enough. We could stick a sleep(1) in there and I doubt anybody would notice. When you are only serving up a few hundred thousand requests per day it really doesn't matter how you architect things. Regardless of this basic fact, I don't think we should be sticking sleep(1)'s in the code, and we should be thinking about reducing the system call overhead where we can, and projects like PEAR should carefully evaluate the cost vs. convenience ratio when making decisions like the one which prompted this. -Rasmus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php