Hello, Jeff Bearer wrote: > > Hello, > > I have a PHP content management application that I've developed. I'm > looking to add data caching to it so the database doesn't get pounded > all day long, the content on the site changes slowly, once or twice a > day. > > Does anyone know of where I can look to find an application that does > this? I've searched and have yet to find anything that does the same > kind of thing. I'll take anything, a module or library that does it, or > even some other application that does the same thing that I can look at.
You are in the right track avoiding as much database access as possible. Database accesses will get slower and slower with the growth of your site contents. This means that not only database connections will take longer but also means that they will eat your memory with the growth of audience. This happens not only because each database connection takes a lot of memory from database server process (1MB for MySQL) but also because the longer that a page takes to generate the more time that stalls Web server processes making it fork more processes to atend the growth of simultaneous requests making them raise the amount of memory that is consumed simultaneously until it is exhausted leading to server crashes. The best thing to do is not just use caches to avoid database accesses, but also minimize the number of dynamic pages and other dynamically generated content, meaning avoid not only database access, but as much PHP scripts as you can. PHP is good and provides flexibility, but for scalability sake avoid it wherever you can. In practice, this means avoid personalized content and generate and serve static content for non-personalized pages. This is easier said than done. I know that very well because my site PHP Classes repository has been suffering outages because of that. The greatest problem is that I admit that I got carried away by the power of PHP and database programming when I planned the site initially. Today the site is so busy that it can't handle that much database accesses and dynamically generated pages for everything. I have been working around the problem as much as I can with caching solutions. They reduce the problem but they don't solve it. The solution is to redesign for making the site static before I need to scale the hardware. Making the site static will also help the mirroring to other servers which is a good thing for which I had plenty of requests. Still making the site static for non-personalize content only solves part of the problem. The personalized pages can't be accessing the database on every request. One thing to avoid is to authenticate sessions using the database on each request. I will be solving this with a back end daemon that caches database accesses for session authentication. The Web server processes will establish persistent Unix domain socket connections with the deamon just to verify session authentication. I will also study shared memory caching to avoid the overhead of connecting to that daemon server. Still there is the problem of personalized content. Fortunately, personalized pages are made of views that often are common even for different users. So, does parts are worth caching. That problem is mostly solved with a caching class that I developed that stores caches in local disk files. This class is freely available from here: http://phpclasses.upperdesign.com/browse.html/package/313 > Ideally I think it would be cool if it would be a PEAR module that the > application connects to just like the database, and it manages chaches > and querying the database for data in the module. My experience is that it is not worth to cache just queries, but to cache the page contents generated from those queries and thus avoid the overhead of reprocessing query results on each access. I looked at PEAR cache classes. PEAR provides an infra-structure for caching data in different types of containers: database, shared memory and files. Caching content in the database is counterproductive because that is precisely what you need to avoid most. Caching content in the shared memory is probably the fastest way but usually you are limited to the amount of shared memory that each process may use. Caching contents in files seems to be the most flexible way although it could lead to the better results when combined with shared memory caching. Anyway, most times OS implicit filesystem caching is already good enough to avoid most overhead of reading from the same files over and over again. Anyway, if your cached data is going to be shared by multiple Web server processes that will access it concurrently, you certainly need an engine that will avoid that more than one process attempts to update the cache contents due to race conditions that may happen when the stored cache is outdated and needs to be rebuilt. This aspect PEAR file cache class is not addressing. This means that if you attempt to use it in a busy site, you may end up with corrupted cache files that compromise what the site will display. Fortunately, this issue is properly handled with the class that I pointed above. It uses file locks to allow multiple read accesses and only one write access. It has proven to work robustly in a busy site like the PHP Classes repository, so you may want to give a serious try. Regards, Manuel Lemos -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]