Hoi Arend, Have you already taken a look at MySQL proxy? http://forge.mysql.com/wiki/MySQL_Proxy It might be able to do what you want, or might need a slight modification to do so.
Best regards, Arnold On Sat, 10 Nov 2007 16:51:03 +0100, "Arend van Beelen" <[EMAIL PROTECTED]> wrote: > Hi Donal, > > thanks for your suggestion. While I think this approach might provide some > quick solutions short-term, there actually is a much bigger problem we are > trying to attack. I don't know exactly how much details I can give, but I > will give some background information to get some more insight in the > situation... > > We are dealing with literally hundreds of webservers, and hundreds of > database servers, and are expanding both of them on a frequent basis. > Whenever we increase the number of webservers, the databases become our > bottleneck and vice versa. We realize we won't magically solve any of > these bottlenecks by introducing parallel querying on the databases. We > have lots of tables which are divided over more than a dozen database > clusters, and we are getting more and more tables which become so big they > have to be spread out over multiple databases. Because of the distribution > of these tables, querying them becomes increasingly hard, and we are > approaching a limit where further distribution will become virtually > undoable using our current approach. The current approach being querying > the various databases serially from PHP and manually merging the results. > If we continue down this path, our PHP application will have to do > increasingly many queries serially, and latencies will add up more and > more. Not to mention the code maintenance required for finding the correct > databases to query and merging all the results. Therefore we will be > needing parallellization techniques that will be able to transparently > handle communication with the databases, to keep our latencies low, but > also to relieve our PHP application from having to deal with all the > distributed databases. > > Thanks! > Arend. > > > -----Oorspronkelijk bericht----- > Van: Donal McMullan [mailto:[EMAIL PROTECTED] > Verzonden: za 10-11-2007 13:43 > Aan: Arend van Beelen > CC: internals@lists.php.net; Alexey Zakhlestin > Onderwerp: Re: [PHP-DEV] Making parallel database queries from PHP > > Hi Arend - > > If your webserver CPUs are already maxed out, that problem won't go away > on its own, but once you've solved that (optimized your code or added > more webservers), the curl_multi_* functions might help you out. > > A cheap way to parallelize your database or data-object access, is to > implement a kind of services-oriented architecture, where you have one > PHP script* that does little except get data from a database, serialize > that data, and return it to your main PHP script. > > The main PHP script uses the curl_multi_init, curl_multi_add_handle, > etc. functions to call this script multiple times in parallel, returning > different data objects for each call. > > Because this introduces latency into the data retrieval trip, it will be > slower for most applications. Some circumstances that might make it > viable include: > * you have > 1 data store > * you have multiple slow queries that aren't interdependent > * you have to do expensive processing on the data you retrieve > * you have lots of slack (CPU, RAM, processes) on the webservers > > In its favor - it should take just a couple of hours to prototype. If > you have a single canonical data store, you might find that as soon as > you enable parallel queries against the database, your database becomes > the bottleneck, and throughput doesn't actually increase. This technique > should reveal that as a potential problem without much development cost. > > Interested to know how you proceed. > > Donal McMullan > > ----------------------------------------------------------------------- > Donal @ Catalyst.Net.NZ PO Box 11-053, Manners St, Wellington > WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St > OFFICE: +64(4)803-2372 MOB: +64(21)661-254 > ----------------------------------------------------------------------- > > *actually - Java's a pretty good option for this tier too. > > Arend van Beelen wrote: >> While I can see the theoretical advantage of this, I wonder how much > there's too gain in practice (at least for us, that is). >> >> In our current codebase, when a database query is done, PHP can only > continue when it has the result anyway, so it would require serious code > modifications to make use of such functionality. Also, while it may > theoratically shorten page load times, our webservers are already > constraint by CPU load anyway, so we would probably not be able to get > more pageviews out of it either. >> >> -----Oorspronkelijk bericht----- >> Van: Alexey Zakhlestin [mailto:[EMAIL PROTECTED] >> Verzonden: za 10-11-2007 11:31 >> Aan: Arend van Beelen >> CC: internals@lists.php.net >> Onderwerp: Re: [PHP-DEV] Making parallel database queries from PHP >> >> I would prefer to have some function, which would check, if the >> requested data is already available (if it is not, I would still be >> able to do something useful, while waiting) >> >> On 11/10/07, Arend van Beelen <[EMAIL PROTECTED]> wrote: >>> Hi there, >>> >>> I am researching the possibility of developing a shared library which > can >>> perform database queries in parallel to multiple databases. One > important >>> requirement is that I will be able to use this functionality from PHP. >>> Because I know PHP is not thread-safe due to other libraries, I am > wondering >>> what would be the best way to implement this. Right now I can imagine > three >>> solutions: >>> >>> - Use multiple threads to connect to the databases, but let the library >>> export a blocking single-threaded API. So, PHP calls a function in the >>> library, this function spawns new threads, which do the real work. > Meanwhile >>> the function waits for the threads to finish, and when all threads are > done >>> it returns the final result back to PHP. >>> - Use a single thread and asynchronous socket communication. So, PHP > calls >>> the library function and this function handles all connections within > the >>> same thread using asynchronous communication, and returns the result to > PHP >>> when all communication is completed. >>> - Use a daemon on the localhost. Make a connection from PHP to the > daemon, >>> the daemon handles all the connections to the databases and passes the >>> result back to the connection made from PHP. >>> >>> Can someone give me some advise about advantages of using one approach > or >>> another? Please keep in mind that I'm hoping for a solution which will > be >>> both stable and minimizes overhead. >>> >>> Thanks, >>> Arend. >>> >>> -- >>> Arend van Beelen jr. >>> "If you want my address, it's number one at the end of the bar." >>> >> >> -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php