Hi Arend -

If your webserver CPUs are already maxed out, that problem won't go away on its own, but once you've solved that (optimized your code or added more webservers), the curl_multi_* functions might help you out.

A cheap way to parallelize your database or data-object access, is to implement a kind of services-oriented architecture, where you have one PHP script* that does little except get data from a database, serialize that data, and return it to your main PHP script.

The main PHP script uses the curl_multi_init, curl_multi_add_handle, etc. functions to call this script multiple times in parallel, returning different data objects for each call.

Because this introduces latency into the data retrieval trip, it will be slower for most applications. Some circumstances that might make it viable include:
 * you have > 1 data store
 * you have multiple slow queries that aren't interdependent
 * you have to do expensive processing on the data you retrieve
 * you have lots of slack (CPU, RAM, processes) on the webservers

In its favor - it should take just a couple of hours to prototype. If you have a single canonical data store, you might find that as soon as you enable parallel queries against the database, your database becomes the bottleneck, and throughput doesn't actually increase. This technique should reveal that as a potential problem without much development cost.

Interested to know how you proceed.

Donal McMullan

-----------------------------------------------------------------------
Donal @ Catalyst.Net.NZ          PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/           PHYS: Level 2, 150-154 Willis St
OFFICE: +64(4)803-2372                              MOB: +64(21)661-254
-----------------------------------------------------------------------

*actually - Java's a pretty good option for this tier too.

Arend van Beelen wrote:
While I can see the theoretical advantage of this, I wonder how much there's 
too gain in practice (at least for us, that is).

In our current codebase, when a database query is done, PHP can only continue 
when it has the result anyway, so it would require serious code modifications 
to make use of such functionality. Also, while it may theoratically shorten 
page load times, our webservers are already constraint by CPU load anyway, so 
we would probably not be able to get more pageviews out of it either.

-----Oorspronkelijk bericht-----
Van: Alexey Zakhlestin [mailto:[EMAIL PROTECTED]
Verzonden: za 10-11-2007 11:31
Aan: Arend van Beelen
CC: internals@lists.php.net
Onderwerp: Re: [PHP-DEV] Making parallel database queries from PHP
I would prefer to have some function, which would check, if the
requested data is already available (if it is not, I would still be
able to do something useful, while waiting)

On 11/10/07, Arend van Beelen <[EMAIL PROTECTED]> wrote:
Hi there,

I am researching the possibility of developing a shared library which can
perform database queries in parallel to multiple databases. One important
requirement is that I will be able to use this functionality from PHP.
Because I know PHP is not thread-safe due to other libraries, I am wondering
what would be the best way to implement this. Right now I can imagine three
solutions:

- Use multiple threads to connect to the databases, but let the library
export a blocking single-threaded API. So, PHP calls a function in the
library, this function spawns new threads, which do the real work. Meanwhile
the function waits for the threads to finish, and when all threads are done
it returns the final result back to PHP.
- Use a single thread and asynchronous socket communication. So, PHP calls
the library function and this function handles all connections within the
same thread using asynchronous communication, and returns the result to PHP
when all communication is completed.
- Use a daemon on the localhost. Make a connection from PHP to the daemon,
the daemon handles all the connections to the databases and passes the
result back to the connection made from PHP.

Can someone give me some advise about advantages of using one approach or
another? Please keep in mind that I'm hoping for a solution which will be
both stable and minimizes overhead.

Thanks,
Arend.

--
Arend van Beelen jr.
"If you want my address, it's number one at the end of the bar."




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to