RE: [PHP-DEV] Making parallel database queries from PHP

Arnold Daniels Sun, 11 Nov 2007 03:55:00 -0800

Hoi Arend,

Have you already taken a look at MySQL proxy?
http://forge.mysql.com/wiki/MySQL_Proxy
It might be able to do what you want, or might need a slight modification
to do so.


Best regards,
Arnold


On Sat, 10 Nov 2007 16:51:03 +0100, "Arend van Beelen" <[EMAIL PROTECTED]>
wrote:
> Hi Donal,
> 
> thanks for your suggestion. While I think this approach might provide
some
> quick solutions short-term, there actually is a much bigger problem we
are
> trying to attack. I don't know exactly how much details I can give, but I
> will give some background information to get some more insight in the
> situation...
> 
> We are dealing with literally hundreds of webservers, and hundreds of
> database servers, and are expanding both of them on a frequent basis.
> Whenever we increase the number of webservers, the databases become our
> bottleneck and vice versa. We realize we won't magically solve any of
> these bottlenecks by introducing parallel querying on the databases. We
> have lots of tables which are divided over more than a dozen database
> clusters, and we are getting more and more tables which become so big
they
> have to be spread out over multiple databases. Because of the
distribution
> of these tables, querying them becomes increasingly hard, and we are
> approaching a limit where further distribution will become virtually
> undoable using our current approach. The current approach being querying
> the various databases serially from PHP and manually merging the results.
> If we continue down this path, our PHP application will have to do
> increasingly many queries serially, and latencies will add up more and
> more. Not to mention the code maintenance required for finding the
correct
> databases to query and merging all the results. Therefore we will be
> needing parallellization techniques that will be able to transparently
> handle communication with the databases, to keep our latencies low, but
> also to relieve our PHP application from  having to deal with all the
> distributed databases.
> 
> Thanks!
> Arend.
> 
> 
> -----Oorspronkelijk bericht-----
> Van: Donal McMullan [mailto:[EMAIL PROTECTED]
> Verzonden: za 10-11-2007 13:43
> Aan: Arend van Beelen
> CC: [email protected]; Alexey Zakhlestin
> Onderwerp: Re: [PHP-DEV] Making parallel database queries from PHP
>  
> Hi Arend -
> 
> If your webserver CPUs are already maxed out, that problem won't go away 
> on its own, but once you've solved that (optimized your code or added 
> more webservers), the curl_multi_* functions might help you out.
> 
> A cheap way to parallelize your database or data-object access, is to 
> implement a kind of services-oriented architecture, where you have one 
> PHP script* that does little except get data from a database, serialize 
> that data, and return it to your main PHP script.
> 
> The main PHP script uses the curl_multi_init, curl_multi_add_handle, 
> etc. functions to call this script multiple times in parallel, returning 
> different data objects for each call.
> 
> Because this introduces latency into the data retrieval trip, it will be 
> slower for most applications. Some circumstances that might make it 
> viable include:
>   * you have > 1 data store
>   * you have multiple slow queries that aren't interdependent
>   * you have to do expensive processing on the data you retrieve
>   * you have lots of slack (CPU, RAM, processes) on the webservers
> 
> In its favor - it should take just a couple of hours to prototype. If 
> you have a single canonical data store, you might find that as soon as 
> you enable parallel queries against the database, your database becomes 
> the bottleneck, and throughput doesn't actually increase. This technique 
> should reveal that as a potential problem without much development cost.
> 
> Interested to know how you proceed.
> 
> Donal McMullan
> 
> -----------------------------------------------------------------------
> Donal @ Catalyst.Net.NZ          PO Box 11-053, Manners St,  Wellington
> WEB: http://catalyst.net.nz/           PHYS: Level 2, 150-154 Willis St
> OFFICE: +64(4)803-2372                              MOB: +64(21)661-254
> -----------------------------------------------------------------------
> 
> *actually - Java's a pretty good option for this tier too.
> 
> Arend van Beelen wrote:
>> While I can see the theoretical advantage of this, I wonder how much
> there's too gain in practice (at least for us, that is).
>> 
>> In our current codebase, when a database query is done, PHP can only
> continue when it has the result anyway, so it would require serious code
> modifications to make use of such functionality. Also, while it may
> theoratically shorten page load times, our webservers are already
> constraint by CPU load anyway, so we would probably not be able to get
> more pageviews out of it either.
>> 
>> -----Oorspronkelijk bericht-----
>> Van: Alexey Zakhlestin [mailto:[EMAIL PROTECTED]
>> Verzonden: za 10-11-2007 11:31
>> Aan: Arend van Beelen
>> CC: [email protected]
>> Onderwerp: Re: [PHP-DEV] Making parallel database queries from PHP
>>  
>> I would prefer to have some function, which would check, if the
>> requested data is already available (if it is not, I would still be
>> able to do something useful, while waiting)
>> 
>> On 11/10/07, Arend van Beelen <[EMAIL PROTECTED]> wrote:
>>> Hi there,
>>>
>>> I am researching the possibility of developing a shared library which
> can
>>> perform database queries in parallel to multiple databases. One
> important
>>> requirement is that I will be able to use this functionality from PHP.
>>> Because I know PHP is not thread-safe due to other libraries, I am
> wondering
>>> what would be the best way to implement this. Right now I can imagine
> three
>>> solutions:
>>>
>>> - Use multiple threads to connect to the databases, but let the library
>>> export a blocking single-threaded API. So, PHP calls a function in the
>>> library, this function spawns new threads, which do the real work.
> Meanwhile
>>> the function waits for the threads to finish, and when all threads are
> done
>>> it returns the final result back to PHP.
>>> - Use a single thread and asynchronous socket communication. So, PHP
> calls
>>> the library function and this function handles all connections within
> the
>>> same thread using asynchronous communication, and returns the result to
> PHP
>>> when all communication is completed.
>>> - Use a daemon on the localhost. Make a connection from PHP to the
> daemon,
>>> the daemon handles all the connections to the databases and passes the
>>> result back to the connection made from PHP.
>>>
>>> Can someone give me some advise about advantages of using one approach
> or
>>> another? Please keep in mind that I'm hoping for a solution which will
> be
>>> both stable and minimizes overhead.
>>>
>>> Thanks,
>>> Arend.
>>>
>>> --
>>> Arend van Beelen jr.
>>> "If you want my address, it's number one at the end of the bar."
>>>
>> 
>> 

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] Making parallel database queries from PHP

Reply via email to