On 12/1/2015 9:51 PM, Robby Findler wrote:
You probably thought of this, but could you hand the ports themselves
over to the worker places after the dispatch but before any other
processing happened? Or maybe just make each place run the entire
server and then hand the ports over before dispatching and dispatch
inside each worker place?

Hi Robby,

I didn't think of that exactly. Using serve/servlet, I don't know how to get the port before I wind up in a handler function. Once inside a handler, AFAIK, the port is only accessible from response/output (or a custom response function which I haven't attempted).

My initial idea was to have each place run serve/servlet using its own set of dispatch rules. I was thinking about having one place handle all the low volume requests plus some background tasks, and having a number of identical places handle the high volume requests.

I did think of having just one place perform dispatch, and using handler stubs such as:

(define (high-volume-function request)
    (response/output
        (lambda (port)
            (let [(ch (get-a-worker))
                  (msg (hash 'req request 'port port))
                 ]
                (place-channel-put/get ch msg)
            ))
        :
        ; other response/output stuff
        :
        )
)

and while doing most processing in the worker place. Perhaps sanity checking in the dispatch place before delegating.

But I'm not positive this approach would work. Is that the actual TCP port in response/output, or a local bytestring that will buffer the data in the dispatch place before sending it on to the client? [Have to model that and find out]. And dealing with a whole new set of potential errors due to using places could get interesting.

I can't afford to have data being copied/buffered unnecessarily. RAM isn't an issue, but processing performance is. No point to relaying something from place to place only to have both collectors have to recycle it later.

The biggest issue I'm facing is that there's no practical way to limit the data. The primary function of the application is to be a specialized DBMS search engine. Most realistic queries are expected to produce < 250KB ... but poorly targeted queries combined with [not my choice] an option to "show all results" (as opposed to in "pages") have potential to produce many Mbytes. Meanwhile the database - and with it the largest possible result - keeps growing. The user base also is expected to grow significantly in the next year.


Most of the time, the single core application handles its load just fine. I have it limited to 16 concurrent requests and most searches return in a few seconds with the odd large one absorbed nicely even when traffic is saturating. But 99.9994% of all requests are some kind of search - every other function combined lurks in the rounding error. Just a handful of concurrent large queries can bog down the server for minutes while the application struggles with post-processing and packaging results for the clients. Apache and the DBMS are apportioned carefully over the available cores - the limitation to more users and/or bigger searches on the current hardware now is the application.

So my application will be going "places" :-) ... it's only a matter of exactly how.

George

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to