You need to split the request handler and the request processor. Receive
the request in mod_perl and then queue it into a separate application which
does the actual heavy lifting.

On Wed, Jan 8, 2020 at 8:59 PM Wesley Peng <wes...@magenta.de> wrote:

> Hallo
>
> We are running LR[1] and GBDT[2] and similar algorithm in MP2 handles.
> For each request, there were about 1000 features as arguments passed
> into the handles, via HTTP POST.
> The request will wait for about 100ms to get responses, coz the
> calculation is not cheap.
> My question is, how can we improve the throughput by architecture
> optimization?
> Yes we know there are TFS[3] and RT[4] for prediction frameworks, but we
> didn't use Tensorflow yet.
>
>
> [1] https://en.wikipedia.org/wiki/LR_parser
> [2] https://en.wikipedia.org/wiki/Gradient_boosting
> [3] https://www.tensorflow.org/tfx/guide/serving
> [4] https://developer.nvidia.com/tensorrt
>
>
> Thanks.
>

Reply via email to