Thanks for digging up all the information. I think adding a caching validator would provide a nice improvement, so I'm in favor of it, and if I were to decide, I'd also include it in Koha.

Still, getting it upstream would be great in the long run, so I'd support that as well.

--Ere

[email protected] kirjoitti 27.4.2021 klo 5.34:
After more experimenting, I have concluded that JSON::Validator->validate() is the culprit in terms of CPU time and memory usage.

Fortunately, I’ve determined that Mojo::JSON and Digest::MD5 can be used together to create consistent reproducible checksums, which could be used for caching validated schemas.

Of course, a solution would involve changes to JSON::Validator (and possibly Mojolicious::Plugin::OpenAPI depending on the chosen solution), and then we’d have to wait for the new and improved version to come downstream, so we wouldn’t see the benefit of this for years.

That said… we could always roll our own JSON::Validator. And if we don’t want to do it as a community, I could always just do it myself.

In terms of testing… with 18 CPUs I can restart 60 instances (120 processes) and get through the app setup in about 60 seconds with significant server load, when using validation. Without validation, I can do it in 20 seconds without significant server load (beyond a few short-lived CPU spikes).

I’m thinking about writing a patch and sending a pull request for JSON::Validator, but also really thinking about implementing it locally too at least in the meantime.

I haven’t heard from the author of JSON::Validator for a little while now, but I hope I do hear back from him. I think it would be a great addition to the library.

David Cook

Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

Office: 02 9212 0899

Online: 02 8005 0595

*From:*Koha-devel <[email protected]> *On Behalf Of *[email protected]
*Sent:* Monday, 26 April 2021 7:01 PM
*To:* 'Renvoize, Martin' <[email protected]>
*Cc:* 'Koha Devel' <[email protected]>
*Subject:* Re: [Koha-devel] Optimizing Starman startup

After some more experimenting, it’s clear that the problem isn’t JSON::Validator::OpenAPI::Mojolicious or Koha::REST::Plugin::PluginRoutes. If you exclude Mojolicious::Plugin::OpenAPI, the startup is very fast. It’s 30 seconds start to finish to restart 60 instances and each instance restarts very quickly.

When using Mojolicious::Plugin::OpenAPI, it takes about 3 minutes and there’s a fair bit of downtime during that time.

When I do a strace, I’m noticing that a process can spend 30 seconds just allocating memory for Mojolicious::Plugin::OpenAPI,  but it only happens once you hit a certain volume of processes. If you’re just starting up 1 or 2, then it’s only a couple seconds. But if you have say 60-120 processes, it can take up to 30 seconds for Mojolicious::Plugin::OpenAPI to do its work. I’m putting 10 CPUs to this work, but clearly that’s not enough. I imagine there may be other bottlenecks accessing the memory as well.

Has anyone profiled Mojolicious before? I’m guessing maybe Martin?

I suspect that this is just a problem that I’m going to have to live with but maybe it is a case where I can find a way to optimize Mojolicious::Plugin::OpenAPI.

David Cook

Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

Office: 02 9212 0899

Online: 02 8005 0595

*From:*Koha-devel <[email protected] <mailto:[email protected]>> *On Behalf Of *[email protected] <mailto:[email protected]>
*Sent:* Monday, 26 April 2021 5:12 PM
*To:* 'Renvoize, Martin' <[email protected] <mailto:[email protected]>> *Cc:* 'Koha Devel' <[email protected] <mailto:[email protected]>>
*Subject:* Re: [Koha-devel] Optimizing Starman startup

So I just tried the following…

--

root@kohadevbox:koha(master)$ npm install -g swagger-cli

/usr/bin/swagger-cli -> /usr/lib/node_modules/swagger-cli/swagger-cli.js

npm WARN @apidevtools/[email protected] requires a peer of openapi-types@>=7 but none is installed. You must install peer dependencies yourself.

+ [email protected] <mailto:[email protected]>

added 46 packages from 27 contributors in 8.203s

--

root@kohadevbox:koha(master)$ time swagger-cli bundle api/v1/swagger/swagger.json --outfile api/v1/swagger/openapi.json --type json

Created api/v1/swagger/openapi.json from api/v1/swagger/swagger.json

real    0m0.296s

user    0m0.346s

sys     0m0.032s

openapi.json is 10891 lines long but it actually contains 741 $ref lines like  "$ref": "#/definitions/error" and "$ref": "#/definitions/patron_extended_attribute".

--

Now to do some benchmarking… I ran the following code:

#!/usr/bin/perl

use JSON::Validator::OpenAPI::Mojolicious;

my $validator = JSON::Validator::OpenAPI::Mojolicious->new;

my $spec = $validator->bundle({

     replace => 1,

     schema => "api/v1/swagger/swagger.json",

});

The first time I ran it… it took 1.343 seconds. The second time and subsequent times it took .354 seconds. (That’s using Ubuntu 20.04 and JSON Validator 3.14.) That suggests caching although I’m not sure where. I don’t see anything obvious in /usr/share/perl5/JSON/Validator/cache.

Trying with openapi.json yields .280 seconds instead of .354 seconds. It’s faster, but not significantly.

So that suggests that the problem is actually with Koha::REST::Plugin::PluginRoutes or Mojolicious::Plugin::OpenAPI more specifically…

David Cook

Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

Office: 02 9212 0899

Online: 02 8005 0595

*From:*[email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]>>
*Sent:* Monday, 26 April 2021 11:24 AM
*To:* 'Renvoize, Martin' <[email protected] <mailto:[email protected]>> *Cc:* 'Koha Devel' <[email protected] <mailto:[email protected]>>
*Subject:* RE: [Koha-devel] Optimizing Starman startup

I think that I accidentally offended him, as he hasn’t responded to me since his initial response.

I do wonder if reducing the number of references would help, although I wonder how easy that would be to do in practice. It looks like we have about 5767 lines of JSON all up as is… so it would probably get even bigger if we dereferenced them.

Oh… here’s a thought. Why don’t we compile it? According to https://davidgarcia.dev/posts/how-to-split-open-api-spec-into-multiple-files/ <https://davidgarcia.dev/posts/how-to-split-open-api-spec-into-multiple-files/>, you can maintain many different files, and then use something like swagger-cli to create a single built/compiled OpenAPI file.

That way JSON::Validator wouldn’t need to resolve any references for the core API. I don’t know if the plugins have any $ref in them but I’m guessing not (just based on Coverflow). So that could be a big win.

I’m working on other things at the moment, but I’m going to put that on my eternal list.

David Cook

Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

Office: 02 9212 0899

Online: 02 8005 0595

*From:*Renvoize, Martin <[email protected] <mailto:[email protected]>>
*Sent:* Friday, 23 April 2021 5:24 PM
*To:* David Cook <[email protected] <mailto:[email protected]>> *Cc:* Koha Devel <[email protected] <mailto:[email protected]>>
*Subject:* Re: [Koha-devel] Optimizing Starman startup

Jan's code is certainly challenging to read and understand at times I agree.. I used to contribute to the plugin a number of years ago now.. but the project that gave me time to play with that has since been sold on so I'm not involved at the level I used to be.. he uses lots of Perl foo which often takes me a long time to wrap my head around.

As for the refs, I think we split our spec up too much in all honesty.. even the swagger spec suggest we went too far.. I think I might have been unclear when I first pushed for a split from one massive file.  We could/should certainly reduce that somewhat..   it'll be interesting to see if it makes much difference.. that could be a fairly quick win.

On Thu, 22 Apr 2021, 12:32 am , <[email protected] <mailto:[email protected]>> wrote:

    Hi Ere,

    I think you're right about the refs. While they get resolved by the
    OpenAPI plugin, you probably have to resolve them before trying to
    dynamically inject the routes from plugins.

    Jan Thorsen (the author of Mojolicious::Plugin::OpenAPI and
    JSON::Validator) thinks that the ref resolution is actually what's
    taking so long. I looked it up and I think we have over 400
    different references in the main OpenAPI spec alone. I haven't
    profiled it but something to think about.

    At some point, I'm going to have a play with newer versions of the
    modules. I'm gong to look at Ubuntu 20.04 and newer Debian versions
    to see what I can get away with in terms of newness. Needs more
    investigation, but I am really hoping that this is an issue that can
    be solved by just upgrading the OS.

    I find Jan's code to be unnecessarily opaque (could use more
    descriptive comments and function naming) but... I'll investigate.
    Probably not right away as I have a bunch of other priorities that I
    have to address but... this is on my mind.

    Starman startup time is probably the thing about Koha annoying me
    the most right now and probably the most practical thing I can
    improve at the moment...

    David Cook
    Software Engineer
    Prosentient Systems
    Suite 7.03
    6a Glen St
    Milsons Point NSW 2061
    Australia

    Office: 02 9212 0899
    Online: 02 8005 0595

    -----Original Message-----
    From: Ere Maijala <[email protected]
    <mailto:[email protected]>>
    Sent: Wednesday, 21 April 2021 6:31 PM
    To: [email protected] <mailto:[email protected]>;
    [email protected]
    <mailto:[email protected]>
    Subject: Re: [Koha-devel] Optimizing Starman startup

    Hi David,

    I wish I'd remember all the details, but my memory fails me. I think
    not using JSON had something to do with how the refs are resolved.
    That may or may not have been the reason, but if everything works
    with JSON module, I can't think of a reason not to use it.

    Thanks for taking a look!

    --Ere

    [email protected] <mailto:[email protected]> kirjoitti
    21.4.2021 klo 3.28:
     > Hi Ere,
     >
     > Thanks for your reply. 24700 looks much better. I'll look at
    backporting it locally.
     >
     > Although I'm looking at JSON::Validator::OpenAPI::Mojolicious at
    
https://metacpan.org/pod/release/JHTHORSEN/Mojolicious-Plugin-OpenAPI-2.19/lib/JSON/Validator/OpenAPI/Mojolicious.pm
    
<https://metacpan.org/pod/release/JHTHORSEN/Mojolicious-Plugin-OpenAPI-2.19/lib/JSON/Validator/OpenAPI/Mojolicious.pm>
    and it says "Do not use this module directly. Use
    Mojolicious::Plugin::OpenAPI instead." I notice that you're using
    the "bundle" method. Do we really need that there? Why don't we just
    load the JSON using the JSON module, merge with the plugin spec
    files, and then pass it to the OpenAPI plugin? Shouldn't the plugin
    take care of the $ref replacement?
     >
     > Hmm... I didn't realize until now that the OpenAPI plugin was
    doing a validate behind the scenes. That's tricky.
     >
     > At a glance, we might be able to pre-load the app into the Starman
     > master process pre-fork. There are warnings about doing that with
    open
     > database connections, so we'd need to review plack.psgi, but a quick
     > glance suggests it might be OK. (Alternatively, I have wondered
    about
     > running the REST API as a separate process apart from Starman using
     > hypnotoad. According to
     > https://docs.mojolicious.org/Mojolicious/Guides/Cookbook
    <https://docs.mojolicious.org/Mojolicious/Guides/Cookbook>,
     > Mojo::Server::Prefork preloads the application in the manager/master
     > process, and Hypnotoad is based off that, so that would help.)
     >
     > It does seem like changes to the OpenAPI plugin would be needed
    for caching.
     >
     > I'm going to try backporting your change and try pre-loading and
    see how far that gets me.
     >
     > David Cook
     > Software Engineer
     > Prosentient Systems
     > Suite 7.03
     > 6a Glen St
     > Milsons Point NSW 2061
     > Australia
     >
     > Office: 02 9212 0899
     > Online: 02 8005 0595
     >
     > -----Original Message-----
     > From: Koha-devel <[email protected]
    <mailto:[email protected]>> On
     > Behalf Of Ere Maijala
     > Sent: Tuesday, 20 April 2021 4:48 PM
     > To: [email protected]
    <mailto:[email protected]>
     > Subject: Re: [Koha-devel] Optimizing Starman startup
     >
     > Hi,
     >
     > I did some work on improving it here:
     >
     > https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24700
    <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24700>
     >
     > That shaved a good bit of time from it, but it's still a heavy
     > operation, and it would make sense to
     >
     > 1.) avoid doing it too often
     >
     > 2.) cache the results and avoid doing it if results are cached
     >
     > If you could address the first one, that'd go a long way. I'm
    afraid the second one would require changes to the OpenAPI plugin to
    support caching.
     >
     > --Ere
     >
     > [email protected] <mailto:[email protected]>
    kirjoitti 20.4.2021 klo 6.15:
     >> Hi all,
     >>
     >> Do you despair when you see the following periodically in “top”
    when
     >> a starman worker is recreated ?
     >>
>>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM  TIME+
     >> COMMAND
     >>
     >> 9529 my-koha  20   0  460108 197212  17172 R 100.0  0.4   0:03.41
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> Or the following in top when you install koha-common package or
     >> restart the koha-common service?
     >>
     >> 11101 1-koha  20   0  447232 193320  16076 R   10.6  0.4   0:09.09
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> 11168 1-koha  20   0  447240 193264  16056 R   10.6  0.4   0:08.72
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> 11306 2-koha  20   0  447220 193148  16000 R   10.6  0.4   0:08.07
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> 11543 2-koha  20   0  447232 193036  15828 R   10.6  0.4   0:07.07
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> 11784 3-koha  20   0  441536 189664  16172 R   10.6  0.4   0:06.04
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> 11830 3-koha  20   0  439548 187212  15748 R   10.6  0.4   0:05.82
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> 11831 4-koha  20   0  438620 186344  15748 R   10.6  0.4   0:05.81
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> 11853 4-koha  20   0  437680 185672  16000 R   10.6  0.4   0:05.79
     >> /usr/share/koha/api/v1/app.pl <http://app.pl>
     >>
     >> Well, I still have a lot of investigation left to do, but I
    notice 1
     >> place that a lot of time taken is here (per worker):
     >>
     >>       my $validator = JSON::Validator::OpenAPI::Mojolicious->new;
     >>
     >>       $validator->load_and_validate_schema(
     >>
     >>           $self->home->rel_file("api/v1/swagger/swagger.json"),
     >>
     >>           {
     >>
     >>             allow_invalid_ref  => 1,
     >>
     >>           }
     >>
     >>         );
     >>
     >>       push @{$self->routes->namespaces}, 'Koha::Plugin';
     >>
     >>       my $spec = $validator->schema->data;
     >>
     >>       $self->plugin(
     >>
     >>           'Koha::REST::Plugin::PluginRoutes' => {
     >>
     >>               spec      => $spec,
     >>
     >>               validator => $validator
     >>
     >>           }
     >>
     >>       );
     >>
     >>       $self->plugin(
     >>
     >>           OpenAPI => {
     >>
     >>               spec  => $spec,
     >>
     >>               route =>
     >> $self->routes->under('/api/v1')->to('Auth#under'),
     >>
     >>               allow_invalid_ref =>
     >>
     >>                 1,    # required by our spec because $ref directly
     >> under
     >>
     >>                       # Paths-, Parameters-, Definitions- &
     >> Info-object
     >>
     >>                       # is not allowed by the OpenAPI specification.
     >>
     >>           }
     >>
     >> );
     >>
     >> Anyone have ideas for improving this? Do we have to validate the
     >> schema every time? Can we move the schema validation into a
    different
     >> module and preload it into Starman using the -M flag so that it’s
     >> done
     >> 1 time per Starman master instance rather than 1 time per
    Starman worker instance?
     >>
     >> I find “/usr/share/koha/api/v1/app.pl <http://app.pl>” to be the
    bane of deployments,
     >> as it puts a massive load on a server, when you have multiple Koha
     >> instances on the server.
     >>
     >> David Cook
     >>
     >> Software Engineer
     >>
     >> Prosentient Systems
     >>
     >> Suite 7.03
     >>
     >> 6a Glen St
     >>
     >> Milsons Point NSW 2061
     >>
     >> Australia
     >>
     >> Office: 02 9212 0899
     >>
     >> Online: 02 8005 0595
     >>
     >>
     >> _______________________________________________
     >> Koha-devel mailing list
     >> [email protected]
    <mailto:[email protected]>
     >>
    https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
    <https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
     >> website : https://www.koha-community.org/
    <https://www.koha-community.org/> git :
     >> https://git.koha-community.org/
    <https://git.koha-community.org/> bugs :
     >> https://bugs.koha-community.org/ <https://bugs.koha-community.org/>
     >>
     >
     > --
     > Ere Maijala
     > Kansalliskirjasto / The National Library of Finland
     > _______________________________________________
     > Koha-devel mailing list
     > [email protected]
    <mailto:[email protected]>
     >
    https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
    <https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
     > website : https://www.koha-community.org/
    <https://www.koha-community.org/> git :
     > https://git.koha-community.org/ <https://git.koha-community.org/>
    bugs :
     > https://bugs.koha-community.org/ <https://bugs.koha-community.org/>
     >
     >

    --
    Ere Maijala
    Kansalliskirjasto / The National Library of Finland


    _______________________________________________
    Koha-devel mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
    <https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
    website : https://www.koha-community.org/
    <https://www.koha-community.org/>
    git : https://git.koha-community.org/ <https://git.koha-community.org/>
    bugs : https://bugs.koha-community.org/
    <https://bugs.koha-community.org/>


--
Ere Maijala
Kansalliskirjasto / The National Library of Finland
_______________________________________________
Koha-devel mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/
git : https://git.koha-community.org/
bugs : https://bugs.koha-community.org/

Reply via email to