After more experimenting, I have concluded that
JSON::Validator->validate() is the culprit in terms of CPU time and
memory usage.
Fortunately, I’ve determined that Mojo::JSON and Digest::MD5 can be used
together to create consistent reproducible checksums, which could be
used for caching validated schemas.
Of course, a solution would involve changes to JSON::Validator (and
possibly Mojolicious::Plugin::OpenAPI depending on the chosen solution),
and then we’d have to wait for the new and improved version to come
downstream, so we wouldn’t see the benefit of this for years.
That said… we could always roll our own JSON::Validator. And if we don’t
want to do it as a community, I could always just do it myself.
In terms of testing… with 18 CPUs I can restart 60 instances (120
processes) and get through the app setup in about 60 seconds with
significant server load, when using validation. Without validation, I
can do it in 20 seconds without significant server load (beyond a few
short-lived CPU spikes).
I’m thinking about writing a patch and sending a pull request for
JSON::Validator, but also really thinking about implementing it locally
too at least in the meantime.
I haven’t heard from the author of JSON::Validator for a little while
now, but I hope I do hear back from him. I think it would be a great
addition to the library.
David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia
Office: 02 9212 0899
Online: 02 8005 0595
*From:*Koha-devel <[email protected]> *On
Behalf Of *[email protected]
*Sent:* Monday, 26 April 2021 7:01 PM
*To:* 'Renvoize, Martin' <[email protected]>
*Cc:* 'Koha Devel' <[email protected]>
*Subject:* Re: [Koha-devel] Optimizing Starman startup
After some more experimenting, it’s clear that the problem isn’t
JSON::Validator::OpenAPI::Mojolicious or
Koha::REST::Plugin::PluginRoutes. If you exclude
Mojolicious::Plugin::OpenAPI, the startup is very fast. It’s 30 seconds
start to finish to restart 60 instances and each instance restarts very
quickly.
When using Mojolicious::Plugin::OpenAPI, it takes about 3 minutes and
there’s a fair bit of downtime during that time.
When I do a strace, I’m noticing that a process can spend 30 seconds
just allocating memory for Mojolicious::Plugin::OpenAPI, but it only
happens once you hit a certain volume of processes. If you’re just
starting up 1 or 2, then it’s only a couple seconds. But if you have say
60-120 processes, it can take up to 30 seconds for
Mojolicious::Plugin::OpenAPI to do its work. I’m putting 10 CPUs to this
work, but clearly that’s not enough. I imagine there may be other
bottlenecks accessing the memory as well.
Has anyone profiled Mojolicious before? I’m guessing maybe Martin?
I suspect that this is just a problem that I’m going to have to live
with but maybe it is a case where I can find a way to optimize
Mojolicious::Plugin::OpenAPI.
David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia
Office: 02 9212 0899
Online: 02 8005 0595
*From:*Koha-devel <[email protected]
<mailto:[email protected]>> *On Behalf Of
*[email protected] <mailto:[email protected]>
*Sent:* Monday, 26 April 2021 5:12 PM
*To:* 'Renvoize, Martin' <[email protected]
<mailto:[email protected]>>
*Cc:* 'Koha Devel' <[email protected]
<mailto:[email protected]>>
*Subject:* Re: [Koha-devel] Optimizing Starman startup
So I just tried the following…
--
root@kohadevbox:koha(master)$ npm install -g swagger-cli
/usr/bin/swagger-cli -> /usr/lib/node_modules/swagger-cli/swagger-cli.js
npm WARN @apidevtools/[email protected] requires a peer of
openapi-types@>=7 but none is installed. You must install peer
dependencies yourself.
+ [email protected] <mailto:[email protected]>
added 46 packages from 27 contributors in 8.203s
--
root@kohadevbox:koha(master)$ time swagger-cli bundle
api/v1/swagger/swagger.json --outfile api/v1/swagger/openapi.json --type
json
Created api/v1/swagger/openapi.json from api/v1/swagger/swagger.json
real 0m0.296s
user 0m0.346s
sys 0m0.032s
openapi.json is 10891 lines long but it actually contains 741 $ref lines
like "$ref": "#/definitions/error" and "$ref":
"#/definitions/patron_extended_attribute".
--
Now to do some benchmarking… I ran the following code:
#!/usr/bin/perl
use JSON::Validator::OpenAPI::Mojolicious;
my $validator = JSON::Validator::OpenAPI::Mojolicious->new;
my $spec = $validator->bundle({
replace => 1,
schema => "api/v1/swagger/swagger.json",
});
The first time I ran it… it took 1.343 seconds. The second time and
subsequent times it took .354 seconds. (That’s using Ubuntu 20.04 and
JSON Validator 3.14.) That suggests caching although I’m not sure where.
I don’t see anything obvious in /usr/share/perl5/JSON/Validator/cache.
Trying with openapi.json yields .280 seconds instead of .354 seconds.
It’s faster, but not significantly.
So that suggests that the problem is actually with
Koha::REST::Plugin::PluginRoutes or Mojolicious::Plugin::OpenAPI more
specifically…
David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia
Office: 02 9212 0899
Online: 02 8005 0595
*From:*[email protected] <mailto:[email protected]>
<[email protected] <mailto:[email protected]>>
*Sent:* Monday, 26 April 2021 11:24 AM
*To:* 'Renvoize, Martin' <[email protected]
<mailto:[email protected]>>
*Cc:* 'Koha Devel' <[email protected]
<mailto:[email protected]>>
*Subject:* RE: [Koha-devel] Optimizing Starman startup
I think that I accidentally offended him, as he hasn’t responded to me
since his initial response.
I do wonder if reducing the number of references would help, although I
wonder how easy that would be to do in practice. It looks like we have
about 5767 lines of JSON all up as is… so it would probably get even
bigger if we dereferenced them.
Oh… here’s a thought. Why don’t we compile it? According to
https://davidgarcia.dev/posts/how-to-split-open-api-spec-into-multiple-files/
<https://davidgarcia.dev/posts/how-to-split-open-api-spec-into-multiple-files/>,
you can maintain many different files, and then use something like
swagger-cli to create a single built/compiled OpenAPI file.
That way JSON::Validator wouldn’t need to resolve any references for the
core API. I don’t know if the plugins have any $ref in them but I’m
guessing not (just based on Coverflow). So that could be a big win.
I’m working on other things at the moment, but I’m going to put that on
my eternal list.
David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia
Office: 02 9212 0899
Online: 02 8005 0595
*From:*Renvoize, Martin <[email protected]
<mailto:[email protected]>>
*Sent:* Friday, 23 April 2021 5:24 PM
*To:* David Cook <[email protected]
<mailto:[email protected]>>
*Cc:* Koha Devel <[email protected]
<mailto:[email protected]>>
*Subject:* Re: [Koha-devel] Optimizing Starman startup
Jan's code is certainly challenging to read and understand at times I
agree.. I used to contribute to the plugin a number of years ago now..
but the project that gave me time to play with that has since been sold
on so I'm not involved at the level I used to be.. he uses lots of Perl
foo which often takes me a long time to wrap my head around.
As for the refs, I think we split our spec up too much in all honesty..
even the swagger spec suggest we went too far.. I think I might have
been unclear when I first pushed for a split from one massive file. We
could/should certainly reduce that somewhat.. it'll be interesting to
see if it makes much difference.. that could be a fairly quick win.
On Thu, 22 Apr 2021, 12:32 am , <[email protected]
<mailto:[email protected]>> wrote:
Hi Ere,
I think you're right about the refs. While they get resolved by the
OpenAPI plugin, you probably have to resolve them before trying to
dynamically inject the routes from plugins.
Jan Thorsen (the author of Mojolicious::Plugin::OpenAPI and
JSON::Validator) thinks that the ref resolution is actually what's
taking so long. I looked it up and I think we have over 400
different references in the main OpenAPI spec alone. I haven't
profiled it but something to think about.
At some point, I'm going to have a play with newer versions of the
modules. I'm gong to look at Ubuntu 20.04 and newer Debian versions
to see what I can get away with in terms of newness. Needs more
investigation, but I am really hoping that this is an issue that can
be solved by just upgrading the OS.
I find Jan's code to be unnecessarily opaque (could use more
descriptive comments and function naming) but... I'll investigate.
Probably not right away as I have a bunch of other priorities that I
have to address but... this is on my mind.
Starman startup time is probably the thing about Koha annoying me
the most right now and probably the most practical thing I can
improve at the moment...
David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia
Office: 02 9212 0899
Online: 02 8005 0595
-----Original Message-----
From: Ere Maijala <[email protected]
<mailto:[email protected]>>
Sent: Wednesday, 21 April 2021 6:31 PM
To: [email protected] <mailto:[email protected]>;
[email protected]
<mailto:[email protected]>
Subject: Re: [Koha-devel] Optimizing Starman startup
Hi David,
I wish I'd remember all the details, but my memory fails me. I think
not using JSON had something to do with how the refs are resolved.
That may or may not have been the reason, but if everything works
with JSON module, I can't think of a reason not to use it.
Thanks for taking a look!
--Ere
[email protected] <mailto:[email protected]> kirjoitti
21.4.2021 klo 3.28:
> Hi Ere,
>
> Thanks for your reply. 24700 looks much better. I'll look at
backporting it locally.
>
> Although I'm looking at JSON::Validator::OpenAPI::Mojolicious at
https://metacpan.org/pod/release/JHTHORSEN/Mojolicious-Plugin-OpenAPI-2.19/lib/JSON/Validator/OpenAPI/Mojolicious.pm
<https://metacpan.org/pod/release/JHTHORSEN/Mojolicious-Plugin-OpenAPI-2.19/lib/JSON/Validator/OpenAPI/Mojolicious.pm>
and it says "Do not use this module directly. Use
Mojolicious::Plugin::OpenAPI instead." I notice that you're using
the "bundle" method. Do we really need that there? Why don't we just
load the JSON using the JSON module, merge with the plugin spec
files, and then pass it to the OpenAPI plugin? Shouldn't the plugin
take care of the $ref replacement?
>
> Hmm... I didn't realize until now that the OpenAPI plugin was
doing a validate behind the scenes. That's tricky.
>
> At a glance, we might be able to pre-load the app into the Starman
> master process pre-fork. There are warnings about doing that with
open
> database connections, so we'd need to review plack.psgi, but a quick
> glance suggests it might be OK. (Alternatively, I have wondered
about
> running the REST API as a separate process apart from Starman using
> hypnotoad. According to
> https://docs.mojolicious.org/Mojolicious/Guides/Cookbook
<https://docs.mojolicious.org/Mojolicious/Guides/Cookbook>,
> Mojo::Server::Prefork preloads the application in the manager/master
> process, and Hypnotoad is based off that, so that would help.)
>
> It does seem like changes to the OpenAPI plugin would be needed
for caching.
>
> I'm going to try backporting your change and try pre-loading and
see how far that gets me.
>
> David Cook
> Software Engineer
> Prosentient Systems
> Suite 7.03
> 6a Glen St
> Milsons Point NSW 2061
> Australia
>
> Office: 02 9212 0899
> Online: 02 8005 0595
>
> -----Original Message-----
> From: Koha-devel <[email protected]
<mailto:[email protected]>> On
> Behalf Of Ere Maijala
> Sent: Tuesday, 20 April 2021 4:48 PM
> To: [email protected]
<mailto:[email protected]>
> Subject: Re: [Koha-devel] Optimizing Starman startup
>
> Hi,
>
> I did some work on improving it here:
>
> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24700
<https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24700>
>
> That shaved a good bit of time from it, but it's still a heavy
> operation, and it would make sense to
>
> 1.) avoid doing it too often
>
> 2.) cache the results and avoid doing it if results are cached
>
> If you could address the first one, that'd go a long way. I'm
afraid the second one would require changes to the OpenAPI plugin to
support caching.
>
> --Ere
>
> [email protected] <mailto:[email protected]>
kirjoitti 20.4.2021 klo 6.15:
>> Hi all,
>>
>> Do you despair when you see the following periodically in “top”
when
>> a starman worker is recreated ?
>>
>> PID USER PR NI VIRT RES SHR S %CPU %MEM
TIME+
>> COMMAND
>>
>> 9529 my-koha 20 0 460108 197212 17172 R 100.0 0.4 0:03.41
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> Or the following in top when you install koha-common package or
>> restart the koha-common service?
>>
>> 11101 1-koha 20 0 447232 193320 16076 R 10.6 0.4 0:09.09
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> 11168 1-koha 20 0 447240 193264 16056 R 10.6 0.4 0:08.72
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> 11306 2-koha 20 0 447220 193148 16000 R 10.6 0.4 0:08.07
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> 11543 2-koha 20 0 447232 193036 15828 R 10.6 0.4 0:07.07
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> 11784 3-koha 20 0 441536 189664 16172 R 10.6 0.4 0:06.04
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> 11830 3-koha 20 0 439548 187212 15748 R 10.6 0.4 0:05.82
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> 11831 4-koha 20 0 438620 186344 15748 R 10.6 0.4 0:05.81
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> 11853 4-koha 20 0 437680 185672 16000 R 10.6 0.4 0:05.79
>> /usr/share/koha/api/v1/app.pl <http://app.pl>
>>
>> Well, I still have a lot of investigation left to do, but I
notice 1
>> place that a lot of time taken is here (per worker):
>>
>> my $validator = JSON::Validator::OpenAPI::Mojolicious->new;
>>
>> $validator->load_and_validate_schema(
>>
>> $self->home->rel_file("api/v1/swagger/swagger.json"),
>>
>> {
>>
>> allow_invalid_ref => 1,
>>
>> }
>>
>> );
>>
>> push @{$self->routes->namespaces}, 'Koha::Plugin';
>>
>> my $spec = $validator->schema->data;
>>
>> $self->plugin(
>>
>> 'Koha::REST::Plugin::PluginRoutes' => {
>>
>> spec => $spec,
>>
>> validator => $validator
>>
>> }
>>
>> );
>>
>> $self->plugin(
>>
>> OpenAPI => {
>>
>> spec => $spec,
>>
>> route =>
>> $self->routes->under('/api/v1')->to('Auth#under'),
>>
>> allow_invalid_ref =>
>>
>> 1, # required by our spec because $ref directly
>> under
>>
>> # Paths-, Parameters-, Definitions- &
>> Info-object
>>
>> # is not allowed by the OpenAPI specification.
>>
>> }
>>
>> );
>>
>> Anyone have ideas for improving this? Do we have to validate the
>> schema every time? Can we move the schema validation into a
different
>> module and preload it into Starman using the -M flag so that it’s
>> done
>> 1 time per Starman master instance rather than 1 time per
Starman worker instance?
>>
>> I find “/usr/share/koha/api/v1/app.pl <http://app.pl>” to be the
bane of deployments,
>> as it puts a massive load on a server, when you have multiple Koha
>> instances on the server.
>>
>> David Cook
>>
>> Software Engineer
>>
>> Prosentient Systems
>>
>> Suite 7.03
>>
>> 6a Glen St
>>
>> Milsons Point NSW 2061
>>
>> Australia
>>
>> Office: 02 9212 0899
>>
>> Online: 02 8005 0595
>>
>>
>> _______________________________________________
>> Koha-devel mailing list
>> [email protected]
<mailto:[email protected]>
>>
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
<https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
>> website : https://www.koha-community.org/
<https://www.koha-community.org/> git :
>> https://git.koha-community.org/
<https://git.koha-community.org/> bugs :
>> https://bugs.koha-community.org/ <https://bugs.koha-community.org/>
>>
>
> --
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland
> _______________________________________________
> Koha-devel mailing list
> [email protected]
<mailto:[email protected]>
>
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
<https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
> website : https://www.koha-community.org/
<https://www.koha-community.org/> git :
> https://git.koha-community.org/ <https://git.koha-community.org/>
bugs :
> https://bugs.koha-community.org/ <https://bugs.koha-community.org/>
>
>
--
Ere Maijala
Kansalliskirjasto / The National Library of Finland
_______________________________________________
Koha-devel mailing list
[email protected]
<mailto:[email protected]>
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
<https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
website : https://www.koha-community.org/
<https://www.koha-community.org/>
git : https://git.koha-community.org/ <https://git.koha-community.org/>
bugs : https://bugs.koha-community.org/
<https://bugs.koha-community.org/>