Mariusz,

Thank You, i don't now why, but the command "pm2 -i max start 
dspace-ui.json" did not work as needed. Only when I added entries to the 
dspace-ui.json file and rebooted the whole server (pm2 reboots had no 
effect) it managed to run on 4 processors in cluster mode - now the 
repository runs much faster, I'm curious how it will behave during the day. 
Thanks and best regards,

Karol

środa, 28 czerwca 2023 o 13:44:51 UTC+2 Technologiczny Informator 
napisał(a):

Hi, 

Exactly. I don't know what linux distribution you are using, but you should 
probably look in systemd which is responsible for the pm2 service. There 
you can have Type=forking instead of Type=cluster by default.

Regards,
Mariusz

środa, 28 czerwca 2023 o 13:24:11 UTC+2 Karol napisał(a):

Hi,

Mariusz, this is screen from command pm2 list
[image: Screenshot from 2023-06-28 13-22-05.png]

If i good understand "mode" should be a "cluster" ? 
Thanks,

Karol

środa, 28 czerwca 2023 o 10:17:27 UTC+2 Technologiczny Informator 
napisał(a):

Hi,

run pm2 list command. In the mode column you will see what mode your 
frontend is in.

Regards,
Mariusz

środa, 28 czerwca 2023 o 09:55:07 UTC+2 Karol napisał(a):

Hi,

Edmund, 

thank you very much for the hints. I have a few questions:

1)Yes, the system has started swapping, I can't identify what is causing 
the swapping (tomcat, angular or postgresql). How can I identify which 
service is swapping? 

2) Is this where I can reduce the number of pm2 instances: 
config.prod.yml ?






* # The rateLimiter settings limit each IP to a 'max' of 500 requests per 
'windowMs' (1 minute).  rateLimiter:    windowMs: 60000 # 1 minute    max: 
500 # limit each IP to 500 requests per windowMs  # Trust X-FORWARDED-* 
headers from proxies (default = true)  useProxies: true*

3) This command is great, thanks : cat access.log | grep -v " 403 " | grep 
-v " 301 " grep -v " 408 " | cut -d " " -f 1 | sort | uniq -c | sort -n
It gets a return:





*  13395 195.164.49.68 - amazon bot  16903 3.224.220.101 - amazon bot  
177081 52.70.240.171 - amazon bot  17146 23.22.35.162 - amazon bot1644494 
ip address of my server*

Do I understand correctly, each amazon query generates several-something 
proxy queries and that's why so many proxy queries? 

4) I totally agree, this is a huge and elegant project, but we need report 
various problems, it will allow better development in the future:)

Mariusz,

Thanks, I added to dspace-ui.json      
  
 
* "instances": "max",  "exec_mode": "cluster",*

and I start pm2 -i max dspace-ui.json
So I was convinced that this is enough. 

Do you know a method how can I confirm this?

Greetings,

Karol



wtorek, 27 czerwca 2023 o 08:01:50 UTC+2 Technologiczny Informator 
napisał(a):

Hi,

are you sure your frontend is running in cluster mode?
https://wiki.lyrasis.org/display/DSDOC7x/Performance+Tuning+DSpace

Regards,
Mariusz

poniedziałek, 26 czerwca 2023 o 23:14:22 UTC+2 Edmund Balnaves napisał(a):

There is an architectural issue in the angular -> API design which means 
that a very large number of calls are made to the API for each page load.  

This also makes the logs very noisy.

I have found 16GB lean for DSpace 7.5 where the database is on the same 
server.     Tomcat needs about 3G,  Each pm2 takes about 1G and SOLR and 
postgres chew up a lot.  ClamAV daemon can chew up another 2G.

 You might want to *reduce* the number of pm2 instances as you may be 
running low on memory.   If your system is starting to swap this can slow 
things down terribly.

Adjust the robots to block entity paths and browse paths as robots can get 
lost in the DSpace search (a historical problem).

The following can help show where traffic is coming from

cat access.log | grep -v " 403 " | grep -v " 301 "  grep -v " 408 " | cut 
-d " " -f 1 | sort | uniq -c | sort -n  

Unfortunately you will find that a lot of the traffic is to the API server 
but you can identify bots this way.

fail2ban is a useful tool to block sim-behaving bots.

IMHO DSpace7 needs a bit more work architecturally to improve 
performance.    That is understandable, it is huge (and impressive) 
migration that has been completed from DSpace 6.   The new version is a 
very fresh and nice design, and the new API is nice.

Edmund Balnaves
Prosentient Systems

On Tuesday, June 27, 2023 at 5:57:38 AM UTC+10 Karol wrote:

Hi,

I have implemented dspace7 in production, I have 4vCPU 16 Gb ram . I start 
angular using all cpu pm2 -i max, but the performance of the whole site is 
very bad. Now I can see my apache logs are growing fast: access.log and 
dspace.log - probably bots are indexing new content and this is killing my 
site, and real users can't submit work and use the repository. 
Unfortunately, I can't tell 100% what or who is overloading the system, 
because the apache logs show my server address (probably by using a proxy 
for angular)

* "top" shows 130% CPU 20% ram node /dspace-angular-7.5/dist/server/main.js 
- this is where I'm looking for a performance problem.

* Apache access.logs per day take up 400 MB - I see continuous logging, but 
I can't tell from which IP addresses. The dspace log: dspace.log 300MB per 
day.

1) How can I increase the performance of angular: node 
/dspace-angular-7.5/dist/server/main.js ? (I already use pm2 -i max)   

2) How can I check from which addresses are so many requests flying to 
dspace7 ?


Thanks and best regards,

Karol

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/879574d5-97da-47a7-a445-549bff53adf6n%40googlegroups.com.

Reply via email to