Having quit Amazon, where I was doing Postgres development, I've started 
looking at various things I might work on for fun. One thought is to start with 
something easy like the scalability of GetSnapshotData(). :-)


I recently found it interesting to examine performance while running near 1 
million pgbench selects per sec on a 48 core/96 HT Skylake box. I noticed that 
additional sessions trying to connect were timing out when they got stuck in 
ProcArrayAdd trying to get the ProcArrayLock in EXCLUSIVE mode. FYI, scale 
10000 with 2048 clients.


The question is whether it is possible that the problem with GetSnapshotData() 
has reached a critical point, with respect to snapshot scaling, on the newest 
high end systems.


I didn't have time to complete my analysis as I lost access to the hardware on 
my last day. It shouldn't cost me much more than about $6 per hour to do 
experiments on a 48 core system.


What I'd like is a short cut to any of the current discussions of various ideas 
to improve snapshot scaling. I have some of my own ideas but want to review 
things before posting them.

Reply via email to