Hi all, I am a master student on database systems and I'm working on a stress testing methodology. The goal is to stress testing PostgreSQL under different workloads. Thus, I would like to discuss with you some results.
My methodology is based on the increment of two variables: complexity and workload. The complexity is the setup of the testing environment and the SUT. That is the hardware and software setup. The workload is the number of transactions submitted to the SUT. The test case increases the number of transactions to find the limit of system. I had some problems with the parameter max_connections. By default the value of this parameter is 100. I set its value to 2,000, accordingly to the amount of available memory. I used the following formula to set the operating system parameter SHMMAX: SHMMAX=250kB + 8.2kB * shared_buffers + 14.2 kB * max_connections The database has started correctly. I begun the tests using 5 clients machines and 1 server. In the server side, I just ran Postgres. In the clients side, I used threads to simulate a large number of transactions. I simulated 1000, 10000 and 100000 transactions sequentially. I used the transaction of TPC-B, this benchmark simulates a bank system. It have insert, update and select of customers accounts. At the first test, 1000 transactions were submitted to the database at 2 seconds. The time interval for analysis is divided into seconds. The distribution of transactions per time interval is uneven due to delay client boot. In the first interval 200 transactions were submitted and all were attended and completed successfully. In sub-sequent intevalo 800 transactions were submitted and of these 33 were not started due to connection refusal by the BD. This rate is not expected since the DB was configured to serve 2000 concurrent connections. An important fact that deserves to be emphasized is that there were no cases of aborted transactions, i.e. all transactions submitted and accepted by BD were successfully completed with average execution time equal to or less than 1 second. The worst case happens with the 100000 transactions. In this case the successful rate is near to 800 per second, but the "connection errors" reached 10000 per second. The configured limit to max_connections was not reached at any time. This is a bug? Do I make myself clear? Best regards Jorge Augusto Meira