Re: Why HDD performance is better than SSD in this case
Hi Neto, RAID 0 to store production data should never be used. Never a good idea, in my opinion. Simple reason is that when you lose one disk, you lose everything. If your goal is to bench the disk, go for single disk. If you want to be closer to a production setup, go for RAID 10, or pick a RAID setup close to what your needs and capabilities are (more reads? more writes? SSD? HDD? cache? ...? ) If you only have 2 disks, your obliged (redundant) choice is RAID 1. regards, fabio pardi On 18/07/18 03:24, Neto pr wrote: > >> As side note: why to run a test on a setup you can never use on production? >> >> regards, >> >> fabio pardi >> > > Can you just explain why you said it below? > > "As side note: why to run a test on a setup you can never use on production?" > > You think that a RAID ZERO configuration for a DBMS is little used? > Which one do you think would be good? I accept suggestions because I > am in the middle of a work for my > research of the postgraduate course and I can change the environment > to something that is more useful and really used in real production > environments. > > Best Regards > []`s Neto
Re: Why HDD performance is better than SSD in this case
Le 18/07/2018 à 03:16, Neto pr a écrit : 2018-07-17 22:13 GMT-03:00 Neto pr : 2018-07-17 20:04 GMT-03:00 Mark Kirkwood : Ok, so dropping the cache is good. How are you ensuring that you have one test setup on the HDDs and one on the SSDs? i.e do you have 2 postgres instances? or are you using one instance with tablespaces to locate the relevant tables? If the 2nd case then you will get pollution of shared_buffers if you don't restart between the HHD and SSD tests. If you have 2 instances then you need to carefully check the parameters are set the same (and probably shut the HDD instance down when testing the SSD etc). Dear Mark To ensure that the test is honest and has the same configuration the O.S. and also DBMS, my O.S. is installed on the SSD and DBMS as well. I have an instance only of DBMS and two database. - a database called tpch40gnorhdd with tablespace on the HDD disk. - a database called tpch40gnorssd with tablespace on the SSD disk. See below: postgres=# \l List of databases Name | Owner | Encoding | Collate |Ctype| Access privileges ---+--+--+-+-+--- postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres tpch40gnorhdd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 | tpch40gnorssd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 | (5 rows) postgres=# After 7 query execution in a database tpch40gnorhdd I restart the DBMS (/etc/init.d/pg101norssd restart and drop cache of the O.S.) and go to execution test with the database tpch40gnorssd. You think in this case there is pollution of shared_buffers? Why do you think having O.S. on SSD is bad? Do you could explain better? Best regards []`s Neto +1 information about EVO SSD Samsung: Model: 850 Evo 500 GB SATA III 6Gb/s - http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/ As stated on his ML on january, Samsung 850 Evo is not a particularly fast SSD - especially it's not really consistent in term of performance ( see https://www.anandtech.com/show/8747/samsung-ssd-850-evo-review/5 and https://www.anandtech.com/bench/product/1913 ). This is not a product for professional usage, and you should not expect great performance from it - as reported by these benchmark, you can have a 34ms latency in very intensive usage: ATSB - The Destroyer (99th Percentile Write Latency)99th Percentile Latency in Microseconds - Lower is Better *34923 *Even average write latency of the Samsung 850 Evo is 3,3 ms in intensive workload Why are you using this type of SSD for your benchmark ? What do you plan to achieve ? I can see a couple of things in your setup that might pessimize the SDD case: - you have OS on the SSD - if you tests make the system swap then this will wreck the SSD result - you have RAID 0 SSD...some of the cheaper ones slow down when you do this. maybe test with a single SSD regards Mark On 18/07/18 01:04, Neto pr wrote (note snippage): (echo 3> / proc / sys / vm / drop_caches; discs: - 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID) - 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID) - The Operating System and the Postgresql DBMS are installed on the SSD disk.
Re: Why HDD performance is better than SSD in this case
Ok, so you are using 1 instance and tablespaces. Also I see you are restarting the instance between HDD and SSD tests, so all good there. The point I made about having the OS on the SSD's means that if these tests make your system swap, and your swap device is on the SSDs (which is probably is by default), then swap activity will compete with db access activity for IOPS on your SSDs and spoil the results of your test (i.e slow down your SSDs). You can check this using top, sar or iostat to see *if* you are swapping during the tests. Ideally you would design your setup to use 3 separate devices: - one device (or array) for os, swap, tmp etc - one device (HDD array) for you 'HDD' tablespace - one device (SDD array) for your 'SDD' tablespace regards Mark On 18/07/18 13:13, Neto pr wrote: Dear Mark To ensure that the test is honest and has the same configuration the O.S. and also DBMS, my O.S. is installed on the SSD and DBMS as well. I have an instance only of DBMS and two database. - a database called tpch40gnorhdd with tablespace on the HDD disk. - a database called tpch40gnorssd with tablespace on the SSD disk. See below: postgres=# \l List of databases Name | Owner | Encoding | Collate |Ctype| Access privileges ---+--+--+-+-+--- postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres tpch40gnorhdd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 | tpch40gnorssd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 | (5 rows) postgres=# After 7 query execution in a database tpch40gnorhdd I restart the DBMS (/etc/init.d/pg101norssd restart and drop cache of the O.S.) and go to execution test with the database tpch40gnorssd. You think in this case there is pollution of shared_buffers? Why do you think having O.S. on SSD is bad? Do you could explain better?
Re: Why HDD performance is better than SSD in this case
On Wed, 18 Jul 2018 09:46:32 +0200, Fabio Pardi wrote: RAID 0 to store production data should never be used. Never a good idea, in my opinion. RAID 0 by itself should never be used. Combined with other RAID levels, it can boost performance without sacrificing reliability. https://en.wikipedia.org/wiki/Nested_RAID_levels Personally, I don't like RAID 0 + ? schemes because they use too many disks (with associated reliability issues). The required performance usually can be achieved in other ways. But YMMV. George
Re: Why HDD performance is better than SSD in this case
>Model: 850 Evo 500 GB SATA III 6Gb/s - please check the SSD *"DRIVE HEALTH STATUS"* and the* "S.M.A.R.T values of specified disk" * for example - with the "smartctl" tool ( https://www.smartmontools.org/ ) ( -x "Show all information for device" ) Expected output with "Samsung SSD 850 EVO 500GB" https://superuser.com/questions/1169810/smart-data-of-a-new-ssd Regards, Imre Neto pr ezt írta (időpont: 2018. júl. 18., Sze, 3:17): > 2018-07-17 22:13 GMT-03:00 Neto pr : > > 2018-07-17 20:04 GMT-03:00 Mark Kirkwood >: > >> Ok, so dropping the cache is good. > >> > >> How are you ensuring that you have one test setup on the HDDs and one > on the > >> SSDs? i.e do you have 2 postgres instances? or are you using one > instance > >> with tablespaces to locate the relevant tables? If the 2nd case then you > >> will get pollution of shared_buffers if you don't restart between the > HHD > >> and SSD tests. If you have 2 instances then you need to carefully check > the > >> parameters are set the same (and probably shut the HDD instance down > when > >> testing the SSD etc). > >> > > Dear Mark > > To ensure that the test is honest and has the same configuration the > > O.S. and also DBMS, my O.S. is installed on the SSD and DBMS as well. > > I have an instance only of DBMS and two database. > > - a database called tpch40gnorhdd with tablespace on the HDD disk. > > - a database called tpch40gnorssd with tablespace on the SSD disk. > > See below: > > > > postgres=# \l > > List of databases > > Name | Owner | Encoding | Collate |Ctype| > > Access privileges > > > ---+--+--+-+-+--- > > postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | > > template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | > > =c/postgres + > >| | | | | > > postgres=CTc/postgres > > template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | > > =c/postgres + > >| | | | | > > postgres=CTc/postgres > > tpch40gnorhdd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 | > > tpch40gnorssd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 | > > (5 rows) > > > > postgres=# > > > > After 7 query execution in a database tpch40gnorhdd I restart the DBMS > > (/etc/init.d/pg101norssd restart and drop cache of the O.S.) and go to > > execution test with the database tpch40gnorssd. > > You think in this case there is pollution of shared_buffers? > > Why do you think having O.S. on SSD is bad? Do you could explain better? > > > > Best regards > > []`s Neto > > > > +1 information about EVO SSD Samsung: > > Model: 850 Evo 500 GB SATA III 6Gb/s - > http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/ > > > >> I can see a couple of things in your setup that might pessimize the SDD > >> case: > >> - you have OS on the SSD - if you tests make the system swap then this > will > >> wreck the SSD result > >> - you have RAID 0 SSD...some of the cheaper ones slow down when you do > this. > >> maybe test with a single SSD > >> > >> regards > >> Mark > >> > >> On 18/07/18 01:04, Neto pr wrote (note snippage): > >> > >>> (echo 3> / proc / sys / vm / drop_caches; > >>> > >>> discs: > >>> - 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID) > >>> - 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID) > >>> > >>> - The Operating System and the Postgresql DBMS are installed on the SSD > >>> disk. > >>> > >>> > >> > >
Re: Faster str to int conversion (was Table with large number of int columns, very slow COPY FROM)
On Sat, Jul 7, 2018 at 4:01 PM, Andres Freund wrote: > FWIW, here's a rebased version of this patch. Could probably be polished > further. One might argue that we should do a bit more wide ranging > changes, to convert scanint8 and pg_atoi to be also unified. But it > might also just be worthwhile to apply without those, given the > performance benefit. Wouldn't hurt to do that one too, but might be OK to just do this much. Questions: 1. Why the error message changes? If there's a good reason, it should be done as a separate commit, or at least well-documented in the commit message. 2. Does the likely/unlikely stuff make a noticeable difference? 3. If this is a drop-in replacement for pg_atoi, why not just recode pg_atoi this way -- or have it call this -- and leave the callers unchanged? 4. Are we sure this is faster on all platforms, or could it work out the other way on, say, BSD? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company