Tom Lane wrote:
So I have some results. I have tested the query on both PostgreSQL 8.0.1 and MySQL 4.1.8 with LIMIT set to 30 and OFFSET set to 6000. PostgreSQL result is 11,667.916 ms, MySQL result is 448.4 ms.
That's a fairly impressive discrepancy :-(, and even the slot_getattr()
patch that Atsushi Ogawa provided isn't going to close the gap.
(I got about a 4x speedup on Miroslav's example in my testing, which
leaves us still maybe 6x slower than MySQL.)
As I wrote, the comparison is not "fair". Here are the conditions:
"Both databases are running on the same machine (my laptop) and contain the same data. However there are some differences in the data table definitions:
1) in PostgreSQL I use 'varchar(1)' for a lot of fields and in MySQL I use 'enum'
2) in PostgreSQL in some cases I use connection fields that are not of the same type (smallint <-> integer (SERIAL)), in MySQL I use the same types"
For those not used to MySQL, enum is an integer "mapped" to a text string (that's how I see it). That means that when you have field such as enum('yes','no','DK'), in the table there are stored numbers 1, 2 and 3 which are mapped to the text values 'yes', 'no' and 'DK'. The description is not accurate (I'm not MySQL programmer, I didn't check it recently and I didn't inspect the code - I wouldn't understand it either) but I think it's not that important. What is important is the fact that MySQL has to work with some dozen fields that are numbers and PostgreSQL has to work with the same fields as varchar(). Some of the other fields are also varchars. This might (or might not) cause the speed difference. However I think that if devs figure out how to speed this up, other cases will benefit from the improvement too.
As I understood from the contributions of other, the 2) shouldn't have a great impact on the speed.
Mine thought is that I don't know what you are talking about :-) Now seriously, I am far below this level of knowledge. But I can contribute a test that (maybe) can help. I have rewritten the query so it JOINs the varchar() fields (in fact all fields except the IDPK) at the last INNER JOIN. Though there is one more JOIN, the query is more than 5 times faster (1975.312 ms) :-)Looking at the post-patch profile for the test case, there is still quite a lot of cycles going into tuple assembly and disassembly:
Each sample counts as 0.01 seconds.
% cumulative self self total time seconds seconds calls ms/call ms/call name 24.47 4.49 4.49 _mcount
8.01 5.96 1.47 9143692 0.00 0.00 ExecEvalVar
6.92 7.23 1.27 6614373 0.00 0.00 slot_deformtuple
6.54 8.43 1.20 9143692 0.00 0.00 slot_getattr
6.21 9.57 1.14 103737 0.01 0.03 ExecTargetList
5.56 10.59 1.02 103775 0.01 0.01 DataFill
3.22 11.18 0.59 103775 0.01 0.01 ComputeDataSize
2.83 11.70 0.52 ExecEvalVar
2.72 12.20 0.50 9094122 0.00 0.00 memcpy
2.51 12.66 0.46 encore
2.40 13.10 0.44 427448 0.00 0.00 nocachegetattr
2.13 13.49 0.39 103775 0.00 0.02 heap_formtuple
2.07 13.87 0.38 noshlibs
1.20 14.09 0.22 225329 0.00 0.00 _doprnt
1.20 14.31 0.22 msquadloop
1.14 14.52 0.21 chunks
0.98 14.70 0.18 871885 0.00 0.00 AllocSetAlloc
0.98 14.88 0.18 $$dyncall
0.76 15.02 0.14 594242 0.00 0.00 FunctionCall3
0.71 15.15 0.13 213312 0.00 0.00 comparetup_heap
0.65 15.27 0.12 6364 0.02 0.13 printtup
0.60 15.38 0.11 790702 0.00 0.00 pfree
(_mcount is profiling overhead, ignore it.) It looks to me like just about everything in the top dozen functions is there as a result of the fact that join steps form new tuples that are the merge of their input tuples. Even our favorite villains, palloc and pfree, are down in the sub-percent range.
I am guessing that the reason MySQL wins on this is that they avoid doing any data copying during a join step. I wonder whether we could accomplish the same by taking Ogawa's patch to the next level: allow a TupleTableSlot to contain either a "materialized" tuple as now, or a "virtual" tuple that is simply an array of Datums and null flags. (It's virtual in the sense that any pass-by-reference Datums would have to be pointing to data at the next level down.) This would essentially turn the formtuple and deformtuple operations into no-ops, and get rid of a lot of the associated overhead such as ComputeDataSize and DataFill. The only operations that would have to forcibly materialize a tuple would be ones that need to keep the tuple till after they fetch their next input tuple --- hashing and sorting are examples, but very many plan node types don't ever need to do that.
I haven't worked out the details, but it seems likely that this could be a relatively nonintrusive patch. The main thing that would be an issue would be that direct reference to slot->val would become verboten (since you could no longer be sure there was a materialized tuple there). I think this would possibly affect some contrib stuff, which is a strong hint that it'd break some existing user-written code out there.
Thoughts?
So my silly opinion is that if the planner could decide that there are too much time expensive fields that are not needed during performing JOINs and these could be JOINed at the last step, it would do it this way :-)
Below is the adjusted query and the EXPLAIN ANALYZE output. (Tom, you can run it on the data I have sent you and it should run without changes.)
SELECT AdDevicesSites.IDPK, AdDevicesSites2.AdDevicesSiteSizeIDFK, AdDevicesSites2.AdDevicesSiteRegionIDFK, AdDevicesSites2.AdDevicesSiteCountyIDFK, AdDevicesSites2.AdDevicesSiteCityIDFK, AdDevicesSites2.AdDevicesSiteDistrictIDFK, AdDevicesSites2.AdDevicesSiteStreetIDFK, AdDevicesSites2.AdDevicesSiteStreetDescriptionIDFK, AdDevicesSites2.AdDevicesSitePositionIDFK, AdDevicesSites2.AdDevicesSiteVisibilityIDFK, AdDevicesSites2.AdDevicesSiteStatusTypeIDFK, AdDevicesSites2.AdDevicesSitePartnerIdentificationOperatorIDFK, AdDevicesSites2.AdDevicesSitePartnerElectricitySupplierIDFK, AdDevicesSites2.AdDevicesSitePartnerMaintainerIDFK, AdDevicesSites2.AdDevicesSitePartnerStickerIDFK, AdDevicesSites2.CadastralUnitIDFK, AdDevicesSites2.MediaType, AdDevicesSites2.Mark, AdDevicesSites2.Amount, AdDevicesSites2.Distance, AdDevicesSites2.OwnLightening, AdDevicesSites2.LocationDownTown, AdDevicesSites2.LocationSuburb, AdDevicesSites2.LocationBusinessDistrict, AdDevicesSites2.LocationResidentialDistrict, AdDevicesSites2.LocationIndustrialDistrict, AdDevicesSites2.LocationNoBuildings, AdDevicesSites2.ParkWayHighWay, AdDevicesSites2.ParkWayFirstClassRoad, AdDevicesSites2.ParkWayOtherRoad, AdDevicesSites2.ParkWayStreet, AdDevicesSites2.ParkWayAccess, AdDevicesSites2.ParkWayExit, AdDevicesSites2.ParkWayParkingPlace, AdDevicesSites2.ParkWayPassangersOnly, AdDevicesSites2.ParkWayCrossRoad, AdDevicesSites2.PositionStandAlone, AdDevicesSites2.NeighbourhoodPublicTransportation, AdDevicesSites2.NeighbourhoodInterCityTransportation, AdDevicesSites2.NeighbourhoodPostOffice, AdDevicesSites2.NeighbourhoodNewsStand, AdDevicesSites2.NeighbourhoodAmenities, AdDevicesSites2.NeighbourhoodSportsSpot, AdDevicesSites2.NeighbourhoodHealthServiceSpot, AdDevicesSites2.NeighbourhoodShops, AdDevicesSites2.NeighbourhoodShoppingCenter, AdDevicesSites2.NeighbourhoodSuperMarket, AdDevicesSites2.NeighbourhoodPetrolStation, AdDevicesSites2.NeighbourhoodSchool, AdDevicesSites2.NeighbourhoodBank, AdDevicesSites2.NeighbourhoodRestaurant, AdDevicesSites2.NeighbourhoodHotel, AdDevicesSites2.RestrictionCigarettes, AdDevicesSites2.RestrictionPolitics, AdDevicesSites2.RestrictionSpirits, AdDevicesSites2.RestrictionSex, AdDevicesSites2.RestrictionOther, AdDevicesSites2.RestrictionNote, AdDevicesSites2.SpotMapFile, AdDevicesSites2.SpotPhotoFile, AdDevicesSites2.SourcePhotoTimeStamp, AdDevicesSites2.SourceMapTimeStamp, AdDevicesSites2.Price, AdDevicesSites2.WebPrice, AdDevicesSites2.CadastralUnitCode, AdDevicesSites2.BuildingNumber, AdDevicesSites2.ParcelNumber, AdDevicesSites2.GPSLatitude, AdDevicesSites2.GPSLongitude, AdDevicesSites2.GPSHeight, AdDevicesSites2.MechanicalOpticalCoordinates, AdDevicesSites2.Deleted, AdDevicesSites2.Protected, AdDevicesSites2.DateCreated, AdDevicesSites2.DateLastModified, AdDevicesSites2.DateDeleted, AdDevicesSites2.CreatedByUserIDFK, AdDevicesSites2.LastModifiedByUserIDFK, AdDevicesSites2.DeletedByUserIDFK, AdDevicesSites2.PhotoLastModificationDate, AdDevicesSites2.MapLastModificationDate, AdDevicesSites2.DateLastImported, AdDevicesSiteRegions.Name AS AdDevicesSiteRegionName, AdDevicesSiteCounties.Name AS AdDevicesSiteCountyName, AdDevicesSiteCities.Name AS AdDevicesSiteCityName, AdDevicesSiteStreets.Name AS AdDevicesSiteStreetName, AdDevicesSiteDistricts.Name AS AdDevicesSiteDistrictName, AdDevicesSiteStreetDescriptions.Name_cs AS AdDevicesSiteStreetDescriptionName_cs, AdDevicesSiteStreetDescriptions.Name_en AS AdDevicesSiteStreetDescriptionName_en, AdDevicesSiteSizes.Name AS AdDevicesSiteSizeName, SUBSTRING(AdDevicesSiteVisibilities.Name_cs, 3) AS AdDevicesSiteVisibilityName_cs, SUBSTRING(AdDevicesSiteVisibilities.Name_en, 3) AS AdDevicesSiteVisibilityName_en, AdDevicesSitePositions.Name_cs AS AdDevicesSitePositionName_cs, AdDevicesSitePositions.Name_en AS AdDevicesSitePositionName_en, AdDevicesSiteStatusTypes.Name_cs AS AdDevicesSiteStatusTypeName_cs, AdDevicesSiteStatusTypes.Name_en AS AdDevicesSiteStatusTypeName_en, PartnerIdentificationsOperator.Name AS PartnerIdentificationOperatorName, PartnersElectricitySupplier.Name AS PartnerElectricitySupplierName, PartnersMaintainer.Name AS PartnerMaintainerName, PartnersSticker.Name AS PartnerStickerName, CadastralUnits.Code AS CadastralUnitCodeNative, CadastralUnits.Name AS CadastralUnitName
FROM AdDevicesSites
LEFT JOIN AdDevicesSiteRegions ON AdDevicesSites.AdDevicesSiteRegionIDFK = AdDevicesSiteRegions.IDPK
LEFT JOIN AdDevicesSiteCounties ON AdDevicesSites.AdDevicesSiteCountyIDFK = AdDevicesSiteCounties.IDPK
LEFT JOIN AdDevicesSiteCities ON AdDevicesSites.AdDevicesSiteCityIDFK = AdDevicesSiteCities.IDPK
LEFT JOIN AdDevicesSiteStreets ON AdDevicesSites.AdDevicesSiteStreetIDFK = AdDevicesSiteStreets.IDPK
LEFT JOIN AdDevicesSiteStreetDescriptions ON AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK = AdDevicesSiteStreetDescriptions.IDPK
LEFT JOIN AdDevicesSiteDistricts ON AdDevicesSites.AdDevicesSiteDistrictIDFK = AdDevicesSiteDistricts.IDPK
LEFT JOIN AdDevicesSiteSizes ON AdDevicesSites.AdDevicesSiteSizeIDFK = AdDevicesSiteSizes.IDPK
LEFT JOIN AdDevicesSiteVisibilities ON AdDevicesSites.AdDevicesSiteVisibilityIDFK = AdDevicesSiteVisibilities.IDPK
LEFT JOIN AdDevicesSitePositions ON AdDevicesSites.AdDevicesSitePositionIDFK = AdDevicesSitePositions.IDPK
LEFT JOIN AdDevicesSiteStatusTypes ON AdDevicesSites.AdDevicesSiteStatusTypeIDFK = AdDevicesSiteStatusTypes.IDPK
LEFT JOIN PartnerIdentifications AS PartnerIdentificationsOperator ON AdDevicesSites.AdDevicesSitePartnerIdentificationOperatorIDFK = PartnerIdentificationsOperator.IDPK
LEFT JOIN Partners AS PartnersElectricitySupplier ON AdDevicesSites.AdDevicesSitePartnerElectricitySupplierIDFK = PartnersElectricitySupplier.IDPK
LEFT JOIN Partners AS PartnersMaintainer ON AdDevicesSites.AdDevicesSitePartnerMaintainerIDFK = PartnersMaintainer.IDPK
LEFT JOIN Partners AS PartnersSticker ON AdDevicesSites.AdDevicesSitePartnerStickerIDFK = PartnersSticker.IDPK
LEFT JOIN CadastralUnits ON AdDevicesSites.CadastralUnitIDFK = CadastralUnits.IDPK
INNER JOIN AdDevicesSites AS AdDevicesSites2 ON AdDevicesSites.IDPK = AdDevicesSites2.IDPK
QUERY PLAN
Hash Join (cost=193867.68..235417.92 rows=142556 width=815) (actual time=1577.080..2200.677 rows=6364 loops=1)
Hash Cond: ("outer".idpk = "inner".idpk)
-> Seq Scan on addevicessites addevicessites2 (cost=0.00..13118.56 rows=142556 width=473) (actual time=40.650..49.195 rows=6364 loops=1)
-> Hash (cost=186898.29..186898.29 rows=142556 width=350) (actual time=1534.080..1534.080 rows=0 loops=1)
-> Merge Right Join (cost=184758.38..186898.29 rows=142556 width=350) (actual time=1187.653..1244.955 rows=6364 loops=1)
Merge Cond: ("outer".idpk = "inner".cadastralunitidfk)
-> Index Scan using cadastralunits_pkey on cadastralunits (cost=0.00..314.49 rows=13027 width=31) (actual time=0.034..0.128 rows=63 loops=1)
-> Sort (cost=184758.38..185114.77 rows=142556 width=325) (actual time=1187.582..1190.111 rows=6364 loops=1)
Sort Key: addevicessites.cadastralunitidfk
-> Hash Left Join (cost=93887.04..143074.76 rows=142556 width=325) (actual time=502.584..1129.145 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitepartnerstickeridfk = "inner".idpk)
-> Hash Left Join (cost=93884.28..140933.65 rows=142556 width=307) (actual time=502.388..1075.572 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitepartnermaintaineridfk = "inner".idpk)
-> Hash Left Join (cost=93881.51..138792.55 rows=142556 width=289) (actual time=502.208..1023.802 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitepartnerelectricitysupplieridfk = "inner".idpk)
-> Hash Left Join (cost=93878.75..136651.45 rows=142556 width=271) (actual time=502.041..969.959 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitepartneridentificationoperatoridfk = "inner".idpk)
-> Nested Loop Left Join (cost=93875.99..134510.35 rows=142556 width=253) (actual time=501.849..915.256 rows=6364 loops=1)
Join Filter: ("outer".addevicessitestatustypeidfk = "inner".idpk)
-> Nested Loop Left Join (cost=93874.93..118471.74 rows=142556 width=228) (actual time=501.826..818.436 rows=6364 loops=1)
Join Filter: ("outer".addevicessitepositionidfk = "inner".idpk)
-> Nested Loop Left Join (cost=93873.90..108848.18 rows=142556 width=207) (actual time=501.802..737.137 rows=6364 loops=1)
Join Filter: ("outer".addevicessitevisibilityidfk = "inner".idpk)
-> Merge Right Join (cost=93872.86..96017.09 rows=142556 width=177) (actual time=501.741..554.834 rows=6364 loops=1)
Merge Cond: ("outer".idpk = "inner".addevicessitesizeidfk)
-> Index Scan using addevicessitesizes_pkey on addevicessitesizes (cost=0.00..5.62 rows=110 width=14) (actual time=0.009..0.264 rows=110 loops=1)
-> Sort (cost=93872.86..94229.25 rows=142556 width=169) (actual time=501.697..504.267 rows=6364 loops=1)
Sort Key: addevicessites.addevicessitesizeidfk
-> Hash Left Join (cost=57752.91..65764.23 rows=142556 width=169) (actual time=234.321..466.130 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitedistrictidfk = "inner".idpk)
-> Hash Left Join (cost=57743.17..63616.15 rows=142556 width=164) (actual time=233.456..421.267 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitestreetdescriptionidfk = "inner".idpk)
-> Hash Left Join (cost=57634.86..62707.43 rows=142556 width=137) (actual time=223.396..362.945 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitestreetidfk = "inner".idpk)
-> Hash Left Join (cost=57566.95..61844.20 rows=142556 width=119) (actual time=217.101..312.605 rows=6364 loops=1)
Hash Cond: ("outer".addevicessitecityidfk = "inner".idpk)
-> Merge Right Join (cost=57561.85..59700.76 rows=142556 width=110) (actual time=216.635..266.672 rows=6364 loops=1)
Merge Cond: ("outer".idpk = "inner".addevicessitecountyidfk)
-> Sort (cost=6.19..6.48 rows=117 width=17) (actual time=0.350..0.389 rows=116 loops=1)
Sort Key: addevicessitecounties.idpk
-> Seq Scan on addevicessitecounties (cost=0.00..2.17 rows=117 width=17) (actual time=0.008..0.146 rows=117 loops=1)
-> Sort (cost=57555.66..57912.05 rows=142556 width=99) (actual time=216.250..218.611 rows=6364 loops=1)
Sort Key: addevicessites.addevicessitecountyidfk
-> Merge Right Join (cost=33573.63..35712.03 rows=142556 width=99) (actual time=173.646..199.036 rows=6364 loops=1)
Merge Cond: ("outer".idpk = "inner".addevicessiteregionidfk)
-> Sort (cost=1.44..1.48 rows=15 width=23) (actual time=0.055..0.059 rows=13 loops=1)
Sort Key: addevicessiteregions.idpk
-> Seq Scan on addevicessiteregions (cost=0.00..1.15 rows=15 width=23) (actual time=0.016..0.032 rows=15 loops=1)
-> Sort (cost=33572.19..33928.58 rows=142556 width=82) (actual time=173.559..176.398 rows=6364 loops=1)
Sort Key: addevicessites.addevicessiteregionidfk
-> Seq Scan on addevicessites (cost=0.00..13118.56 rows=142556 width=82) (actual time=62.345..164.783 rows=6364 loops=1)
-> Hash (cost=4.48..4.48 rows=248 width=17) (actual time=0.283..0.283 rows=0 loops=1)
-> Seq Scan on addevicessitecities (cost=0.00..4.48 rows=248 width=17) (actual time=0.011..0.162 rows=138 loops=1)
-> Hash (cost=60.53..60.53 rows=2953 width=34) (actual time=6.229..6.229 rows=0 loops=1)
-> Seq Scan on addevicessitestreets (cost=0.00..60.53 rows=2953 width=34) (actual time=0.010..3.816 rows=2984 loops=1)
-> Hash (cost=96.85..96.85 rows=4585 width=43) (actual time=10.017..10.017 rows=0 loops=1)
-> Seq Scan on addevicessitestreetdescriptions (cost=0.00..96.85 rows=4585 width=43) (actual time=0.008..6.371 rows=4585 loops=1)
-> Hash (cost=8.59..8.59 rows=459 width=21) (actual time=0.815..0.815 rows=0 loops=1)
-> Seq Scan on addevicessitedistricts (cost=0.00..8.59 rows=459 width=21) (actual time=0.007..0.541 rows=382 loops=1)
-> Materialize (cost=1.04..1.08 rows=4 width=36) (actual time=0.000..0.002 rows=4 loops=6364)
-> Seq Scan on addevicessitevisibilities (cost=0.00..1.04 rows=4 width=36) (actual time=0.026..0.035 rows=4 loops=1)
-> Materialize (cost=1.03..1.06 rows=3 width=27) (actual time=0.000..0.001 rows=3 loops=6364)
-> Seq Scan on addevicessitepositions (cost=0.00..1.03 rows=3 width=27) (actual time=0.007..0.010 rows=3 loops=1)
-> Materialize (cost=1.05..1.10 rows=5 width=31) (actual time=0.000..0.002 rows=5 loops=6364)
-> Seq Scan on addevicessitestatustypes (cost=0.00..1.05 rows=5 width=31) (actual time=0.006..0.016 rows=5 loops=1)
-> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.144..0.144 rows=0 loops=1)
-> Seq Scan on partneridentifications partneridentificationsoperator (cost=0.00..2.61 rows=61 width=34) (actual time=0.007..0.103 rows=61 loops=1)
-> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.122..0.122 rows=0 loops=1)
-> Seq Scan on partners partnerselectricitysupplier (cost=0.00..2.61 rows=61 width=34) (actual time=0.004..0.072 rows=61 loops=1)
-> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.136..0.136 rows=0 loops=1)
-> Seq Scan on partners partnersmaintainer (cost=0.00..2.61 rows=61 width=34) (actual time=0.005..0.086 rows=61 loops=1)
-> Hash (cost=2.61..2.61 rows=61 width=34) (actual time=0.143..0.143 rows=0 loops=1)
-> Seq Scan on partners partnerssticker (cost=0.00..2.61 rows=61 width=34) (actual time=0.009..0.098 rows=61 loops=1)
Total runtime: 2210.937 ms
regards, tom laneMiroslav
begin:vcard fn;quoted-printable:Miroslav =C5=A0ulc n;quoted-printable:=C5=A0ulc;Miroslav org:StartNet s.r.o. adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika email;internet:[EMAIL PROTECTED] title:CEO tel;work:+420 257 225 602 tel;cell:+420 603 711 413 x-mozilla-html:TRUE url:http://www.startnet.cz version:2.1 end:vcard
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend