Summary: special inet operators ( << >> <<= =>> ) are up to 1000000X off in estimating rowcounts Type: performance Severity: normal Tested on: 9.1.0 Description:
We've been noticing that row estimates for queries which use the =>> and <<= operators for inet data were way, way off. We finally narrowed the problem down to a simple test: =========== USING <<= : =========== explain analyze SELECT count(*) FROM partition1 lh WHERE lh.ip <<= '1.2.3'::cidr; QUERY PLAN ..... -> Index Scan using partition1_ip on partition1 lh (cost=0.00..10.21 rows=6956732 width=0) (actual time=0.016..0.016 rows=0 loops=1) Index Cond: ((ip >= '1.2.3.0/24'::inet) AND (ip <= '1.2.3.255'::inet)) Filter: (ip <<= '1.2.3.0/24'::inet) ..... explain analyze SELECT count(*) FROM partition2 WHERE 1=1 AND ip <<= '87.178.193.0/24'::inet; QUERY PLAN Aggregate (cost=18296.78..18296.79 rows=1 width=0) (actual time=0.037..0.038 rows=1 loops=1) -> Index Scan using partition2_ip on partition2 (cost=0.00..38.36 rows=7303365 width=0) (actual ti me=0.022..0.031 rows=5 loops=1) Index Cond: ((ip >= '87.178.193.0/24'::inet) AND (ip <= '87.178.193.255'::inet)) Filter: (ip <<= '87.178.193.0/24'::inet) Total runtime: 0.107 ms ============ USING < > : ============ explain analyze SELECT count(*) FROM partition1 lh WHERE lh.ip >= '1.2.3.0/24'::inet and lh.ip <= '1.2.3.255'::inet; QUERY PLAN .... -> Index Scan using partition1_ip on partition1 lh (cost=0.00..10.22 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=1) Index Cond: ((ip >= '1.2.3.0/24'::inet) AND (ip <= '1.2.3.255'::inet)) .... explain analyze SELECT count(*) FROM partition2 WHERE 1=1 AND ip > '87.178.193.0'::inet and ip <= '87.178.193.255'::inet; QUERY PLAN Aggregate (cost=26.34..26.35 rows=1 width=0) (actual time=0.033..0.033 rows=1 loops=1) -> Index Scan using partition2_ip on partition2 (cost=0.00..26.33 rows=5 width=0) (actual time=0.0 19..0.029 rows=5 loops=1) Index Cond: ((ip > '87.178.193.0'::inet) AND (ip <= '87.178.193.255'::inet)) Total runtime: 0.097 ms ==== Note that the mis-estimate of rows returned in each case is almost exactly 50% of the total rows in the table. That would suggest that match_special_index_operator is failing, and not recognizing the <<= operator for estimation purposes and just going with a default estimate of 0.5. I've tried to locate the cause of this problem, but the code involved is rather convoluted and crusty, and I can't follow the logic. Help? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs