Given a country table and an order table:

CREATE TABLE g.country
(
-- inherited from table g.standard:  oid uuid NOT NULL,
-- ... some more columns inherited...
  lisocode integer NOT NULL, -- Numeric ISO 3166 code
  szisoalpha2 character varying(2), -- The 2 letter country code
  szisoalpha3 character varying(3), -- The 3 letter country code
  szisonum3 character varying(3), -- The NUMERIC country code with leading zeros
-- a few more columns here
CONSTRAINT country_pkey PRIMARY KEY (oid)
) INHERITS (g.standard)
WITH (
  OIDS=FALSE
);

This table is filled with ISO 3166 country codes.

The order table:

CREATE TABLE t."order"
(
-- inherited from table g.standard:  oid uuid NOT NULL,
-- ... some more columns inherited...
  szxmladdress text,
-- many more columns in this table
CONSTRAINT country_pkey PRIMARY KEY (oid)
) INHERITS (g.standard)
WITH (
  OIDS=FALSE
);


A typical entry in t."order".szxmladdress looks like

<address entityType="Person" country="DE">
  <field name="Address.oid">ae0eb84f-9b8b-4fef-b87a-d6757bdfeaf9</field>
  <field 
name="Address.uaddrtyperef">0bbdb48c-21c7-429e-944e-59a4d9ace9d5</field>
  <field name="Address.szstreet">Hauptstraße</field>
  <!-some more xml tags here -->
</address>

No XML field in the order table exceeds 2kB.

Getting the 2 letter country code from the xml address by this function:

CREATE OR REPLACE FUNCTION g.getxmladdresscountryid(xaddr text)
  RETURNS uuid AS
$BODY$BEGIN
        RETURN (SELECT oID FROM G.Country WHERE szIsoAlpha2 = 
array_to_string(xpath('/address/@country', xAddr::xml), '') ORDER BY lIsoCode 
LIMIT 1);
END$BODY$
  LANGUAGE plpgsql IMMUTABLE;

I know that this function is not really IMMUTABLE but the country table is 
changing only every several years.

Created a functional index on the order table:

CREATE INDEX order_getxmladdresscountryid_fidx
  ON t."order"
  USING btree
  (g.getxmladdresscountryid(szxmladdress));

Joining order and country table limiting to 10 rows uses the index:

explain analyse
SELECT
GCountry.szISOAlpha2,
GCountry.szISOAlpha3,
GCountry.szISONum3
from
t.order torder
left join G.Country GCountry ON G.GetXmlAddressCountryID(TOrder.szXmlAddress) = 
GCountry.oID
limit 10

Gives:

Limit  (cost=0.56..8.45 rows=10 width=11) (actual time=0.644..4.764 rows=10 
loops=1)
  ->  Merge Right Join  (cost=0.56..10670.45 rows=13517 width=11) (actual 
time=0.642..4.754 rows=10 loops=1)
        Merge Cond: (gcountry.oid = 
g.getxmladdresscountryid(torder.szxmladdress))
        ->  Index Scan using country_pkey on country gcountry  
(cost=0.27..38.05 rows=258 width=27) (actual time=0.025..0.067 rows=32 loops=1)
        ->  Index Scan using order_getxmladdresscountryid_fidx on "order" 
torder  (cost=0.29..7019.04 rows=13517 width=366) (actual time=0.020..0.058 
rows=10 loops=1)
Planning time: 0.603 ms
Execution time: 4.898 ms

But when I remove the "limit 10", the index is no longer used:

Hash Left Join  (cost=17.81..5397.46 rows=13517 width=11) (actual 
time=0.941..4721.372 rows=13510 loops=1)
  Hash Cond: (g.getxmladdresscountryid(torder.szxmladdress) = gcountry.oid)
  ->  Seq Scan on "order" torder  (cost=0.00..3504.17 rows=13517 width=366) 
(actual time=0.011..27.542 rows=13510 loops=1)
  ->  Hash  (cost=14.58..14.58 rows=258 width=27) (actual time=0.427..0.427 
rows=258 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 23kB
        ->  Seq Scan on country gcountry  (cost=0.00..14.58 rows=258 width=27) 
(actual time=0.008..0.226 rows=258 loops=1)
Planning time: 0.580 ms
Execution time: 4728.602 ms

Event a "limit 100"  does not use the index any more.

However, the result of 4728 ms is almost a linear increase from 10 rows to 
13500 rows.

I played around with COST of the function between 1 and 20000 and with several 
options on postgresql.conf without luck.

A sequential scan on the order table alone is carried out in 15 ms.

Thanks for any idea.

Klaus



Reply via email to