On 05/03/12 04:06, Chris Angelico wrote:
On Mon, Mar 5, 2012 at 2:50 PM, David Johnston<pol...@yahoo.com>  wrote:
Any efficient, non-RegEx, alternative would require more context to evaluate 
than you provide.  Mainly, would it be faster to have a separate field to store 
the parsed (at input) number and then query that field directly (even if it is 
a text field as well)?  Basically cache the parse.
Caching's looking tempting, but I don't know if it'll be worth it
(these fields won't be searched-as-int very often compared to
search-as-string, and there's potentially a lot of such fields). All I
need out of it is the leading digits - I can strip them with trim(),
but I can't keep _only_ those digits.

The other possibility that may be of value is to write the function in
C instead of pl/pgsql, which will then actually call atoi() itself. Is
that going to be a better option?

Can you use to_number() here? It sounds like something along the lines of cast(to_number('0' || field::varchar, '999999999.') as int) might give the behaviour you're after, and a quick test seems to indicate that it's about 4x faster than the original function:

postgres=# explain analyze select cast(to_number('0' || generate_series::varchar, '999999999.') as int) from generate_series(1,1000000);
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series (cost=0.00..22.50 rows=1000 width=4) (actual time=137.720..1065.752 rows=1000000 loops=1)
 Total runtime: 1144.993 ms
(2 rows)

postgres=# explain analyze select str2int(generate_series::varchar) from generate_series(1,1000000);
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series (cost=0.00..265.00 rows=1000 width=4) (actual time=135.180..4022.408 rows=1000000 loops=1)
 Total runtime: 4121.233 ms
(2 rows)

Hopefully there's a cleaner way of writing that without a long list of 9s in the format string, and if the field is nullable I'd guess you probably need a coalesce(..., 0) around that as well.

cheers,

Tom


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to