2010/11/18 Pavel Stehule <pavel.steh...@gmail.com>: > 2010/11/18 Cédric Villemain <cedric.villemain.deb...@gmail.com>: >> 2010/11/18 Pavel Stehule <pavel.steh...@gmail.com>: >>> 2010/11/18 Tom Lane <t...@sss.pgh.pa.us>: >>>> Merlin Moncure <mmonc...@gmail.com> writes: >>>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <ja...@2ndquadrant.com> >>>>> wrote: >>>>>> i will start the review of this one... but before that sorry for >>>>>> suggesting this a bit later but about using UNNEST as part of the >>>>>> sintax? >>>> >>>>> Does for-in-array do what unnset does? >>>> >>>> Yes, which begs the question of why bother at all. AFAICS this patch >>>> simply allows you to replace >>>> >>>> for x in select unnest(array_value) loop >>>> >>>> with >>>> >>>> for x in unnest array_value loop >>>> >>>> (plus or minus a parenthesis or so). I do not think we need to add a >>>> bunch of code and create even more syntactic ambiguity (FOR loops are >>>> already on the hairy edge of unparsability) to save people from writing >>>> "select". >>> >>> this patch is semantically equal to SELECT unnest(..), but it is >>> evaluated as simple expression and does directly array unpacking and >>> iteration, - so it means this fragment is significantly >>faster<<. >> >> Did you implement a method to be able to walk the array and detoast >> only the current needed data ? > > not only - iteration over array can help with readability but a > general work with SRF (set returning functions is more harder and > slower) - so special loop statement can to safe a some toast op / when > you use a large array and access via index, or can to safe a some work > with memory, because there isn't necessary convert array to set of > tuples. Please, recheck these tests. > > test: > > CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select > array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM > (random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE > sql; > > create or replace function rndarray(int) returns text[] as $$select > array(select rndstr() from generate_series(1,$1)) $$ language sql; > > create table t10(x text[]); > insert into t10 select rndarray(10) from generate_series(1,10000); > create table t100(x text[]); > insert into t100 select rndarray(100) from generate_series(1,10000); > create table t1000(x text[]); > insert into t1000 select rndarray(1000) from generate_series(1,10000); > > CREATE OR REPLACE FUNCTION public.filter(text[], text, integer) > RETURNS text[] > LANGUAGE plpgsql > AS $function$ > DECLARE > s text[] := '{}'; > l int := 0; > v text; > BEGIN > FOR v IN ARRAY $1 > LOOP > EXIT WHEN l = $3; > IF v LIKE $2 THEN > s := s || v; > l := l + 1; > END IF; > END LOOP; > RETURN s; > END;$function$; > > postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t10; > avg > -------------------- > 1.1596079803990200 > (1 row) > > Time: 393.649 ms > > postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t100; > avg > -------------------- > 3.4976777789245536 > (1 row) > > Time: 2804.502 ms > > postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t1000; > avg > --------------------- > 10.0000000000000000 > (1 row) > > Time: 9729.994 ms > > CREATE OR REPLACE FUNCTION public.filter01(text[], text, integer) > RETURNS text[] > LANGUAGE plpgsql > AS $function$ > DECLARE > s text[] := '{}'; > l int := 0; > v text; > BEGIN > FOR v IN SELECT UNNEST($1) > LOOP > EXIT WHEN l = $3; > IF v LIKE $2 THEN > s := s || v; > l := l + 1; > END IF; > END LOOP; > RETURN s; > END;$function$; > > postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t10; > avg > -------------------- > 1.1596079803990200 > (1 row) > > Time: 795.383 ms > > postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t100; > avg > -------------------- > 3.4976777789245536 > (1 row) > > Time: 3848.258 ms > > postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t1000; > avg > --------------------- > 10.0000000000000000 > (1 row) > > Time: 12366.093 ms > > The iteration via specialized FOR IN ARRAY is about 25-30% faster than > FOR IN SELECT UNNEST > > postgres=# CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer) > RETURNS text[] > LANGUAGE plpgsql > AS $function$ > DECLARE > s text[] := '{}'; > l int := 0; i int; > v text; > BEGIN > FOR i IN array_lower($1,1)..array_upper($1,1) > LOOP > EXIT WHEN l = $3; > IF $1[i] LIKE $2 THEN > s := s || $1[i]; > l := l + 1; > END IF; > END LOOP; > RETURN s; > END;$function$ > ; > > postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t10; > avg > -------------------- > 1.1596079803990200 > (1 row) > > Time: 414.960 ms > > postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t100; > avg > -------------------- > 3.4976777789245536 > (1 row) > > Time: 3460.970 ms > > there FOR IN ARRAY is faster about 30% then access per index > > for T1000 I had to cancel over 1 minute!!!!
I can't test until this week-end. But I will. > > >> >> (I wonder because I have something like that in that garage : select >> array_filter(foo,'like','%bar%',10); where 10 is the limit and can be >> avoided, foo is the array, like is callback function, '%bar%' the >> parameter for the callback function for filtering results.) >> >> It will make my toy in the garage a fast race car (and probably doable >> in (plpg)SQL instead of C) ... > > it can help with reading of array. But it doesn't help with array > updating :(. For large arrays it can be slow too. select fast is already a good job, thank you. -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers