2013/8/19 Robert Sosinski <rsosin...@ticketevolution.com> > Hi Pavel, > > What kind of example do you need? I cant give you the actual data I have > in the table, but I can give you an example query and the schema attached > below. From there, I would just put in 2 million rows worth 1.2 Gigs of > data. Average size of the the extended columns (using the pg_column_size > function) in bytes are: > > guid: 33 > name: 2.41 > currency: 4 > fields: 120.32 > > example query: > > -- find duplicate records using a guid > select guid, array_agg(id) from orders group by guid; >
how much distinct guid is there, and how much duplicates ?? regards Pavel > > example schema: > Table "public.things" > > Column | Type | Modifiers > | Storage | Stats target | Description > > ------------+-----------------------------+-----------------------------------------------------+----------+--------------+------------- > id | integer | not null default > nextval('things_id_seq'::regclass) | plain | | > version | integer | not null > | plain | | > created_at | timestamp without time zone | not null > | plain | | > updated_at | timestamp without time zone | not null > | plain | | > foo_id | integer | not null > | plain | | > bar_id | integer | not null > | plain | | > baz_id | integer | not null > | plain | | > guid | character varying | not null > | extended | | > name | character varying | not null > | extended | | > price | numeric(12,2) | not null > | main | | > currency | character varying | not null > | extended | | > amount | integer | not null > | plain | | > the_date | date | not null > | plain | | > fields | hstore | > | extended | | > Indexes: > "things_pkey" PRIMARY KEY, btree (id) > "things_foo_id_idx" btree (foo_id) > "things_bar_id_idx" btree (bar_id) > "things_baz_id_idx" btree (baz_id) > "things_guid_uidx" UNIQUE, btree (guid) > "things_lpad_lower_name_eidx" btree (lpad(lower(name::text), 10, > '0'::text)) > "things_price_idx" btree (price) > > Foreign-key constraints: > "things_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(id) > "things_bar_id_fkey" FOREIGN KEY (bar_id) REFERENCES bars(id) > "things_baz_id_fkey" FOREIGN KEY (baz_id) REFERENCES bazs(id) > Triggers: > timestamps_trig BEFORE INSERT OR UPDATE ON things FOR EACH ROW EXECUTE > PROCEDURE timestamps_tfun() > > Let me know if you need anything else. > > Thanks, > > > On Mon, Aug 19, 2013 at 3:29 AM, Pavel Stehule <pavel.steh...@gmail.com>wrote: > >> Hello >> >> please, can you send some example or test? >> >> Regards >> >> Pavel Stehule >> >> >> 2013/8/19 Robert Sosinski <rsosin...@ticketevolution.com> >> >>> When using array_agg on a large table, memory usage seems to spike up >>> until Postgres crashes with the following error: >>> >>> 2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection >>> because of crash of another server process >>> 2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has commanded >>> this server process to roll back the current transaction and exit, because >>> another server process exited abnormally and possibly corrupted shared >>> memory. >>> 2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be able >>> to reconnect to the database and repeat your command. >>> >>> I've definitely isolated it down to using array_agg, as when I changed >>> the query to use string_agg, it worked fine. I also tried using array_agg >>> on a few different queries, all yielding the same issue. Swapping in >>> string_agg fixed the issue once more. >>> >>> This particular table has over 2 million rows and is 1.2 Gigs, and when >>> I ran the query while viewing htop, the virtual size of the Postgres >>> process ballooned to 13.9G until crashing. >>> >>> The version of Postgres I am using is: PostgreSQL 9.2.4 on >>> x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) >>> 4.7.2, 64-bit >>> >>> Any help would be much appreciated, thanks! >>> >>> -Robert >>> >> >> >