Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-24 Thread Amador Alvarez
I would also give it a try on turning on statistics on service_id and follower_id fields and tune collecting of distinct values for the optimizer. Cheers, Amador A. On Wed, Apr 24, 2013 at 9:04 AM, François Beausoleil wrote: > Hi! > > Le 2013-04-17 à 14:15, Jeff Janes a écrit : > > On Wed, Apr

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-24 Thread François Beausoleil
Hi! Le 2013-04-17 à 14:15, Jeff Janes a écrit : > On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil > wrote: > > > Insert on public.persona_followers (cost=139261.12..20483497.65 > rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1) >Buffers: shared hit=3313

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Jeff Janes
On Wed, Apr 17, 2013 at 1:19 PM, François Beausoleil wrote: > > Le 2013-04-17 à 14:15, Jeff Janes a écrit : > > > It looks like 12% of the time is being spent figuring out what rows to > insert, and 88% actually doing the insertions. > > So I think that index maintenance is killing you. You could

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread François Beausoleil
Le 2013-04-17 à 14:15, Jeff Janes a écrit : > On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil > wrote: > > > Insert on public.persona_followers (cost=139261.12..20483497.65 > rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1) >Buffers: shared hit=33135295

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Moshe Jacobson
On Tue, Apr 16, 2013 at 10:51 PM, François Beausoleil wrote: > INSERT INTO persona_followers(service_id, follower_id, valid_at) > SELECT service_id, follower_id, NOW() > FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import > WHERE NOT EXISTS(SELECT * FROM persona_followers WH

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Jeff Janes
On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil wrote: > > > Insert on public.persona_followers (cost=139261.12..20483497.65 > rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1) >Buffers: shared hit=33135295 read=4776921 >-> Subquery Scan on t1 (cost=1392

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Chris Curvey
> INSERT INTO persona_followers(service_id, follower_id, valid_at) > SELECT service_id, follower_id, NOW() > FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import > WHERE NOT EXISTS(SELECT * FROM persona_followers WHERE import.service_id > = persona_followers.service_id AND imp

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread François Beausoleil
Le 2013-04-16 à 22:51, François Beausoleil a écrit : > Hi all! > > I track Twitter followers in my database. I have the following table: > > # \d persona_followers > Table "public.persona_followers" > Column|Type | Modifiers > -+--

[GENERAL] Most efficient way to insert without duplicates

2013-04-16 Thread François Beausoleil
Hi all! I track Twitter followers in my database. I have the following table: # \d persona_followers Table "public.persona_followers" Column|Type | Modifiers -+-+--- service_id | bigint