Re: [PERFORM] Shared memory for large PostGIS operations
On 03/16/2012 05:30 PM, Kevin Grittner wrote: Brian Hamlin wrote: On Mar 16, 2012, at 7:17 AM, Kevin Grittner wrote: Andy Colson wrote: I tried shared_buffers at both 2400M and 18000M, and it took 4.5 hours both times. ... (weak attempts at humor omitted) Ah, I didn't pick up on the attempts at humor; perhaps that's why you mistook something I said as an attempt at an insult. It wasn't you Kevin, it was me that insulted him. (Although I was trying to be funny, and not mean). Sorry again Brian. -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Shared memory for large PostGIS operations
Andy Colson wrote: On 03/16/2012 05:30 PM, Kevin Grittner wrote: >> Ah, I didn't pick up on the attempts at humor; perhaps that's why >> you mistook something I said as an attempt at an insult. > > It wasn't you Kevin, it was me that insulted him. (Although I was > trying to be funny, and not mean). Adding to the confusion, I think I missed one of the emails/posts. Oh, well, it sounds like mostly people need to use more smiley-faces, since humor can be so easy to miss in this medium. Brian, I hope this doesn't put you off from posting -- we try to be helpful here. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] slow self-join query
Disclaimer: this is a re-post, since I wasn't subscribed the first time I posted. Pardon if this is a duplicate.] The following query is abysmally slow (e.g. 7 hours+). The goal is to find "among the users that follow user #1, who do they also follow?" and to count the latter. SELECT L2.followed_id as followed_id, COUNT(U1.id) AS count FROM users AS U1 INNER JOIN links AS L1 ON L1.follower_id = U1.id INNER JOIN links AS L2 ON L2.follower_id = U1.id WHERE U1.type = 'User::Twitter' AND L1.followed_id = 1 GROUP BY L2.followed_id Here's the rest of the info. === versions psql (9.1.2, server 8.3.14) === schema create_table "users", :force => true do |t| t.string "type" end create_table "links", :force => true do |t| t.integer "followed_id" t.integer "follower_id" end add_index "links", ["follower_id"], :name => "index_links_on_follower_id" add_index "links", ["followed_id", "follower_id"], :name => "index_links_on_followed_id_and_follower_id", :unique => true add_index "links", ["followed_id"], :name => "index_links_on_followed_id" === # of rows users: 2,525,539 links: 4,559,390 === explain "HashAggregate (cost=490089.52..490238.78 rows=11941 width=8)" " -> Hash Join (cost=392604.44..483190.22 rows=1379860 width=8)" "Hash Cond: (f1.follower_id = u1.id)" "-> Bitmap Heap Scan on links f1 (cost=14589.95..55597.70 rows=764540 width=4)" " Recheck Cond: (followed_id = 1)" " -> Bitmap Index Scan on index_links_on_followed_id (cost=0.00..14398.82 rows=764540 width=0)" "Index Cond: (followed_id = 1)" "-> Hash (cost=300976.98..300976.98 rows=4559881 width=12)" " -> Hash Join (cost=94167.40..300976.98 rows=4559881 width=12)" "Hash Cond: (f2.follower_id = u1.id)" "-> Seq Scan on links f2 (cost=0.00..77049.81 rows=4559881 width=8)" "-> Hash (cost=53950.20..53950.20 rows=2526496 width=4)" " -> Seq Scan on users u1 (cost=0.00..53950.20 rows=2526496 width=4)" "Filter: ((type)::text = 'User::Twitter'::text)" === other comments I'm assuming I'm doing something obviously stupid and that the above info will be sufficient for anyone skilled in the art to detect the problem. However, if needed I will gladly invest the time to create a subset of the data in order to run EXPLAIN ANALYZE. (With the whole dataset, it requires > 7 hours to complete the query. I don't want to go down that path again!) - rdp -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow self-join query
On Sat, Mar 17, 2012 at 2:56 PM, Robert Poor wrote: > Disclaimer: this is a re-post, since I wasn't subscribed the first > time I posted. Pardon if this is a duplicate.] > > The following query is abysmally slow (e.g. 7 hours+). The goal is to > find "among the users that follow user #1, who do they also follow?" > and to count the latter. > > SELECT L2.followed_id as followed_id, COUNT(U1.id) AS count > FROM users AS U1 > INNER JOIN links AS L1 ON L1.follower_id = U1.id > INNER JOIN links AS L2 ON L2.follower_id = U1.id > WHERE U1.type = 'User::Twitter' > AND L1.followed_id = 1 > GROUP BY L2.followed_id > > Here's the rest of the info. > > === versions > > psql (9.1.2, server 8.3.14) > > === schema > > create_table "users", :force => true do |t| > t.string "type" > end > > create_table "links", :force => true do |t| > t.integer "followed_id" > t.integer "follower_id" > end > > add_index "links", ["follower_id"], :name => "index_links_on_follower_id" > add_index "links", ["followed_id", "follower_id"], :name => > "index_links_on_followed_id_and_follower_id", :unique => true > add_index "links", ["followed_id"], :name => "index_links_on_followed_id" > > === # of rows > > users: 2,525,539 > links: 4,559,390 > > === explain > > "HashAggregate (cost=490089.52..490238.78 rows=11941 width=8)" > " -> Hash Join (cost=392604.44..483190.22 rows=1379860 width=8)" > " Hash Cond: (f1.follower_id = u1.id)" > " -> Bitmap Heap Scan on links f1 (cost=14589.95..55597.70 > rows=764540 width=4)" > " Recheck Cond: (followed_id = 1)" > " -> Bitmap Index Scan on index_links_on_followed_id > (cost=0.00..14398.82 rows=764540 width=0)" > " Index Cond: (followed_id = 1)" > " -> Hash (cost=300976.98..300976.98 rows=4559881 width=12)" > " -> Hash Join (cost=94167.40..300976.98 rows=4559881 > width=12)" > " Hash Cond: (f2.follower_id = u1.id)" > " -> Seq Scan on links f2 (cost=0.00..77049.81 > rows=4559881 width=8)" > " -> Hash (cost=53950.20..53950.20 rows=2526496 > width=4)" > " -> Seq Scan on users u1 > (cost=0.00..53950.20 rows=2526496 width=4)" > " Filter: ((type)::text = > 'User::Twitter'::text)" > > === other comments > > I'm assuming I'm doing something obviously stupid and that the above > info will be sufficient for anyone skilled in the art to detect the > problem. However, if needed I will gladly invest the time > to create a subset of the data in order to run EXPLAIN ANALYZE. (With > the whole dataset, it requires > 7 hours to complete the query. I > don't want to go down that path again!) Do you have an index on users.type? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance