I see others have responded with suggestions to improve query performance, but one thing I noticed when you gave the data structure is there are no no primary keys defined for friends or posting, neither are there any indexes. Was that an omission? If not, then please note that PostgreSQL is a _relational_ database and it is critical to have primary keys and additional indexes for data integrity and performance.
FYI, defining a foreign key in a table does not automatically generate an associated index. I therefore suggest you do the following to improve performance. ALTER TABLE posting ADD CONSTRAINT posting_pk PRIMARY KEY ( id ); ALTER TABLE friends ADD CONSTRAINT friends_pk PRIMARY KEY ( user_id__a, user_id__b ); ALTER TABLE membership ADD CONSTRAINT membership_pk PRIMARY KEY ( user_id, group_id ); CREATE INDEX posting_group_id__in_idx on posting USING BTREE ( group_id__in ); ANALYZE posting; ANALYZE friends; ANALYZE membership; On Wed, Apr 29, 2015 at 6:54 AM, Ladislav Lenart <lenart...@volny.cz> wrote: > Hello. > > On 29.4.2015 01:57, Jonathan Vanasco wrote: > > Sorry, I was trying to ask something very abstract as I have similar > situations > > on multiple groups of queries/tables (and they're all much more complex). > > > > I'm on pg 9.3 > > > > The relevant structure is: > > > > posting: > > id > > timestamp_publish > > group_id__in > > user_id__author > > > > friends: > > user_id__a > > user_id__b > > > > memberships: > > user_id > > group_id > > role_id > > > > > > -- working sql > > CREATE TABLE groups( > > id SERIAL NOT NULL PRIMARY KEY > > ); > > CREATE TABLE users( > > id SERIAL NOT NULL PRIMARY KEY > > ); > > CREATE TABLE friends ( > > user_id__a INT NOT NULL REFERENCES users( id ), > > user_id__b INT NOT NULL REFERENCES users( id ) > > ); > > CREATE TABLE memberships ( > > user_id INT NOT NULL REFERENCES users( id ), > > group_id INT NOT NULL REFERENCES groups( id ), > > role_id INT NOT NULL > > ); > > CREATE TABLE posting ( > > id SERIAL NOT NULL, > > timestamp_publish timestamp not null, > > group_id__in INT NOT NULL REFERENCES groups(id), > > user_id__author INT NOT NULL REFERENCES users(id), > > is_published BOOL > > ); > > > > The output that I'm trying to get is: > > posting.id > > {the context of the select} > > posting.timestamp_publish (this may need to get correlated into > other queries) > > > > > > These approaches had bad performance: > > > > -- huge selects / memory > > -- it needs to load everything from 2 tables before it limits > > EXPLAIN ANALYZE > > SELECT id, feed_context FROM ( > > SELECT id, timestamp_publish, 'in-group' AS feed_context FROM > posting > > WHERE ( > > group_id__in IN (SELECT group_id FROM memberships WHERE > user_id = 57 > > AND role_id IN (1,2,3)) > > AND (is_published = True AND timestamp_publish <= > CURRENT_TIMESTAMP > > AT TIME ZONE 'UTC') > > ) > > UNION > > SELECT id, timestamp_publish, 'by-user' AS feed_context FROM > posting > > WHERE ( > > user_id__author IN (SELECT user_id__b FROM friends WHERE > user_id__a > > = 57) > > AND (is_published = True AND timestamp_publish <= > CURRENT_TIMESTAMP > > AT TIME ZONE 'UTC') > > ) > > ) AS feed > > ORDER BY timestamp_publish DESC > > LIMIT 10 > > ; > > > I think you can propagate ORDER BY and LIMIT also to the subqueries of the > UNION, i.e.: > > select... > from ( > ( > select... > from posting > where... -- friends > order by timestamp_publish desc > limit 10 > ) union ( > ( > select... > from posting > where... -- groups > order by timestamp_publish desc > limit 10 > ) > ) as feed > order by timestamp_publish desc > limit 10 > > That might behave better. > > > Ladislav Lenart > > > > -- selects minimized, but repetitive subqueries > > SELECT > > id, > > CASE > > WHEN group_id__in IN (SELECT group_id FROM memberships WHERE > user_id > > = 57 AND role_id IN (1,2,3)) THEN True > > ELSE NULL > > END AS feed_context_group, > > CASE > > WHEN user_id__author IN (SELECT user_id__b FROM friends WHERE > > user_id__a = 57) THEN True > > ELSE NULL > > END AS feed_context_user > > FROM posting > > WHERE ( > > group_id__in IN (SELECT group_id FROM memberships WHERE > user_id = 57 > > AND role_id IN (1,2,3)) > > OR > > user_id__author IN (SELECT user_id__b FROM friends WHERE > user_id__a > > = 57) > > ) > > AND (is_published = True AND timestamp_publish <= > CURRENT_TIMESTAMP AT > > TIME ZONE 'UTC') > > ORDER BY timestamp_publish DESC > > LIMIT 10 > > ; > > > > > > > > On Apr 28, 2015, at 6:56 PM, Melvin Davidson wrote: > > > >> Since you very nicely DID NOT provide the pg version, O/S or table > >> structure(s), which is what you should do REGARDLESS of the > >> type of question (it's just the smart and polite thing to do when > asking for > >> help) The best I can suggest is: > >> SELECT > >> CASE WHEN context = 'friend' THEN p.junka > >> WHEN context = 'group' THEN p.junkb > >> WHEN context = 'both' THEN p.junka || ' ' || p.junkb > >> END > >> FROM posting p > >> where p.author_id in (SELECT f.friend_id > >> FROM friends f > >> WHERE f.user_id = ?) > >> OR p.group_id in (SELECT m.group_id > >> FROM memberships m > >> WHERE m.user_id = ?); > > > > > > > -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.