Nothing? Are subqueries just not meant to be used this way?

On Wed, May 9, 2012 at 9:42 AM, Chris Hanks
<christopher.m.ha...@gmail.com>wrote:

> Hello -
>
> I have two tables:
>
> CREATE TABLE users
> (
>   id serial NOT NULL,
>   created_at timestamp with time zone NOT NULL,
>   last_seen_at timestamp with time zone NOT NULL,
>   -- some other columns...
> )
>
> CREATE TABLE emails
> (
>   user_id integer NOT NULL,
>   address text NOT NULL,
>   created_at timestamp with time zone NOT NULL,
>   confirmed_at timestamp with time zone,
>   -- some other columns...
>   CONSTRAINT emails_user_id_fkey FOREIGN KEY (user_id)
>       REFERENCES users (id) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE CASCADE
> )
>
> The gist is that a single user can be related to multiple emails, and some
> email addresses are confirmed (they've clicked a link I've sent there, so I
> know it's valid) and some aren't.
>
> Routinely, when I'm fetching users from the db I also want to get the best
> email address for each user. That is, the email address that they've
> confirmed the most recently, or failing that, the one that they created the
> most recently. I've been doing this via a subselect:
>
> SELECT *,
>   (SELECT "address" FROM "emails" WHERE ("user_id" = "id") ORDER BY
> "confirmed_at" DESC NULLS LAST, "created_at" DESC LIMIT 1) AS "best_email"
> FROM "users"
>
> I like the subquery approach because I can use my ORM to easily append it
> to whatever query I'm running against the users table (whether I'm looking
> up one user or many), without having to do an explicit join and trim out
> the unnecessary rows. Also, in the future I'm planning on adding additional
> subqueries to get (for example) each user's current subscription status,
> and I'm afraid that the joins will get ungainly. Besides, I find subqueries
> much easier to reason about than joins.
>
> My problem is that now I need to get not only the best email's address,
> but whether it is confirmed (whether confirmed_at is not null). My first
> attempt was to simply repeat the subquery:
>
> SELECT *,
>   (SELECT "address" FROM "emails" WHERE ("user_id" = "id") ORDER BY
> "confirmed_at" DESC NULLS LAST, "created_at" DESC LIMIT 1) AS "best_email",
>   ((SELECT "confirmed_at" FROM "emails" WHERE ("user_id" = "id") ORDER BY
> "confirmed_at" DESC NULLS LAST, "created_at" DESC LIMIT 1) IS NOT NULL) AS
> "best_email_confirmed"
> FROM "users"
>
> I had hoped Postgres would recognize that the two subqueries were
> identical, but judging from the explain output from my development database
> it's not, and it's simply running the subquery twice instead:
>
> "Seq Scan on users  (cost=0.00..333.65 rows=13 width=81)"
> "  SubPlan 1"
> "    ->  Limit  (cost=12.79..12.79 rows=1 width=48)"
> "          ->  Sort  (cost=12.79..12.80 rows=5 width=48)"
> "                Sort Key: public.emails.confirmed_at,
> public.emails.created_at"
> "                ->  Bitmap Heap Scan on emails  (cost=4.29..12.76 rows=5
> width=48)"
> "                      Recheck Cond: (user_id = users.id)"
> "                      ->  Bitmap Index Scan on emails_pkey
>  (cost=0.00..4.29 rows=5 width=0)"
> "                            Index Cond: (user_id = users.id)"
> "  SubPlan 2"
> "    ->  Limit  (cost=12.79..12.79 rows=1 width=16)"
> "          ->  Sort  (cost=12.79..12.80 rows=5 width=16)"
> "                Sort Key: public.emails.confirmed_at,
> public.emails.created_at"
> "                ->  Bitmap Heap Scan on emails  (cost=4.29..12.76 rows=5
> width=16)"
> "                      Recheck Cond: (user_id = users.id)"
>  "                      ->  Bitmap Index Scan on emails_pkey
>  (cost=0.00..4.29 rows=5 width=0)"
> "                            Index Cond: (user_id = users.id)"
>
> It would be ideal if I could pull both results from the same subquery,
> something like:
>
> SELECT *,
>   (SELECT "address", "confirmed_at" IS NOT NULL FROM "emails" WHERE
> ("user_id" = "id") ORDER BY "confirmed_at" DESC NULLS LAST, "created_at"
> DESC LIMIT 1) AS ("best_email", "best_email_confirmed")
> FROM "users"
>
> But this isn't valid syntax. I tried putting the subquery under a FROM
> clause, but it won't work with my "user_id" = "id" condition, and throws
> "ERROR:  subquery in FROM cannot refer to other relations of same query
> level". I think CTEs might be an answer, but I'm stuck on 8.3 for the
> foreseeable future, which doesn't support them.
>
> Does anyone have any suggestions?
>

Reply via email to