On Tue, Apr 15, 2014 at 10:56 AM, Chris Curvey <ch...@chriscurvey.com>wrote:

> On Mon, Apr 14, 2014 at 12:27 PM, Robert DiFalco <robert.difa...@gmail.com
> > wrote:
>
>> I have several related tables that represent a call state. Let's think of
>> these as phone calls to simplify things. Sometimes I need to determine the
>> last time a user was called, the last time a user answered a call, or the
>> last time a user completed a call.
>>
>> The basic schema is something like this:
>>
>> CREATE TABLE calls (
>>   id      BIGINT NOT NULL, // sequence generator
>>   user_id BIGINT NOT NULL,
>>   called  TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
>>
>>   PRIMARY KEY (id),
>>   FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE
>> );
>>
>> CREATE TABLE calls_answered (
>>   id        BIGINT NOT NULL,
>>   answered  TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
>>
>>   PRIMARY KEY (id),
>>   FOREIGN KEY (id) REFERENCES calls(id) ON DELETE CASCADE
>> );
>>
>>
>> And so on for calls_connected, calls_completed, call_errors, etc.
>>
>> Occasionally I will want to know things like "When was the last time a
>> user answered a call" or "How many times has a user been called".
>>
>> I can do these queries using a combination of MAX or COUNT. But I'm
>> concerned about the performance.
>>
>> SELECT MAX(a.id)
>> FROM calls_answered a JOIN calls c ON c.id = a.id
>> WHERE c.user_id = ?;
>>
>>
>> Or the number of answered calls:
>>
>> SELECT MAX(a.id)
>> FROM calls_answered a JOIN calls c ON c.id = a.id
>> WHERE c.user_id = ?;
>>
>>
>> Sometimes I might want to get this data for a whole bunch of users. For
>> example, "give me all users whose have not answered a call in the last 5
>> days." Or even "what percentage of users called actually answered a call."
>> This approach could become a performance issue. So the other option is to
>> create a call_summary table that is updated with triggers.
>>
>> The summary table would need fields like "user_id", "last_call_id",
>> "call_count", "last_answered_id", "answered_count", "last_completed_id",
>> "last_completed_count", etc.
>>
>> My only issue with a summary table is that I don't want a bunch of null
>> fields. For example, if the user was *called* but they have never
>> *answered* at call then the last_call_id and call_count fields on the
>> summary table would be non-NULL but the last_answer_id and answer_count
>> fields WOULD be NULL. But over time all fields would eventually become
>> non-NULL.
>>
>> So that leads me to a summary table for EACH call state. Each summary
>> table would have a user id, a ref_id, and a count -- one summary table for
>> each state e.g. call_summary, call_answered_summary, etc.
>>
>> This approach has the down side that it creates a lot of tables and
>> triggers. It has the upside of being pretty efficient without having to
>> deal with NULL values.  It's also pretty easy to reason about.
>>
>> So for my question -- is the choice between these a personal preference
>> sort of thing or is there a right or wrong approach? Am I missing another
>> approach that would be better?  I'm okay with SQL but I'm not expert so I'm
>> not sure if there is an accepted DESIGN PATTERN for this that I am missing.
>>
>> Thanks!
>>
>>
>>
> (Sorry, fat-fingered and hit "send too early"...)

CREATE TABLE calls (
  id      BIGINT NOT NULL, // sequence generator
  user_id BIGINT NOT NULL,
  called  TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
  answered  TIMESTAMPTZ NULL,
  connected  TIMESTAMPTZ NULL,
  completed  TIMESTAMPTZ NULL,


  PRIMARY KEY (id),
  FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE
);

Then  your queries end up looking like this:

--last time john answered
SELECT MAX(a.id)
FROM calls
where answered is not null
and user_id = ?

-- users that have not answered a call in the last five days (I can think
of a few ways to interpret that phrase)
select myusers.*
from myusers
where not exists
(  select *
  from calls
  where calls.user_id = myusers.user_id
  and answered >= <five days ago>)

-- average ring time
select avg(extract ('seconds' from called - answered))
where answered is not null



-- 
I asked the Internet how to train my cat, and the Internet told me to get a
dog.

Reply via email to