Hello everyone,

I'm just beginning a PHP / PostgreSQL project for someone and there are a 
couple of things where I am not sure the best design for the database.

The first is that we want to store kind of a history of values for the past 
3 days or so.  We want to use this so that we can analyze how the values 
have changed over the last few days.  The solution I thought of was having 
4 columns in the table for each value, like this:

somedata        int,
somedata_24     int,
somedata_48     int,
somedata_72     int,

There are 3 different variables that we want to keep a history for in each 
row.  So what I thought we could do is at some arbitrary time each day, 
copy the values from somedata into somedata_24, from somedata_24 into 
somedata_48, from somedata_48 into somedata_72, and just forget whatever 
was in somedata_72.  My question is, how long would something like this 
take (relatively speaking, I don't know the hardware specs of the server 
exactly, it will be professionally hosted I believe) if there were around 
20,000 rows?  If it would take too long or be too taxing on resources, do 
you have any other ideas on how to handle something like this?


The second thing is we are going to be having our users update the data in 
the database quite frequently.  What will happen is that when User A wants 
to search the database for something, the system will first check and see 
what the oldest set of data it has is and then ask User A to fetch new data 
to replace the old stuff.  Now since we expect very frequent use of this, 
we want a way to mark which data was requested from User A so that when 
User B comes along, the system doesn't request the same data from him.  But 
we don't want to change the timestamp for when the data was updated until 
we actually get the information, in case the user decides he doesn't want 
to do a search and doesn't send the new data.  One way I thought I could do 
this is by having a table something like this:

dataset_id      int,
last_update     timestamp, (or some other date/time field...I can never keep 
them straight in my head)
locked  bool

Then, when I request a dataset from User A, I set the bool field to true 
and my SQL for finding the oldest one already only selects from datasets 
where locked is false.  But I thought if PGSQL's transaction functions 
already did something like this, it would be easier and faster to use 
them.  Will SELECT ... FOR UPDATE help me out here?  Is there a way to 
check and see if a row is locked and ignore locked rows in another SELECT 
... FOR UPDATE?  Any input on a better way to handle this would also be 
appreciated.

Thanks for the time. :)

-Jesse

Reply via email to