BTW, it's not a problem to query data across multiple days as long as you query 
from the parent table -- Postgresql will use the child table constraints to 
search all the child tables that could contain data.

Guy

On May 10, 2012, at 11:01 AM, Horaci Macias wrote:

> thanks Guy. I had thought about using per-day tables (although I didn't know 
> about child tables) but my problem is that some entries are related and they 
> can span several minutes, so my worry is that I end up not finding all the 
> right entries when I search for entries that happen close to the end of day / 
> start of day time.
> Anyway, worth a thought for sure so thanks.
> 
> H
> 
> On 10/05/12 16:42, Guy Helmer wrote:
>> On May 10, 2012, at 4:31 AM, Horaci Macias wrote:
>> 
>>> Hi everybody,
>>> 
>>> I'm running postgres 9.1 and having disk space problems.
>>> My application captures information 24x7 and stores it into the database. 
>>> This includes several bytea and can be ~5M entries a day, so the size can 
>>> be an issue after several days.
>>> My application also cleans up entries older than 10 days; it does this 
>>> every night and the delete operations are happening successfully. I cannot 
>>> truncate the tables as they contain both stale and active data.
>>> The database is able to store all the entries for ~15 days without 
>>> problems, but for some reason the deletion of old entries is not freeing up 
>>> the space (or the insertion of new entries is not reusing the space used by 
>>> old entries) because after running the application for ~20days I run out of 
>>> space on disk.
>>> I've been reading on this forum and the postgres documentation; vacuum full 
>>> is not recommended and apparently vacuum should be all I need. I'm using 
>>> autovacuum but this doesn't seem to be solving the problem (perhaps because 
>>> while vacuum is running the application keeps inserting entries 24x7?)
>>> 
>>> Just to clarify, I don't really care if the disk space is returned to the 
>>> OS; what I need though is to be sure that I can keep a window of 10 days of 
>>> records (assuming of course my HD is big enough for those 10 days, which 
>>> seems to be the case).
>>> 
>>> Some questions:
>>> * Although not being generally recommended, I've read that vacuum full is 
>>> sometimes the only choice when large deletions are in place in order to 
>>> maintain the database. Is this the case here?
>>> * Should I try to have a "maintenance window" and stop all inserts/writes 
>>> while vacuum is running? If so, is there any way to configure at what time 
>>> vacuum will be executed by autovacuum or should I rely on cron-type jobs 
>>> for this? and is there any way to prevent external connections at certain 
>>> times of day to make sure inserts/writes don't happen while vacuum is 
>>> going, or again I should use cron-type jobs for this?
>>> * Any other suggestions/ideas to troubleshoot this or any pointers to 
>>> further documentation?
>> I would expect a plain VACUUM to make unused space available for re-use -- 
>> not sure why it would not be helping.
>> 
>> Since Postgresql can have tables that are children of tables, a neat trick 
>> in this situation is to create per-day child tables and insert the new data 
>> directly into the appropriate per-day table; with this approach, deleting 
>> old data is accomplished by simply dropping outdated tables and thereby 
>> avoiding VACUUM completely. With constraints on the child tables, Postgresql 
>> can optimize a query on the parent table by knowing what child table has 
>> data from what day and will only check child tables that would have data for 
>> a given query.
>> 
>> For example, I have a table called data_tbl, and child per-day tables like 
>> data_tbl_ts_20120509. The child table data_tbl_ts_20120509 has constraint:
>>     "data_tbl_20120509_ts_check" CHECK (ts>= '2012-05-08 
>> 19:00:00-05'::timestamp with time zone AND ts<  '2012-05-09 
>> 19:00:00-05'::timestamp with time zone)"
>> (each child table has data from 00:00:00 GMT to 23:59:59 GMT for that day)
>> 
>> Each day on my systems, a cron job creates the child table, constraints on 
>> the child table, and index(es) for the child table to hold the next day's 
>> data, and another cron job drops any outdated child tables. I believe the 
>> command to create the example child table above would have been:
>> 
>> CREATE TABLE data_tbl_ts_20120509 (CHECK (ts>= '2012-05-08 
>> 19:00:00-05'::timestamp with time zone AND ts<  '2012-05-09 
>> 19:00:00-05'::timestamp with time zone)) INHERITS (data_tbl)
>> 
>> (followed by any necessary GRANT commands to provide access to the new child 
>> table)
>> 
>> Hope this helps,
>> Guy
>> 
>> 


--------
This message has been scanned by ComplianceSafe, powered by Palisade's 
PacketSure.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to