Consolidating my responses in one email. 1. The total data that is expected is some 1 - 1.5 Tb a day. 75% of the data comes in a period of 10 hours. Rest 25% comes in the 14 hours. Of course there are ways to smooth the load patterns, however the current scenario is as explained.
2 I do expect that the customer rolls in something like a NAS/SAN with Tb of disk space. The idea is to retain the data for a duration and offload it to tape. That leads to the question, can the data be compressed? Since the data is very similar, any compression would result in some 6x-10x compression. Is there a way to identify which partitions are in which data files and compress them until they are actually read? Regards Dhaval On 5/12/07, Lincoln Yeoh <[EMAIL PROTECTED]> wrote:
At 04:43 AM 5/12/2007, Dhaval Shah wrote: >1. Large amount of streamed rows. In the order of @50-100k rows per >second. I was thinking that the rows can be stored into a file and the >file then copied into a temp table using copy and then appending those >rows to the master table. And then dropping and recreating the index >very lazily [during the first query hit or something like that] Is it one process inserting or can it be many processes? Is it just a short (relatively) high burst or is that rate sustained for a long time? If it's sustained I don't see the point of doing so many copies. How many bytes per row? If the rate is sustained and the rows are big then you are going to need LOTs of disks (e.g. a large RAID10). When do you need to do the reads, and how up to date do they need to be? Regards, Link.
-- Dhaval Shah ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq