Consolidating my responses in one email.

1. The total data that is expected is some 1 - 1.5 Tb a day. 75% of
the data comes in a period of 10 hours. Rest 25% comes in the 14
hours. Of course there are ways to smooth the load patterns, however
the current scenario is as explained.

2 I do expect that the customer rolls in something like a NAS/SAN with
Tb of disk space. The idea is to retain the data for a duration and
offload it to tape.

That leads to the question, can the data be compressed? Since the data
is very similar, any compression would result in some 6x-10x
compression. Is there a way to identify which partitions are in which
data files and compress them until they are actually read?

Regards
Dhaval

On 5/12/07, Lincoln Yeoh <[EMAIL PROTECTED]> wrote:
At 04:43 AM 5/12/2007, Dhaval Shah wrote:

>1. Large amount of streamed rows. In the order of @50-100k rows per
>second. I was thinking that the rows can be stored into a file and the
>file then copied into a temp table using copy and then appending those
>rows to the master table. And then dropping and recreating the index
>very lazily [during the first query hit or something like that]

Is it one process inserting or can it be many processes?

Is it just a short (relatively) high burst or is that rate sustained
for a long time? If it's sustained I don't see the point of doing so
many copies.

How many bytes per row? If the rate is sustained and the rows are big
then you are going to need LOTs of disks (e.g. a large RAID10).

When do you need to do the reads, and how up to date do they need to be?

Regards,
Link.






--
Dhaval Shah

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Reply via email to