On 29 November 2016 at 16:50, Thomas Güttler <guettl...@thomas-guettler.de>
wrote:

>
>
> Am 29.11.2016 um 01:52 schrieb Mike Sofen:
>
>> From: Thomas Güttler   Sent: Monday, November 28, 2016 6:28 AM
>>
>> ...I have 2.3TBytes of files. File count is 17M
>>
>> Since we already store our structured data in postgres, I think about
>> storing the files in PostgreSQL, too.
>>
>> Is it feasible to store file in PostgreSQL?
>>
>>

> I guess I will use some key-to-blob store like s3. AFAIK there are open
> source s3 implementations available.
>
> Thank you all for your feeback!
>
>  Regards, Thomas
>


I have a similar setup. I have about 20TB of data in over 60 million files.
It might be possible to store that in PG, but I think it would be a huge
headache easily avoided. Files are GPG encrypted and backed up offsite to
S3, with lifecycle rules to migrate that to Glacier storage. A tool like
boto lets you sync things easily to S3, and maybe directly to glacier, and
there are alternatives out there.

If your rsync is taking too long, it will be worse syncing to s3 though. If
that is your bottleneck, then you need to fix it. Probably by knowing which
files have changed and only resyncing them,for example using timestamps
from the database or storing 'incoming' files in a separate area from your
'archive'. Once you have this sorted you can do your backups every few
minutes and reduce your potential data loss.


-- 
Stuart Bishop <stu...@stuartbishop.net>
http://www.stuartbishop.net/

Reply via email to