From: Thomas Güttler   Sent: Monday, November 28, 2016 6:28 AM


...I have 2.3TBytes of files. File count is 17M

Since we already store our structured data in postgres, I think about storing 
the files in PostgreSQL, too.

Is it feasible to store file in PostgreSQL?

-------

I am doing something similar, but in reverse.  The legacy mysql databases I’m 
converting into a modern Postgres data model, have very large genomic strings 
stored in 3 separate columns.  Out of the 25 TB of legacy data storage (in 800 
dbs across 4 servers, about 22b rows), those 3 columns consume 90% of the total 
space, and they are just used for reference, never used in searches or 
calculations.  They range from 1k to several MB.

 

Since I am collapsing all 800 dbs into a single PG db, being very smart about 
storage was critical.  Since we’re also migrating everything to AWS, we’re 
placing those 3 strings (per row) into a single json document and storing the 
document in S3 bins, with the pointer to the file being the globally unique PK 
for the row…super simple.  The app tier knows to fetch the data from the db and 
large string json from the S3 bins.  The retrieval time is surprisingly fast, 
this is all real time web app stuff.

 

This is a model that could work for anyone dealing with large objects (text or 
binary).  The nice part is, the original 25TB of data storage drops to 5TB – a 
much more manageable number, allowing for significant growth, which is on the 
horizon.

 

Mike Sofen  (Synthetic Genomics USA)

Reply via email to