Package: wnpp
Severity: wishlist
X-Debbugs-Cc: h...@torproject.org, debian-r...@lists.debian.org

* Package name    : pg-parquet
  Version         : 0.3.0
  Upstream Contact: https://github.com/CrunchyData/
* URL             : https://github.com/CrunchyData/pg_parquet/
* License         : PostgreSQL
  Programming Lang: Rust
  Description     : Copy to/from Parquet in S3 or Azure Blob Storage from 
within PostgreSQL

pg_parquet is a PostgreSQL extension that allows you to read and write
Parquet files, which are located in S3 or file system, from PostgreSQL
via COPY TO/FROM commands. It depends on Apache Arrow project to read
and write Parquet files and pgrx project to extend PostgreSQL's COPY
command.

----

This is something we're looking at using inside Tor to archive
long-term data from postgresql into object storage. See
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41416 and related.

Another alternative is a "foreign data wrapper" like:

https://github.com/pgspider/parquet_s3_fdw

We're not absolutely sure which one the best, we might need both.

Reply via email to