Hi all, Would love to get some feedback on a little project I put together. I paired my company's parquet conversion routines (wrapper around pyarrow) with SqlAlchemy's table reflection capabilities to make an "easy mode" redshift --> Redshift spectrum converter.
You can find it here: https://github.com/hellonarrativ/spectrify I would be curious to hear impressions about the project (is it obvious what it does? Would you find it useful?) and also the parquet conversion more specifically. I ended up not using numpy/pandas to avoid issues with null values. Performance wise it's obviously not the best choice, but for this application (occasional conversion of data to parquet) performance is not critical. I thought this project might be useful for people evaluating Redshift Spectrum, or for those without an existing setup for converting to parquet. Thanks for reading! Best, Colin -- sent from my phone --