HI, Does the guix project and members suggest best guix-ish practices for managing on premise mirrors of large file-based data-sets such as appear in genomics HPC evironments?
Perhaps a guix-ish response to [Go Get Data \(GGD\) is a framework that facilitates reproducible access to genomic data](https://www.nature.com/articles/s41467-021-22381-z) That would build on GWL? Use cases would be, e.g. download/sync selected (versions of) genomes from Ensembl/NCBI etc and index them for Blast, blat, bowtie{2}, bwa, STAR, GMAP, HiSAT, IGV, BioConductor, etc... I see much that addresses analysis workflows, such as - [Reproducible genomics analysis pipelines with GNU Guix](https://www.biorxiv.org/content/10.1101/298653v2.full) - [Scalable Workflows and Reproducible Data Analysis for Genomics](https://pubmed.ncbi.nlm.nih.gov/31278683/) - [PiGx: reproducible genomics analysis pipelines with GNU Guix](https://academic.oup.com/gigascience/article/7/12/giy123/5114263) Am I missing similar efforts toward maintaining an up-to-date catalog of the genomic resources that such workflows require? Thanks! Malcolm Cook Database Applications Manager Stowers Institute for Medical Research Kansas City, MO USA