a new, more inclusive GWAS catalog is available (GRASP, from Andrew Johnson at NHLBI), with 6 million records and voluminous metadata (though it seems sparse and perhaps can be trimmed/reshaped)
i made a GRanges and it takes 3 minutes to load. even after stripping all the metadata, a GRanges with 6 million records takes 20 seconds to load. that's probably acceptable, but a managed chromosome-specific distribution might be closer to interactive availability. the metadata probably would be best kept in SQLite. it occurred to me to consider an arrangement in which we have the GRanges managing the ranges and a key to the database. range operations can engender queries to retrieve metadata, metadata queries in the db can generate indices to retrieve matching ranges. is anyone doing something along these lines? [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel