[Bioc-devel] range-directed metadata management

Vincent Carey Thu, 10 Jul 2014 13:53:27 -0700

a new, more inclusive GWAS catalog is available (GRASP, from Andrew Johnson
at NHLBI), with 6 million records and voluminous metadata (though it seems
sparse and perhaps can be trimmed/reshaped)


i made a GRanges and it takes 3 minutes to load.  even after stripping all
the
metadata, a GRanges with 6 million records takes 20 seconds to load.
 that's probably acceptable, but a managed chromosome-specific distribution
might
be closer to interactive availability.

the metadata probably would be best kept in SQLite.  it occurred to me to
consider an arrangement in which we have the GRanges managing the ranges
and a key to the database.  range operations can engender queries to
retrieve metadata, metadata queries in the db can generate indices to
retrieve matching ranges.

is anyone doing something along these lines?

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] range-directed metadata management

Reply via email to