Hi, I would like to do some analysis on the TCGA data as provided in ExperimentHub's GSE62944 ExpressionSet.
The Description of the dataset reads: "TCGA re-processed RNA-Seq data from 9264 Tumor Samples and 741 normal samples across 24 cancer types" However, when loading the dataset via > eh <- ExperimentHub() > query(eh , "GSE62944") > tcga_data <- eh[["EH1"]] and counting the samples > dim(tcga_data) Features Samples 23368 7706 as well as the cancer types > length(table(pData(tcga_data)[,"CancerType"])) results in the observed discrepancies with the above description, indicating that this is an outdated version of the dataset. Is it possible to (1) update it accordingly (2) include a varLabel, i.e. pData column indicating whether this is a tumor or an adjacent normal sample for the respective cancer type. That would be great! Thx & Best, Ludwig -- Dr. Ludwig Geistlinger Lehr- und Forschungseinheit für Bioinformatik Institut für Informatik Ludwig-Maximilians-Universität München Amalienstrasse 17, 2. Stock, Büro A201 80333 München Tel.: 089-2180-4067 eMail: ludwig.geistlin...@bio.ifi.lmu.de _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel