Hi,

In the QA report of the 'ShortRead' package, a short sequential integer labeling for referencing the samples/files throughout the report is created by default. Would it be reasonable/possible to allow for other optional names to label the samples to make the results of the report easier to understand?

In general, I have three ideas what would be handy to have:

1. Derive a label from the file names. This is probably hard to generalize and implement in a way that it actually helps.

2. In case the 'dirPath' argument in the 'qa' function call is a named vector, such as

    qa(dirPath=c(p1="bam_file1.bam", p2="bam_file2.bam"))

use the names ["p1", "p2"] for the labeling later on. This would require storing the names in the object returned by 'qa', but should not be too hard to implement.

3. Optionally, pass a named vector to the 'report' method, matching file names to sample labels. In case the file names do not match or 'samples' is missing, default to the sequential labeling.


For option 3, I have created a simple example patch to illustrate how this could be implemented (see attached). So, later this may look like this:


    library(ShortRead)
    files = c(p1="bam_file1.bam", p2="bam_file2.bam")
    qa = qa(files, type="BAM")

    ## default sequential labeling ##
    ShortRead:::.report_html_BAMQA(qa, dest="report_normal")

    ## samples named according to names(files) ##
    ShortRead:::.report_html_BAMQA(qa, dest="report_named", samples=files)


I'm happy about any inputs or thoughts regarding this.


Best wishes
Julian
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to