Dear all, I would like to ask if it is possible to change the seqnames of a bam file giving a vector of character to the function renameSeqlevels. This is because in order to use the fuction summarizeOverlap or count/find, the seqnames have to match. >From the bamfile below I have extracted the locus annotations form the seqnames (i.e ERCC00002, NC_001133.9...etc) and I have created a list (same length as the seqlevels of the bam file).
*bamfile* GAlignments object with 6 alignments and 0 metadata columns: seqnames <Rle> [1] DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061 [2] DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061 [3] DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061 [4] DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061 [5] DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061 [6] DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061 strand cigar qwidth start end width njunc <Rle> <character> <integer> <integer> <integer> <integer> <integer> [1] + 8M2D27M 35 1025 1061 37 0 [2] + 8M2D27M 35 1025 1061 37 0 [3] - 36M 36 1025 1060 36 0 [4] - 36M 36 1026 1061 36 0 [5] + 35M 35 1027 1061 35 0 [6] + 35M 35 1027 1061 35 0 ------- *gffile* GRanges object with 6 ranges and 12 metadata columns: seqnames ranges strand | source type score <Rle> <IRanges> <Rle> | <factor> <factor> <numeric> [1] NC_001133.9 [ 24837, 25070] + | s_cerevisiae exon <NA> [2] NC_001133.9 [ 25048, 25394] + | s_cerevisiae exon <NA> [3] NC_001133.9 [ 27155, 27786] + | s_cerevisiae exon <NA> [4] NC_001133.9 [ 73431, 73792] + | s_cerevisiae exon <NA> [5] NC_001133.9 [165314, 165561] + | s_cerevisiae exon <NA> [6] NC_001133.9 [165388, 165781] + | s_cerevisiae exon <NA> phase gene_id transcript_id exon_number gene_name <integer> <character> <character> <character> <character> [1] <NA> XLOC_000040 TCONS_00000191 1 FLO9 [2] <NA> XLOC_000040 TCONS_00000192 1 FLO9 [3] <NA> XLOC_000041 TCONS_00000193 1 FLO9 [4] <NA> XLOC_000055 TCONS_00000200 1 YAL037C-A [5] <NA> XLOC_000075 TCONS_00000100 1 YAR010C [6] <NA> XLOC_000075 TCONS_00000219 1 YAR010C oId nearest_ref class_code <character> <character> <character> [1] {TRINITY_GG_normal}16_c1_g1_i1.mrna1 rna8 x [2] {TRINITY_GG_normal}16_c0_g1_i1.mrna1 rna8 x [3] {TRINITY_GG_normal}12_c0_g1_i1.mrna1 rna8 x [4] {TRINITY_GG_normal}3_c3_g1_i1.mrna1 rna31 x [5] {TRINITY_GG_normal}3479_c0_g1_i1.mrna1 rna77 x [6] {TRINITY_GG_normal}24_c0_g1_i1.mrna1 rna77 x tss_id <character> [1] TSS42 [2] TSS43 [3] TSS44 [4] TSS71 [5] TSS118 [6] TSS118 ------- It is possible to replace the seqlevels names with the list? I have tried: bamfile1 <- renameSeqlevels(seqlevels(bamfile), listx) Thank you for any advice, Kind regards, Teresa
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.