Hi,
I can't reproduce this error. Here is a read/write example using a file
from VariantAnnotation where the results are as expected.
fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
dest <- tempfile()
vcf1 <- readVcf(fl, "hg19")
> rownames(vcf1)
[1] "rs6054257" "20:17330_T/A" "rs6040355" "20:1230237_T/."
[5] "microsat1"
writeVcf(vcf1, dest)
vcf2 <- readVcf(dest, "hg19")
> rownames(vcf2)
[1] "rs6054257" "20:17330_T/A" "rs6040355" "20:1230237_T/."
[5] "microsat1"
I need a reproducible example in order to help. Is the vcf you're
working with publicly available?
Valerie
On 11/27/2013 03:37 AM, Becq, Jennifer wrote:
Hi Valerie,
Thank you for cc'ing my message.
The "ID" values are removed when reading a VCF through readVcf() and re-writing
it with writeVcf():
V = readVcf("test.vcf", "hg19")
rownames(V)
[1] "DEL:561590:0:1:0:0:0" "BND:81424:0:1:1:1"
[3] "BND:54200:0:1:0:1" "DEL:572151:1:1:6:4:0"
[5] "DEL:572151:1:1:6:4:1" "DEL:572151:1:1:11:0:0"
[7] "DEL:572433:0:0:5:2:0" "DUP:TANDEM:572544:0:0:8:0:0"
writeVcf(V, "writeTest.vcf")
V2 = readVcf("writeTest.vcf", "hg19")
rownames(V2)
[1] "chr20:14855644_C/<DEL>"
[2] "chr20:29627290_G/[chr2:114173319[G"
[3] "chr20:35365307_T/]chr1:230941520]T"
[4]
"chr20:60520225_AACGATGAGGAGCATCGCGGCTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCACGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACTGTGGCCGCCCCTTCTCACCG/A"
[5]
"chr20:60520443_GACTGTCTGCACCGTGGCCGCCCCTTCTCACTGACGATGAGGAGCACTGCGACTGTCTGCACCGTGGCCGCCCTTTCTGACTGATGATAAGGAACATTGCGACTGTCTGCACCGTGGCTGCCCCTTCTCACCAACGCTGAGGAGCACTGCAACCATCTGCACCGTGGCCGCCCCTTCTCACCGATGATGAGGAACATTGAGACTGTCTGCCCCGTGGCTGCCCCTTCTCACCGATGCTGAGGAGCACTGTGACTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCGCGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACCGTGGCCGCCCCTTCTCACCGATGACGAGGAGCACTGCGA/GC"
[6] "chr20:60520937_C/<DEL>"
[7]
"chr20:61766068_CAGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCCACAGGGGAGGCAGGGCCCAGAGAGGAGGCGGGGCCACAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCCAC/CGG"
[8] "chr20:62063686_T/<DUP:TANDEM>"
sessionInfo()
R version 3.0.2 Patched (2013-10-27 r64116)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] VariantAnnotation_1.8.6 Rsamtools_1.14.2 Biostrings_2.30.1
[4] GenomicRanges_1.14.3 XVector_0.2.0 IRanges_1.20.5
[7] BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.24.0 Biobase_2.22.0 biomaRt_2.18.0
[4] bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7
[7] GenomicFeatures_1.14.2 RCurl_1.95-4.1 RSQLite_0.11.4
[10] rtracklayer_1.22.0 stats4_3.0.2 tools_3.0.2
[13] XML_3.98-1.1 zlibbioc_1.8.0
***** With the following VCF test.vcf:
##fileformat=VCFv4.1
#CHROM POS ID REF ALT QUAL FILTER INFO
chr20 14855644 DEL:561590:0:1:0:0:0 C <DEL> . PASS
.
chr20 29627290 BND:81424:0:1:1:1 G [chr2:114173319[G
. MaxDepth .
chr20 35365307 BND:54200:0:1:0:1 T ]chr1:230941520]T
. PASS .
chr20 60520225 DEL:572151:1:1:6:4:0
AACGATGAGGAGCATCGCGGCTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCACGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACTGTGGCCGCCCCTTCTCACCG
A
. PASS .
chr20 60520443 DEL:572151:1:1:6:4:1
GACTGTCTGCACCGTGGCCGCCCCTTCTCACTGACGATGAGGAGCACTGCGACTGTCTGCACCGTGGCCGCCCTTTCTGACTGATGATAAGGAACATTGCGACTGTCTGCACCGTGGCTGCCCCTTCTCACCAACGCTGAGGAGCACTGCAACCATCTGC
ACCGTGGCCGCCCCTTCTCACCGATGATGAGGAACATTGAGACTGTCTGCCCCGTGGCTGCCCCTTCTCACCGATGCTGAGGAGCACTGTGACTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCGCGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACCGTGGC
CGCCCCTTCTCACCGATGACGAGGAGCACTGCGA GC . PASS .
chr20 60520937 DEL:572151:1:1:11:0:0 C <DEL> . PASS
.
chr20 61766068 DEL:572433:0:0:5:2:0
CAGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCCACAGGGGAGGCAGGGCCCAGAGAGGAGGCGGGGCCACAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCC
AC CGG . PASS .
chr20 62063686 DUP:TANDEM:572544:0:0:8:0:0 T <DUP:TANDEM>
. PASS .
Thanks
Jennifer
Jennifer Becq
Bioinformatics Scientist
Illumina Cambridge Ltd
Tel: +44 (0) 1799 532300
email: jb...@illumina.com
-----Original Message-----
From: Valerie Obenchain [mailto:voben...@fhcrc.org]
Sent: 20 November 2013 17:28
To: Becq, Jennifer; bioc-devel@r-project.org
Subject: Re: VariantAnnotation writeVcf problem
Hi Jennifer,
I've cc'd your message to the Bioconductor mailing list. We have two lists, one
for general questions and the other for bug reports/feature requests. Please
post future questions to one of these lists instead of sending them to a single
person. The lists reach a wider audience and others can chime in with their
responses/experience. You can find info about the mailing lists here,
http://www.bioconductor.org/help/mailing-list/
writeVcf() should only write out '.' for ID if the ID is missing. There is no
restriction on the format of the ID. Can you provide a small sample of the vcf
file you're having trouble with (just a few lines is enough)? Also include the
output of your sessionInfo().
Valerie
On 11/15/2013 08:56 AM, Becq, Jennifer wrote:
Hi Valerie,
I've been using VariantAnnotation for quite a while now and it's been great!
However I've just encountered a problem:
If I read in a VCF and re-write it directly, the ID column has
disappeared and becomes "." instead of the original
"DEL:9586:0:1:0:0:0", even though the rownames of my VCF object are
correctly populated with the original ID column.
> library(VariantAnnotation)
> in1 = readVcf("my.vcf.gz", "hg19")
> writeVcf(in1, "test.vcf")
I was wondering if that was because ID only accepts a specific format
(rsID or chr:pos)?
Thank you for your help
Jennifer
*Jennifer Becq*
*Bioinformatics Scientist*
*Illumina Cambridge Ltd*
Tel: +44 (0) 1799 532300
email: jb...@illumina.com <mailto:jb...@illumina.com>
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel