Hi Dario, On 09/27/2016 01:00 AM, Dario Strbenac wrote: > Good day, > > When importing a VCF file from VariantAnnotation's data directory into R, a > warning is emitted. > > library(VariantAnnotation) > aFile <- system.file("extdata", "hapmap_exome_chr22.vcf.gz", package = > "VariantAnnotation") > aSet <- readVcf(aFile, "hg19") > > Warning message: > In .bcfHeaderAsSimpleList(header) : > duplicate keys in header will be forced to unique rownames
Header info is grouped by category (geno, info, meta) and put into DataFrames. Within each grouping, row names are taken from different parts of the header. In this case the warning comes from having 2 'source' lines. less hapmap_exome_chr22.vcf.gz ... ##source=CalculateGenotypePosteriors ##source=SelectVariants These end up as element 'META' in the meta() list. This is a catch all category, as you can see, that holds key value pairs that don't meet other criteria. > meta(hdr)$META DataFrame with 5 rows and 1 column Value <character> fileformat VCFv4.1 GVCFBlock minGQ=0(inclusive),maxGQ=1(exclusive) reference file:///projects/cidr/Amos/amos_cidr/Analysis_Pipeline_Files/human_g1k_v37_decoy.fasta source CalculateGenotypePosteriors source.1 SelectVariants > > Is there some problem with one of the VCF file's format which is distributed > with VariantAnnotation ? I wouldn't expect any package data files to emit > warnings to the end user. It's uncommon (I think) for files to have multiple 'source' lines but not incorrect. The warning was added as a heads up and was probably done after the file was already in the package. I've added some documentation to ?scanVcfHeader about this. If others feel strongly about this it could be changed to a message or just removed. Valerie > > R version 3.3.1 (2016-06-21) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 15.10 > VariantAnnotation 1.18.7 > > -------------------------------------- > Dario Strbenac > University of Sydney > Camperdown NSW 2050 > Australia > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel