Ludwig, If you do this search on the UCSC genome browser (which this annotation package is built from), you will see that the longest variant is what is shown
http://genome.ucsc.edu/cgi-bin/hgTracks?clade=mammal&org=Human&db=hg38&position=brca1&hgt.positionInput=brca1&hgt.suggestTrack=knownGene&Submit=submit&hgsid=429339723_8sd4QD2jSAnAsa6cVCevtoOy4GAz&pix=1885 If instead of "genes" you do "transcripts", you will see 20 different transcripts for this gene, including the one listed by NCBI. I havent tried it yet (haven't upgraded R or bioconductor to latest version), but there is now an Ensembl based annotation package as well, that may work better?? http://bioconductor.org/packages/release/data/annotation/html/EnsDb.Hsapiens.v79.html -Robert On Wed, Jun 3, 2015 at 7:04 AM Ludwig Geistlinger < ludwig.geistlin...@bio.ifi.lmu.de> wrote: > Dear Bioc annotation team, > > Querying TxDb.Hsapiens.UCSC.hg38.knownGene for gene coordinates, e.g. for > > BRCA1; ENSG00000012048; entrez:672 > > via > > > genes(TxDb.Hsapiens.UCSC.hg38.knownGene, vals=list(gene_id="672")) > > gives me: > > GRanges object with 1 range and 1 metadata column: > seqnames ranges strand | gene_id > <Rle> <IRanges> <Rle> | <character> > 672 chr17 [43044295, 43170403] - | 672 > ------- > seqinfo: 455 sequences (1 circular) from hg38 genome > > > However, querying Ensembl and NCBI Gene > http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000012048 > http://www.ncbi.nlm.nih.gov/gene/672 > > the gene is located at (note the difference in the end position) > > Chromosome 17: 43,044,295-43,125,483 reverse strand > > > How is the inconsistency explained and how to extract an ENSEMBL/NCBI > conform annotation from the TxDb object? > (I am aware of biomaRt, but I want to explicitely use the Bioc annotation > functionality). > > Thanks! > Ludwig > > > -- > Dipl.-Bioinf. Ludwig Geistlinger > > Lehr- und Forschungseinheit für Bioinformatik > Institut für Informatik > Ludwig-Maximilians-Universität München > Amalienstrasse 17, 2. Stock, Büro A201 > 80333 München > > Tel.: 089-2180-4067 > eMail: ludwig.geistlin...@bio.ifi.lmu.de > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel