Hey guys,

Thanks for your responses!
Exactly the kind of feedback I wanted, to ensure that what is being intended 
makes sense.

Resource to be used is UniprotKB.  Preview the first 10 entries:
http://www.uniprot.org/uniprot/?sort=&desc=&compress=no&query=&fil=organism:%22Homo%20sapiens%20(Human)%20[9606]%22&limit=10&force=no&preview=true&format=txt<http://www.uniprot.org/uniprot/?sort=&desc=&compress=no&query=&fil=organism:%22Homo%20sapiens%20(Human)%20%5b9606%5d%22&limit=10&force=no&preview=true&format=txt>

How it will look like? Similar to org.Hs.eg.db, using the AnnotationDbi::select 
interface etc.

Why not org.Hs.eg.db?

·         Many uniprot accessions are simply not present in org.Hs.eg.db. Take 
non-canonical isoforms. They are badly represented in org.Hs.eg.db, but are 
essential in LCMS proteomics.

·         org.Hs.eg.db does not offer an easy way to map uniprot accessions to 
uniprot summary. I would include that in the org.Hs.uniprot.db package

·         In LCMS proteomics, protein identification is performed by comparing 
the observed MS spectra to those you would expect from an organism-specific 
protein sequence database. Using the same protein sequence database for 
annotation as is being used for identification would provide a one-to-one 
mapping between analysis and annotation..

Why not biomaRt

·         A reasonably deep LCMS proteomics experiment quantifies 7000 
proteins. Retrieving annotation for these through biomaRt would be slow (an 
overnight activity). I want something that works instantaneously.

·         From what I remember you can actually not access the uniprot summary 
(which gives a paragraph on current known knowledge for a protein) field from 
within biomaRt.

What do you guys think?
Thanks for your feedback!

Cheers,

Aditya


From: Karim Mezhoud [mailto:kmezh...@gmail.com]
Sent: Wednesday, September 13, 2017 1:35 PM
To: Vincent Carey
Cc: Aditya Bhagwat; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] Creating an org.Hs.uniprot.db package

Hi,
I general LCMSMS generate mass/charge data of Amino Acid or peptides.
The goal in to identify which  protein belong the peptides.
The Software used with LCMSMS can match the peptides to Uniprot database , and 
ranks putative proteins by scores.
Could the tell us what is the interest of org.Hs.uniprot.db versus default 
UniprotKB?
Thanks,
Karim


On Wed, Sep 13, 2017 at 11:19 AM, Vincent Carey 
<st...@channing.harvard.edu<mailto:st...@channing.harvard.edu>> wrote:
can you say a little more about what resource will be tapped and what it
will look like?  you can
already use uniprot identifiers as keys into org.Hs.eg.db

On Tue, Sep 12, 2017 at 9:05 AM, Aditya Bhagwat <
adb2...@qatar-med.cornell.edu<mailto:adb2...@qatar-med.cornell.edu>> wrote:

> Hey guys,
>
> I love the org.Hs.eg.db package (and similar others for other organisms).
>
> I work a lot with LCMSMS proteomics data, and I have always missed a
> similar org.Hs.uniprot.db package, so I am thinking of creating that (and
> then sharing it on BioC, to benefit fellow proteomics R users).
>
> Would you agree that this is of general interest? (Or does this in some
> form already exist and have I overlooked it?)
>
> Thanks for your feedback!
>
> Cheers,
>
> Aditya
>
>
>
> Disclaimer: This email and its attachments may be confidential and are
> intended solely for the use of the individual to whom it is addressed. If
> you are not the intended recipient, any reading, printing, storage,
> disclosure, copying or any other action taken in respect of this e-mail is
> prohibited and may be unlawful. If you are not the intended recipient,
> please notify the sender immediately by using the reply function and then
> permanently delete what you have received..
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



Disclaimer: This email and its attachments may be confidential and are intended 
solely for the use of the individual to whom it is addressed. If you are not 
the intended recipient, any reading, printing, storage, disclosure, copying or 
any other action taken in respect of this e-mail is prohibited and may be 
unlawful. If you are not the intended recipient, please notify the sender 
immediately by using the reply function and then permanently delete what you 
have received..

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to