Re: [ccp4bb] pdb sequence search

sameer Sat, 23 Jun 2012 03:38:24 -0700

Hi,
  The up-to-date list of mappings between PDB and sequence database
UniProt is available at -


ftp://ftp.ebi.ac.uk/pub/databases/msd/sifts/csv/pdb_chain_uniprot.csv

This gives mapping between PDB chains and UniProt accession numbers. This
will allow you to find all DB entries for a particular UniProt accession
number in the PDB.

To answer original question about sequence search the following PDBe
service -
pdbe.org/fasta

allows you to set % identity value and perform search against PDB sequences.

cheers,
Sameer Velankar
PDBe

>
> Hi Ed, If you are looking for a specific protein, why not get all PDB
> files with a DBREF record pointing at the uniprot record of the protein
> you want? You can do a simple text search in the PDB, e.g. 'MYG_PHYCA'.
> Cheers,Robbie
>  > Date: Fri, 22 Jun 2012 22:39:12 -0400
>> From: epozh...@umaryland.edu
>> Subject: Re: [ccp4bb] pdb sequence search
>> To: CCP4BB@JISCMAIL.AC.UK
>>
>> Tim,
>>
>>
>> > I did not understand your objection against solution 1 - is it because
>> > it is not automated? You can sort the results by max. Ident so that
>> > you can sroll down to the limit you set yourself.
>>
>> More that it does not generate a list of PDB IDs.  What I want to do is
>> to find every structure of a particular protein and line them all up.  I
>> am not saying it's not doable with option 1, it's just not too
>> convenient.
>> >
>> > Why do you think a identity cut-off was a good criterium? I usually
>> > cut by E-value because I assume the developers of blast know what they
>> > are doing and I have the impression they consider the E-value a better
>> > criterium than the max. Ident.
>> Because I want all the structures of a particular protein itself, not
>> it's homologues.  I just went through several cycles of reducing E-value
>> down to 1e-100, and I still get one hit included at 88% identity.
>> Setting E-value cutoff to 0 doesn't work, it just returns them all.
>> Well, thanks to you I now see how to figure out the cutoff - the results
>> are sorted by E-values and list them, so I can just go to the first
>> non-identical hit and use a slightly smaller number.  It's just that
>> sequence identity is easier for me to interpret and it's (emotionally)
>> easier to select a cutoff at, say, no more than 5 mutations rather than
>> E-value of 10e-150.
>>
>> Cheers,
>>
>> Ed.
>>
>> Cheers
>>
>>
>>
>> --
>> Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
>>                                                  Julian, King of Lemurs
>

Re: [ccp4bb] pdb sequence search

Reply via email to