Dear Doeke,
Here is the process we recommend:
For now you can get the entire collection of ligand definitions at
http://www.wwpdb.org/data/ccd, using the file
https://files.wwpdb.org/pub/pdb/data/monomers/components.cif.gz
From there you would parse the SMILES from _pdbx_chem_comp_descriptor.
You can also use RCSB.org data API as described at
https://data.rcsb.org/#gql-example-7
Select "OPEN IN EDITOR" and add a chemical descriptor component search
as follows:
{
chem_comps(comp_ids:["NAG", "EBW"]) {
rcsb_id
chem_comp {
type
formula_weight
name
formula
}
pdbx_chem_comp_descriptor {
type
descriptor
}
rcsb_chem_comp_info {
initial_release_date
}
}
}
Then, you can get all IDs either using a "grep data_" command from the
file downloaded above, or by using the RCSB Advanced Search >> Chemical
Attributes >> Chemical IDs(s)>> "is not empty"
<https://www.rcsb.org/search?request=%7B%22query%22%3A%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text_chem%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_chem_comp_container_identifiers.comp_id%22%2C%22operator%22%3A%22exists%22%2C%22negation%22%3Afalse%7D%7D%5D%2C%22logical_operator%22%3A%22and%22%7D%5D%2C%22label%22%3A%22text_chem%22%7D%5D%7D%2C%22return_type%22%3A%22entry%22%2C%22request_options%22%3A%7B%22paginate%22%3A%7B%22start%22%3A0%2C%22rows%22%3A25%7D%2C%22results_content_type%22%3A%5B%22experimental%22%5D%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22score%22%2C%22direction%22%3A%22desc%22%7D%5D%2C%22scoring_strategy%22%3A%22combined%22%7D%2C%22request_info%22%3A%7B%22query_id%22%3A%22757e2e66d52ec03ec085a0c4cc2ba0ba%22%7D%7D>
. Using the "Search API" button will then bring you to:
{
"query": {
"type": "terminal",
"label": "text_chem",
"service": "text_chem",
"parameters": {
"attribute": "rcsb_chem_comp_container_identifiers.comp_id",
"operator": "exists",
"negation": false
}
},
"return_type": "entry",
"request_options": {
"paginate": {
"start": 0,
"rows": 25
},
"results_content_type": [
"experimental"
],
"sort": [
{
"sort_by": "score",
"direction": "desc"
}
],
"scoring_strategy": "combined"
}
}
Please feel free to write to us at i...@rcsb.org if you have any questions.
Sincerely,
Rachel Green
-------------------------------------------------------------
*RACHEL KRAMER GREEN, PH.D.*
Scientific Support & Customer Service Lead
RCSB Protein Data Bank
Rutgers, The State University of New Jersey
174 Frelinghuysen Road, Piscataway NJ 08854
rachel.gr...@rcsb.org <about:blankrachel.gr...@rcsb.org>
rcsb.org <http://rcsb.org/>| facebook
<https://www.facebook.com/RCSBPDB> | twitter
<https://twitter.com/buildmodels>_
_
Coronavirus Resources: RCSB.org/covid19 <http://RCSB.org/covid19>
Undergraduates and Graduates: Opportunities for Scientific
Software-focused Summer Research, Gap Year, and Postdoctoral Research:
www.rcsb.org/pages/jobs <http://www.rcsb.org/pages/jobs>
On 9/13/2024 9:28 AM, Hekstra, Doeke Romke wrote:
Hi,
Does anyone have experience (or scripts!) to extract ligand SMILES
from macromolecular PDB records using the PDB data API or from a
library of PDB or mmCIF files? We would greatly appreciate any
pointers to get us started.
Thank you,
Doeke
=====
Doeke Hekstra
Assistant Professor of Molecular & Cellular Biology, and of Applied
Physics (SEAS),
Director of Undergraduate Studies, Chemical and Physical Biology
Center for Systems Biology, Harvard University
52 Oxford Street, NW311
Cambridge, MA 02138
Office: 617-496-4740
Admin: 617-495-5651 (Lin Song)
------------------------------------------------------------------------
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/