Dear Doeke,

Here is the process we recommend:

For now you can get the entire collection of ligand definitions at http://www.wwpdb.org/data/ccd, using the file https://files.wwpdb.org/pub/pdb/data/monomers/components.cif.gz

From there you would parse the SMILES from _pdbx_chem_comp_descriptor.

You can also use RCSB.org data API as described at https://data.rcsb.org/#gql-example-7

Select "OPEN IN EDITOR" and add a chemical descriptor component search as follows:
{
  chem_comps(comp_ids:["NAG", "EBW"]) {
    rcsb_id
    chem_comp {
      type
      formula_weight
      name
      formula
    }
    pdbx_chem_comp_descriptor {
      type
      descriptor
    }
    rcsb_chem_comp_info {
      initial_release_date
    }
  }
}

Then, you can get all IDs either using a "grep data_"  command from the file downloaded above, or by using the RCSB Advanced Search >> Chemical Attributes >> Chemical IDs(s)>> "is not empty" <https://www.rcsb.org/search?request=%7B%22query%22%3A%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text_chem%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_chem_comp_container_identifiers.comp_id%22%2C%22operator%22%3A%22exists%22%2C%22negation%22%3Afalse%7D%7D%5D%2C%22logical_operator%22%3A%22and%22%7D%5D%2C%22label%22%3A%22text_chem%22%7D%5D%7D%2C%22return_type%22%3A%22entry%22%2C%22request_options%22%3A%7B%22paginate%22%3A%7B%22start%22%3A0%2C%22rows%22%3A25%7D%2C%22results_content_type%22%3A%5B%22experimental%22%5D%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22score%22%2C%22direction%22%3A%22desc%22%7D%5D%2C%22scoring_strategy%22%3A%22combined%22%7D%2C%22request_info%22%3A%7B%22query_id%22%3A%22757e2e66d52ec03ec085a0c4cc2ba0ba%22%7D%7D> .  Using the "Search API" button will then bring you to:

{
  "query": {
    "type": "terminal",
    "label": "text_chem",
    "service": "text_chem",
    "parameters": {
      "attribute": "rcsb_chem_comp_container_identifiers.comp_id",
      "operator": "exists",
      "negation": false
    }
  },
  "return_type": "entry",
  "request_options": {
    "paginate": {
      "start": 0,
      "rows": 25
    },
    "results_content_type": [
      "experimental"
    ],
    "sort": [
      {
        "sort_by": "score",
        "direction": "desc"
      }
    ],
    "scoring_strategy": "combined"
  }
}

Please feel free to write to us at i...@rcsb.org if you have any questions.

Sincerely,

Rachel Green

-------------------------------------------------------------
*RACHEL KRAMER GREEN, PH.D.*

Scientific Support & Customer Service Lead
RCSB Protein Data Bank

Rutgers, The State University of New Jersey

174 Frelinghuysen Road, Piscataway NJ 08854

rachel.gr...@rcsb.org <about:blankrachel.gr...@rcsb.org>

rcsb.org <http://rcsb.org/>| facebook <https://www.facebook.com/RCSBPDB> | twitter <https://twitter.com/buildmodels>_

_

Coronavirus Resources: RCSB.org/covid19 <http://RCSB.org/covid19>

Undergraduates and Graduates:  Opportunities for Scientific Software-focused Summer Research, Gap Year, and Postdoctoral Research: www.rcsb.org/pages/jobs <http://www.rcsb.org/pages/jobs>

On 9/13/2024 9:28 AM, Hekstra, Doeke Romke wrote:

Hi,

Does anyone have experience (or scripts!) to extract ligand SMILES from macromolecular PDB records using the PDB data API or from a library of PDB or mmCIF files? We would greatly appreciate any pointers to get us started.

Thank you,

Doeke

=====

Doeke Hekstra

Assistant Professor of Molecular & Cellular Biology, and of Applied Physics (SEAS),

Director of Undergraduate Studies, Chemical and Physical Biology

Center for Systems Biology, Harvard University

52 Oxford Street, NW311

Cambridge, MA 02138

Office:    617-496-4740

Admin:   617-495-5651 (Lin Song)


------------------------------------------------------------------------

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 <https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to