> -----Original Message----- > From: Charlie > Sent: Wednesday, February 18, 2004 8:56 AM > To: Warren DeLano > Subject: Re: [PyMOL] selecting multiple atoms ie oxygen > > Warren DeLano wrote: > > John, > > > > color red, 5paa and elem o > > > > The problem with using asterices as wildcards in atom names is that > > some ill-conceived PDB files actually use them in atom names. > > > > However, PyMOL does support the use of a terminal wildcard in some > > cases, such as with the delete command... > > > > create obj01, none > > create obj02, none > > delete obj* > > > > And with residue names > > > > color red, as* > > color blue, gl* > > color pink, hi* > > > > Cheers, > > Warren > > Hi Warren, > I've struggled with this a couple of times. > > Could it be worth finding another way round, so that a > consistent role for * exists. > > Might it not be better to require users to somehow escape *'s > in atom names and then allow * as a wildcard. Although this > might be less clear for newbies, the lack of * as a wilcard > in atom names is currently unclear. I don't know much about > python, but the unix system of escaping special chars with a > backslash would be an option, then one could select > foo,(bar//A/50/*1\*) to get atoms O1* and C1* from the molecule. > > You've probably been through this and have a perfectly sound > reason for not doing it ! >
The newbie issue is what troubled me. Unix hacks know how to edit PDB files to replace asterisks with something more benign, and they know to escape common wildcards -- but the ordinary person does not. Clearly we can't make everyone happy, so perhaps consistency should be the guide? But on the other hand, we are talking about a huge portion of the PDB. Nearly every nucleic acid structure seems to suffer from this unfortunate naming convention (~5000 PDB entries contain "C1\*" according to grep of a recent copy of the PDB). It is true that well-established conventions already exist for handling asterisks, but I don't believe in following conventions blindly, particularly when so many people would be negatively affected. I welcome further discussion on this point. Guidance from the community will be crucial, since I don't have a good solution in mind yet. Some food for thought: 1) Are atom name wildcards really needed when a more precise way of selecting by element symbol already exists? 2) If so, then what are the proposals? a. Escape non-wildcard asterisks with backslash? (regexp convention, but would trip-up newbies, break current PyMOL scripts, and inconvenience a whole field of research) b. Escape wildcard asterisks in atom names with a backslash (that would be very backwards from the standard convention and create further confusion). c. Add a configurable "atom_name_wildcard" toggle? d. Support alternative wildcards for atom names? Would "." or ".*" work? Also note that currently PyMOL doesn't have a regexp engine, and it doesn't support full wildcards -- just terminal astericks in a few situations. Full regexps matching (and thus full convention adherence) would be a nice addition in the future, but it will need to be configured somehow as well, probably via some global setting like "regexp_based_matching". Cheers, Warren