The new PDB format (version 3) has a lot of very useful improvements, and an update is long overdue. However, I am irate that RCSB chose NOT to use the ACA meeting to discuss the changes. Instead, the format is being put into production at the same time as the ACA meeting. It is essentially stating that opinions expressed at the ACA do not count. Their was a lot of conflict at their last attempt at an update. Instead of working to better involve the structural biologist community, I feel that they are intentionally discounting our interests because working with the user community is too much effort.
Unfortunately, structural biologists generally do not want to spend time arguing about file formats, while computer scientists can carry on for weeks over minor details. This change is going to affect all of us. If you have concerns about the new format that have not been addressed, it is important to take action now. The PDB format is not just their personal database format (that's what mmCIF is for), but the format that we all use in our daily research. They don't even want to keep the PDB format at all. It's primary purpose now is for structural biologists. It is essential that we be part of the decision making process. I just sent the following letter to the wwPDB, which is where comments about the new format are supposed to go. If you will be at the ACA meeting, I encourage you to complain loudly. Joe Krahn ----------------------------------------------------------------------- To: [EMAIL PROTECTED] Subject: The new PDB format is WRONG. It seems obvious to me that the RCSB and wwPDB worked on the new format to consider database users needs, but has intentionally ignored the rest of the user community. RCSB manages mmCIF for database purposes, and has declared a lack of interest in even keeping the PDB format. Obviously, the primary purpose of the PDB format is for structural biologists working with individual structures, and not database users. Most of the updates are quite positive and beneficial, but I think that some changes are detrimental. My only serious complaint is that RCSB, and now wwPDB, seem to be ignoring the interests of much of the scientific community which they are supposed to be serving. All that I ask for is appropriate inclusion of all of the user community. This is a big change that will affect thousands of people. We should ensure that it is the best possible format update before we all have to expend a huge effort to deal with it. I have seen many comments about the format by well known crystallographers ignored. One example is the use of SegID. Most structural biologists have favored it for years, but RCSB continued to deny us, on grounds that it is not "well defined". It would be better to make a better definition, and allow it to be used to group together non-covalent groups, such as waters with a specific protein molecule. This is important because the use of ChainID for non-polymers has been banned, which also goes against the wishes of most users. The latest atom alignment rule changes is also detrimental. RCSB has totally broken the element alignment rules, on baseless grounds that it was too hard to follow. The new change convolutes this rule even further, and essentially follows an earlier attempt at IUPAC hydrogen names that the community strongly rejected. At this point, the best solution is probably to make it completely left justified. Again, my main concern is not to follow my idea, but to ensure that the user community gets a fair chance to participate in the final decision. Another problem is that the original meaning of HET groups continues to be corrupted. ATOM records are for commonly occurring residues from a list of standard residues. Water is obviously common, and should not have been converted to a HET group. HET groups have NO relation ship to polymeric state. With water as a HET group, a proper PDB file for a modeller with bulk solvent would require CONECT entries for every single water. It is also important to emphasize that the HETNAM is the actual unique ID, not the 3-letter code. The current hack is to treat everything as an ATOM, which has a pre-determined connectivity. This cannot continue forever, and we are already stuck with meaningless 3-letter codes instead of useful 3-letter abbreviations. The unique 3-letter code should be continued for now, but there should be an emphasis on beginning to use the full HETNAM so that the inevitable switch top non-unique 3-letter codes will not have a big impact. Thank you, Joe Krahn