Why molecular weight? That's just arbitrary.
There is a simple way of referring to proteins which avoids any
ambiguity - by it's sequence. When referring to a protein, we should use
its sequence as an identifier. No ambiguity.
Now, I understand that some smart people in America are now solving
proteins of more than a dozen aa in length. For these, quoting the whole
sequence could be a bit long. Fortunately this is a solved problem: all
we need to do is quote a CRC64 hash of the ascii representation of the
protein sequence. This gives a name space big enough that we can name
about 4 billion proteins before the probability of a name clash becomes
significant.
James Stroud wrote:
I think actually *naming* the proteins would be too extreme. Even the
current alpha-numeric system is overwrought. I liked it better when we
just called proteins "p75" or "p105". For instance, how many proteins in
the human genome are 75 kD, anyway? My guess is not enough to make the
situation ambiguous in any catastrophic way.