Apologies to one of my favorite chemistry blogs for the title of this series—it just fit too well!
I’ve become very interested in the field of chemoinformatics lately. It’s mind-boggling to think about how chemoinformatics could influence education, as student responses are digitized. It’s a young field with a lot of potential! A series of upcoming posts investigate some of the interesting aspects of chemoinformatics in a general sense (divorced from its most common bedfellow, chemical biology).
Here’s an interesting problem: how can one systematically and uniquely number the atoms of a molecular graph? We might need to do so, for instance, to compare two structures to see if they’re identical.
One solution would be to assign, systematically, a unique number to each atom in a structure based on connectivity. Atoms with identical connectivity are in identical chemical environments anyway,  so this procedure would provide us with a nice way to uniquely assign numbers to the atoms of a molecular graph. The toughest aspect of this solution is that little word “systematically.” Procedures that assign unique numbers to atoms must be designed so that the same numbering scheme results every time, irrespective of how the molecule is drawn. In a nutshell, the numbering must depend only on intrinsic properties of the molecular graph itself, and not at all on how it is represented.
Morgan devised an ingenious algorithm that meets this criterion while working for Chemical Abstracts. Let’s begin by numbering each non-hydrogen according to its “non-hydrogen degree,” that is, the number of heavy atoms to which it is attached. Ignore multiple bonds for now.
Next, a weird, iterative addition trick assigns unique numbers to atoms based on their connectivity. For each atom, sum the degrees of each of its neighbors, and give that number to the atom. Rinse and repeat this process until the numbers are unique as possible. For our tyrosine example, this happens after five iterations…I’ll spare you the details, and show only the final result. Suffice it to say, if we repeated this, we wouldn’t introduce any more uniqueness in the numbers.
At this point, most of the atoms have different labels. Begin at the atom with the highest number, and assign it as “1.” Look at atom 1’s neighbors, and assign the highest as 2, second highest as 3, etc. Then move to atom 2, rinse and repeat for any unassigned atoms attached to atom 2. Where ties emerge, assign the atom with higher bond order the lower number. When all is said and done, we get…
In an ideal world, Morgan’s and related “relaxation” algorithms (which iteratively examine the neighbors of atoms) would assign identical numbers to symmetry-equivalent atoms and different numbers to symmetry-inequivalent atoms in all cases. However, there are known examples of molecules with symmetry-inequivalent atoms that cannot be distinguished by Morgan’s algorithm. For some applications of Morgan’s algorithm in the chemical literature, check these out. The alternative proposals to the Cahn-Ingold-Prelog system are particularly intriguing!
 Let’s ignore enantiotopic and diastereotopic groups for now… 😀