It may not be a stretch to say that the study of reaction kinetics has claimed more hours of chemistry graduate student labor than any other enterprise. Waiting for a reaction to go to “completion” could require hours or even days, and one must keep a watchful eye on the data collection apparatus to avoid wasted runs. There’s a good chance that guy who’s reserved the NMR all night long is battening down for a kinetics run.
All of that effort, of course, leads to supposedly valuable data. The party line in introductory chemistry courses is that under pseudo-first order conditions, one can determine the order of a reactant in the rate law just by watching its concentration over time. We merely need to fit the data to each kinetic “scheme” (zero-, first-, and second-order kinetics) and see which fit looks best to ascertain the order. What could be simpler? The typical method—carried out by thousands (dare I say millions?) of chemistry students over the years—involves attempting to linearize the data by plotting [A] versus t, ln [A] versus t, and 1/[A] versus t. The transformation that leads to the largest R2 value is declared the winner, and the rate constant and order of A are pulled directly from the “winning” equation.
Would you believe that this linearization approach is painfully wrong? Zielinski and Allendoerfer, McNaught, and others have discussed the problems with linearization at length in flowery technical prose. Try the method yourself, and I think you’ll see the problems intuitively: correlation coefficients for linearized data tend to be very large no matter what linearization scheme we choose to use. Even when we know a kinetic order through other means (say, kinetic isotope effects), kinetic schemes known to be incorrect come out looking relatively nice after linearization. I came up against the issue recently while investigating the kinetics of the addition of hydroxide to crystal violet. Linearization using first-order (ln [A] vs. t) and second-order (1/[A] vs. t) schemes gave R2 values of 0.9996 and 0.9902, respectively. If you’re willing to place your bets on an “advantage” (we’ll see in a bit why it really isn’t an advantage) of less than 1%, I have a bridge to sell you…
The problem with linearization is not its effect on the data itself, but how it influences the variations in the data. A kinetics run is a single data set, each point of which has an identical standard deviation that depends only on the quirks of the experimental setup. We must assume the standard deviation is the same throughout the experiment, but even if it isn’t, that doesn’t affect how linearization creates problems. The point is that standard deviation is a function of the experimental setup, not the shape of the data. Mathematically, standard deviation is proportional to d[A] = c, a constant.
Applying two different transformations to the single data set all of a sudden creates two transformed data sets with different standard deviations, both of which depend on [A]. y1 = ln [A] has a standard deviation profile of dy1 = d(ln [A]) = d[A]/[A] = c/[A]. y2 = 1/[A] has a standard deviation profile of dy2 = d(1/[A]) = –d[A]/[A]2 = –c/[A]2. At small [A]’s, linearization has introduced greater variation in the second-order model than it did in the first-order model…and all we did was transform the data! Ack! This effect of linearization will cause us to reject linearized second-order models too often, even when non-linear regression would’ve suggested that the second-order model was superior. Ouch.
Non-linear regression is a thing, of course, and we can use it to fit the raw data without the problems associated with linearization. Excel uses non-linear regression, for example, for all of its non-linear trendlines (shouldn’t they be called “trendcurves”?). Even this doesn’t solve our problem, however—yes, we’ve eliminated the “multiplying variances” issue, but comparing R2 values is still the wrong approach. The Wikipedia articles on regression model validation and Anscombe’s quartet do a nice job of explaining why. R2 is not a sufficient statistic for assessing the relative fit of two statistical models.
The way to go is to use the F-test, which can be used to assess the relative fit of two models to a set of data. The F statistic is a ratio of reduced χ2 statistics for the two models, each of which captures (essentially) how much the data differs from the model. An F statistic larger than a critical value indicates that one model is significantly better than the other. Packed into the F and reduced χ2 values are all kinds of checks (missing from R2) to ensure that the χ2 values are directly comparable and thus that the F statistic has meaning.
This linearization issue is the kind of thing that the typical chemistry student doesn’t hear about until his/her third or fourth year of college—a classic example of “we know this isn’t quite right, but it’ll get you by in general chemistry.” It brings up the broader issue of the role of statistics in the chemistry curriculum, and the fact that there probably isn’t enough of it around in the early stages. In my view, statistics ought to be a mindset promoted by chemistry teachers rather than something that necessarily receives dedicated class time. The student needs to know when to question her results and subject them to statistical scrutiny, but the details of the exact statistical tests to use are not so important.