At work I am currently writing software that calculates some thermodynamic properties of proteins. I recently refactored some of the code and I wanted to make sure that I didn’t screw something up and change the output. So I was concerned when I compared the old and new output and discovered some differences:

Corex output before and after the refactor. The second column of numbers in each file gives the conformational entropy (Sconf).
Looking into the difference further, I finally tracked it back to a single change in the C source code. The original programmer liked to write out additions explicitly:
Sconf=Sconf+backboneEntropy+sidechainEntropy;
During the refactor I changed this to the more readable (IMHO)
Sconf += backboneEntropy + sidechainEntropy;
This small change resulted in all the numerical differences I was seeing. To better understand the reason for the difference, here’s a simple C program:
#include <stdio.h>
int main(int args, char *argv[]) {
float a, b, sum1, sum2;
a= 4.1;
b = -0.12;
sum1 = sum2 = 11.9;
sum1 = sum1 + a + b;
sum2 += a + b;
printf("=+ %f\n+= %f",sum1,sum2);
return 0;
}
This yields the output
=+ 15.880000
+= 15.879999
Here is the unoptimized assembly code for lines 9-10, commented with “pseudo-c” descriptions of what’s happening. XMM0 and XMM1 are two 128-bit registers used by 64-bit processors (like my Intel Core 2 duo) for floating point operations. However, the ‘ss’ at the end of all the operations means that only the lowest 32 bits are used in the computation, with overflow being discarded.
.loc 1 9 0 ;line 9 movss -12(%rbp), %xmm0 ; XMM0 = sum1 addss -4(%rbp), %xmm0 ; XMM0 += a addss -8(%rbp), %xmm0 ; XMM0 += b movss %xmm0, -12(%rbp) ; sum1 = XMM0 .loc 1 10 0 ;line 10 movss -4(%rbp), %xmm0 ; XMM0 = a movaps %xmm0, %xmm1 ; XMM1 = XMM0 addss -8(%rbp), %xmm1 ; XMM1 += b movss -16(%rbp), %xmm0 ; XMM0 = sum2 addss %xmm1, %xmm0 ; XMM0 += XMM1 movss %xmm0, -16(%rbp) ; sum2 = XMM0
So basically, the difference between “=…+” and “+=” is just the order of operations; the former does (sum1 + a) + b while the latter does sum1 + (a + b). The small differences in rounding behavior between these two variants add up to account for the 0.01 differences I was observing in my program.