又一个类型提升引起的Bug

在好几年前我已经中招过一次了, 没想到最近一不留神又中招一次。不过这次的花样和上次又不太一样。

Bug的起因是,我需要一个函数,根据指定速度(可能不是整数)和距离来获取到达目的点的时间,于是就有了下面这样一段代码。

//#define TIME time(NULL)
#define TIME 1526796356
time_t foo(int distance, float speed)
{
        return TIME + distance / speed;
}
int main()
{
        printf("%ld\n", foo(30, 3.0f));
}

咋一看这代码几乎没毛病,严格遵循牛顿大神给出来的公式来算。

然而他的输出却是`1526796416`这样一个值。在思考了将近20分钟之后,我才恍然大悟。

根据《The C Programming Language》第二版中的A6.5节算术转换一节的内容

`
First, if either operand is long double, the other is converted to long double. Otherwise, if either operand is double, the other is converted to double.

Otherwise, if either operand is float, the other is converted to float.

Otherwise, the integral promotions are performed on’ both operands; then, if either operand is unsigned long int, the other is converted to unsigned long int

Otherwise, if one operand is long int and the other is unsigned int, the effect depends on whether a long int can represent all values of an unsigned int; if so, the unsigned int operand is converted to long int; if not, both are converted to unsigned long int

Otherwise, if one operand is long int, the other is converted to long int

Otherwise, if either operand is unsigned int, the other is converted to unsigned int

Otherwise, both operands have type int

There are two changes here. First, arithmetic on float operands may be done in single precision, rather than double; the first edition specified that all floating arithmetic was double precision. Second, shorter unsigned types, when combined with a larger signed type, do not propagate the unsigned property to the result type; in the first edition, the unsigned always dominated. The new rules are slightly more complicated, but reduce somewhat the surprises that may occur when an unsigned quantity meets signed. Unexpected results may still occur when an unsigned expression is compared to a signed expression of the same size.
`

代码`return TIME + distance / speed`在实际执行时会被转换成`return (float)TIME + (float)distance / speed`来执行。

之所以思考了那么久才发现问题,是因为在写代码时就知道编译器会进行类型提升。而从理论上来讲,编译器在进行数学运算时,总是会向最大值更大的类型进行转换,以保证隐式转换的正确性。所以首先就把这种错误是类型提升造成的可能给排除了。

这种理论本身也是正确的,因为float的最大值为`340282346638528859811704183484516925440`,可是我忽略了一个很重要的事实,就是float即然叫单精度,就说明它本身是有精度的。


来看一下float的构成.


Float example.svg

float由1个符号位,8个指数位和23位尾数位构成,而精度完全是由23位尾数控制的,因此虽然float可以表示很大,但那是通过指数位进行跳跃(乘以2的N次方)来得到了,换句话说,float能表示的数其实是不连续的,而我们现在的time(NULL)值`1526796356(0101 1011 0000 0001 0000 0100 0100b)`已经使用了31位二进制了,它的二进制有效数字位为29位,已经超出了float中的尾数位。因此在int转向float时必然会丢失精度。

弄清楚了问题之后,只需要加个类型强制转换就可以解决问题了:

//#define TIME time(NULL)
#define TIME 1526796356
time_t foo(int distance, float speed)
{
        return TIME + (time_t)(distance / speed);
}

BTW. 在查阅文档时,发现了一个有意思的事,《The C Program Language》第二版 在算术类型转换一节中特意标注,在第一版中,所有的float类型算术运算都是先转换成double再进行,而第二版的描述是float类型算术运算‘有可能’直接进行,而不是转换成double之后再进行。

发表评论

− one = five