Verification of Floating-Point-Intensive Programs: An Industry-Academic Research ProgramRoberto Bagnara, BUGSENG srl and University of Parma, Italy Roberta Gori, University of Pisa, Italy |
During the last decade, the use of floating-point computations in the design of critical systems has become increasingly acceptable. Even in the civil and military avionics domain, which are among the most critical domains for software, floating-point numbers are now seen as a sufficiently-safe, faster and cheaper alternative to fixed-point arithmetic. To the point that, in modern avionics, floating-point is the norm rather than the exception [7].
Acceptance of floating-point computations in the design of critical systems took a long time. In fact, rounding errors can cause subtle bugs which are often missed by non experts [11], and can lead to catastrophic failures. For instance, during the first Persian Gulf War, the failure of a Patriot missile battery in Dhahran was traced to an accumulating rounding error in the continuous execution of tracking and guidance software: this failure prevented the interception of an Iraqi Scud that hit the barracks in Dhahran, Saudi Arabia, killing 28 US soldiers [12]. A careful analysis of this failure revealed that, even though the rounding error obtained at each step of the floating-point computation was very small, the propagation during a long loop-iterating path could lead to dramatic imprecision.
Adoption of floating-point computations in critical systems involves the use of adequate software verification techniques. While the situation has improved due to the wide adoption of significant portions of the IEEE 754 standard for binary floating-point arithmetic [8], verifying floating-point-intensive programs remains a challenging task.
Most implementations of the C and C++ programming languages provide floating-point data types that conform to IEEE 754 as far as the basic arithmetic operations and the conversions are concerned. Some implementations push compliance further by providing other operations required by IEEE 754, such as correctly rounded fused multiply-add.
The C and C++ floating-point mathematical functions are part of the
standard libraries of the respective languages.
Different versions of the language standards
define different, though backwards-compatible,
sets of basic mathematical functions
(see, e.g., [9, 10]).
Access to such functions requires inclusion of the
math.h
header file, in C, or the cmath
header file
in C++.
While the IEEE 754 provides recommendations for these functions
[8, Section 9.2], no C/C++ implementation we know of
complies to the recommendation that such functions be correctly rounded.
1 #include <math.h> 2 #include <stdint.h> 3 4 #define RadOfDeg(x) ((x) * (M_PI/180.)) 5 #define E 0.08181919106 /* Computation for the WGS84 geoid only */ 6 #define LambdaOfUtmZone(utm_zone) RadOfDeg((utm_zone-1)*6-180+3) 7 #define CScal(k, z) { z.re *= k; z.im *= k; } 8 #define CAdd(z1, z2) { z2.re += z1.re; z2.im += z1.im; } 9 #define CSub(z1, z2) { z2.re -= z1.re; z2.im -= z1.im; } 10 #define CI(z) { float tmp = z.re; z.re = - z.im; z.im = tmp; } 11 #define CExp(z) { float e = exp(z.re); z.re = e*cos(z.im); \ 12 z.im = e*sin(z.im); } 13 #define CSin(z) { CI(z); struct complex _z = {-z.re, -z.im}; \ 14 float e = exp(z.re); float cos_z_im = cos(z.im); z.re = e*cos_z_im; \ 15 float sin_z_im = sin(z.im); z.im = e*sin_z_im; _z.re = cos_z_im/e; \ 16 _z.im = -sin_z_im/e; CSub(_z, z); CScal(-0.5, z); CI(z); } 17 18 static inline float isometric_latitude(float phi, float e) { 19 return log(tan(M_PI_4 + phi / 2.0)) 20 - e / 2.0 * log((1.0 + e * sin(phi)) / (1.0 - e * sin(phi))); 21 } 22 23 static inline float isometric_latitude0(float phi) { 24 return log(tan(M_PI_4 + phi / 2.0)); 25 } 26 27 void latlong_utm_of(float phi, float lambda, uint8_t utm_zone) { 28 float lambda_c = LambdaOfUtmZone(utm_zone); 29 float ll = isometric_latitude(phi, E); 30 float dl = lambda - lambda_c; 31 float phi_ = asin(sin(dl) / cosh(ll)); 32 float ll_ = isometric_latitude0(phi_); 33 float lambda_ = atan(sinh(ll) / cos(dl)); 34 struct complex z_ = { lambda_, ll_ }; 35 CScal(serie_coeff_proj_mercator[0], z_); 36 uint8_t k; 37 for(k = 1; k < 3; k++) { 38 struct complex z = { lambda_, ll_ }; 39 CScal(2*k, z); 40 CSin(z); 41 CScal(serie_coeff_proj_mercator[k], z); 42 CAdd(z, z_); 43 } 44 CScal(N, z_); 45 latlong_utm_x = XS + z_.im; 46 latlong_utm_y = z_.re; 47 } |
To illustrate the concrete problem raised by the use of floating-point arithmetic and transcendental functions in program verification settings, consider the code reproduced in Listing 1. It is a reduced version of a real-world example extracted from a critical embedded system.^{1} Some of the questions to be answered for this code are:
Concerning the second question, we call the argument of a floating-point trigonometric function ill-conditioned if its absolute value exceeds some application-dependent threshold. Ideally, this threshold should be just above π. To understand this often-overlooked programming error, consider that, if x is an IEEE 754 single-precision number and x ≥ 2^{23}, then the smallest single-precision range containing [x, x + 2π) contains no more than three floating-point numbers.^{2} Current implementations of floating-point trigonometric functions contain precise range reduction algorithms that have no problem computing a very accurate result even for numbers much higher than 2^{23}: the point is that the input is so inaccurate that the very accurate result is, from the point of view of the application, indistinguishable from a pseudo-random number.
The collaboration has two main partners:
These partners have stipulated a research convention under the coordination of Roberto Bagnara (mailto:roberto.bagnara@bugseng.com, BUGSENG) and Roberta Gori (mailto:roberta.gori@unipi.it, DIPI). The collaboration is open to other researchers and institutions.
Current and past collaborators (in addition to BUGSENG personnel) include:
The objective of the collaboration is twofold:
In order to achieve the given objectives, the collaboration joins the rigor in the theoretical investigation to the best practices of software engineering and experimental validation.
In [2], we presented a new version of FPSE, a research prototype for the symbolic evaluation of computation paths originated from C programs. FPSE solves so-called path conditions that contain floating-point computations exploiting, among other things, a property that is peculiar to the binary representation defined by the IEEE 754 standard [8]. This feature makes FPSE suitable to the generation of test input data covering complex computation paths that depend on floating-point computations. [2] describes some of the key choices in the implementation of FPSE and presents an experimental evaluation that shows that, on normal PC hardware, FPSE can generate correct input data that exercises computation paths containing hundreds of iterations and thousands of floating-point operations.
The work presented in [1, 5] constitutes, on the one hand, the theoretical counterpart of [2], where all results are thoroughly proved correct; on the other hand [1, 5] contain the generalization of the ideas of [2] to subnormal numbers and to floating-point division.
In [6] we tackled the problem of verifying software that makes use of mathematical libraries, which, generally speaking, come without any formal guarantee about what they compute. This would seem to make the task of (even partial) verification of such programs an impossible one. [6] presents a viable alternative to surrender: even if we do not have a proper specification for the functions implemented by mathematical libraries, we might have a partial specification; at the very least we have their implementation, and from the implementation we can sometimes extract, automatically, a partial specification. For those cases where we cannot even count on a partial specification, we can still detect serious program defects by testing, using automatic generation of test input data. The pragmatic approach put forward in [6] exploits the fact that the majority of functions provided by C and C++ mathematical libraries are almost piecewise monotonic: they may contain glitches, that is, violations of monotonicity that, however, are usually small and infrequent. This allowed us to develop effective interval refinement techniques for such functions; these techniques enable verification techniques based on abstract interpretation and symbolic model checking, as well as the automatic generation of test input date via constraint resolution. In particular, the proposed techniques allow to quickly and reliably compute the answers to the questions posed in Section 1 concerning the program in Listing 1.
All of this has been implemented in the ECLAIR software verification platform for C, C++ and Java source code as well as Java bytecode (http:bugseng.com/products/eclair).
Several subjects for theses at all levels and internships are available: they range from more theoretical work to purely experimental work. For more details, get in touch with Roberto Bagnara (mailto:roberto.bagnara@bugseng.com) and Roberta Gori (mailto:roberta.gori@unipi.it).