Comments Page - Emulating the FMAdd Instruction, Part 1: 32-bit Floats

« Back Emulating the FMAdd Instruction, Part 1: 32-bit Floatsdrilian.comSubmitted by TazeTSchnitzel 6 months ago

RossBencina 6 months ago
I would love to know how the author tested/verified that the code is correct.
- stephencanon 6 months ago
  The _algorithm_ is certainly correct (this is a relatively straightforward and well-known theorem of floating-point arithmetic, see e.g. "Handbook of Floating-Point Arithmetic" by JM Muller et al).
  Whether or not the code presented is a faithful translation of that algorithm is more subtle and somewhat dependent on the environment; the AddWithError operation requires that no excess precision is used (FLT_EVAL_METHOD == 0), which is the norm for compilers today but not quite universal, especially for compilers that target 32b x86. The `LowestBitOfMantissa`, `AddOneBitToMantissa` and `SubtractOneBitFromMantissa` operations are not implemented in the post--those aren't hard to implement, but doing it without invoking undefined behavior can be subtle (and what's actually permitted is a critical distinction between C and C++).
  It also may not handle some non-finite cases correctly--when the result is infinite or NaN, the error term as computed will be NaN, which will pass `errorTerm != 0.0 && LowestBitOfMantissa(value) == 0`, which will cause a bit to be subtracted from the result, rounding infinity back to a finite value. That will overflow back to infinity on the final conversion in default rounding, but in directed rounding modes will give the wrong answer. It also will get the FP flags wrong in some cases, which doesn't matter much for most use, but means you couldn't use it as a C standard library implementation.
  A fully correct implementation that handles all these small details isn't really any more complicated, and can be written in ~20 lines of C without any code golfing (I wrote Apple's libm fma[f] for targets that do not have hardware support). The "hard" part is writing the thousands and thousands of tests that allow you to have some confidence in the implementation details—this can largely be automated, but it’s definitely the harder side of the task.
  RossBencina 6 months ago
  > The "hard" part is writing the thousands and thousands of tests
  Indeed that is what I was getting at -- and I wouldn't know where to start. I'd like to know more about this.
  Thanks for the Muller book reference, will have to acquire. I have their "Elementary Functions" book.
  stephencanon 6 months ago
  Interesting test cases for FMA are pretty easy to characterize, since it's a simple linear function up to rounding. The obvious edge cases include zero/infinity/nan arithmetic, cases where the product would overflow or underflow but the final result does not, and cases where the produce has finite overflow/underflow but the addend is an exact infinity or zero which hides it. After those, cases with significant cancellation (e.g. fma(1+e, 1-e, -1)) are interesting, as well as cases that fall exactly on or very close to rounding boundaries (these can be constructed via integer arithmetic fairly easily).
  fma is also interesting because it can produce results very close to the underflow boundary (add/subtract/divide/sqrt cannot do this, so one sometimes unearths bugs specific to fma, occasionally even hardware bugs), so those are well worth exercising. For subnormal results, double-rounding happens in a different bit position, so it's important to exercise those cases. One also wants to exercise cases where the product underflows by a huge amount, but the final result does not, because people sometimes write implementations based on integer arithmetic that will incur out-of-range shift counts that may not be correctly handled and invoke UB in C or C++.
- djmips 6 months ago
  I would imagine checking on a machine with working FMADD instructions would be a start. Exhaustive A/B might take a while...
- eapriv 6 months ago
  It doesn’t even compile (the last code snippet uses some undefined variables).
anematode 6 months ago
As a vaguely related observation, I recently found a bug in a JVM written in JavaScript which implemented the l2f (64-bit long to 32-bit float) instruction incorrectly as Math.fround(Number(x)), performing an intermediate conversion to double (because there's no way to directly round a JS bigint to float in one step). My workaround was to round the intermediate result to odd before calling Math.fround: https://gist.github.com/anematode/46bb73dbe6134cc861975c6443...
msk-lywenn 6 months ago
I think the link should be: https://drilian.com/posts/2024.12.31-emulating-the-fmadd-ins...
bru3s 6 months ago
[dead]