diff options
author | Szabolcs Nagy <nsz@port70.net> | 2018-10-16 22:20:39 +0000 |
---|---|---|
committer | Rich Felker <dalias@aerifal.cx> | 2019-04-17 13:02:47 -0400 |
commit | e980ca7a571465e8a4c887a199491c2cd8d0c0ee (patch) | |
tree | d86a07a748a2c59bf1f12609f497cfd12f0dfc7a /dist | |
parent | 65c8be380431eebe4d70d130bd38563f8df9a7d7 (diff) | |
download | musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.gz musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.bz2 musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.xz musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.zip |
define FP_FAST_FMA* when fma* can be inlined
FP_FAST_FMA can be defined if "the fma function generally executes about
as fast as, or faster than, a multiply and an add of double operands",
which can only be true if the fma call is inlined as an instruction.
gcc sets __FP_FAST_FMA if __builtin_fma is inlined as an instruction,
but that does not mean an fma call will be inlined (e.g. it is defined
with -fno-builtin-fma), other compilers (clang) don't even have such
macro, but this is the closest we can get.
(even if the libc fma implementation is a single instruction, the extern
call overhead is already too big when the macro is used to decide between
x*y+z and fma(x,y,z) so it cannot be based on libc only, defining the
macro unconditionally on targets which have fma in the base isa is also
incorrect: the compiler might not inline fma anyway.)
this solution works with gcc unless fma inlining is explicitly turned off.
Diffstat (limited to 'dist')
0 files changed, 0 insertions, 0 deletions