define FP_FAST_FMA* when fma* can be inlined

FP_FAST_FMA can be defined if "the fma function generally executes about as fast as, or faster than, a multiply and an add of double operands", which can only be true if the fma call is inlined as an instruction. gcc sets __FP_FAST_FMA if __builtin_fma is inlined as an instruction, but that does not mean an fma call will be inlined (e.g. it is defined with -fno-builtin-fma), other compilers (clang) don't even have such macro, but this is the closest we can get. (even if the libc fma implementation is a single instruction, the extern call overhead is already too big when the macro is used to decide between x*y+z and fma(x,y,z) so it cannot be based on libc only, defining the macro unconditionally on targets which have fma in the base isa is also incorrect: the compiler might not inline fma anyway.) this solution works with gcc unless fma inlining is explicitly turned off.
author: Szabolcs Nagy <nsz@port70.net> 2018-10-16 22:20:39 +0000
committer: Rich Felker <dalias@aerifal.cx> 2019-04-17 13:02:47 -0400
commit: e980ca7a571465e8a4c887a199491c2cd8d0c0ee (patch)
tree: d86a07a748a2c59bf1f12609f497cfd12f0dfc7a /include
parent: 65c8be380431eebe4d70d130bd38563f8df9a7d7 (diff)
download: musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.gz
musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.bz2
musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.xz
musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.zip
1 files changed, 12 insertions, 0 deletions
diff --git a/include/math.h b/include/math.h
index fea34686..14f28ec8 100644
--- a/include/math.h
+++ b/include/math.h
@@ -36,6 +36,18 @@ extern "C" {
 #define FP_SUBNORMAL 3
 #define FP_NORMAL    4
 
+#ifdef __FP_FAST_FMA
+#define FP_FAST_FMA 1
+#endif
+
+#ifdef __FP_FAST_FMAF
+#define FP_FAST_FMAF 1
+#endif
+
+#ifdef __FP_FAST_FMAL
+#define FP_FAST_FMAL 1
+#endif
+
 int __fpclassify(double);
 int __fpclassifyf(float);
 int __fpclassifyl(long double);
author	Szabolcs Nagy <nsz@port70.net>	2018-10-16 22:20:39 +0000
committer	Rich Felker <dalias@aerifal.cx>	2019-04-17 13:02:47 -0400
commit	e980ca7a571465e8a4c887a199491c2cd8d0c0ee (patch)
tree	d86a07a748a2c59bf1f12609f497cfd12f0dfc7a /include
parent	65c8be380431eebe4d70d130bd38563f8df9a7d7 (diff)
download	musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.gz musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.bz2 musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.tar.xz musl-e980ca7a571465e8a4c887a199491c2cd8d0c0ee.zip