Age | Commit message (Collapse) | Author | Files | Lines |
|
Handle when after reduction |y| > pi/4+tiny. This happens in directed
rounding modes because the fast round to int code does not give the
nearest integer. In such cases the reduction may not be symmetric
between x and -x so e.g. cos(x)==cos(-x) may not hold (but polynomial
evaluation is not symmetric either with directed rounding so fixing
that would require more changes with bigger performance impact).
The fix only adds two predictable branches in nearest rounding mode,
simple ubenchmark does not show relevant performance regression in
nearest rounding mode.
The code could be improved: e.g reducing the medium size threshold
such that two step reduction is enough instead of three, and the
single precision case can avoid the issue by doing the round to int
differently, but this fix was kept minimal.
|
|
a conforming compiler for an arch with excess precision floating point
(FLT_EVAL_METHOD!=0; presently i386 is the only such arch supported)
computes all intermediate results in the types float_t and double_t
rather than the nominal type of the expression. some incorrect
compilers, however, only keep excess precision in registers, and
convert down to the nominal type when spilling intermediate results to
memory, yielding unpredictable results that depend on the compiler's
choices of what/when to spill. in particular, this happens on old gcc
versions with -ffloat-store, which we need in order to work around
bugs where the compiler wrongly keeps explicitly-dropped excess
precision.
by explicitly converting to double_t where expressions are expected be
be evaluated in double_t precision, we can avoid depending on the
compiler to get types correct when spilling; the nominal and
intermediate precision now match. this commit should not change the
code generated by correct compilers, or by old ones on non-i386 archs
where double_t is defined as double.
this fixes a serious bug in argument reduction observed on i386 with
gcc 4.2: for values of x outside the unit circle, sin(x) was producing
results outside the interval [-1,1]. changes made in commit
0ce946cf808274c2d6e5419b139e130c8ad4bd30 were likely responsible for
breaking compatibility with this and other old gcc versions.
patch by Szabolcs Nagy.
|
|
the idiomatic rounding of x is
n = x + toint - toint;
where toint is either 1/EPSILON (x is non-negative) or 1.5/EPSILON
(x may be negative and nearest rounding mode is assumed) and EPSILON is
according to the evaluation precision (the type of toint is not very
important, because single precision float can represent the 1/EPSILON of
ieee binary128).
in case of FLT_EVAL_METHOD!=0 this avoids a useless store to double or
float precision, and the long double code became cleaner with
1/LDBL_EPSILON instead of ifdefs for toint.
__rem_pio2f and __rem_pio2 functions slightly changed semantics:
on i386 a double-rounding is avoided so close to half-way cases may
get evaluated differently eg. as sin(pi/4-eps) instead of cos(pi/4+eps)
|
|
- remove the HAVE_EFFICIENT_IRINT case: fn is an exact integer, so
it can be converted to int32_t a bit more efficiently than with a
cast (the rounding mode change can be avoided), but musl does not
support this case on any arch.
- __rem_pio2: use double_t where possible
- __rem_pio2f: use less assignments to avoid stores on i386
- use unsigned int bit manipulation (and union instead of macros)
- use hexfloat literals instead of named constants
|
|
gcc did not always drop excess precision according to c99 at assignments
before version 4.5 even if -std=c99 was requested which caused badly
broken mathematical functions on i386 when FLT_EVAL_METHOD!=0
but STRICT_ASSIGN was not used consistently and it is worked around for
old compilers with -ffloat-store so it is no longer needed
the new convention is to get the compiler respect c99 semantics and when
excess precision is not harmful use float_t or double_t or to specialize
code using FLT_EVAL_METHOD
|
|
|
|
thanks to the hard work of Szabolcs Nagy (nsz), identifying the best
(from correctness and license standpoint) implementations from freebsd
and openbsd and cleaning them up! musl should now fully support c99
float and long double math functions, and has near-complete complex
math support. tgmath should also work (fully on gcc-compatible
compilers, and mostly on any c99 compiler).
based largely on commit 0376d44a890fea261506f1fc63833e7a686dca19 from
nsz's libm git repo, with some additions (dummy versions of a few
missing long double complex functions, etc.) by me.
various cleanups still need to be made, including re-adding (if
they're correct) some asm functions that were dropped.
|