Age | Commit message (Collapse) | Author | Files | Lines |
|
there were two problems:
* omitted underflow on subnormal results: exp2l(-16383.5) was calculated
as sqrt(2)*2^-16384, the last bits of sqrt(2) are zero so the down scaling
does not underflow eventhough the result is in subnormal range
* spurious underflow for subnormal inputs: exp2l(0x1p-16400) was evaluated
as f2xm1(x)+1 and f2xm1 raised underflow (because inexact subnormal result)
the first issue is fixed by raising underflow manually if x is in
(-32768,-16382] and not integer (x-0x1p63+0x1p63 != x)
the second issue is fixed by treating x in (-0x1p64,0x1p64) specially
for these fixes the special case handling was completely rewritten
|
|
underflow is raised by an inexact subnormal float store,
since subnormal operations are slow, check the underflow
flag and skip the store if it's already raised
|
|
with naive exp2l(x*log2e) the last 12bits of the result was incorrect
for x with large absolute value
with hi + lo = x*log2e is caluclated to 128 bits precision and then
expl(x) = exp2l(hi) + exp2l(hi) * f2xm1(lo)
this gives <1.5ulp measured error everywhere in nearest rounding mode
|
|
exp(inf), exp(-inf), exp(nan) used to raise wrong flags
|
|
exponents (base 2) near 16383 were broken due to (1) wrong cutoff, and
(2) inability to fit the necessary range of scalings into a long
double value.
as a solution, we fall back to using frndint/fscale for insanely large
exponents, and also have to special-case infinities here to avoid
inf-inf generating nan.
thankfully the costly code never runs in normal usage cases.
|
|
up to 30% faster exp2 by avoiding slow frndint and fscale functions.
expm1 also takes a much more direct path for small arguments (the
expected usage case).
|
|
infinities were getting converted into nans. the new code simply tests
for infinity and replaces it with a large magnitude value of the same
sign.
also, the fcomi instruction is apparently not part of the i387
instruction set, so avoid using it.
|
|
|