Age | Commit message (Collapse) | Author | Files | Lines |
|
lrint in (LONG_MAX, 1/DBL_EPSILON) and in (-1/DBL_EPSILON, LONG_MIN)
is not trivial: rounding to int may be inexact, but the conversion to
int may overflow and then the inexact flag must not be raised. (the
overflow threshold is rounding mode dependent).
this matters on 32bit targets (without single instruction lrint or
rint), so the common case (when there is no overflow) is optimized by
inlining the lrint logic, otherwise the old code is kept as a fallback.
on my laptop an i486 lrint call is asm:10ns, old c:30ns, new c:21ns
on a smaller arm core: old c:71ns, new c:34ns
on a bigger arm core: old c:27ns, new c:19ns
|
|
analogous to commit ddc7c4f936c7a90781072f10dbaa122007e939d0 for
mips64 and n32, remove the hack to load the syscall number into $2 via
asm, and use a constraint to let the compiler load it instead.
now, only $4, $5, and $6 are potential input-only registers. $2 is
always input and output, and $7 is both when it's an argument,
otherwise output-only. previously, $7 was treated as an input (with a
"1" constraint matching its output position) even when it was not an
input, which was arguably undefined behavior (asm input from
indeterminate value). this is corrected.
as before, $8, $9, and $10 are conditionally input-output registers
for 5-, 6-, and 7-argument syscalls. their role in input is carrying
in the values that will be stored on the stack for arguments 5-7.
their role in output is carrying back whatever the kernel has
clobbered them with, so that the compiler cannot assume they still
contain the input values.
|
|
|
|
mips32 has two fpu register file variants: FR=0 with 32 32-bit
registers, where pairs of neighboring even/odd registers are used to
represent doubles, and FR=1 with 32 64-bit registers, each of which
can store a single or double.
up through r5 (our "mips" arch), the supported ABI uses FR=0, but
modern compilers generate "fpxx" model code that can safely operate
with either model. r6, which is an incompatible but similar ISA, drops
FR=0 and only provides the FR=1 model. as such, setjmp and longjmp,
which depended on being able to save and restore call-saved doubles by
storing and loading their 32-bit halves, were completely broken in the
presence of floating point code on mips r6.
to fix this, use the s.d and l.d mnemonics to store and load fpu
registers. these expand to the existing swc1 and lwc1 instructions for
pairs of 32-bit fpu registers on mips1, but on mips2 and later they
translate directly to the 64-bit sdc1 and ldc1.
with FR=0, sdc1 and ldc1 behave just like the pairs of swc1 and lwc1
instructions they replace, storing or loading the even/odd pair of fpu
registers that can be treated as separate single-precision floats or
as a unit representing a double. but with FR=1, they store/load
individual 64-bit registers. this yields the ABI-correct behavior on
mips r6, and should make linking of pre-r6 (plain "mips") code with
"fp64" model code workable, although this is and will likely remain
unsupported usage.
in addition to the mips r6 problem this change fixes, reportedly
clang's internal assembler refuses to assemble swc1 and lwc1
instructions for odd register indices when building for "fpxx" model
(the default). this caused setjmp and longjmp not to build. by using
the s.d and l.d forms, this problem is avoided too.
as a bonus, code size is reduced everywhere but mips1.
|
|
mips r6 (an incompatible isa from traditional mips) removes the hi and
lo registers used for mul/div results. older gcc versions accepted
them in the clobber list for asm, but their presence is incorrect and
breaks on later versions.
in the process of fixing this, the clobber list for 32-bit mips
syscalls has been deduplicated via a macro like on mips64 and n32.
|
|
armv8 removed the coprocessor instructions other than cp14, so
on an armv8 system the related hwcaps should never be set.
new llvm complains about the use of coprocessor instructions in
armv8-a mode (even though they are never executed at runtime),
so ifdef them out when musl is built for armv8.
|
|
in the timer thread start function, self->timer_id was accessed
without synchronization; the timer thread could fail to see the store
from the calling thread, resulting in timer_delete failing to delete
the correct kernel-level timer.
this fix is based on a patch by changdiankang, but with the load moved
to after receiving the timer_delete signal rather than just after the
start barrier, so as not to retain the possibility of data race with
timer_delete.
|
|
The operand sepcifiers in a_cas and a_cas_p for riscv64 were incorrect:
there's a backwards branch in the routine, so despite tmp being written
at the end of the assembly fragment it cannot be allocated in one of the
input registers because the input values may be needed for another trip
around the loop.
For code that follows the guaranteed forward progress requirements, the
backwards branch is rarely taken: SiFive's hardware only fails a store
conditional on execptional cases (ie, instruction cache misses inside
the loop), and until recently a bug in QEMU allowed back-to-back
store conditionals to succeed. The bug has been fixed in the latest
QEMU release, but it turns out that the fix caused this latent bug in
musl to manifest.
|
|
commit 8a544ee3a2a75af278145b09531177cab4939b41 introduced a
dependency of the failure path for explicit scheduling at thread
creation on __clone's handling of the start function returning, which
should result in SYS_exit.
as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, the arm
version of __clone was broken in this case. in the past, the mips
version was also broken; it was fixed in commit
8b2b61e0001281be0dcd3dedc899bf187172fecb.
since this code path is pretty much entirely untested (previously only
reachable in applications that call the public clone() and return from
the start function) and consists of fragile per-arch asm, don't assume
it works, at least not until it's been thoroughly tested. instead make
the SYS_exit syscall from the start function's failure path.
|
|
commit cc3a4466605fe8dfc31f3b75779110ac93055bc1 fixed this for printf
but neglected to fix wprintf.
Previously, %lf caused a failure to output.
|
|
we don't actually support building asm source files as thumb1, but
it's possible that the condition __ARM_ARCH>=5 would be false on old
compilers that did not define __ARM_ARCH at all. avoiding that would
require enumerating all of the possible __ARM_ARCH_*__ macros for
testing.
as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc
is not valid for saving a return address when in thumb mode. since
this code is a hot path (dynamic TLS access), don't do the out-of-line
bl->bx chaining to save the return value; instead, use the fact that
this file is preprocessed asm to add the missing thumb bit with an add
in place of the mov.
the change here does not affect builds for ISA levels new enough to
have a thread pointer read instruction, or for armv5 and later as long
as the compiler properly defines __ARM_ARCH, or for any build as arm
(not thumb) code. it's likely that it makes no difference whatsoever
to any present-day practical build environments, but nonetheless now
it's safe.
as an alternative, we could just assume __thumb__ implies availability
of blx since we don't support building asm source files as thumb1. I
didn't do that in order to avoid having a wrong assumption here if
that ever changes.
|
|
as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc
is not a valid method for saving the return address in code that might
be built as thumb.
this one is unlikely to matter, since any ISA level that has thumb2
should also have native implementations of atomics that don't involve
kuser_helper, and the affected code is only used on very old kernels
to begin with.
|
|
mov lr,pc is not a valid way to save the return address in thumb mode
since it omits the thumb bit. use a chain of bl and bx to emulate blx.
this could be avoided by converting to a .S file with preprocessor
conditions to use blx if available, but the time cost here is
dominated by the syscall anyway.
while making this change, also remove the remnants of support for
pre-bx ISA levels. commit 9f290a49bf9ee247d540d3c83875288a7991699c
removed the hack from the parent code paths, but left the unnecessary
code in the child. keeping it would require rewriting two code paths
rather than one, and is useless for reasons described in that commit.
|
|
AT_HWCAP2 flags, see
linux commit 671db581815faf17cbedd7fcbc48823a247d90b1
arm64: Expose DC CVADP to userspace
linux commit 06a916feca2b262ab0c1a2aeb68882f4b1108a07
arm64: Expose SVE2 features for userspace
|
|
new mount api syscalls were added, same numers on all targets, see
linux commit a07b20004793d8926f78d63eb5980559f7813404
vfs: syscall: Add open_tree(2) to reference or clone a mount
linux commit 2db154b3ea8e14b04fee23e3fdfd5e9d17fbc6ae
vfs: syscall: Add move_mount(2) to move mounts around
linux commit 24dcb3d90a1f67fe08c68a004af37df059d74005
vfs: syscall: Add fsopen() to prepare for superblock creation
linux commit ecdab150fddb42fe6a739335257949220033b782
vfs: syscall: Add fsconfig() for configuring and managing a context
linux commit 93766fbd2696c2c4453dd8e1070977e9cd4e6b6d
vfs: syscall: Add fsmount() to create a mount for a superblock
linux commit cf3cba4a429be43e5527a3f78859b1bfd9ebc5fb
vfs: syscall: Add fspick() to select a superblock for reconfiguration
linux commit 9c8ad7a2ff0bfe58f019ec0abc1fb965114dde7d
uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]
linux commit d8076bdb56af5e5918376cd1573a6b0007fc1a89
uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]
|
|
apply open_tree with OPEN_TREE_CLONE call to the entire subtree, see
linux commit a07b20004793d8926f78d63eb5980559f7813404
vfs: syscall: Add open_tree(2) to reference or clone a mount
|
|
see
linux commit a528d35e8bfcc521d7cb70aaf03e1bd296c8493f
statx: Add a system call to make enhanced file info available
these are linux specific and not reserved names for fcntl.h so they
are under _BSD_SOURCE|_GNU_SOURCE.
|
|
when set a pidfd is stored in parent_tidptr, see
linux commit b3e5838252665ee4cfa76b82bdf1198dca81e5be
clone: add CLONE_PIDFD
|
|
ethertype for fake VLAN header for DSA, see
linux commit bf5bc3ce8a8f32a0d45b6820ede8f9fc3e9c23df
ether: Add dedicated Ethertype for pseudo-802.1Q DSA tagging
|
|
historically, a number of 32-bit archs used long rather than int for
wchar_t, for no good reason. GCC still uses the historical types, but
clang replaced them all with int, and it seems PCC uses int too.
mismatching the compiler's type for wchar_t is not an option due to
wide string literals.
note that the mismatch does not affect C++ ABI since wchar_t is its
own builtin type/keyword in C++, distinct from both int and long, not
a typedef.
i386 already worked around this by honoring __WCHAR_TYPE__ if defined
by the compiler, and only using the official legacy ABI type if not.
add the same to the other affected archs.
it might make sense at some point to switch to using int as the
default if __WCHAR_TYPE__ is not defined, if the expectations is that
new compilers will treat int as the correct choice, but it's unlikely
that the case where __WCHAR_TYPE__ is undefined will ever be used
anyway. I actually wanted to move the definition of wchar_t to the
top-level shared alltypes.h.in, using __WCHAR_TYPE__ and falling back
to int if not defined, but that can't be done without assuming all
compilers define __WCHAR_TYPE__ thanks to some pathological archs
where the ABI has wchar_t as an unsigned type.
|
|
previously, when pthread_create failed due to inability to set
explicit scheduling according to the requested attributes, the nascent
thread was detached and made responsible for its own cleanup via the
standard pthread_exit code path. this left it consuming resources
potentially well after pthread_create returned, in a way that the
application could not see or mitigate, and unnecessarily exposed its
existence to the rest of the implementation via the global thread
list.
instead, attempt explicit scheduling early and reuse the failure path
for __clone failure if it fails. the nascent thread's exit futex is
not needed for unlocking the thread list, since the thread calling
pthread_create holds the thread list lock the whole time, so it can be
repurposed to ensure the thread has finished exiting. no pthread_exit
is needed, and freeing the stack, if needed, can happen just as it
would if __clone failed.
|
|
if setting scheduling properties succeeds, the new thread may end up
with lower priority than the caller, and may be unable to continue
running due to another intermediate-priority thread. this produces a
priority inversion situation for the thread calling pthread_create,
since it cannot return until the new thread reports success.
originally, the parent was responsible for setting the new thread's
priority; commits b8742f32602add243ee2ce74d804015463726899 and
40bae2d32fd6f3ffea437fa745ad38a1fe77b27e changed it as part of
trimming down the pthread structure. since then, commit
04335d9260c076cf4d9264bd93dd3b06c237a639 partly reversed the changes,
but did not switch responsibilities back. do that now.
|
|
commit 8f11e6127fe93093f81a52b15bb1537edc3fc8af wrongly documented
that all changes to libc.threads_minus_1 were guarded by the thread
list lock, but the decrement for failed SYS_clone took place after the
thread list lock was released.
|
|
commit 030e52639248ac8417a4934298caa78c21a228d1 added optreset, a BSD
extension to getopt duplicating the functionality (also an extension)
of setting optind to 0, but failed to provide a public declaration for
it. according to the BSD documentation and headers, the application is
not supposed to need to provide its own declaration.
|
|
these are presently extensions, thus named with _np to match glibc and
other implementations that provide them; however they are likely to be
standardized in the future without the _np suffix as a result of
Austin Group issue 1208. if so, both names will be kept as aliases.
|
|
|
|
due to historical accident/sloppiness in glibc, the powerpc,
powerpc64, and sh versions of struct user, defined by sys/user.h, used
struct pt_regs from the kernel asm/ptrace.h for their regs member.
this made it impossible to define the type in an API-compatible manner
without either including asm/ptrace.h like glibc does (contrary to our
policy of not depending on kernel headers), or clashing with
asm/ptrace.h's definition of struct pt_regs if both headers are
included (which is almost always the case in software using
sys/user.h).
for a long time I viewed this problem as having no reasonable fix. I
even explored the possibility of having the powerpc[64] and sh
versions of user.h just include the kernel header (breaking with
policy), but that looked like it might introduce new clashes with
sys/ptrace.h. and it would also bring in a lot of additional cruft
that makes no sense for sys/user.h to expose. glibc goes out of its
way to suppress some of that with #undef, possibly leading to
different problems. this is a rabbit-hole that should be explored no
further.
as it turns out, however, nothing actually uses struct user
sufficiently to care about the type of the regs member; most software
including sys/user.h does not even use struct user at all. so, the
problem can be fixed just by doing away with the insistence on strict
glibc API compatibility for the struct tag of the regs member.
rather than renaming the tag, which might lead to the new name
entering use as API, simply use an untagged structure inside struct
user with the same members/layout as struct pt_regs.
for sh, struct pt_dspregs is just removed entirely since it was not
used.
|
|
commit 9b14ad541068d4f7d0be9bcd1ff4c70090d868d3 introduced this
namespace violation.
|
|
these members are associated with an unsupported option group. with
time_t changing size on 32-bit archs, all interfaces taking struct
sched_param arguments would need redirection and compat shims in order
to be able to continue offering these members, for no benefit. just
convert them to reserved space instead.
|
|
d493206de7df4db07ad34f24701539ba0a6ed38c deleted all the content of
user.h, but sys/procfs.h expects this from sys/user.h
threfore we retain the non conflicting parts
|
|
commit ffab43602b5900c86b7040abdda8ccf6cdec95f5 broke this by moving
relocations after not only the allocation of storage for the main
thread's static TLS, but after the copying of the TLS image. thus,
relocation results were not reflected in the main thread's copy. this
could be fixed by calling __reset_tls after relocations, but instead
split the allocation and installation before/after relocations so that
there's not a redundant copy.
due to commit 71af5309874269bcc9e4b84ea716fab33d888c1d, updating of
static_tls_cnt needs to be kept with allocation of static TLS, before
relocations, rather than after installation.
|
|
commit 7590203c486d9002522019045d34ee3dee0a66f5 omitted static here.
|
|
Using common code path for all symbol lookups fixes three dlsym issues:
- st_shndx of STT_TLS symbols were not checked and thus an undefined
tls symbol reference could be incorrectly treated as a definition
(the sysv hash lookup returns undefined symbols, gnu does not, so should
be rare in practice).
- symbol binding was not checked so a hidden symbol may be returned
(in principle STB_LOCAL symbols may appear in the dynamic symbol table
for hidden symbols, but linkers most likely don't produce it).
- mips specific behaviour was not applied (ARCH_SYM_REJECT_UND) so
undefined symbols may be returned on mips.
always_inline is used to avoid relocation performance regression, the
code generation for find_sym should not be affected.
|
|
commit 7a9669e977e5f750cf72ccbd2614f8b72ce02c4c added use of the
symbol reference as the definition, in place of performing a lookup,
for STT_SECTION symbol references that were first found used in FDPIC.
such references may happen in certain other cases, such as
local-dynamic TLS and with relocation types that require a symbol but
that are being used for non-symbolic purposes, like the powerpc
unaligned address relocations.
in all such cases I'm aware of, the symbol referenced is a section
symbol (STT_SECTION); however, the important semantic property is not
its being a section, but rather its binding local (STB_LOCAL). check
the latter instead of the former for greater generality and semantic
correctness.
|
|
R_PPC_UADDR32 (R_PPC64_UADDR64) has the same meaning as R_PPC_ADDR32
(R_PPC64_ADDR64), except that its address need not be aligned. For
powerpc64, BFD ld(1) will automatically convert between ADDR<->UADDR
relocations when the address is/isn't at its native alignment. This
will happen if, for example, there is a pointer in a packed struct.
gold and lld do not currently generate R_PPC64_UADDR64, but pass
through misaligned R_PPC64_ADDR64 relocations from object files,
possibly relaxing them to misaligned R_PPC64_RELATIVE. In both cases
(relaxed or not) this violates the PSABI, which defines the relevant
field type as "a 64-bit field occupying 8 bytes, the alignment of
which is 8 bytes unless otherwise specified."
All three linkers violate the PSABI on 32-bit powerpc, where the only
difference is that the field is 32 bits wide, aligned to 4 bytes.
Currently musl fails to load executables linked by BFD ld containing
R_PPC64_UADDR64, with the error "unsupported relocation type 43".
This change provides compatibility with BFD ld on powerpc64, and any
static linker on either architecture that starts following the PSABI
more closely.
|
|
as a result of commit ffab43602b5900c86b7040abdda8ccf6cdec95f5,
static_tls_cnt is now valid during relocations at program startup, so
it's no longer necessary to condition the check against static_tls_cnt
on this being a runtime (dlopen) relocation.
|
|
this is analogous to commit 2f1f51ae7b2d78247568e7fdb8462f3c19e469a4,
and should have been caught at the same time since it was right next
to the code moved in that commit. between final stage 3 reloc_all and
the jump to the main program's entry point, it is not valid to call
any functions which may be interposed by the application; doing so
results in execution of application code before ctors have run, and on
fdpic archs, before the main program's fdpic self-fixups have taken
place, which will produce runaway wrong execution.
|
|
This function is a GNU extension introduced in glibc 2.17.
|
|
POSIX allows a null pointer, in which case the function only checks
the validity of the clock id argument.
|
|
at the point of this check, the pointer has already been dereferenced.
clock_settime is not defined for null pointer arguments.
|
|
somewhat analogous to commit d0b547dfb5f7678cab6bc39dd736ed6454357ca4,
but here the omission of the null timeout check was in the time64
syscall code path. this code is not yet used except on x32.
|
|
these accept the netbsd/openbsd message catalog file format,
consisting of a sorted list of set headers and a sorted list of
message headers for each set, admitting trivial binary search for
lookups.
the gnu format was not chosen because it's unusably bad. it does not
admit efficient (log time or better) lookups; rather, it requires
linear search or hash table lookups, and the hash function is awful:
it's literally set_id*msg_id.
|
|
commit 722a1ae3351a03ab25010dbebd492eced664853b inadvertently passed a
copy of {s,us} to the syscall even if the timeout argument tv was
null, thereby causing immediate timeout (polling) in place of
unlimited timeout. only archs using SYS_select were affected.
|
|
when the pattern ended with one or more literal path components, or
when the GLOB_MARK flag was passed to request that glob flag directory
results and the type obtained by readdir was unknown or inconclusive
(symlink), the stat function was called to evaluate existence and/or
determine type. however, stat fails with ENOENT for broken symlinks,
and this caused the match to be omitted from the results.
instead, use stat only for the unknown/inconclusive cases with
GLOB_MARK, and otherwise, or if stat fails, use lstat existence still
needs to be determined. this minimizes the number of costly syscalls,
performing both only in the case where GLOB_MARK is in use and there
is a final literal path component which is a broken symlink.
based on/simplified from patch by James Y Knight.
|
|
the contents conflicted with asm/ptrace.h. glibc does not provide
anything in user.h for riscv, so software cannot be depending on it.
simplified from patch submitted by Baruch Siach.
|
|
Rename user registers struct definitions to avoid conflict with the
asm/ptrace.h kernel header that defines the same structs. Use the
__riscv_mc prefix as glibc does.
|
|
The only reason we needed to preserve the link register was because we
were using a branch-link instruction to branch to __cp_cancel.
Replacing this with a branch means we can avoid the save/restore as
the link register is no longer modified.
|
|
|
|
otherwise alarm will break on 32-bit archs when time_t is changed to
64-bit. a second itimerval object is introduced for retrieving the old
value, since the setitimer function has restrict-qualified arguments.
|
|
commit f3ed8bfe8a82af1870ddc8696ed4cc1d5aa6b441 inadvertently removed
labels that were still needed.
|