Age | Commit message (Collapse) | Author | Files | Lines |
|
this cleans up what had become widespread direct inline use of "GNU C"
style attributes directly in the source, and lowers the barrier to
increased use of hidden visibility, which will be useful to recovering
some of the efficiency lost when the protected visibility hack was
dropped in commit dc2f368e565c37728b0d620380b849c3a1ddd78f, especially
on archs where the PLT ABI is costly.
|
|
commit ad1cd43a86645ba2d4f7c8747240452a349d6bc1 eliminated
preprocessor-level omission of references to the init/fini array
symbols from object files going into libc.so. the references are weak,
and the intent was that the linker would resolve them to zero in
libc.so, but instead it leaves undefined references that could be
satisfied at runtime. normally these references would be harmless,
since the code using them does not even get executed, but some older
binutils versions produce a linking error: when linking a program
against libc.so, ld first tries to use the hidden init/fini array
symbols produced by the linker script to satisfy the references in
libc.so, then produces an error because the definitions are hidden.
ideally ld would have already provided definitions of these symbols
when linking libc.so, but the linker script for -shared omits them.
to avoid this situation, the dynamic linker now provides its own dummy
definitions of the init/fini array symbols for libc.so. since they are
hidden, everything binds at ld time and no references remain in the
dynamic symbol table. with modern binutils and --gc-sections, both
the dummy empty array objects and the code referencing them get
dropped at link time, anyway.
the _init and _fini symbols are also switched back to using weak
definitions rather than weak references since the latter behave
somewhat problematically in general, and the weak definition approach
was known to work well.
|
|
use weak definitions that the dynamic linker can override instead of
preprocessor conditionals on SHARED so that the same libc start and
exit code can be used for both static and dynamic linking.
|
|
this was originally added as a cheap but portable way to quell
warnings about reaching the end of a function that does not return,
but since _Exit is marked _Noreturn, it's not needed. removing it
makes the call to _Exit into a tail call and shaves off a few bytes of
code from minimal static programs.
|
|
the purpose of this logic is to avoid linking __stdio_exit unless any
stdio reads (which might require repositioning the file offset at exit
time) or writes (which might require flushing at exit time) could have
been performed.
previously, exit called two wrapper functions for __stdio_exit named
__flush_on_exit and __seek_on_exit. both of these functions actually
performed both tasks (seek and flushing) by calling the underlying
__stdio_exit. in order to avoid doing this twice, an overridable data
object __towrite_used was used to cause __seek_on_exit to act as a nop
when __towrite was linked.
now, exit only makes one call, directly to __stdio_exit. this is
satisfiable by a weak dummy definition in exit.c, but the real
definition is pulled in by either __toread.c or __towrite.c through
their referencing a symbol which is defined only in __stdio_exit.c.
|
|
calling exit more than once invokes undefined behavior. in some cases
it's desirable to detect undefined behavior and diagnose it via a
predictable crash, but the code here was silently covering up an
uncommon case (exit from more than one thread) and turning a much more
common case (recursive calls to exit) into a permanent hang.
|
|
|
|
modern (4.7.x and later) gcc uses init/fini arrays, rather than the
legacy _init/_fini function pasting and crtbegin/crtend ctors/dtors
system, on most or all archs. some archs had already switched a long
time ago. without following this change, global ctors/dtors will cease
to work under musl when building with new gcc versions.
the most surprising part of this patch is that it actually reduces the
size of the init code, for both static and shared libc. this is
achieved by (1) unifying the handling main program and shared
libraries in the dynamic linker, and (2) eliminating the
glibc-inspired rube goldberg machine for passing around init and fini
function pointers. to clarify, some background:
the function signature for __libc_start_main was based on glibc, as
part of the original goal of being able to run some glibc-linked
binaries. it worked by having the crt1 code, which is linked into
every application, static or dynamic, obtain and pass pointers to the
init and fini functions, which __libc_start_main is then responsible
for using and recording for later use, as necessary. however, in
neither the static-linked nor dynamic-linked case do we actually need
crt1.o's help. with dynamic linking, all the pointers are available in
the _DYNAMIC block. with static linking, it's safe to simply access
the _init/_fini and __init_array_start, etc. symbols directly.
obviously changing the __libc_start_main function signature in an
incompatible way would break both old musl-linked programs and
glibc-linked programs, so let's not do that. instead, the function can
just ignore the information it doesn't need. new archs need not even
provide the useless args in their versions of crt1.o. existing archs
should continue to provide it as long as there is an interest in
having newly-linked applications be able to run on old versions of
musl; at some point in the future, this support can be removed.
|
|
|
|
for seekable files, posix imposed requirements on the offset of the
underlying open file description after a stream is closed. this was
correctly handled (as a side effect of the unconditional fflush call)
when streams were explicitly closed by fclose, but was not handled
correctly at program exit time, where fflush(0) was being used.
the weak symbol hackery is to pull in __stdio_exit if either of
__toread or __towrite is used, but avoid calling it twice so we don't
have to keep extra state. the new __stdio_exit is a streamlined fflush
variant that avoids performing any unnecessary operations and which
never unlocks the files or open file list, so we can be sure no other
threads write new data to a stream's buffer after it's already
flushed.
|
|
this is required in case dtors use stdio.
also remove the old comments; one was cruft from when the code used to
be using function pointers and conditional calls, and has little
motivation now that we're using weak symbols. the other was just
complaining about having to support dtors even though the cost was
made essentially zero in the non-use case by the way it's done here.
|
|
there's no sense in using a powerful lock in exit, because it will
never be unlocked. a thread that arrives at exit while exit is already
in progress just needs to hang forever. use the pause syscall for this
because it's cheap and easy and universally available.
|
|
i did some testing trying to switch malloc to use the new internal
lock with priority inheritance, and my malloc contention test got
20-100 times slower. if priority inheritance futexes are this slow,
it's simply too high a price to pay for avoiding priority inversion.
maybe we can consider them somewhere down the road once the kernel
folks get their act together on this (and perferably don't link it to
glibc's inefficient lock API)...
as such, i've switch __lock to use malloc's implementation of
lightweight locks, and updated all the users of the code to use an
array with a waiter count for their locks. this should give optimal
performance in the vast majority of cases, and it's simple.
malloc is still using its own internal copy of the lock code because
it seems to yield measurably better performance with -O3 when it's
inlined (20% or more difference in the contention stress test).
|
|
|
|
the biggest change in this commit is that stdio now uses readv to fill
the caller's buffer and the FILE buffer with a single syscall, and
likewise writev to flush the FILE buffer and write out the caller's
buffer in a single syscall.
making this change required fundamental architectural changes to
stdio, so i also made a number of other improvements in the process:
- the implementation no longer assumes that further io will fail
following errors, and no longer blocks io when the error flag is set
(though the latter could easily be changed back if desired)
- unbuffered mode is no longer implemented as a one-byte buffer. as a
consequence, scanf unreading has to use ungetc, to the unget buffer
has been enlarged to hold at least 2 wide characters.
- the FILE structure has been rearranged to maintain the locations of
the fields that might be used in glibc getc/putc type macros, while
shrinking the structure to save some space.
- error cases for fflush, fseek, etc. should be more correct.
- library-internal macros are used for getc_unlocked and putc_unlocked
now, eliminating some ugly code duplication. __uflow and __overflow
are no longer used anywhere but these macros. switch to read or
write mode is also separated so the code can be better shared, e.g.
with ungetc.
- lots of other small things.
|
|
|