Age | Commit message (Collapse) | Author | Files | Lines |
|
this is the first part of a series of patches intended to make
__syscall fully self-contained in the object file produced using
syscall.h, which will make it possible for crt1 code to perform
syscalls.
the (confusingly named) i386 __vsyscall mechanism, which this commit
removes, was introduced before the presence of a valid thread pointer
was mandatory; back then the thread pointer was setup lazily only if
threads were used. the intent was to be able to perform syscalls using
the kernel's fast entry point in the VDSO, which can use the sysenter
(Intel) or syscall (AMD) instruction instead of int $128, but without
inlining an access to the __syscall global at the point of each
syscall, which would incur a significant size cost from PIC setup
everywhere. the mechanism also shuffled registers/calling convention
around to avoid spills of call-saved registers, and to avoid
allocating ebx or ebp via asm constraints, since there are plenty of
broken-but-supported compiler versions which are incapable of
allocating ebx with -fPIC or ebp with -fno-omit-frame-pointer.
the new mechanism preserves the properties of avoiding spills and
avoiding allocation of ebx/ebp in constraints, but does it inline,
using some fairly simple register shuffling, and uses a field of the
thread structure rather than global data for the vdso-provided syscall
code address.
for now, the external __syscall function is refactored not to use the
old __vsyscall so it can be kept, but the intent is to remove it too.
|
|
commit f9cccfc16e58b39ee381fbdfb8688db3bb8e3555 left behind the part
in libc.c; remove it too.
|
|
this is a bit ugly, and the motivation for supporting it is
questionable. however the main factors were:
1. it will be useful to have this for certain internal purposes
anyway -- things like syslog.
2. applications can just save argv[0] in main, but it's hard to fix
non-portable library code that's depending on being able to get the
invocation name without the main application's help.
|
|
this doubles the performance of the fastest syscalls on the atom I
tested it on; improvement is reportedly much more dramatic on
worst-case cpus. cannot be used for cancellable syscalls.
|
|
it's expected that this will be needed/useful only in asm, so I've
given it its own symbol that can be addressed in pc-relative ways from
asm rather than adding a field in the __libc structure which would
require hard-coding the offset wherever it's used.
|
|
since gcc is failing to generate the necessary ".hidden" directive in
the output asm, generate it explicitly with an __asm__ statement...
|
|
this was a failed attempt at working around the gcc 3 visibility bug
affecting x86_64. subsequent patch will address it with an ugly but
working hack.
|
|
in gcc 3, the visibility attribute must be placed on both the
declaration and on the definition. if it's omitted from the
definition, the compiler fails to emit the ".hidden" directive in the
assembly, and the linker will either generate textrels (if supported,
such as on i386) or refuse to link (on targets where certain types of
textrels are forbidden or impossible without further assumptions about
memory layout, such as on x86_64).
this patch also unifies the decision about when to use visibility into
libc.h and makes the visibility in the utf-8 state machine tables
based on libc.h rather than a duplicate test.
|
|
prefer using visibility=hidden for __libc internal data, rather than
an accessor function, if the compiler has visibility.
optimize with -O3 for PIC targets (shared library). without heavy
inlining, reloading the GOT register in small functions kills
performance. 20-30% size increase for a single libc.so is not a big
deal, compared to comparaible size increase in every static binaries.
use -Bsymbolic-functions, not -Bsymbolic. global variables are subject
to COPY relocations, and thus binding their addresses in the library
at link time will cause library functions to read the wrong (original)
copies instead of the copies made in the main program's bss section.
add entry point, _start, for dynamic linker.
|
|
prior to this change, a large portion of libc was unusable prior to
relocation by the dynamic linker, due to dependence on the global data
in the __libc structure and the need to obtain its address through the
GOT. with this patch, the accessor function __libc_loc is now able to
obtain the address of __libc via PC-relative addressing without using
the GOT. this means the majority of libc functionality is now
accessible right away.
naturally, the above statements all depend on having an architecture
where PC-relative addressing and jumps/calls are feasible, and a
compiler that generates the appropriate code.
|
|
|