summaryrefslogtreecommitdiff
path: root/arch/arm/atomic.h
AgeCommit message (Collapse)AuthorFilesLines
2014-11-19overhaul ARM atomics/tls for performance and compatibilityRich Felker1-41/+153
previously, builds for pre-armv6 targets hard-coded use of the "kuser helper" system for atomics and thread-pointer access, resulting in binaries that fail to run (crash) on systems where this functionality has been disabled (as a security/hardening measure) in the kernel. additionally, builds for armv6 hard-coded an outdated/deprecated memory barrier instruction which may require emulation (extremely slow) on future models. this overhaul replaces the behavior for all pre-armv7 builds (both of the above cases) to perform runtime detection of the appropriate mechanisms for barrier, atomic compare-and-swap, and thread pointer access. detection is based on information provided by the kernel in auxv: presence of the HWCAP_TLS bit for AT_HWCAP and the architecture version encoded in AT_PLATFORM. direct use of the instructions is preferred when possible, since probing for the existence of the kuser helper page would be difficult and would incur runtime cost. for builds targeting armv7 or later, the runtime detection code is not compiled at all, and much more efficient versions of the non-cas atomic operations are provided by using ldrex/strex directly rather than wrapping cas.
2014-10-10add explicit barrier operation to internal atomic.h APIRich Felker1-1/+3
2014-08-25fix build error on arm due to new a_spin codeRich Felker1-1/+1
this was broken by commit ea818ea8340c13742a4f41e6077f732291aea4bc.
2014-08-25add working a_spin() atomic for non-x86 targetsRich Felker1-0/+1
conceptually, a_spin needs to be at least a compiler barrier, so the compiler will not optimize out loops (and the load on each iteration) while spinning. it should also be a memory barrier, or the spinning thread might keep spinning without noticing stores from other threads, thus delaying for longer than it should. ideally, an optimal a_spin implementation that avoids unnecessary cache/memory contention should be chosen for each arch, but for now, the easiest thing is to perform a useless a_cas on the calling thread's stack.
2014-07-27clean up unused and inconsistent atomics in arch dirsRich Felker1-5/+0
the a_cas_l, a_swap_l, a_swap_p, and a_store_l operations were probably used a long time ago when only i386 and x86_64 were supported. as other archs were added, support for them was inconsistent, and they are obviously not in use at present. having them around potentially confuses readers working on new ports, and the type-punning hacks and inconsistent use of types in their definitions is not a style I wish to perpetuate in the source tree, so removing them seems appropriate.
2014-04-30fix arm thread-pointer/atomic asm when compiling to thumb codeRich Felker1-3/+5
armv7/thumb2 provides a way to do atomics in thumb mode, but for armv6 we need a call to arm mode. this commit is based on a patch by Stephen Thomas which fixed the armv7 cases but not the armv6 ones. all of this should be revisited if/when runtime selection of thread pointer access and atomics are added.
2014-04-14use dmb barrier instruction for atomics on arm v7Rich Felker1-2/+9
aside from potentially offering better performance, this change is needed since the old coprocessor-based approach to barriers is deprecated in arm v7, and some compilers/assemblers issue errors when using the deprecated instruction for v7 targets.
2014-04-07fix arm atomic asm register constraintRich Felker1-1/+1
the "m" constraint could give a memory reference with an offset that's not compatible with ldrex/strex, so the arm-specific "Q" constraint is needed instead.
2014-04-07use inline atomics and thread pointer on arm models supporting themRich Felker1-0/+21
this is perhaps not the optimal implementation; a_cas still compiles to nested loops due to the different interface contracts of the kuser helper cas function (whose contract this patch implements) and the a_cas function (whose contract mimics the x86 cmpxchg). fixing this may be possible, but it's more complicated and thus deferred until a later time. aside from improving performance and code size, this patch also provides a means of producing binaries which can run on hardened kernels where the kuser helpers have been disabled. however, at present this requires producing binaries for armv6k or later, which will not run on older cpus. a real solution to the problem of kernels that omit the kuser helpers would be runtime detection, so that universal binaries which run on all arm cpu models can also be compatible with all kernel hardening profiles. robust detection however is a much harder problem, and will be addressed at a later time.
2013-09-22fix arm atomic store and generate simpler/less-bloated/faster codeRich Felker1-6/+8
atomic store was lacking a barrier, which was fine for legacy arm with no real smp and kernel-emulated cas, but unsuitable for more modern systems. the kernel provides another "kuser" function, at 0xffff0fa0, which could be used for the barrier, but using that would drop support for kernels 2.6.12 through 2.6.14 unless an extra conditional were added to check for barrier availability. just using the barrier in the kernel cas is easier, and, based on my reading of the assembly code in the kernel, does not appear to be significantly slower. at the same time, other atomic operations are adapted to call the kernel cas function directly rather than using a_cas; due to small differences in their interface contracts, this makes the generated code much simpler.
2013-08-11add missing a_or_l to atomic.h for non-x86 archsRich Felker1-0/+5
this is needed for recently committed sigaction code
2012-07-08remove little-endian assumption from arm atomic.hRich Felker1-4/+6
this hidden endian dependency had left big endian arm badly broken.
2011-09-18initial commit of the arm portRich Felker1-0/+112
this port assumes eabi calling conventions, eabi linux syscall convention, and presence of the kernel helpers at 0xffff0f?0 needed for threads support. otherwise it makes very few assumptions, and the code should work even on armv4 without thumb support, as well as on systems with thumb interworking. the bits headers declare this a little endian system, but as far as i can tell the code should work equally well on big endian. some small details are probably broken; so far, testing has been limited to qemu/aboriginal linux.