summaryrefslogtreecommitdiff
path: root/arch/or1k
AgeCommit message (Collapse)AuthorFilesLines
2020-11-29bits/syscall.h: add __NR_close_range from linux v5.9Szabolcs Nagy1-0/+1
see linux commit 9b4feb630e8e9801603f3cab3a36369e3c1cf88d arch: wire-up close_range() linux commit 278a5fbaed89dacd04e9d052f4594ffd0e0585de open: add close_range()
2020-09-09bits/syscall.h: add __NR_faccessat2 from linux v5.8Szabolcs Nagy1-0/+1
the linux faccessat syscall lacks a flag argument that is necessary to implement the posix api, see linux commit c8ffd8bcdd28296a198f237cc595148a8d4adfbe vfs: add faccessat2 syscall
2020-09-09add pidfd_getfd and openat2 syscall numbers from linux v5.6Szabolcs Nagy1-0/+2
also added clone3 on sh and m68k, on sh it's still missing (not yet wired up), but reserved so safe to add. see linux commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179 open: introduce openat2(2) syscall linux commit 9a2cef09c801de54feecd912303ace5c27237f12 arch: wire up pidfd_getfd syscall linux commit 8649c322f75c96e7ced2fec201e123b2b073bf09 pid: Implement pidfd_getfd syscall linux commit e8bb2a2a1d51511e6b3f7e08125d52ec73c11139 m68k: Wire up clone3() syscall
2020-08-27deduplicate __pthread_self thread pointer adjustment out of each archRich Felker1-5/+4
the adjustment made is entirely a function of TLS_ABOVE_TP and TP_OFFSET. aside from avoiding repetition of the TP_OFFSET value and arithmetic, this change makes pthread_arch.h independent of the definition of struct __pthread from pthread_impl.h. this in turn will allow inclusion of pthread_arch.h to be moved to the top of pthread_impl.h so that it can influence the definition of the structure. previously, arch files were very inconsistent about the type used for the thread pointer. this change unifies the new __get_tp interface to always use uintptr_t, which is the most correct when performing arithmetic that may involve addresses outside the actual pointed-to object (due to TP_OFFSET).
2020-08-24deduplicate TP_ADJ logic out of each arch, replace with TP_OFFSETRich Felker1-1/+0
the only part of TP_ADJ that was not uniquely determined by TLS_ABOVE_TP was the 0x7000 adjustment used mainly on mips and powerpc variants.
2020-02-05remove legacy time32 timer[fd] syscalls from public syscall.hRich Felker1-4/+4
this extends commit 5a105f19b5aae79dd302899e634b6b18b3dcd0d6, removing timer[fd]_settime and timer[fd]_gettime. the timerfd ones are likely to have been used in software that started using them before it could rely on libc exposing functions.
2020-02-05remove further legacy time32 clock syscalls from public syscall.hRich Felker1-4/+4
this extends commit 5a105f19b5aae79dd302899e634b6b18b3dcd0d6, removing clock_settime, clock_getres, clock_nanosleep, and settimeofday.
2020-01-30remove legacy clock_gettime and gettimeofday from public syscall.hRich Felker1-2/+2
some nontrivial number of applications have historically performed direct syscalls for these operations rather than using the public functions. such usage is invalid now that time_t is 64-bit and these syscalls no longer match the types they are used with, and it was already harmful before (by suppressing use of vdso). since syscall() has no type safety, incorrect usage of these syscalls can't be caught at compile-time. so, without manually inspecting or running additional tools to check sources, the risk of such errors slipping through is high. this patch renames the syscalls on 32-bit archs to clock_gettime32 and gettimeofday_time32, so that applications using the original names will fail to build without being fixed. note that there are a number of other syscalls that may also be unsafe to use directly after the time64 switchover, but (1) these are the main two that seem to be in widespread use, and (2) most of the others continue to have valid usage with a null timeval/timespec argument, as the argument is an optional timeout or similar.
2019-12-30add clone3 syscall number from linux v5.3Szabolcs Nagy1-0/+1
the syscall number is reserved on all targets, but it is not wired up on all targets, see linux commit 8f6ccf6159aed1f04c6d179f61f6fb2691261e84 Merge tag 'clone3-v5.3' of ... brauner/linux linux commit 8f3220a806545442f6f26195bc491520f5276e7c arch: wire-up clone3() syscall linux commit 7f192e3cd316ba58c88dfa26796cf77789dd9872 fork: add clone3
2019-12-30add pidfd_open syscall number from linux v5.3Szabolcs Nagy1-0/+1
see linux commit 7615d9e1780e26e0178c93c55b73309a5dc093d7 arch: wire-up pidfd_open() linux commit 32fcb426ec001cb6d5a4a195091a8486ea77e2df pid: add pidfd_open()
2019-11-02move time_t and suseconds_t definitions to common alltypes.h.inRich Felker1-3/+0
now that all 32-bit archs have 64-bit time_t (and suseconds_t), the arch-provided _Int64 macro (long or long long, as appropriate) can be used to define them, and arch-specific definitions are no longer needed.
2019-11-02move time64 ioctl numbers to generic bits/ioctl.hRich Felker1-4/+0
now that all 32-bit archs have 64-bit time types, the values for the time-related ioctls can be shared. the mechanism for this is an arch/generic version of the bits header. archs which don't use the generic header still need to duplicate the definitions. x32, which does not use the new time64 values of the macros, already has its own overrides, so this commit does not affect it.
2019-11-02move time64 socket options from arch bits to top-level sys/socket.hRich Felker1-5/+0
now that all 32-bit archs have 64-bit time types, the values for the time-related socket option macros can be treated as universal for 32-bit archs. the sys/socket.h mechanism for this predates arch/generic and is instead in the top-level header. x32, which does not use the new time64 values of the macros, already has its own overrides, so this commit does not affect it.
2019-11-02switch all existing 32-bit archs to 64-bit time_tRich Felker9-22/+46
this commit preserves ABI fully for existing interface boundaries between libc and libc consumers (applications or libraries), by retaining existing symbol names for the legacy 32-bit interfaces and redirecting sources compiled against the new headers to alternate symbol names. this does not necessarily, however, preserve the pairwise ABI of libc consumers with one another; where they use time_t-derived types in their interfaces with one another, it may be necessary to synchronize updates with each other. the intent is that ABI resulting from this commit already be stable and permanent, but it will not be officially so until a release is made. changes to some header-defined types that do not play any role in the ABI between libc and its consumers may still be subject to change. mechanically, the changes made by this commit for each 32-bit arch are as follows: - _REDIR_TIME64 is defined to activate the symbol redirections in public headers - COMPAT_SRC_DIRS is defined in arch.mak to activate build of ABI compat shims to serve as definitions for the original symbol names - time_t and suseconds_t definitions are changed to long long (64-bit) - IPC_STAT definition is changed to add the IPC_TIME64 bit (0x100), triggering conversion of semid_ds, shmid_ds, and msqid_ds split low/high time bits into new time_t members - structs semid_ds, shmid_ds, msqid_ds, and stat are modified to add new 64-bit time_t/timespec members at the end, maintaining existing layout of other members. - socket options (SO_*) and ioctl (sockios) command macros are redefined to use the kernel's "_NEW" values. in addition, on archs where vdso clock_gettime is used, the VDSO_CGT_SYM macro definition in syscall_arch.h is changed to use a new time64 vdso function if available, and a new VDSO_CGT32_SYM macro is added for use as fallback on kernels lacking time64.
2019-10-17move pthread types out of per-arch alltypes.hRich Felker1-8/+0
policy has long been that these definitions are purely a function of whether long/pointer is 32- or 64-bit, and that they are not allowed to vary per-arch. move the definition to the shared alltypes.h.in fragment, using integer constant expressions in terms of sizeof to vary the array dimensions appropriately. I'm not sure whether this is more or less ugly than using preprocessor conditionals and two sets of definitions here, but either way is a lot less ugly than repeating the same thing for every arch.
2019-10-17define LONG_MAX via arch alltypes.h, strip down bits/limits.hRich Felker2-7/+1
LLONG_MAX is uniform for all archs we support and plenty of header and code level logic assumes it is, so it does not make sense for limits.h bits mechanism to pretend it's variable. LONG_BIT can be defined in terms of LONG_MAX; there's no reason to put it in bits. by moving LONG_MAX definition to __LONG_MAX in alltypes.h and moving LLONG_MAX out of bits, there are now no plain-C limits that are defined in the bits header, so the bits header only needs to be included in the POSIX or extended profiles. this allows the feature test macro logic to be removed from the bits header, facilitating a long-term goal of getting such logic out of bits. having __LONG_MAX in alltypes.h will allow further generalization of headers. archs without a constant PAGESIZE no longer need bits/limits.h at all.
2019-10-17move __BYTE_ORDER definition to alltypes.hRich Felker2-1/+2
this change is motivated by the intersection of several factors. presently, despite being a nonstandard header, endian.h is exposing the unprefixed byte order macros and functions only if _BSD_SOURCE or _GNU_SOURCE is defined. this is to accommodate use of endian.h from other headers, including bits headers, which need to define structure layout in terms of endianness. with time64 switch-over, even more headers will need to do this. at the same time, the resolution of Austin Group issue 162 makes endian.h a standard header for POSIX-future, requiring that it expose the unprefixed macros and the functions even in standards-conforming profiles. changes to meet this new requirement would break existing internal usage of endian.h by causing it to violate namespace where it's used. instead, have the arch's alltypes.h define __BYTE_ORDER, either as a fixed constant or depending on the right arch-specific predefined macros for determining endianness. explicit literals 1234 and 4321 are used instead of __LITTLE_ENDIAN and __BIG_ENDIAN so that there's no danger of getting the wrong result if a macro is undefined and implicitly evaluates to 0 at the preprocessor level. the powerpc (32-bit) bits/endian.h being removed had logic for varying endianness, but our powerpc arch has never supported that and has always been big-endian-only. this logic is not carried over to the new __BYTE_ORDER definition in alltypes.h.
2019-10-17remove per-arch definitions for va_listRich Felker1-3/+0
now that commit f7f1079796abc6f97c69521d2334e9c7d3945dd8 removed the legacy i386 conditional definition, va_list is in no way arch-specific, and has no reason to be in the future. move it to the shared part of alltypes.h.in
2019-09-11add new syscall numbers from linux v5.2Szabolcs Nagy1-0/+6
new mount api syscalls were added, same numers on all targets, see linux commit a07b20004793d8926f78d63eb5980559f7813404 vfs: syscall: Add open_tree(2) to reference or clone a mount linux commit 2db154b3ea8e14b04fee23e3fdfd5e9d17fbc6ae vfs: syscall: Add move_mount(2) to move mounts around linux commit 24dcb3d90a1f67fe08c68a004af37df059d74005 vfs: syscall: Add fsopen() to prepare for superblock creation linux commit ecdab150fddb42fe6a739335257949220033b782 vfs: syscall: Add fsconfig() for configuring and managing a context linux commit 93766fbd2696c2c4453dd8e1070977e9cd4e6b6d vfs: syscall: Add fsmount() to create a mount for a superblock linux commit cf3cba4a429be43e5527a3f78859b1bfd9ebc5fb vfs: syscall: Add fspick() to select a superblock for reconfiguration linux commit 9c8ad7a2ff0bfe58f019ec0abc1fb965114dde7d uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2] linux commit d8076bdb56af5e5918376cd1573a6b0007fc1a89 uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]
2019-08-02move IPC_STAT definition to a new bits/ipcstat.h fileRich Felker1-0/+1
otherwise, 32-bit archs that could otherwise share the generic bits/ipc.h would need to duplicate the struct ipc_perm definition, obscuring the fact that it's the same. sysvipc is not widely used and these headers are not commonly included, so there is no performance gain to be had by limiting the number of indirectly included files here. files with the existing time32 definition of IPC_STAT are added to all current 32-bit archs now, so that when it's changed the change will show up as a change rather than addition of a new file where it's less obvious that the value is changing vs the generic one that was used before.
2019-07-30remove arch-specific bits/ipc.h that are identical to genericRich Felker1-11/+0
previously these differed from generic because they needed their own definitions of IPC_64. now that it's no longer in public header, they're identical.
2019-07-30move IPC_64 from public bits/ipc.h to syscall_arch.hRich Felker2-2/+2
the definition of the IPC_64 macro controls the interface between libc and the kernel through syscalls; it's not a public API. the meaning is rather obscure. long ago, Linux's sysvipc *id_ds structures used 16-bit uids/gids and wrong types for a few other fields. this was in the libc5 era, before glibc. the IPC_64 flag (64 is a misnomer; it's more like 32) tells the kernel to use the modern[-ish] versions of the structures. the definition of IPC_64 has nothing to do with whether the arch is 32- or 64-bit. rather, due to either historical accident or intentional obnoxiousness, the kernel only accepts and masks off the 0x100 IPC_64 flag conditional on CONFIG_ARCH_WANT_IPC_PARSE_VERSION, i.e. for archs that want to provide, or that accidentally provided, both. for archs which don't define this option, no masking is performed and commands with the 0x100 bit set will fail as invalid. so ultimately, the definition is just a matter of matching an arbitrary switch defined per-arch in the kernel.
2019-07-29collapse out byte order conditions in bits/sem.h for fixed-endian archsRich Felker1-5/+0
having preprocessor conditionals on byte order in the bits headers for fixed-endian archs is confusing at best. remove them.
2019-07-29duplicate generic bits/sem.h for each arch using it, in prep to changeRich Felker1-0/+16
2019-07-29remove trailing newlines from various versions of bits/shm.hRich Felker1-1/+0
2019-07-29duplicate generic bits/shm.h for each arch using it, in prep to changeRich Felker1-0/+28
there are more archs sharing the generic 64-bit version of the struct, which is uniform and much more reasonable, than sharing the current "generic" one, and depending on how time64 sysvipc is done for 32-bit archs, even more may be sharing the "64-bit version" in the future. so, duplicate the current generic to all archs using it (arm, i386, m68k, microblaze, or1k) so that the generic can be changed freely. this is recorded as its own commit mainly as a hint to git tooling, to assist in copy/move tracking.
2019-07-18decouple struct stat from kernel typeRich Felker1-0/+21
presently, all archs/ABIs have struct stat matching the kernel stat[64] type, except mips/mipsn32/mips64 which do conversion hacks in syscall_arch.h to work around bugs in the kernel type. this patch completely decouples them and adds a translation step to the success path of fstatat. at present, this is just a gratuitous copying, but it opens up multiple possibilities for future support for 64-bit time_t on 32-bit archs and for cleaned-up/unified ABIs. for clarity, the mips hacks are not yet removed in this commit, so the mips kstat structs still correspond to the output of the hacks in their syscall_arch.h files, not the raw kernel type. a subsequent commit will fix this.
2019-07-01add new syscall numbers from linux v5.1Szabolcs Nagy1-0/+24
syscall numbers are now synced up across targets (starting from 403 the numbers are the same on all targets other than an arch specific offset) IPC syscalls sem*, shm*, msg* got added where they were missing (except for semop: only semtimedop got added), the new semctl, shmctl, msgctl imply IPC_64, see linux commit 0d6040d4681735dfc47565de288525de405a5c99 arch: add split IPC system calls where needed new 64bit time_t syscall variants got added on 32bit targets, see linux commit 48166e6ea47d23984f0b481ca199250e1ce0730a y2038: add 64-bit time_t syscalls to all 32-bit architectures new async io syscalls got added, see linux commit 2b188cc1bb857a9d4701ae59aa7768b5124e262e Add io_uring IO interface linux commit edafccee56ff31678a091ddb7219aba9b28bc3cb io_uring: add support for pre-mapped user IO buffers a new syscall got added that uses the fd of /proc/<pid> as a stable handle for processes: allows sending signals without pid reuse issues, intended to eventually replace rt_sigqueueinfo, kill, tgkill and rt_tgsigqueueinfo, see linux commit 3eb39f47934f9d5a3027fe00d906a45fe3a15fad signal: add pidfd_send_signal() syscall on some targets (arm, m68k, s390x, sh) some previously missing syscall numbers got added as well.
2019-04-10remove cruft for supposedly-buggy clang from or1k & microblaze syscall_archRich Felker1-9/+0
it was never demonstrated to me that this workaround was needed, and seems likely that, if there ever was any clang version for which it was needed, it's old enough to be unusably buggy in other ways. if it turns out some compilers actually can't do the register allocation right, we'll need to replace this with inline shuffling code, since the external __syscall dependency is being removed.
2019-03-13aarch64, or1k: add kexec_file_load syscall number from linux v5.0Szabolcs Nagy1-0/+1
added in linux commit 4e21565b7fd4d9045765f697887e74a704135fe2
2019-03-13aarch64, or1k: define rseq syscall number following linux v4.19Szabolcs Nagy1-0/+1
added in linux commit db7a2d1809a5b6b08d138ff68837f805fc073351
2018-12-09add io_pgetevents and rseq syscall numbers from linux v4.18Szabolcs Nagy1-0/+1
io_pgetevents is new in linux commit 7a074e96dee62586c935c80cecd931431bfdd0be rseq is new in linux commit d7822b1e24f2df5df98c76f0e94a5416349ff759
2018-10-16make thread-pointer-loading asm non-volatileRich Felker1-2/+2
this will allow the compiler to cache and reuse the result, meaning we no longer have to take care not to load it more than once for the sake of archs where the load may be expensive. depends on commit 1c84c99913bf1cd47b866ed31e665848a0da84a2 for correctness, since otherwise the compiler could hoist loads during stage 3 of dynamic linking before the initial thread-pointer setup.
2018-06-02fix TLS layout of TLS variant I when there is a gap above TPSzabolcs Nagy1-0/+1
In TLS variant I the TLS is above TP (or above a fixed offset from TP) but on some targets there is a reserved gap above TP before TLS starts. This matters for the local-exec tls access model when the offsets of TLS variables from the TP are hard coded by the linker into the executable, so the libc must compute these offsets the same way as the linker. The tls offset of the main module has to be alignup(GAP_ABOVE_TP, main_tls_align). If there is no TLS in the main module then the gap can be ignored since musl does not use it and the tls access models of shared libraries are not affected. The previous setup only worked if (tls_align & -GAP_ABOVE_TP) == 0 (i.e. TLS did not require large alignment) because the gap was treated as a fixed offset from TP. Now the TP points at the end of the pthread struct (which is aligned) and there is a gap above it (which may also need alignment). The fix required changing TP_ADJ and __pthread_self on affected targets (aarch64, arm and sh) and in the tlsdesc asm the offset to access the dtv changed too.
2018-03-10reverse definition dependency between PAGESIZE and PAGE_SIZERich Felker1-1/+1
PAGESIZE is actually the version defined in POSIX base, with PAGE_SIZE being in the XSI option. use PAGESIZE as the underlying definition to facilitate making exposure of PAGE_SIZE conditional.
2017-11-05add statx syscall numbers from linux v4.11Szabolcs Nagy1-0/+1
statx was added in linux commit a528d35e8bfcc521d7cb70aaf03e1bd296c8493f (there is no libc wrapper yet and microblaze and sh misses the number).
2017-09-06make syscall.h consistent with linuxSzabolcs Nagy1-0/+1
most of the found naming differences don't matter to musl, because internally it unifies the syscall names that vary across targets, but for external code the names should match the kernel uapi. aarch64: __NR_fstatat is called __NR_newfstatat in linux. __NR_or1k_atomic got mistakenly copied from or1k. arm: __NR_arm_sync_file_range is an alias for __NR_sync_file_range2 __NR_fadvise64_64 is called __NR_arm_fadvise64_64 in linux, the old non-arm name is kept too, it should not cause issues. (powerpc has similar nonstandard fadvise and it uses the normal name.) i386: __NR_madvise1 was removed from linux in commit 303395ac3bf3e2cb488435537d416bc840438fcb 2011-11-11 microblaze: __NR_fadvise, __NR_fstatat, __NR_pread, __NR_pwrite had different name in linux. mips: __NR_fadvise, __NR_fstatat, __NR_pread, __NR_pwrite, __NR_select had different name in linux. mipsn32: __NR_fstatat is called __NR_newfstatat in linux. or1k: __NR__llseek is called __NR_llseek in linux. the old name is kept too because that's the name musl uses internally. powerpc: __NR_{get,set}res{gid,uid}32 was never present in powerpc linux. __NR_timerfd was briefly defined in linux but then got renamed.
2016-12-29add pkey_{mprotect,alloc,free} syscalls from linux v4.9Szabolcs Nagy1-0/+3
see linux commit e8c24d3a23a469f1f40d4de24d872ca7023ced0a and linux Documentation/x86/protection-keys.txt
2016-07-06remove or1k version of sem.hBobby Bingham1-11/+0
It's identical to the generic version, after evaluating the endian preprocessor checks in the generic version.
2016-06-09add preadv2 and pwritev2 syscall numbers for linux v4.6Szabolcs Nagy1-0/+2
the syscalls take an additional flag argument, they were added in commit f17d8b35452cab31a70d224964cd583fb2845449 and a RWF_HIPRI priority hint flag was added to linux/fs.h in 97be7ebe53915af504fb491fb99f064c7cf3cb09. the syscall is not allocated for microblaze and sh yet.
2016-05-12deduplicate __NR_* and SYS_* syscall number definitionsBobby Bingham2-543/+272
2016-03-19add copy_file_range syscall numbers from linux v4.5Szabolcs Nagy1-0/+2
it was introduced for offloading copying between regular files in linux commit 29732938a6289a15e907da234d6692a2ead71855 (microblaze and sh does not yet have the syscall number.)
2016-03-18deduplicate bits/mman.hSzabolcs Nagy1-59/+0
currently five targets use the same mman.h constants and the rest share most constants too, so move them to sys/mman.h before the bits/mman.h include where the differences can be corrected by redefinition of the macros. this fixes two minor bugs: POSIX_MADV_DONTNEED was wrong on most targets (it should be the same as MADV_DONTNEED), and sh defined the x86-only MAP_32BIT mmap flag.
2016-01-27deduplicate the bulk of the arch bits headersRich Felker12-593/+0
all bits headers that were identical for a number of 'clean' archs are moved to the new arch/generic tree. in addition, a few headers that differed only cosmetically from the new generic version are removed. additional deduplication may be possible in mman.h and in several headers (limits.h, posix.h, stdint.h) that mostly depend on whether the arch is 32- or 64-bit, but they are left alone for now because greater gains are likely possible with more invasive changes to header logic, which is beyond the scope of this commit.
2016-01-26add MCL_ONFAULT and MLOCK_ONFAULT mlockall and mlock2 flagsSzabolcs Nagy1-0/+1
they lock faulted pages into memory (useful when a small part of a large mapped file needs efficient access), new in linux v4.4, commit b0f205c2a3082dd9081f9a94e50658c5fa906ff1 MLOCK_* is not in the POSIX reserved namespace for sys/mman.h
2016-01-26add mlock2 syscall number from linux v4.4Szabolcs Nagy1-0/+2
this is mlock with a flags argument, new in linux commit a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e as usual microblaze and sh don't have allocated syscall number yet.
2016-01-26add new membarrier, userfaultfd and switch_endian syscallsSzabolcs Nagy1-0/+4
new in linux v4.3 added for aarch64, arm, i386, mips, or1k, powerpc, x32 and x86_64. membarrier is a system wide memory barrier, moves most of the synchronization cost to one side, new in kernel commit 5b25b13ab08f616efd566347d809b4ece54570d1 userfaultfd is useful for qemu and is new in kernel commit 8d2afd96c20316d112e04d935d9e09150e988397 switch_endian is powerpc only for switching endianness, new in commit 529d235a0e190ded1d21ccc80a73e625ebcad09b
2016-01-21refactor internal atomic.hRich Felker2-120/+14
rather than having each arch provide its own atomic.h, there is a new shared atomic.h in src/internal which pulls arch-specific definitions from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal, defining only a_cas or new ll/sc type primitives which the shared atomic.h will use to construct everything else. this commit avoids making heavy changes to the individual archs' atomic implementations. definitions which are identical or near-identical to what the new shared atomic.h would produce have been removed, but otherwise the changes made are just hooking up the arch-specific files to the new infrastructure. major changes to take advantage of the new system will come in subsequent commits.
2015-11-02properly access mcontext_t program counter in cancellation handlerRich Felker1-2/+1
using the actual mcontext_t definition rather than an overlaid pointer array both improves correctness/readability and eliminates some ugly hacks for archs with 64-bit registers bit 32-bit program counter. also fix UB due to comparison of pointers not in a common array object.
2015-10-15prevent reordering of or1k and powerpc thread pointer loadsRich Felker1-0/+1
other archs use asm for the thread pointer load, so making that asm volatile is sufficient to inform the compiler that it has a "side effect" (crashing or giving the wrong result if the thread pointer was not yet initialized) that prevents reordering. however, powerpc and or1k have dedicated general purpose registers for the thread pointer and did not need to use any asm to access it; instead, "local register variables with a specified register" were used. however, there is no specification for ordering constraints on this type of usage, and presumably use of the thread pointer could be reordered across its initialization. to impose an ordering, I have added empty volatile asm blocks that produce the "local register variable with a specified register" as an output constraint.