summaryrefslogtreecommitdiff
path: root/src/mman/mmap.c
AgeCommit message (Collapse)AuthorFilesLines
2019-12-20revert unwanted and inadvertent change that slipped into mmap.cRich Felker1-1/+0
commit ae388becb529428ac926da102f1d025b3c3968da accidentally introduced #define SYSCALL_NO_TLS 1 in mmap.c, which was probably a stale change left around from unrelated syscall timing measurements. reverse it.
2019-12-17implement SO_TIMESTAMP[NS] fallback for kernels without time64 versionsRich Felker1-0/+1
the definitions of SO_TIMESTAMP* changed on 32-bit archs in commit 38143339646a4ccce8afe298c34467767c899f51 to the new versions that provide 64-bit versions of timeval/timespec structure in control message payload. socket options, being state attached to the socket rather than function calls, are not trivial to implement as fallbacks on ENOSYS, and support for them was initially omitted on the assumption that the ioctl-based polling alternatives (SIOCGSTAMP*) could be used instead by applications if setsockopt fails. unfortunately, it turns out that SO_TIMESTAMP is sufficiently old and widely supported that a number of applications assume it's available and treat errors as fatal. this patch introduces emulation of SO_TIMESTAMP[NS] on pre-time64 kernels by falling back to setting the "_OLD" (time32) versions of the options if the time64 ones are not recognized, and performing translation of the SCM_TIMESTAMP[NS] control messages in recvmsg. since recvmsg does not know whether its caller is legacy time32 code or time64, it performs translation for any SCM_TIMESTAMP[NS]_OLD control messages it sees, leaving the original time32 timestamp as-is (it can't be rewritten in-place anyway, and memmove would be mildly expensive) and appending the converted time64 control message at the end of the buffer. legacy time32 callers will see the converted one as a spurious control message of unknown type; time64 callers running on pre-time64 kernels will see the original one as a spurious control message of unknown type. a time64 caller running on a kernel with native time64 support will only see the time64 version of the control message. emulation of SO_TIMESTAMPING is not included at this time since (1) applications which use it seem to be prepared for the possibility that it's not present or working, and (2) it can also be used in sendmsg control messages, in a manner that looks complex to emulate completely, and costly even when running on a time64-supporting kernel. corresponding changes in recvmmsg are not made at this time; they will be done separately.
2018-09-12remove spurious inclusion of libc.h for LFS64 ABI aliasesRich Felker1-2/+1
the LFS64 macro was not self-documenting and barely saved any characters. simply use weak_alias directly so that it's clear what's being done, and doesn't depend on a header to provide a strange macro.
2017-09-06work around incorrect EPERM from mmap syscallRich Felker1-2/+7
under some conditions, the mmap syscall wrongly fails with EPERM instead of ENOMEM when memory is exhausted; this is probably the result of the kernel trying to fit the allocation somewhere that crosses into the kernel range or below mmap_min_addr. in any case it's a conformance bug, so work around it. for now, only handle the case of anonymous mappings with no requested address; in other cases EPERM may be a legitimate error. this indirectly fixes the possibility of malloc failing with the wrong errno value.
2017-04-21allow full-range file offsets to mmap on archs with 64-bit syscall argsRich Felker1-1/+1
normally 32-bit archs use the mmap2 syscall and are limited to an offset of 2^32 pages. however some 32-bit archs (mainly ILP32-on-64 ones like x32) have 64-bit syscall argument slots and thus can accept the full range. don't artifically limit them.
2015-04-10redesign and simplify vmlock systemRich Felker1-6/+3
this global lock allows certain unlock-type primitives to exclude mmap/munmap operations which could change the identity of virtual addresses while references to them still exist. the original design mistakenly assumed mmap/munmap would conversely need to exclude the same operations which exclude mmap/munmap, so the vmlock was implemented as a sort of 'symmetric recursive rwlock'. this turned out to be unnecessary. commit 25d12fc0fc51f1fae0f85b4649a6463eb805aa8f already shortened the interval during which mmap/munmap held their side of the lock, but left the inappropriate lock design and some inefficiency. the new design uses a separate function, __vm_wait, which does not hold any lock itself and only waits for lock users which were already present when it was called to release the lock. this is sufficient because of the way operations that need to be excluded are sequenced: the "unlock-type" operations using the vmlock need only block mmap/munmap operations that are precipitated by (and thus sequenced after) the atomic-unlock they perform while holding the vmlock. this allows for a spectacular lack of synchronization in the __vm_wait function itself.
2014-08-16optimize locking against vm changes for mmap/munmapRich Felker1-7/+6
the whole point of this locking is to prevent munmap, or mmap with MAP_FIXED, from deallocating virtual addresses, or changing the backing a given virtual address refers to, during certain race windows involving self-synchronized unmapping or destruction of pthread synchronization objects. there is no need for exclusion in the other direction, so it suffices to take the lock momentarily and release it before making the syscall, rather than holding it across the syscall.
2014-07-30add framework for mmap2 syscall unit to vary by archRich Felker1-2/+3
2013-06-27disallow creation of objects larger than PTRDIFF_MAX via mmapRich Felker1-0/+5
internally, other parts of the library assume sizes don't overflow ssize_t and/or ptrdiff_t, and the way this assumption is made valid is by preventing creating of such large objects. malloc already does so, but the check was missing from mmap. this is also a quality of implementation issue: even if the implementation internally could handle such objects, applications could inadvertently invoke undefined behavior by subtracting pointers within an object. it is very difficult to guard against this in applications, so a good implementation should simply ensure that it does not happen.
2012-12-20clean up and fix logic for making mmap fail on invalid/unsupported offsetsRich Felker1-3/+7
the previous logic was assuming the kernel would give EINVAL when passed an invalid address, but instead with MAP_FIXED it was giving EPERM, as it considered this an attempt to map over kernel memory. instead of trying to get the kernel to do the rigth thing, the new code just handles the error in userspace. I have also cleaned up the code to use a single mask to check for invalid low bits and unsupported high bits, so it's simpler and more clearly correct. the old code was actually wrong for sizeof(long) smaller than sizeof(off_t) but not equal to 4; now it should be correct for all possibilities. for 64-bit systems, the low-bits test is new and extraneous (the kernel should catch the error anyway when the mmap2 syscall is not used), but it's cheap anyway. if this is an issue, the OFF_MASK definition could be tweaked to omit the low bits when SYS_mmap2 is not defined.
2011-09-27process-shared barrier support, based on discussion with bdonlanRich Felker1-2/+11
this implementation is rather heavy-weight, but it's the first solution i've found that's actually correct. all waiters actually wait twice at the barrier so that they can synchronize exit, and they hold a "vm lock" that prevents changes to virtual memory mappings (and blocks pthread_barrier_destroy) until all waiters are finished inspecting the barrier. thus, it is safe for any thread to destroy and/or unmap the barrier's memory as soon as pthread_barrier_wait returns, without further synchronization.
2011-04-06consistency: change all remaining syscalls to use SYS_ rather than __NR_ prefixRich Felker1-1/+1
2011-03-20global cleanup to use the new syscall interfaceRich Felker1-2/+2
2011-02-13cleaning up syscalls in preparation for x86_64 portRich Felker1-0/+4
- hide all the legacy xxxxxx32 name cruft in syscall.h so the actual source files can be clean and uniform across all archs. - cleanup llseek/lseek and mmap2/mmap handling for 32/64 bit systems - alternate implementation for nice if the target lacks nice syscall
2011-02-12initial check-in, version 0.5.0v0.5.0Rich Felker1-0/+18