summaryrefslogtreecommitdiff
path: root/src/malloc/lite_malloc.c
AgeCommit message (Collapse)AuthorFilesLines
2015-11-04remove external linkage from __simple_malloc definitionRich Felker1-1/+1
this function is used only as a weak definition for malloc, for static linking in programs which do not call realloc or free. since it had external linkage and was thereby exported in libc.so's dynamic symbol table, --gc-sections was unable to drop it. this was merely an oversight; there's no reason for it to be external, so make it static.
2015-06-22fix regression/typo that disabled __simple_malloc when calloc is usedRich Felker1-1/+1
commit ba819787ee93ceae94efd274f7849e317c1bff58 introduced this regression. since the __malloc0 weak alias was not properly provided by __simple_malloc, use of calloc forced the full malloc to be linked.
2015-06-22fix calloc when __simple_malloc implementation is usedRich Felker1-0/+1
previously, calloc's implementation encoded assumptions about the implementation of malloc, accessing a size_t word just prior to the allocated memory to determine if it was obtained by mmap to optimize out the zero-filling. when __simple_malloc is used (static linking a program with no realloc/free), it doesn't matter if the result of this check is wrong, since all allocations are zero-initialized anyway. but the access could be invalid if it crosses a page boundary or if the pointer is not sufficiently aligned, which can happen for very small allocations. this patch fixes the issue by moving the zero-fill logic into malloc.c with the full malloc, as a new function named __malloc0, which is provided by a weak alias to __simple_malloc (which always gives zero-filled memory) when the full malloc is not in use.
2015-06-14refactor malloc's expand_heap to share with __simple_mallocRich Felker1-23/+26
this extends the brk/stack collision protection added to full malloc in commit 276904c2f6bde3a31a24ebfa201482601d18b4f9 to also protect the __simple_malloc function used in static-linked programs that don't reference the free function. it also extends support for using mmap when brk fails, which full malloc got in commit 5446303328adf4b4e36d9fba21848e6feb55fab4, to __simple_malloc. since __simple_malloc may expand the heap by arbitrarily large increments, the stack collision detection is enhanced to detect interval overlap rather than just proximity of a single address to the stack. code size is increased a bit, but this is partly offset by the sharing of code between the two malloc implementations, which due to linking semantics, both get linked in a program that needs the full malloc with realloc/free support.
2015-03-03make all objects used with atomic operations volatileRich Felker1-1/+1
the memory model we use internally for atomics permits plain loads of values which may be subject to concurrent modification without requiring that a special load function be used. since a compiler is free to make transformations that alter the number of loads or the way in which loads are performed, the compiler is theoretically free to break this usage. the most obvious concern is with atomic cas constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be transformed to a_cas(p,*p,f(*p)); where the latter is intended to show multiple loads of *p whose resulting values might fail to be equal; this would break the atomicity of the whole operation. but even more fundamental breakage is possible. with the changes being made now, objects that may be modified by atomics are modeled as volatile, and the atomic operations performed on them by other threads are modeled as asynchronous stores by hardware which happens to be acting on the request of another thread. such modeling of course does not itself address memory synchronization between cores/cpus, but that aspect was already handled. this all seems less than ideal, but it's the best we can do without mandating a C11 compiler and using the C11 model for atomics. in the case of pthread_once_t, the ABI type of the underlying object is not volatile-qualified. so we are assuming that accessing the object through a volatile-qualified lvalue via casts yields volatile access semantics. the language of the C standard is somewhat unclear on this matter, but this is an assumption the linux kernel also makes, and seems to be the correct interpretation of the standard.
2012-04-24ditch the priority inheritance locks; use malloc's version of lockRich Felker1-4/+4
i did some testing trying to switch malloc to use the new internal lock with priority inheritance, and my malloc contention test got 20-100 times slower. if priority inheritance futexes are this slow, it's simply too high a price to pay for avoiding priority inversion. maybe we can consider them somewhere down the road once the kernel folks get their act together on this (and perferably don't link it to glibc's inefficient lock API)... as such, i've switch __lock to use malloc's implementation of lightweight locks, and updated all the users of the code to use an array with a waiter count for their locks. this should give optimal performance in the vast majority of cases, and it's simple. malloc is still using its own internal copy of the lock code because it seems to yield measurably better performance with -O3 when it's inlined (20% or more difference in the contention stress test).
2011-03-30rename __simple_malloc.c to lite_malloc.c - yes this affects behavior!Rich Felker1-0/+46
why does this affect behavior? well, the linker seems to traverse archive files starting from its current position when resolving symbols. since calloc.c comes alphabetically (and thus in sequence in the archive file) between __simple_malloc.c and malloc.c, attempts to resolve the "malloc" symbol for use by calloc.c were pulling in the full malloc.c implementation rather than the __simple_malloc.c implementation. as of now, lite_malloc.c and malloc.c are adjacent in the archive and in the correct order, so malloc.c should never be used to resolve "malloc" unless it's already needed to resolve another symbol ("free" or "realloc").